InterCellar
is a Bioconductor package that provides an
interactive Shiny application to enable the analysis of cell-cell
communication from single-cell RNA sequencing (scRNA-seq) data. Every
step of the analysis can be performed interactively, thus not requiring
any programming skills. Moreover, InterCellar
runs on your
local machine, avoiding issues related to data privacy.
InterCellar
is distributed as a Bioconductor package and
requires R (version 4.1) and Bioconductor (version 3.14).
To install InterCellar
package enter:
Once InterCellar is successfully installed, it can be loaded as follow:
In order to start the app, please run the following command:
InterCellar
should be opening in a browser. If this does
not happen automatically, please open a browser and navigate to the
address shown (for example,
Listening on http://127.0.0.1:6134
). The flag
reproducible = TRUE
ensures that your results will be
reproducible across R sessions.
The first step of the workflow requires the upload of pre-computed
results generated by an external tool capable of predicting cell-cell
communication mediated by ligand-receptor interactions.
InterCellar
supports both published tools such as CellPhoneDBv2 (Efremova et al. 2020), CellChat(Jin et al. 2021), ICELLNET(Noël et al. 2021), and SingleCellSignalR
(Cabello-Aguilar et al. 2020), and custom
results output of ad hoc methods, which must contain necessary
information as described in the panel From custom analysis. For
this user guide, we will use CellPhoneDB (CPDB) results
computed on a scRNA-seq dataset from Chua et al.(Chua et al. 2020). This dataset comprises data
of COVID-19 patients, divided in critical and moderate cases, as well as
healthy controls. Cell-cell interaction (CCI) data output of CPDB on
each condition can be found at InterCellar-reproducibility.
By navigating to 1. Data and
Upload, we can import our 3 CCI data from the
Supported tools panel. We specify an existing local folder
where InterCellar
will create output folders to save
figures and tables results of the analysis. To upload a CCI data, we
must specify an ID and an output folder tag. Next, we can select the
folder containing CellPhoneDB results from our local drive.
InterCellar
will read and pre-process the data and show the
resulting table in Table view. The pre-processing step
consists of:
Finally, we can switch active CCI data on the left menu, to easily analyze multiple datasets in parallel.
Once the input data has been uploaded, InterCellar
takes
us to the exploration of three Universes. Each universe
has its focus on a different biological domain: cell clusters, genes and
functions. Specific filtering options can be applied and multiple
visualization choices are available to enable a deep exploration of the
cellular communication.
Focus of this universe are clusters of cells participating in the communication. The filtering options allow the user to subset the dataset by:
The analyst will be able to see the effect of these filtering steps by looking at the box showing the number of total interactions.
Warning: these filters have global influence on the analysis, since they subset the input data!
Three tabs are part of the Cluster-verse: Network, Barplot and Table.
The Network of clusters shows the overall cellular communication. Nodes represent different clusters while edges show the (total or weighted by interaction score) number of paracrine interactions occurring between two clusters. Edges that fall back on the same cluster represent autocrine interactions.
Barplot offers two different barplots representing: (1) the total number of interactions per cluster, divided in paracrine and autocrine interactions; and (2) the relative number of interactions for a certain cell type.
In the Table panel, the analyst can restrict the data exploration to a specific focus, by subsetting the data to one cluster of interest, called viewpoint, and one flow of communication among:
InterCellar
second universe focuses on the genes.
Filtering options to exclude interaction pairs (int-pairs) are available
and are specific to the input tool chosen by the user. The
Table shows all distinct int-pairs enriched in our
data, regardless of the clusters in which these are found. Included in
this Table are Ensembl and UniProt IDs of each gene, with
hyperlinks to the respective web pages to facilitate investigation of
unfamiliar genes.
Upon selection of one or multiple int-pairs from the previous Table, a dot plot is generated and visible in the Dot Plot panel. The analyst can decide to select a subset of clusters for the visualization as well as choose different colors for high and low int-pair score.
Network panel visualizes the selected int-pairs in a cluster network.
In the Function-verse, the analyst is required to
perform a functional annotation, before proceeding to the next steps of
the analysis. To this scope, InterCellar
offers multiple
sources of functional annotations in terms of Gene Ontology (queried from Ensembl, via the package biomaRt) and
pre-downloaded pathway databases (from the package graphite). After
selection of suitable sources, the annotation can be performed and a
Table showing all functional terms annotated to each
int-pair is displayed. Worth to note is the fact that a functional term
is annotated to an int-pair only when the functional term is enriched in
both genes (or gene complexes), partners of the interaction.
The Barplot panel summarizes the number of functional terms annotated for each source.
In the following panel, Ranking, functional terms are listed individually, along with information on occurrence (i.e. how many int-pairs have been annotated to this term).
By selecting one row of the Ranking table, we can explore the term of interest in the Sunburst plot. This visualization allows to connect functions to int-pairs and clusters. On the left side of the panel, a table lists all int-pairs annotated to the term. The user can choose to visualize the number of interactions or the weighted number (by score). On the right side, the sunburst plot is composed as follows:
This step of InterCellar
’s workflow allows the analyst
to define and analyze Int-Pair Modules, i.e. groups of
int-pairs that share a similar functional profile. To this aim, the
choice of a viewpoint cluster and communication flow
is required. InterCellar
will subset the input data
accordingly. This analysis can be repeated for each viewpoint and flow
of interest.
To define the number of int-pair modules in the data subset, four
visualizations are provided. On the left hand side, the optimal number
of modules is calculated by InterCellar
using (1) the elbow
method on the total within-clusters sum of squares (which should be
minimized) and (2) the average silhouette width (which should be
maximized). Both methods are standard practice in cluster analysis and
are supposed to help the choice of the optimal number of modules.
However, the user is free to choose the best number of modules depending
on each case. In general, high (low) number of groups is reflected in
high (low) specificity of a module. For this purpose, two visualization
offer yet another way to investigate the optimal number of modules. A
dendrogram of int-pairs shows the results of a
hierarchical clustering obtained on the first two components of the
UMAP underneath. Each point of the UMAP represents one
int-pair (shown by hovering) and color-coding is consistent for both
UMAP and dendrogram, showing the number of modules chosen. Moreover,
dendrogram and UMAP are initialized with the optimal number of modules
chosen by the elbow method (giving usually higher resolution compared to
the average silhouette).
Once the int-pair modules have been defined, InterCellar
offers the possibility to visualize the int-pairs belonging to each
module and the respective clusters in a Circle plot.
Directed interactions are represented here by arrows originating from
ligands (double segment) towards receptors (single segment). The
Table panel summarizes the same info in a tabular
format.
Last step of the int-pair modules analysis concerns functional terms.
InterCellar
performs a permutation test to calculate
empirical p-values assessing the significance of functional terms
annotated to int-pairs of each module. A Table displays
functional terms that are found significant (p-value <= 0.05 by
default) for the chosen int-pair module. The significant functional
terms listed in these tables can help the user to “manually” select
terms that are of biological interest and can be used to annotate the
UMAP, as we did in our manuscript.
The final step of InterCellar
’s analysis allows the
comparison of cell-cell communication from different conditions (up to
3). The user can choose which conditions to compare (we recommend having
the same -or very similar- composition in terms of cell clusters). The
analysis is then structured as follow:
InterCellar
computes
int-pair/cluster-pair couplets that are unique to each
condition and displays them in the Table. Upon
selection of one or multiple unique couplets, a Dot
Plot is generated, showing the occurrence of each couplet in
the respective condition. A Pie Chart summarizes the
contribution of the selected couplets to each condition.InterCellar
implements a
permutation test to calculate an empirical p-value of significance for
the functional terms that were annotated to these unique int-pairs.
Table-FuncTerms shows all functional terms that are
significantly enriched by int-pairs unique to each condition. Finally,
by selecting a term of interest, the user can visualize it in a
Sunburst Plot.