| Title: | Coralysis sensitive identification of imbalanced cell types and states in single-cell data via multi-level integration |
|---|---|
| Description: | Coralysis is an R package featuring a multi-level integration algorithm for sensitive integration, reference-mapping, and cell-state identification in single-cell data. The multi-level integration algorithm is inspired by the process of assembling a puzzle - where one begins by grouping pieces based on low-to high-level features, such as color and shading, before looking into shape and patterns. This approach progressively blends the batch effects and separates cell types across multiple rounds of divisive clustering. |
| Authors: | António Sousa [cre, aut] (ORCID: <https://orcid.org/0000-0003-4779-6459>), Johannes Smolander [ctb, aut] (ORCID: <https://orcid.org/0000-0003-3872-9668>), Sini Junttila [aut] (ORCID: <https://orcid.org/0000-0003-3754-5584>), Laura L Elo [aut] (ORCID: <https://orcid.org/0000-0001-5648-4532>) |
| Maintainer: | António Sousa <[email protected]> |
| License: | GPL-3 |
| Version: | 1.3.0 |
| Built: | 2026-05-30 08:51:26 UTC |
| Source: | https://github.com/bioc/Coralysis |
The function aggregates feature expression by cell clusters, per batch if provided.
AggregateDataByBatch.SingleCellExperiment(object, batch.label, nhvg, p, ...) ## S4 method for signature 'SingleCellExperiment' AggregateDataByBatch(object, batch.label, nhvg = 2000L, p = 30L, ...)AggregateDataByBatch.SingleCellExperiment(object, batch.label, nhvg, p, ...) ## S4 method for signature 'SingleCellExperiment' AggregateDataByBatch(object, batch.label, nhvg = 2000L, p = 30L, ...)
object |
An object of |
batch.label |
Cluster identities vector corresponding to the cells in
|
nhvg |
Integer of the number of highly variable features to select. By default
|
p |
Integer. By default |
... |
Parameters to be passed to |
A SingleCellExperiment object with feature expression aggregated by clusters.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14871436/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Run with a batch set.seed(1204) sce <- AggregateDataByBatch(object = sce, batch.label = "batch") logcounts(sce)[1:10, 1:10] head(metadata(sce)$clusters) # Run without a batch set.seed(1204) sce <- AggregateDataByBatch(object = sce, batch.label = NULL) logcounts(sce)[1:10, 1:10] head(metadata(sce)$clusters)# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14871436/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Run with a batch set.seed(1204) sce <- AggregateDataByBatch(object = sce, batch.label = "batch") logcounts(sce)[1:10, 1:10] head(metadata(sce)$clusters) # Run without a batch set.seed(1204) sce <- AggregateDataByBatch(object = sce, batch.label = NULL) logcounts(sce)[1:10, 1:10] head(metadata(sce)$clusters)
Bin cell cluster probability by a given cell label.
BinCellClusterProbability.SingleCellExperiment( object, label, icp.run, icp.round, funs, bins, aggregate.bins.by, use.assay ) ## S4 method for signature 'SingleCellExperiment' BinCellClusterProbability( object, label, icp.run = NULL, icp.round = NULL, funs = "mean", bins = 20, aggregate.bins.by = "mean", use.assay = "logcounts" )BinCellClusterProbability.SingleCellExperiment( object, label, icp.run, icp.round, funs, bins, aggregate.bins.by, use.assay ) ## S4 method for signature 'SingleCellExperiment' BinCellClusterProbability( object, label, icp.run = NULL, icp.round = NULL, funs = "mean", bins = 20, aggregate.bins.by = "mean", use.assay = "logcounts" )
object |
An object of |
label |
Label of interest available in |
icp.run |
ICP run(s) to retrieve from |
icp.round |
ICP round(s) to retrieve from |
funs |
One function to summarise ICP cell cluster probability. One of |
bins |
Number of bins to bin cell cluster probability by cell |
aggregate.bins.by |
One function to aggregate One of |
use.assay |
Name of the assay that should be used to obtain the average expression
of features across cell |
A SingleCellExperiment class object with feature average expression by
cell label probability bins.
# Packages suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14845751/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Prepare data sce <- PrepareData(object = sce) # Multi-level integration - 'L = 4' just for highlighting purposes set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "batch", L = 4, threads = 2 ) # Cell states SCE object for a given cell type annotation or clustering cellstate.sce <- BinCellClusterProbability( object = sce, label = "cell_type", icp.round = 4, bins = 20 ) cellstate.sce# Packages suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14845751/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Prepare data sce <- PrepareData(object = sce) # Multi-level integration - 'L = 4' just for highlighting purposes set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "batch", L = 4, threads = 2 ) # Cell states SCE object for a given cell type annotation or clustering cellstate.sce <- BinCellClusterProbability( object = sce, label = "cell_type", icp.round = 4, bins = 20 ) cellstate.sce
Correlation between cell bins for the given labels and features.
CellBinsFeatureCorrelation.SingleCellExperiment(object, labels, method) ## S4 method for signature 'SingleCellExperiment' CellBinsFeatureCorrelation(object, labels = NULL, method = "pearson")CellBinsFeatureCorrelation.SingleCellExperiment(object, labels, method) ## S4 method for signature 'SingleCellExperiment' CellBinsFeatureCorrelation(object, labels = NULL, method = "pearson")
object |
An object of |
labels |
Character of label(s) from the label provided to the function
|
method |
Character specifying the correlation method to use. One of
|
A data frame with the correlation coefficient for each feature (rows) across labels (columns).
# Packages suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14845751/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Prepare data sce <- PrepareData(object = sce) # Multi-level integration - 'L = 4' just for highlighting purposes set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "batch", L = 4, threads = 2 ) # Cell states SCE object for a given cell type annotation or clustering cellstate.sce <- BinCellClusterProbability( object = sce, label = "cell_type", icp.round = 4, bins = 20 ) cellstate.sce # Pearson correlated features with "Monocyte" cor.features.mono <- CellBinsFeatureCorrelation( object = cellstate.sce, labels = "Monocyte" )# Packages suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14845751/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Prepare data sce <- PrepareData(object = sce) # Multi-level integration - 'L = 4' just for highlighting purposes set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "batch", L = 4, threads = 2 ) # Cell states SCE object for a given cell type annotation or clustering cellstate.sce <- BinCellClusterProbability( object = sce, label = "cell_type", icp.round = 4, bins = 20 ) cellstate.sce # Pearson correlated features with "Monocyte" cor.features.mono <- CellBinsFeatureCorrelation( object = cellstate.sce, labels = "Monocyte" )
Plot cell cluster probability distribution per label by group.
CellClusterProbabilityDistribution.SingleCellExperiment( object, label, group, probability ) ## S4 method for signature 'SingleCellExperiment' CellClusterProbabilityDistribution( object, label, group, probability = "scaled_mean_probs" )CellClusterProbabilityDistribution.SingleCellExperiment( object, label, group, probability ) ## S4 method for signature 'SingleCellExperiment' CellClusterProbabilityDistribution( object, label, group, probability = "scaled_mean_probs" )
object |
An object of |
label |
Character specifying the |
group |
Character specifying the |
probability |
Character specifying the aggregated cell cluster probability
variable available in |
A plot of class ggplot.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 4, L = 25, C = 1, d = 0.5, train.with.bnn = FALSE, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Summarise cell cluster probability sce <- SummariseCellClusterProbability(object = sce, icp.round = 2) # saved in 'colData' # Search for differences in probabilities across group(s) # give an interesting variable to the "group" parameter prob.dist <- CellClusterProbabilityDistribution( object = sce, label = "Species", group = "Batch", probability = "scaled_mean_probs" ) prob.dist # print plot# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 4, L = 25, C = 1, d = 0.5, train.with.bnn = FALSE, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Summarise cell cluster probability sce <- SummariseCellClusterProbability(object = sce, icp.round = 2) # saved in 'colData' # Search for differences in probabilities across group(s) # give an interesting variable to the "group" parameter prob.dist <- CellClusterProbabilityDistribution( object = sce, label = "Species", group = "Batch", probability = "scaled_mean_probs" ) prob.dist # print plot
FindAllClusterMarkers enables identifying feature markers for all
clusters at once. This is done by differential expresission analysis where
cells from one cluster are compared against the cells from the rest of the
clusters. Feature and cell filters can be applied to accelerate the analysis,
but this might lead to missing weak signals.
FindAllClusterMarkers.SingleCellExperiment( object, clustering.label, test, log2fc.threshold, min.pct, min.diff.pct, min.cells.group, max.cells.per.cluster, return.thresh, only.pos ) ## S4 method for signature 'SingleCellExperiment' FindAllClusterMarkers( object, clustering.label, test = "wilcox", log2fc.threshold = 0.25, min.pct = 0.1, min.diff.pct = NULL, min.cells.group = 3, max.cells.per.cluster = NULL, return.thresh = 0.01, only.pos = FALSE )FindAllClusterMarkers.SingleCellExperiment( object, clustering.label, test, log2fc.threshold, min.pct, min.diff.pct, min.cells.group, max.cells.per.cluster, return.thresh, only.pos ) ## S4 method for signature 'SingleCellExperiment' FindAllClusterMarkers( object, clustering.label, test = "wilcox", log2fc.threshold = 0.25, min.pct = 0.1, min.diff.pct = NULL, min.cells.group = 3, max.cells.per.cluster = NULL, return.thresh = 0.01, only.pos = FALSE )
object |
A |
clustering.label |
A variable name (of class |
test |
Which test to use. Only "wilcox" (the Wilcoxon rank-sum test, AKA Mann-Whitney U test) is supported at the moment. |
log2fc.threshold |
Filters out features that have log2 fold-change of the
averaged feature expression values below this threshold. Default is |
min.pct |
Filters out features that have dropout rate (fraction of cells
expressing a feature) below this threshold in both comparison groups. Default is
|
min.diff.pct |
Filters out features that do not have this minimum
difference in the dropout rates (fraction of cells expressing a feature)
between the two comparison groups. Default is |
min.cells.group |
The minimum number of cells in the two comparison
groups to perform the DE analysis. If the number of cells is below the
threshold, then the DE analysis of this cluster is skipped. Default is |
max.cells.per.cluster |
The maximum number of cells per cluster if
downsampling is performed to speed up the DE analysis. Default is |
return.thresh |
If |
only.pos |
Whether to return only features that have an adjusted p-value
(adjusted by the Bonferroni method) below or equal to the threshold. Default
is |
A data frame of the results if positive results were found, else NULL.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Markers dge <- FindAllClusterMarkers(sce, clustering.label = "Species") dge# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Markers dge <- FindAllClusterMarkers(sce, clustering.label = "Species") dge
FindClusterMarkers enables identifying feature markers for one
cluster or two arbitrary combinations of clusters, e.g. 1_2 vs. 3_4_5. Feature
and cell filters can be applied to accelerate the analysis, but this might
lead to missing weak signals.
FindClusterMarkers.SingleCellExperiment( object, clustering.label, clusters.1, clusters.2, test, log2fc.threshold, min.pct, min.diff.pct, min.cells.group, max.cells.per.cluster, return.thresh, only.pos ) ## S4 method for signature 'SingleCellExperiment' FindClusterMarkers( object, clustering.label, clusters.1 = NULL, clusters.2 = NULL, test = "wilcox", log2fc.threshold = 0.25, min.pct = 0.1, min.diff.pct = NULL, min.cells.group = 3, max.cells.per.cluster = NULL, return.thresh = 0.01, only.pos = FALSE )FindClusterMarkers.SingleCellExperiment( object, clustering.label, clusters.1, clusters.2, test, log2fc.threshold, min.pct, min.diff.pct, min.cells.group, max.cells.per.cluster, return.thresh, only.pos ) ## S4 method for signature 'SingleCellExperiment' FindClusterMarkers( object, clustering.label, clusters.1 = NULL, clusters.2 = NULL, test = "wilcox", log2fc.threshold = 0.25, min.pct = 0.1, min.diff.pct = NULL, min.cells.group = 3, max.cells.per.cluster = NULL, return.thresh = 0.01, only.pos = FALSE )
object |
A |
clustering.label |
A variable name (of class |
clusters.1 |
a character or numeric vector denoting which clusters to use in the first group (named group.1 in the results) |
clusters.2 |
a character or numeric vector denoting which clusters to use in the second group (named group.2 in the results) |
test |
Which test to use. Only "wilcoxon" (the Wilcoxon rank-sum test, AKA Mann-Whitney U test) is supported at the moment. |
log2fc.threshold |
Filters out features that have log2 fold-change of the
averaged feature expression values below this threshold.
Default is |
min.pct |
Filters out features that have dropout rate (fraction of cells
expressing a feature) below this threshold in both comparison groups
Default is |
min.diff.pct |
Filters out features that do not have this minimum
difference in the dropout rates (fraction of cells expressing a feature)
between the two comparison groups. Default is |
min.cells.group |
The minimum number of cells in the two comparison
groups to perform the DE analysis. If the number of cells is below the
threshold, then the DE analysis is not performed.
Default is |
max.cells.per.cluster |
The maximun number of cells per cluster
if downsampling is performed to speed up the DE analysis.
Default is |
return.thresh |
If only.pos=TRUE, then return only features that
have the adjusted p-value (adjusted by the Bonferroni method) below or
equal to this threshold. Default is |
only.pos |
Whether to return only features that have an adjusted
p-value (adjusted by the Bonferroni method) below or equal to the
threshold. Default is |
a data frame of the results if positive results were found, else NULL
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Markers between versicolor vs virginica dge <- FindClusterMarkers(sce, clustering.label = "Species", clusters.1 = "versicolor", clusters.2 = "virginica" ) dge# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Markers between versicolor vs virginica dge <- FindClusterMarkers(sce, clustering.label = "Species", clusters.1 = "versicolor", clusters.2 = "virginica" ) dge
Get ICP cell cluster probability table(s)
GetCellClusterProbability.SingleCellExperiment( object, icp.run, icp.round, concatenate ) ## S4 method for signature 'SingleCellExperiment' GetCellClusterProbability( object, icp.run = NULL, icp.round = NULL, concatenate = TRUE )GetCellClusterProbability.SingleCellExperiment( object, icp.run, icp.round, concatenate ) ## S4 method for signature 'SingleCellExperiment' GetCellClusterProbability( object, icp.run = NULL, icp.round = NULL, concatenate = TRUE )
object |
An object of |
icp.run |
ICP run(s) to retrieve from |
icp.round |
ICP round(s) to retrieve from |
concatenate |
Concatenate list of ICP cell cluster probability tables retrieved.
By default |
A list with ICP cell cluster probability tables or a matrix with concatenated tables.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Get cluster probability for all ICP runs probs <- GetCellClusterProbability(object = sce, icp.round = 1, concatenate = TRUE) probs[1:10, 1:5]# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Get cluster probability for all ICP runs probs <- GetCellClusterProbability(object = sce, icp.round = 1, concatenate = TRUE) probs[1:10, 1:5]
Get feature coefficients from ICP models.
GetFeatureCoefficients.SingleCellExperiment( object, icp.run = NULL, icp.round = NULL ) ## S4 method for signature 'SingleCellExperiment' GetFeatureCoefficients(object, icp.run = NULL, icp.round = NULL)GetFeatureCoefficients.SingleCellExperiment( object, icp.run = NULL, icp.round = NULL ) ## S4 method for signature 'SingleCellExperiment' GetFeatureCoefficients(object, icp.run = NULL, icp.round = NULL)
object |
An object of |
icp.run |
ICP run(s) to retrieve from |
icp.round |
ICP round(s) to retrieve from |
A list of feature coefficient weights per cluster per ICP run/round.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 4, L = 25, C = 1, d = 0.5, train.with.bnn = FALSE, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # GetFeatureCoefficients gene_coefficients_icp_7_1 <- GetFeatureCoefficients(object = sce, icp.run = 7, icp.round = 1) head(gene_coefficients_icp_7_1$icp_13)# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 4, L = 25, C = 1, d = 0.5, train.with.bnn = FALSE, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # GetFeatureCoefficients gene_coefficients_icp_7_1 <- GetFeatureCoefficients(object = sce, icp.run = 7, icp.round = 1) head(gene_coefficients_icp_7_1$icp_13)
The HeatmapFeatures function draws a heatmap of features
by cluster identity.
HeatmapFeatures.SingleCellExperiment( object, clustering.label, features, use.color, seed.color, ... ) ## S4 method for signature 'SingleCellExperiment' HeatmapFeatures( object, clustering.label, features, use.color = NULL, seed.color = 123, ... )HeatmapFeatures.SingleCellExperiment( object, clustering.label, features, use.color, seed.color, ... ) ## S4 method for signature 'SingleCellExperiment' HeatmapFeatures( object, clustering.label, features, use.color = NULL, seed.color = 123, ... )
object |
of |
clustering.label |
A variable name (of class |
features |
Feature names to plot by cluster ( |
use.color |
Character specifying the colors for the clusters. By default
|
seed.color |
Seed to randomly select colors for the clusters. By default
|
... |
Parameters to pass to |
nothing
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Plot features by clustering, i.e., grouping variable # without scaling rows (using 'logcounts' expression): HeatmapFeatures( object = sce, clustering.label = "Species", features = row.names(sce)[1:4] ) # scaling rows: HeatmapFeatures( object = sce, clustering.label = "Species", features = row.names(sce)[1:4], scale = "row" ) # scale# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Plot features by clustering, i.e., grouping variable # without scaling rows (using 'logcounts' expression): HeatmapFeatures( object = sce, clustering.label = "Species", features = row.names(sce)[1:4] ) # scaling rows: HeatmapFeatures( object = sce, clustering.label = "Species", features = row.names(sce)[1:4], scale = "row" ) # scale
Get ICP feature coefficients for a label of interest by majority voting label across ICP clusters.
MajorityVotingFeatures.SingleCellExperiment(object, label) ## S4 method for signature 'SingleCellExperiment' MajorityVotingFeatures(object, label)MajorityVotingFeatures.SingleCellExperiment(object, label) ## S4 method for signature 'SingleCellExperiment' MajorityVotingFeatures(object, label)
object |
An object of |
label |
Label of interest available in |
A list of with a list of data frames with feature weights per label and a data frame with a summary by label.
## Not run: # Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14845751/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "batch", k = 4, L = 10, C = 1, d = 0.5, train.with.bnn = FALSE, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2 ) # Get coefficients by majority voting for a given categorical variable coeff <- MajorityVotingFeatures(object = sce, label = "cell_type") gene_coeff$summary order.rows <- order(coeff$feature_coeff$Monocyte$coeff_clt2, decreasing = TRUE ) head(coeff$feature_coeff$Monocyte[order.rows, ], n = 10) ## End(Not run)## Not run: # Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14845751/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "batch", k = 4, L = 10, C = 1, d = 0.5, train.with.bnn = FALSE, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2 ) # Get coefficients by majority voting for a given categorical variable coeff <- MajorityVotingFeatures(object = sce, label = "cell_type") gene_coeff$summary order.rows <- order(coeff$feature_coeff$Monocyte$coeff_clt2, decreasing = TRUE ) head(coeff$feature_coeff$Monocyte[order.rows, ], n = 10) ## End(Not run)
Draw an elbow plot of the standard deviations of the principal
components to deduce an appropriate value for p.
PCAElbowPlot.SingleCellExperiment(object, dimred.name, return.plot) ## S4 method for signature 'SingleCellExperiment' PCAElbowPlot(object, dimred.name = "PCA", return.plot = FALSE)PCAElbowPlot.SingleCellExperiment(object, dimred.name, return.plot) ## S4 method for signature 'SingleCellExperiment' PCAElbowPlot(object, dimred.name = "PCA", return.plot = FALSE)
object |
A |
dimred.name |
Dimensional reduction name of the PCA to select from
|
return.plot |
logical indicating if the ggplot2 object should be returned.
By default |
A ggplot2 object, if return.plot=TRUE.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 ) # Plot Elbow PCAElbowPlot(sce)# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 ) # Plot Elbow PCAElbowPlot(sce)
Plot cluster tree by or cluster probability or categorical variable.
PlotClusterTree.SingleCellExperiment( object, icp.run, color.by, use.color, seed.color, legend.title, return.data ) ## S4 method for signature 'SingleCellExperiment' PlotClusterTree( object, icp.run, color.by = NULL, use.color = NULL, seed.color = 123, legend.title = color.by, return.data = FALSE )PlotClusterTree.SingleCellExperiment( object, icp.run, color.by, use.color, seed.color, legend.title, return.data ) ## S4 method for signature 'SingleCellExperiment' PlotClusterTree( object, icp.run, color.by = NULL, use.color = NULL, seed.color = 123, legend.title = color.by, return.data = FALSE )
object |
An object of |
icp.run |
ICP run(s) to retrieve from |
color.by |
Categorical variable available in |
use.color |
Character specifying the colors. By default |
seed.color |
Seed to randomly select colors. By default |
legend.title |
Legend title. By default the same as given at |
return.data |
Return data frame used to plot. Logical. By default |
A plot of class ggplot or a list with a plot of class ggplot and a data frame.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 4, L = 25, C = 1, d = 0.5, train.with.bnn = FALSE, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Plot probability PlotClusterTree(object = sce, icp.run = 2) # Plot batch label distribution PlotClusterTree(object = sce, icp.run = 2, color.by = "Batch") # Plot species label distribution PlotClusterTree(object = sce, icp.run = 2, color.by = "Species")# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 4, L = 25, C = 1, d = 0.5, train.with.bnn = FALSE, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Plot probability PlotClusterTree(object = sce, icp.run = 2) # Plot batch label distribution PlotClusterTree(object = sce, icp.run = 2, color.by = "Batch") # Plot species label distribution PlotClusterTree(object = sce, icp.run = 2, color.by = "Species")
Plot categorical variables in dimensional reduction.
PlotDimRed.SingleCellExperiment( object, color.by, dimred, dims, use.color, point.size, point.stroke, legend.nrow, seed.color, label, plot.theme, rasterise, rasterise.dpi, legend.justification, legend.size, legend.title ) ## S4 method for signature 'SingleCellExperiment' PlotDimRed( object, color.by, dimred = tail(reducedDimNames(object), n = 1), dims = 1:2, use.color = NULL, point.size = 1, point.stroke = 1, legend.nrow = 2, seed.color = 123, label = FALSE, plot.theme = theme_classic(), rasterise = (ncol(object) <= 30000), rasterise.dpi = 300, legend.justification = "center", legend.size = 10, legend.title = color.by )PlotDimRed.SingleCellExperiment( object, color.by, dimred, dims, use.color, point.size, point.stroke, legend.nrow, seed.color, label, plot.theme, rasterise, rasterise.dpi, legend.justification, legend.size, legend.title ) ## S4 method for signature 'SingleCellExperiment' PlotDimRed( object, color.by, dimred = tail(reducedDimNames(object), n = 1), dims = 1:2, use.color = NULL, point.size = 1, point.stroke = 1, legend.nrow = 2, seed.color = 123, label = FALSE, plot.theme = theme_classic(), rasterise = (ncol(object) <= 30000), rasterise.dpi = 300, legend.justification = "center", legend.size = 10, legend.title = color.by )
object |
An object of |
color.by |
Categorical variable available in |
dimred |
Dimensional reduction available in |
dims |
Dimensions from the dimensional reduction embedding to plot. |
use.color |
Character specifying the colors. By default |
point.size |
Size of points. By default |
point.stroke |
Size of stroke. By default |
legend.nrow |
Display legend items by this number of rows. By default |
seed.color |
Seed to randomly select colors. By default |
label |
Logical to add or not categorical labels to the centroid categories.
By default |
plot.theme |
Plot theme available in |
rasterise |
Logical specifying if points should be rasterised or not. By
default |
rasterise.dpi |
In case |
legend.justification |
Legend justification. By default |
legend.size |
Legend size. By default |
legend.title |
Legend title. By default the same as given at |
A plot of class ggplot.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Compute dimensional reduction sce <- RunPCA( object = sce, assay.name = "logcounts", p = 4, pca.method = "stats" ) # Plot batch PlotDimRed(object = sce, color.by = "Batch", dimred = "PCA", legend.nrow = 1) # Plot cell type annotations PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1, dimred = "PCA", label = TRUE )# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Compute dimensional reduction sce <- RunPCA( object = sce, assay.name = "logcounts", p = 4, pca.method = "stats" ) # Plot batch PlotDimRed(object = sce, color.by = "Batch", dimred = "PCA", legend.nrow = 1) # Plot cell type annotations PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1, dimred = "PCA", label = TRUE )
Plot feature expression in dimensional reduction.
PlotExpression.SingleCellExperiment( object, color.by, dimred, scale.values, color.scale, plot.theme, legend.title, point.size, point.stroke ) ## S4 method for signature 'SingleCellExperiment' PlotExpression( object, color.by, dimred = tail(reducedDimNames(object), n = 1), scale.values = FALSE, color.scale = "inferno", plot.theme = theme_classic(), legend.title = color.by, point.size = 1, point.stroke = 1 )PlotExpression.SingleCellExperiment( object, color.by, dimred, scale.values, color.scale, plot.theme, legend.title, point.size, point.stroke ) ## S4 method for signature 'SingleCellExperiment' PlotExpression( object, color.by, dimred = tail(reducedDimNames(object), n = 1), scale.values = FALSE, color.scale = "inferno", plot.theme = theme_classic(), legend.title = color.by, point.size = 1, point.stroke = 1 )
object |
An object of |
color.by |
Categorical variable available in |
dimred |
Dimensional reduction available in |
scale.values |
Logical specifying if values should be scaled. By default
|
color.scale |
Character of color scale palette to be passed to
|
plot.theme |
Plot theme available in |
legend.title |
Legend title. By default the same as given at |
point.size |
Size of points. By default |
point.stroke |
Size of stroke. By default |
A plot of class ggplot.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Compute dimensional reduction sce <- RunPCA( object = sce, assay.name = "logcounts", p = 4, pca.method = "stats" ) # Plot expression level of one or more features ## one PlotExpression(object = sce, color.by = "Petal.Width") ## more than one features <- row.names(sce)[1:4] exp.plots <- lapply(X = features, FUN = function(x) { PlotExpression(object = sce, color.by = x, scale.values = TRUE) }) cowplot::plot_grid(plotlist = exp.plots, ncol = 2, align = "vh")# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Compute dimensional reduction sce <- RunPCA( object = sce, assay.name = "logcounts", p = 4, pca.method = "stats" ) # Plot expression level of one or more features ## one PlotExpression(object = sce, color.by = "Petal.Width") ## more than one features <- row.names(sce)[1:4] exp.plots <- lapply(X = features, FUN = function(x) { PlotExpression(object = sce, color.by = x, scale.values = TRUE) }) cowplot::plot_grid(plotlist = exp.plots, ncol = 2, align = "vh")
SingleCellExperiment object for analysisThis function prepares the SingleCellExperiment object
for analysis. The only required input is an object of class SingleCellExperiment
with at least data in the logcounts slot.
PrepareData.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' PrepareData(object)PrepareData.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' PrepareData(object)
object |
An object of |
An object of SingleCellExperiment class.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) sce <- PrepareData(sce)# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) sce <- PrepareData(sce)
This function allows to project new query data sets onto a reference built with Coralysis as well as transfer cell labels from the reference to queries.
ReferenceMapping.SingleCellExperiment( ref, query, ref.label, label.prune.cutoff, scale.query.by, project.umap, select.icp.models, k.nn, dimred.name.prefix ) ## S4 method for signature 'SingleCellExperiment,SingleCellExperiment' ReferenceMapping( ref, query, ref.label, label.prune.cutoff = 0.5, scale.query.by = NULL, project.umap = FALSE, select.icp.models = metadata(ref)$coralysis$pca.params$select.icp.tables, k.nn = 10, dimred.name.prefix = "" )ReferenceMapping.SingleCellExperiment( ref, query, ref.label, label.prune.cutoff, scale.query.by, project.umap, select.icp.models, k.nn, dimred.name.prefix ) ## S4 method for signature 'SingleCellExperiment,SingleCellExperiment' ReferenceMapping( ref, query, ref.label, label.prune.cutoff = 0.5, scale.query.by = NULL, project.umap = FALSE, select.icp.models = metadata(ref)$coralysis$pca.params$select.icp.tables, k.nn = 10, dimred.name.prefix = "" )
ref |
An object of |
query |
An object of |
ref.label |
A character cell metadata column name from the |
label.prune.cutoff |
A numeric cutoff value used to prune low-confidence
predicted cell labels, based on the confidence probability scores stored in the
|
scale.query.by |
Should the query data be scaled by |
project.umap |
Project query data onto reference UMAP (logical). By
default |
select.icp.models |
Select the reference ICP models to use for query
cluster probability prediction. By default |
k.nn |
The number of |
dimred.name.prefix |
Dimensional reduction name prefix to add to the
computed PCA and UMAP. By default nothing is added, i.e.,
|
An object of SingleCellExperiment class.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Create reference & query SCE objects ref <- sce[, sce$Batch == "b1"] query <- sce[, sce$Batch == "b2"] # 1) Train the reference set.seed(123) ref <- RunParallelDivisiveICP( object = ref, k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # 2) Compute reference PCA & UMAP ref <- RunPCA(ref, p = 5, return.model = TRUE, pca.method = "stats") set.seed(123) ref <- RunUMAP(ref, return.model = TRUE) # Plot PlotDimRed(object = ref, color.by = "Species", legend.nrow = 1) # 3) Project & predict query cell labels map <- ReferenceMapping( ref = ref, query = query, ref.label = "Species", project.umap = TRUE ) # Confusion matrix: predictions (rows) x ground-truth (cols) preds_x_truth <- table(map$coral_labels, map$Species) print(preds_x_truth) # Accuracy score acc <- sum(diag(preds_x_truth)) / sum(preds_x_truth) * 100 print(paste0("Coralysis accuracy score: ", round(acc), "%")) # Visualize: ground-truth, prediction, confidence scores cowplot::plot_grid( PlotDimRed( object = map, color.by = "Species", legend.nrow = 1 ), PlotDimRed( object = map, color.by = "coral_labels", legend.nrow = 1 ), PlotExpression( object = map, color.by = "coral_probability", color.scale = "viridis" ), ncol = 2, align = "vh" )# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Create reference & query SCE objects ref <- sce[, sce$Batch == "b1"] query <- sce[, sce$Batch == "b2"] # 1) Train the reference set.seed(123) ref <- RunParallelDivisiveICP( object = ref, k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # 2) Compute reference PCA & UMAP ref <- RunPCA(ref, p = 5, return.model = TRUE, pca.method = "stats") set.seed(123) ref <- RunUMAP(ref, return.model = TRUE) # Plot PlotDimRed(object = ref, color.by = "Species", legend.nrow = 1) # 3) Project & predict query cell labels map <- ReferenceMapping( ref = ref, query = query, ref.label = "Species", project.umap = TRUE ) # Confusion matrix: predictions (rows) x ground-truth (cols) preds_x_truth <- table(map$coral_labels, map$Species) print(preds_x_truth) # Accuracy score acc <- sum(diag(preds_x_truth)) / sum(preds_x_truth) * 100 print(paste0("Coralysis accuracy score: ", round(acc), "%")) # Visualize: ground-truth, prediction, confidence scores cowplot::plot_grid( PlotDimRed( object = map, color.by = "Species", legend.nrow = 1 ), PlotDimRed( object = map, color.by = "coral_labels", legend.nrow = 1 ), PlotExpression( object = map, color.by = "coral_probability", color.scale = "viridis" ), ncol = 2, align = "vh" )
Run divisive ICP clustering in parallel in order to perform multi-level integration.
RunParallelDivisiveICP.SingleCellExperiment( object, batch.label, k, d, L, r, C, reg.type, max.iter, threads, icp.batch.size, train.with.bnn, train.k.nn, train.k.nn.prop, build.train.set, build.train.params, scale.by, use.cluster.seed, divisive.method, allow.free.k, ari.cutoff, verbose, RNGseed, BPPARAM ) ## S4 method for signature 'SingleCellExperiment' RunParallelDivisiveICP( object, batch.label = NULL, k = 16, d = 0.3, L = 50, r = 5, C = 0.3, reg.type = "L1", max.iter = 200, threads = 0, icp.batch.size = Inf, train.with.bnn = TRUE, train.k.nn = 10, train.k.nn.prop = 0.3, build.train.set = TRUE, build.train.params = list(), scale.by = NULL, use.cluster.seed = TRUE, divisive.method = "cluster.batch", allow.free.k = TRUE, ari.cutoff = 0.3, verbose = FALSE, RNGseed = 123, BPPARAM = NULL )RunParallelDivisiveICP.SingleCellExperiment( object, batch.label, k, d, L, r, C, reg.type, max.iter, threads, icp.batch.size, train.with.bnn, train.k.nn, train.k.nn.prop, build.train.set, build.train.params, scale.by, use.cluster.seed, divisive.method, allow.free.k, ari.cutoff, verbose, RNGseed, BPPARAM ) ## S4 method for signature 'SingleCellExperiment' RunParallelDivisiveICP( object, batch.label = NULL, k = 16, d = 0.3, L = 50, r = 5, C = 0.3, reg.type = "L1", max.iter = 200, threads = 0, icp.batch.size = Inf, train.with.bnn = TRUE, train.k.nn = 10, train.k.nn.prop = 0.3, build.train.set = TRUE, build.train.params = list(), scale.by = NULL, use.cluster.seed = TRUE, divisive.method = "cluster.batch", allow.free.k = TRUE, ari.cutoff = 0.3, verbose = FALSE, RNGseed = 123, BPPARAM = NULL )
object |
An object of |
batch.label |
A variable name (of class |
k |
A positive integer power of two, i.e., |
d |
A numeric greater than |
L |
A positive integer greater than |
r |
A positive integer that denotes the number of reiterations
performed until the ICP algorithm stops.
Increasing recommended with a significantly larger sample size
(tens of thousands of cells). Default is |
C |
A positive real number denoting the cost of constraints violation in
the L1-regularized logistic regression model from the LIBLINEAR library.
Decreasing leads to more stringent feature selection, i.e. less features are
selected that are used to build the projection classifier. Decreasing to a
very low value (~ |
reg.type |
"L1" or "L2". L2-regularization was not investigated in the manuscript, but it leads to a more conventional outcome (less subpopulations). Default is "L1". |
max.iter |
A positive integer that denotes
the maximum number of iterations performed until ICP stops. This parameter
is only useful in situations where ICP converges extremely slowly, preventing
the algorithm to run too long. In most cases, reaching
the number of reiterations ( |
threads |
A positive integer that specifies how many logical processors
(threads) to use in parallel computation. Set |
icp.batch.size |
A positive integer that specifies how many cells
to randomly select. It behaves differently depending on |
train.with.bnn |
Train data with batch nearest neighbors. Default is
|
train.k.nn |
Train data with batch nearest neighbors using |
train.k.nn.prop |
A numeric (higher than 0 and lower than 1) corresponding
to the fraction of cells per cluster to use as |
build.train.set |
Logical specifying if a training set should be built
from the data or the whole data should be used for training. By default
|
build.train.params |
A list of parameters to be passed to the function
|
scale.by |
A character specifying if the data should be scaled by |
use.cluster.seed |
Should the same starting clustering result be provided
to ensure more reproducible results (logical). If |
divisive.method |
Divisive method (character). One of |
allow.free.k |
Allow free |
ari.cutoff |
Include ICP models and probability tables with an Adjusted
Rand Index higher than |
verbose |
A logical value to print verbose during the ICP run in case.
Default is |
RNGseed |
Seed number passed to the parallel backend via |
BPPARAM |
A |
A SingleCellExperiment object.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 )# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 )
Perform principal component analysis using assays or the joint probability matrix as input.
RunPCA.SingleCellExperiment( object, assay.name, p, scale, center, threshold, pca.method, return.model, select.icp.tables, features, dimred.name ) ## S4 method for signature 'SingleCellExperiment' RunPCA( object, assay.name = "joint.probability", p = 50, scale = TRUE, center = TRUE, threshold = 0, pca.method = "irlba", return.model = FALSE, select.icp.tables = NULL, features = NULL, dimred.name = "PCA" )RunPCA.SingleCellExperiment( object, assay.name, p, scale, center, threshold, pca.method, return.model, select.icp.tables, features, dimred.name ) ## S4 method for signature 'SingleCellExperiment' RunPCA( object, assay.name = "joint.probability", p = 50, scale = TRUE, center = TRUE, threshold = 0, pca.method = "irlba", return.model = FALSE, select.icp.tables = NULL, features = NULL, dimred.name = "PCA" )
object |
A |
assay.name |
Name of the assay to compute PCA. One of |
p |
A positive integer denoting the number of principal components to
calculate and select. Default is |
scale |
A logical specifying whether the probabilities should be
standardized to unit-variance before running PCA. Default is |
center |
A logical specifying whether the probabilities should be
centered before running PCA. Default is |
threshold |
A threshold for filtering out ICP runs before PCA with the
lower terminal projection accuracy below the threshold. Default is |
pca.method |
A character specifying the PCA method. One of |
return.model |
A logical specifying if the PCA model should or not be
retrieved. By default |
select.icp.tables |
Select the ICP cluster probability tables to perform
PCA. By default |
features |
A character of feature names matching |
dimred.name |
Dimensional reduction name given to the returned PCA. By
default |
object of SingleCellExperiment class
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 )# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 )
Run nonlinear dimensionality reduction using t-SNE with the PCA-transformed consensus matrix as input.
RunTSNE.SingleCellExperiment( object, dims, dimred.type, perplexity, dimred.name, ... ) ## S4 method for signature 'SingleCellExperiment' RunTSNE( object, dims = NULL, dimred.type = "PCA", perplexity = 30, dimred.name = "TSNE", ... )RunTSNE.SingleCellExperiment( object, dims, dimred.type, perplexity, dimred.name, ... ) ## S4 method for signature 'SingleCellExperiment' RunTSNE( object, dims = NULL, dimred.type = "PCA", perplexity = 30, dimred.name = "TSNE", ... )
object |
Object of |
dims |
Dimensions to select from |
dimred.type |
Dimensional reduction type to use. By default |
perplexity |
Perplexity of t-SNE. |
dimred.name |
Dimensional reduction name given to the returned t-SNE.
By default |
... |
Parameters to be passed to the |
A SingleCellExperiment object.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Run PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA( object = sce, assay.name = "logcounts", pca.method = "stats", p = nrow(sce) ) # Run t-SNE set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunTSNE(object = sce, dimred.type = "PCA", check_duplicates = FALSE) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 )# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Run PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA( object = sce, assay.name = "logcounts", pca.method = "stats", p = nrow(sce) ) # Run t-SNE set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunTSNE(object = sce, dimred.type = "PCA", check_duplicates = FALSE) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 )
Run nonlinear dimensionality reduction using UMAP with a dimensional reduction as input.
RunUMAP.SingleCellExperiment( object, dims, dimred.type, return.model, umap.method, dimred.name, ... ) ## S4 method for signature 'SingleCellExperiment' RunUMAP( object, dims = NULL, dimred.type = "PCA", return.model = FALSE, umap.method = "umap", dimred.name = "UMAP", ... )RunUMAP.SingleCellExperiment( object, dims, dimred.type, return.model, umap.method, dimred.name, ... ) ## S4 method for signature 'SingleCellExperiment' RunUMAP( object, dims = NULL, dimred.type = "PCA", return.model = FALSE, umap.method = "umap", dimred.name = "UMAP", ... )
object |
An object of |
dims |
Dimensions to select from |
dimred.type |
Dimensional reduction type to use. By default |
return.model |
Return UMAP model. By default |
umap.method |
UMAP method to use: |
dimred.name |
Dimensional reduction name given to the returned UMAP.
By default |
... |
Parameters to be passed to the |
A SingleCellExperiment object.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 ) # Run UMAP set.seed(123) sce <- RunUMAP(sce, dimred.type = "PCA") # Plot results # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 )# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 ) # Run UMAP set.seed(123) sce <- RunUMAP(sce, dimred.type = "PCA") # Plot results # Plot result cowplot::plot_grid( PlotDimRed( object = sce, color.by = "Batch", legend.nrow = 1 ), PlotDimRed( object = sce, color.by = "Species", legend.nrow = 1 ), ncol = 2 )
Summarise ICP cell cluster probability table(s)
SummariseCellClusterProbability.SingleCellExperiment( object, icp.run, icp.round, funs, scale.funs, save.in.sce ) ## S4 method for signature 'SingleCellExperiment' SummariseCellClusterProbability( object, icp.run = NULL, icp.round = NULL, funs = c("mean", "median"), scale.funs = TRUE, save.in.sce = TRUE )SummariseCellClusterProbability.SingleCellExperiment( object, icp.run, icp.round, funs, scale.funs, save.in.sce ) ## S4 method for signature 'SingleCellExperiment' SummariseCellClusterProbability( object, icp.run = NULL, icp.round = NULL, funs = c("mean", "median"), scale.funs = TRUE, save.in.sce = TRUE )
object |
An object of |
icp.run |
ICP run(s) to retrieve from |
icp.round |
ICP round(s) to retrieve from |
funs |
Functions to summarise ICP cell cluster probability: |
scale.funs |
Scale in the range 0-1 the summarised probability obtained with
|
save.in.sce |
Save the data frame into the cell metadata from the
|
A data frame or a SingleCellExperiment object with ICP cell cluster probability summarised.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Summarise cluster probability sce <- SummariseCellClusterProbability( object = sce, icp.round = 1, save.in.sce = TRUE ) # saved in 'colData' # Plot the clustering result for ICP run no. 3 PlotDimRed(object = sce, color.by = "icp_run_round_3_1_clusters") # Plot Coralysis mean cell cluster probabilities PlotExpression( object = sce, color.by = "mean_probs", color.scale = "viridis" )# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Prepare SCE object for analysis sce <- PrepareData(sce) # Multi-level integration (just for highlighting purposes; use default parameters) set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "Batch", k = 2, L = 25, C = 1, train.k.nn = 10, train.k.nn.prop = NULL, use.cluster.seed = FALSE, build.train.set = FALSE, ari.cutoff = 0.1, threads = 2, RNGseed = 1024 ) # Integrated PCA set.seed(125) # to ensure reproducibility for the default 'irlba' method sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10) # Summarise cluster probability sce <- SummariseCellClusterProbability( object = sce, icp.round = 1, save.in.sce = TRUE ) # saved in 'colData' # Plot the clustering result for ICP run no. 3 PlotDimRed(object = sce, color.by = "icp_run_round_3_1_clusters") # Plot Coralysis mean cell cluster probabilities PlotExpression( object = sce, color.by = "mean_probs", color.scale = "viridis" )
Frequency of cells per cell cluster probability bin by group for each label.
The label has to be specified beforehand to the function BinCellClusterProbability().
TabulateCellBinsByGroup.SingleCellExperiment(object, group, relative, margin) ## S4 method for signature 'SingleCellExperiment' TabulateCellBinsByGroup(object, group, relative = FALSE, margin = 1)TabulateCellBinsByGroup.SingleCellExperiment(object, group, relative, margin) ## S4 method for signature 'SingleCellExperiment' TabulateCellBinsByGroup(object, group, relative = FALSE, margin = 1)
object |
An object of |
group |
Character specifying the |
relative |
Logical specifying if relative proportions of cell bins per
|
margin |
If |
A list of tables with the frequency of cells per bin of cell cluster probability by group for each label.
# Packages suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14845751/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Prepare data sce <- PrepareData(object = sce) # Multi-level integration - 'L = 4' just for highlighting purposes set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "batch", L = 4, threads = 2 ) # Cell states SCE object for a given cell type annotation or clustering cellstate.sce <- BinCellClusterProbability( object = sce, label = "cell_type", icp.round = 4, bins = 20 ) cellstate.sce # Tabulate cell bins by group # give an interesting variable to the "group" parameter cellbins.tables <- TabulateCellBinsByGroup( object = cellstate.sce, group = "batch", relative = TRUE, margin = 1 )# Packages suppressPackageStartupMessages(library("SingleCellExperiment")) # Import data from Zenodo data.url <- "https://zenodo.org/records/14845751/files/pbmc_10Xassays.rds?download=1" sce <- readRDS(file = url(data.url)) # Prepare data sce <- PrepareData(object = sce) # Multi-level integration - 'L = 4' just for highlighting purposes set.seed(123) sce <- RunParallelDivisiveICP( object = sce, batch.label = "batch", L = 4, threads = 2 ) # Cell states SCE object for a given cell type annotation or clustering cellstate.sce <- BinCellClusterProbability( object = sce, label = "cell_type", icp.round = 4, bins = 20 ) cellstate.sce # Tabulate cell bins by group # give an interesting variable to the "group" parameter cellbins.tables <- TabulateCellBinsByGroup( object = cellstate.sce, group = "batch", relative = TRUE, margin = 1 )
The VlnPlot function enables visualizing expression levels
of feature(s), across clusters using violin plots.
VlnPlot.SingleCellExperiment( object, clustering.label, features, return.plot, rotate.x.axis.labels ) ## S4 method for signature 'SingleCellExperiment' VlnPlot( object, clustering.label, features, return.plot = FALSE, rotate.x.axis.labels = FALSE )VlnPlot.SingleCellExperiment( object, clustering.label, features, return.plot, rotate.x.axis.labels ) ## S4 method for signature 'SingleCellExperiment' VlnPlot( object, clustering.label, features, return.plot = FALSE, rotate.x.axis.labels = FALSE )
object |
of |
clustering.label |
A variable name (of class |
features |
Feature names to plot by cluster ( |
return.plot |
return.plot whether to return the |
rotate.x.axis.labels |
a logical denoting whether the x-axis labels should
be rotated 90 degrees or just draw it. Default is |
A ggplot2 object if return.plot=TRUE.
# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Plot features by clustering/grouping variable VlnPlot(sce, clustering.label = "Species", features = row.names(sce)[1:4], rotate.x.axis.labels = TRUE )# Import package suppressPackageStartupMessages(library("SingleCellExperiment")) # Create toy SCE data batches <- c("b1", "b2") set.seed(239) batch <- sample(x = batches, size = nrow(iris), replace = TRUE) sce <- SingleCellExperiment( assays = list(logcounts = t(iris[, 1:4])), colData = DataFrame( "Species" = iris$Species, "Batch" = batch ) ) colnames(sce) <- paste0("samp", 1:ncol(sce)) # Plot features by clustering/grouping variable VlnPlot(sce, clustering.label = "Species", features = row.names(sce)[1:4], rotate.x.axis.labels = TRUE )