Title: | Easy single cell analysis platform for enrichment |
---|---|
Description: | A bridging R package to facilitate gene set enrichment analysis (GSEA) in the context of single-cell RNA sequencing. Using raw count information, Seurat objects, or SingleCellExperiment format, users can perform and visualize ssGSEA, GSVA, AUCell, and UCell-based enrichment calculations across individual cells. |
Authors: | Nick Borcherding [aut, cre], Jared Andrews [aut], Alexei Martsinkovskiy [ctb] |
Maintainer: | Nick Borcherding <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2.3.0 |
Built: | 2024-10-30 07:47:55 UTC |
Source: | https://github.com/bioc/escape |
This function allows to the user to examine the mean ranking within the groups across the gene set. The visualization uses the density function to display the relative position and distribution of rank.
densityEnrichment( input.data, gene.set.use = NULL, gene.sets = NULL, group.by = NULL, palette = "inferno" )
densityEnrichment( input.data, gene.set.use = NULL, gene.sets = NULL, group.by = NULL, palette = "inferno" )
input.data |
The single-cell object to use. |
gene.set.use |
Selected individual gene set. |
gene.sets |
The gene set library to use to extract the individual gene set information from. |
group.by |
Categorical parameter to plot along the x.axis. If input is a single-cell object the default will be cluster. |
palette |
Colors to use in visualization - input any hcl.pals. |
ggplot2 object mean rank gene density across groups
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small densityEnrichment(pbmc_small, gene.set.use = "Tcells", gene.sets = GS)
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small densityEnrichment(pbmc_small, gene.set.use = "Tcells", gene.sets = GS)
A list of gene sets derived from Azizi, et al 2018 PMID: 29961579) relating to tumor immunity.
This function allows users to input both the single-cell RNA-sequencing counts and output the enrichment scores as a matrix.
escape.matrix( input.data, gene.sets = NULL, method = "ssGSEA", groups = 1000, min.size = 5, normalize = FALSE, make.positive = FALSE, BPPARAM = SerialParam(), ... )
escape.matrix( input.data, gene.sets = NULL, method = "ssGSEA", groups = 1000, min.size = 5, normalize = FALSE, make.positive = FALSE, BPPARAM = SerialParam(), ... )
input.data |
The count matrix, Seurat, or Single-Cell Experiment object. |
gene.sets |
Gene sets can be a list, output from
|
method |
Select the method to calculate enrichment, AUCell, GSVA, ssGSEA or UCell. |
groups |
The number of cells to separate the enrichment calculation. |
min.size |
Minimum number of gene necessary to perform the enrichment calculation |
normalize |
Whether to divide the enrichment score by the number of genes TRUE or report unnormalized FALSE. |
make.positive |
During normalization shift enrichment values to a positive range TRUE for downstream analysis or not TRUE (default). Will only be applied if normalize = TRUE. |
BPPARAM |
A BiocParallel::bpparam() object that for parallelization. |
... |
pass arguments to AUCell GSVA, ssGSEA, or UCell call |
matrix of enrichment scores
Nick Borcherding, Jared Andrews
getGeneSets
to collect gene sets.
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small ES <- escape.matrix(pbmc_small, gene.sets = GS, min.size = NULL)
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small ES <- escape.matrix(pbmc_small, gene.sets = GS, min.size = NULL)
This function allows users to select libraries and specific gene.sets to form a GeneSetCollection that is a list of gene sets.
getGeneSets( species = "Homo sapiens", library = NULL, subcategory = NULL, gene.sets = NULL )
getGeneSets( species = "Homo sapiens", library = NULL, subcategory = NULL, gene.sets = NULL )
species |
The scientific name of the species of interest in order to get correct gene nomenclature |
library |
Individual collection(s) of gene sets, e.g. c("H", "C5"). See msigdbrfor all MSigDB collections. |
subcategory |
MSigDB sub-collection abbreviation, such as CGP or BP. |
gene.sets |
Select gene sets or pathways, using specific names, example: pathways = c("HALLMARK_TNFA_SIGNALING_VIA_NFKB"). Will only be honored if library is set, too. |
A list of gene sets from msigdbr.
Nick Borcherding, Jared Andrews
GS <- getGeneSets(library = "H")
GS <- getGeneSets(library = "H")
This function allows to the user to examine the distribution of enrichment across groups by generating a ridge plot.
geyserEnrichment( input.data, assay = NULL, group.by = NULL, gene.set = NULL, color.by = "group", order.by = NULL, scale = FALSE, facet.by = NULL, palette = "inferno" )
geyserEnrichment( input.data, assay = NULL, group.by = NULL, gene.set = NULL, color.by = "group", order.by = NULL, scale = FALSE, facet.by = NULL, palette = "inferno" )
input.data |
Enrichment output from |
assay |
Name of the assay to plot if data is a single-cell object. |
group.by |
Categorical parameter to plot along the x.axis. If input is a single-cell object the default will be cluster. |
gene.set |
Gene set to plot (on y-axis). |
color.by |
How the color palette applies to the graph - can be "group" for a categorical color palette based on the group.by parameter or use the gene.set name if wanting to apply a gradient palette. |
order.by |
Method to organize the x-axis: "mean" will arrange the x-axis by the mean of the gene.set, while "group" will arrange the x-axis by in alphanumerical order. Using NULL will not reorder the x-axis. |
scale |
Visualize raw values FALSE or Z-transform enrichment values TRUE. |
facet.by |
Variable to facet the plot into n distinct graphs. |
palette |
Colors to use in visualization - input any hcl.pals. |
ggplot2 object with geyser-based distributions of selected gene.set
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) geyserEnrichment(pbmc_small, assay = "escape", gene.set = "Tcells")
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) geyserEnrichment(pbmc_small, assay = "escape", gene.set = "Tcells")
This function allows to the user to examine the heatmap with the mean enrichment values by group. The heatmap will have the gene sets as rows and columns will be the grouping variable.
heatmapEnrichment( input.data, assay = NULL, group.by = NULL, gene.set.use = "all", cluster.rows = FALSE, cluster.columns = FALSE, scale = FALSE, facet.by = NULL, palette = "inferno" )
heatmapEnrichment( input.data, assay = NULL, group.by = NULL, gene.set.use = "all", cluster.rows = FALSE, cluster.columns = FALSE, scale = FALSE, facet.by = NULL, palette = "inferno" )
input.data |
Enrichment output from |
assay |
Name of the assay to plot if data is a single-cell object. |
group.by |
Categorical parameter to plot along the x.axis. If input is a single-cell object the default will be cluster. |
gene.set.use |
Selected gene sets to visualize. If "all", the heatmap will be generated across all gene sets. |
cluster.rows |
Use Euclidean distance to order the row values. |
cluster.columns |
Use Euclidean distance to order the column values. |
scale |
Visualize raw values FALSE or Z-transform enrichment values TRUE. |
facet.by |
Variable to facet the plot into n distinct graphs. |
palette |
Colors to use in visualization - input any hcl.pals. |
ggplot2 object with heatmap of mean enrichment values
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) heatmapEnrichment(pbmc_small, assay = "escape")
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) heatmapEnrichment(pbmc_small, assay = "escape")
This function allows to the user to examine the distribution of principal components run on the enrichment values.
pcaEnrichment( input.data, dimRed = NULL, x.axis = "PC1", y.axis = "PC2", facet.by = NULL, style = "point", add.percent.contribution = TRUE, display.factors = FALSE, number.of.factors = 10, palette = "inferno" )
pcaEnrichment( input.data, dimRed = NULL, x.axis = "PC1", y.axis = "PC2", facet.by = NULL, style = "point", add.percent.contribution = TRUE, display.factors = FALSE, number.of.factors = 10, palette = "inferno" )
input.data |
PCA from |
dimRed |
Name of the dimensional reduction to plot if data is a single-cell object. |
x.axis |
Component to plot on the x.axis. |
y.axis |
Component set to plot on the y.axis. |
facet.by |
Variable to facet the plot into n distinct graphs. |
style |
Return a "hex" bin plot or a "point"-based plot. |
add.percent.contribution |
Add the relative percent of contribution of the selected components to the axis labels. |
display.factors |
Add an arrow overlay to show the direction and magnitude of individual gene sets on the PCA dimensions. |
number.of.factors |
The number of gene.sets to display on the overlay. |
palette |
Colors to use in visualization - input any hcl.pals. |
ggplot2 object with PCA distribution
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) pbmc_small <- performPCA(pbmc_small, assay = "escape") pcaEnrichment(pbmc_small, x.axis = "PC1", y.axis = "PC2", dimRed = "escape.PCA")
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) pbmc_small <- performPCA(pbmc_small, assay = "escape") pcaEnrichment(pbmc_small, x.axis = "PC1", y.axis = "PC2", dimRed = "escape.PCA")
This function allows users to normalize the enrichment calculations by accounting for single-cell dropout and producing positive values for downstream differential enrichment analyses. A positive range values is useful for several downstream analyses, like differential evaluation for log2-fold change, but will alter the original enrichment values.
performNormalization( sc.data, enrichment.data = NULL, assay = "escape", gene.sets = NULL, make.positive = FALSE, scale.factor = NULL, groups = NULL )
performNormalization( sc.data, enrichment.data = NULL, assay = "escape", gene.sets = NULL, make.positive = FALSE, scale.factor = NULL, groups = NULL )
sc.data |
Single-cell object or matrix used in the gene set enrichment calculation in
|
enrichment.data |
The enrichment results from |
assay |
Name of the assay to normalize if using a single-cell object |
gene.sets |
The gene set library to use to extract the individual gene set information from |
make.positive |
Shift enrichment values to a positive range TRUE for downstream analysis or not TRUE (default). |
scale.factor |
A vector to use for normalizing enrichment scores per cell. |
groups |
the number of cells to calculate normalization on at once. chunks matrix into groups sized chunks. Useful in case of memory issues. |
Single-cell object or matrix of normalized enrichment scores
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) pbmc_small <- performNormalization(pbmc_small, assay = "escape", gene.sets = GS)
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) pbmc_small <- performNormalization(pbmc_small, assay = "escape", gene.sets = GS)
This function allows users to calculate the principal components
for the gene set enrichment values. For single-cell data, the PCA
will be stored with the dimensional reductions. If a matrix is used
as input, the output is a list for further plotting. Alternatively,
users can use functions for PCA calculations based on their desired
workflow in lieu of using performPCA
, but will not be
compatible with downstream pcaEnrichment
visualization.
performPCA( input.data, assay = NULL, scale = TRUE, n.dim = 1:10, reduction.name = "escape.PCA", reduction.key = "PCA" )
performPCA( input.data, assay = NULL, scale = TRUE, n.dim = 1:10, reduction.name = "escape.PCA", reduction.key = "PCA" )
input.data |
Enrichment output from |
assay |
Name of the assay to plot if data is a single-cell object. |
scale |
Standardize the enrichment value (TRUE) or not (FALSE) |
n.dim |
The number of components to calculate. |
reduction.name |
Name of the reduced dimensions object to add if data is a single-cell object. |
reduction.key |
Name of the key to use with the components. |
single-cell object or list with PCA components to plot.
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) pbmc_small <- performPCA(pbmc_small, assay = "escape")
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) pbmc_small <- performPCA(pbmc_small, assay = "escape")
This function allows to the user to examine the distribution of enrichment across groups by generating a ridge plot.
ridgeEnrichment( input.data, assay = NULL, group.by = NULL, gene.set = NULL, color.by = "group", order.by = NULL, scale = FALSE, facet.by = NULL, add.rug = FALSE, palette = "inferno" )
ridgeEnrichment( input.data, assay = NULL, group.by = NULL, gene.set = NULL, color.by = "group", order.by = NULL, scale = FALSE, facet.by = NULL, add.rug = FALSE, palette = "inferno" )
input.data |
Enrichment output from |
assay |
Name of the assay to plot if data is a single-cell object. |
group.by |
Categorical parameter to plot along the x.axis. If input is a single-cell object the default will be cluster. |
gene.set |
Gene set to plot (on y-axis). |
color.by |
How the color palette applies to the graph - can be "group" for a categorical color palette based on the group.by parameter or use the gene.set name if wanting to apply a gradient palette. |
order.by |
Method to organize the x-axis: "mean" will arrange the x-axis by the mean of the gene.set, while "group" will arrange the x-axis by in alphanumerical order. Using NULL will not reorder the x-axis. |
scale |
Visualize raw values FALSE or Z-transform enrichment values TRUE. |
facet.by |
Variable to facet the plot into n distinct graphs. |
add.rug |
Add visualization of the discrete cells along the ridge plot (TRUE). |
palette |
Colors to use in visualization - input any hcl.pals. |
ggplot2 object with ridge-based distributions of selected gene.set
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) ridgeEnrichment(pbmc_small, assay = "escape", gene.set = "Tcells") ridgeEnrichment(pbmc_small, assay = "escape", gene.set = "Tcells", color.by = "Tcells")
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) ridgeEnrichment(pbmc_small, assay = "escape", gene.set = "Tcells") ridgeEnrichment(pbmc_small, assay = "escape", gene.set = "Tcells", color.by = "Tcells")
Run the escape-based gene-set enrichment calculation with Seurat or SingleCellExperiment pipelines
runEscape( input.data, gene.sets = NULL, method = "ssGSEA", groups = 1000, min.size = 5, normalize = FALSE, make.positive = FALSE, new.assay.name = "escape", BPPARAM = SerialParam(), ... )
runEscape( input.data, gene.sets = NULL, method = "ssGSEA", groups = 1000, min.size = 5, normalize = FALSE, make.positive = FALSE, new.assay.name = "escape", BPPARAM = SerialParam(), ... )
input.data |
The count matrix, Seurat, or Single-Cell Experiment object. |
gene.sets |
Gene sets can be a list, output from
|
method |
Select the method to calculate enrichment, AUCell, GSVA, ssGSEA or UCell. |
groups |
The number of cells to separate the enrichment calculation. |
min.size |
Minimum number of gene necessary to perform the enrichment calculation |
normalize |
Whether to divide the enrichment score by the number of genes TRUE or report unnormalized FALSE. |
make.positive |
During normalization shift enrichment values to a positive range TRUE for downstream analysis or not TRUE (default). Will only be applied if normalize = TRUE. |
new.assay.name |
The new name of the assay to append to the single-cell object containing the enrichment scores. |
BPPARAM |
A BiocParallel::bpparam() object that for parallelization. |
... |
pass arguments to AUCell GSVA, ssGSEA or UCell call |
Seurat or Single-Cell Experiment object with escape enrichment scores in the assay slot.
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL)
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL)
This function allows to the user to examine the distribution of 2 gene sets along the x.axis and y.axis. The color gradient is generated using the a density estimate. See ggpointdensity) for more information.
scatterEnrichment( input.data, assay = NULL, x.axis = NULL, y.axis = NULL, scale = FALSE, facet.by = NULL, style = "point", palette = "inferno" )
scatterEnrichment( input.data, assay = NULL, x.axis = NULL, y.axis = NULL, scale = FALSE, facet.by = NULL, style = "point", palette = "inferno" )
input.data |
Enrichment output from |
assay |
Name of the assay to plot if data is a single-cell object. |
x.axis |
Gene set to plot on the x.axis. |
y.axis |
Gene set to plot on the y.axis. group.by parameter or use the gene.set name if wanting to apply a gradient palette. |
scale |
Visualize raw values FALSE or Z-transform enrichment values TRUE. |
facet.by |
Variable to facet the plot into n distinct graphs. |
style |
Return a "hex" bin plot or a "point"-based plot. |
palette |
Colors to use in visualization - input any hcl.pals. |
ggplot2 object with a scatter plot of selected gene.sets
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) scatterEnrichment(pbmc_small, assay = "escape", x.axis = "Tcells", y.axis = "Bcells")
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) scatterEnrichment(pbmc_small, assay = "escape", x.axis = "Tcells", y.axis = "Bcells")
This function allows to the user to examine the distribution of enrichment across groups by generating a split violin plot.
splitEnrichment( input.data, assay = NULL, split.by = NULL, group.by = NULL, gene.set = NULL, order.by = NULL, facet.by = NULL, scale = TRUE, palette = "inferno" )
splitEnrichment( input.data, assay = NULL, split.by = NULL, group.by = NULL, gene.set = NULL, order.by = NULL, facet.by = NULL, scale = TRUE, palette = "inferno" )
input.data |
Enrichment output from |
assay |
Name of the assay to plot if data is a single-cell object. |
split.by |
Variable to form the split violin, must have 2 levels. |
group.by |
Categorical parameter to plot along the x.axis. If input is a single-cell object the default will be cluster. |
gene.set |
Gene set to plot (on y-axis). |
order.by |
Method to organize the x-axis - "mean" will arrange the x-axis by the mean of the gene.set, while "group" will arrange the x-axis by in alphanumerical order. Using NULL will not reorder the x-axis. |
facet.by |
Variable to facet the plot into n distinct graphs. |
scale |
Visualize raw values FALSE or Z-transform enrichment values TRUE. |
palette |
Colors to use in visualization - input any hcl.pals. |
ggplot2 object violin-based distributions of selected gene.set
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) splitEnrichment(pbmc_small, assay = "escape", split.by = "groups", gene.set = "Tcells")
GS <- list(Bcells = c("MS4A1", "CD79B", "CD79A", "IGH1", "IGH2"), Tcells = c("CD3E", "CD3D", "CD3G", "CD7","CD8A")) pbmc_small <- SeuratObject::pbmc_small pbmc_small <- runEscape(pbmc_small, gene.sets = GS, min.size = NULL) splitEnrichment(pbmc_small, assay = "escape", split.by = "groups", gene.set = "Tcells")