Title: | Rapid benchmarking of methods for *in silico* deconvolution of bulk RNA-seq data |
---|---|
Description: | granulator is an R package for the cell type deconvolution of heterogeneous tissues based on bulk RNA-seq data or single cell RNA-seq expression profiles. The package provides a unified testing interface to rapidly run and benchmark multiple state-of-the-art deconvolution methods. Data for the deconvolution of peripheral blood mononuclear cells (PBMCs) into individual immune cell types is provided as well. |
Authors: | Sabina Pfister [aut, cre], Vincent Kuettel [aut], Enrico Ferrero [aut] |
Maintainer: | Sabina Pfister <[email protected]> |
License: | GPL-3 |
Version: | 1.15.0 |
Built: | 2024-12-24 06:02:24 UTC |
Source: | https://github.com/bioc/granulator |
regress
computes regression between estimated cell
type proportions and the measured cell type proportions (ground truth).
benchmark(deconvoluted, ground_truth)
benchmark(deconvoluted, ground_truth)
deconvoluted |
Output object of the function |
ground_truth |
A matrix containing measured cell type proportions in percentages. Samples names are inlcuded in rownames. |
Returns a list containing thres elements:
data: a list of data frames with celltype matched estimated and predicted proportions
stats: a list of data frames with regression statistics comprising Pearson Correlation Coefficient ('pcc'), Concordance Correlation Coefficient ('ccc'), Coefficient of Determination ('adj.r2') and Root Mean Square Error ('rmse')
summary: a data frame with summary statistics by cell type
rank: ranking of deconvolution alghoritms by highest all-to-all correlation of coefficients
summay: summary statistics of regression coefficients by method, signature and cell type
rank: ranking of methods and signatures by highest average regression coefficient
combinations: combination of methods and signatures tested
Vincent Kuettel, Sabina Pfister
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # bechmark bench <- benchmark(deconvoluted = decon, ground_truth = groundTruth_ABIS)
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # bechmark bench <- benchmark(deconvoluted = decon, ground_truth = groundTruth_ABIS)
Public dataset (GSE107011) containing the TPM-normalized gene expression values from bulk RNAseq of PBMCs of 12 healthy individuals. We include here only genes selected in the signature matrices.
data(bulkRNAseq_ABIS)
data(bulkRNAseq_ABIS)
A matrix with 1296 rows (genes) and 12 variables (samples)
Monaco et al. (2019) Cell Reports 26, 1627–1640 (Cell Reports)
correlate
computes Pearson correlations between
estimated cell type proportions generated by different methods.
correlate(deconvoluted, scale = TRUE)
correlate(deconvoluted, scale = TRUE)
deconvoluted |
A list: output object from |
scale |
Boolean: indicate whether the coefficients should be transformed to standard scores (default: scale = TRUE). |
correlation_analysis
is particularly useful to assess the
performance of the different methods when no ground truth is available.
If several methods agree on similar relative abundances of cell types across
samples, the results are more likely to reflect true differences in cell
type composition.
Returns a list encompassing two data frames:
the pearson correlation of coefficients with all other coefficients
summay: summary statistics of all-to-all correlation of coefficients by cell type
rank: ranking of deconvolution alghoritms by highest all-to-all correlation of coefficients
rank: ranking of deconvolution alghoritms by highest average regression all-to-all correlation of coefficients
combinations: combination of methods and signatures tested
Vincent Kuettel, Sabina Pfister
# load data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # correlate correl <- correlate(deconvoluted = decon)
# load data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # correlate correl <- correlate(deconvoluted = decon)
deconvolute
predicts cell type proportions from bulk
RNAseq data by applying multiple deconvolution methods.
deconvolute(m, sigMatrix, methods = get_decon_methods(), use_cores = 1)
deconvolute(m, sigMatrix, methods = get_decon_methods(), use_cores = 1)
m |
Bulk RNAseq: a genes (rows) by samples (columns) matrix containing transcript-per-million (TPM)-normalized gene expression values. |
sigMatrix |
Reference profile: a matrix or a named list of matrices. Each signature matrix should be a genes (rows) by cell types (columns) data frame containing TPM-normalized gene expression values of signature genes. |
methods |
Deconvolution methods: a character vector containing the names of the deconvolution methods to be applied. By default, all methods are run. Functions are either reimplementations of published methods or wrapper functions for published packages:
|
use_cores |
Number of cores to use for parallel processing |
Returns a list containing two elements:
coefficients: estimated cell type coefficients
proportions: estimated cell type proportions in percentage
combinations: combination of methods and signatures tested
Vincent Kuettel, Sabina Pfister
# load demo PBMCS data load_ABIS() # generate list of reference profiles to be tested sigMatrix <- list( sig1 = sigMatrix_ABIS_S0, sig2 = sigMatrix_ABIS_S1) # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix)
# load demo PBMCS data load_ABIS() # generate list of reference profiles to be tested sigMatrix <- list( sig1 = sigMatrix_ABIS_S0, sig2 = sigMatrix_ABIS_S1) # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix)
get_decon_methods
returns supported deconvolution
methods acronyms.
get_decon_methods()
get_decon_methods()
vector containing the acronyms of deconvolution methods.
Vincent Kuettel, Sabina Pfister
# get available deconvolution methods get_decon_methods()
# get available deconvolution methods get_decon_methods()
get_TPM
is used to convert raw counts to TPMs,
which is the most suitable normalization for deconvolution.
get_TPM(counts, effLen)
get_TPM(counts, effLen)
counts |
Bulk RNAseq: a genes (rows) by samples (columns) matrix containing gene raw counts. |
effLen |
Vector of gene lengths. |
Returns a transcript-per-million (TPM)-normalized matrix.
Vincent Kuettel, Sabina Pfister
# get TPMs from raw counts and gene lengths. mat <- round(matrix(rexp(200, rate=.01), ncol=20)) len <- round(matrix(rexp(10, rate=.001), ncol=1))+10 tpm <- get_TPM(mat,as.vector(len))
# get TPMs from raw counts and gene lengths. mat <- round(matrix(rexp(200, rate=.01), ncol=20)) len <- round(matrix(rexp(10, rate=.001), ncol=1))+10 tpm <- get_TPM(mat,as.vector(len))
Public dataset (GSE107011)containing the true proportions for all combinations of cell types (PBMCs) for 12 individuals.
data(groundTruth_ABIS)
data(groundTruth_ABIS)
A matrix with 12 rows (samples) and 24 variables (cell types)
Monaco et al. (2019) Cell Reports 26, 1627–1640 (Cell Reports)
load_ABIS
is used to load a demo dataset for the
deconvolution of PBMCs samples from published data under the accession
number GSE107011. The dataset consists of the following datasets:
bulkRNAseq_ABIS: PBMCs expression profiles
sigMatrix_ABIS_S0: Signature matrix for deconvolution of PBMCs in 17 cell types
sigMatrix_ABIS_S1: Signature matrix for deconvolution of PBMCs in 13 cell types
sigMatrix_ABIS_S2: Signature matrix for deconvolution of PBMCs in 11 cell types
sigMatrix_ABIS_S3: Signature matrix for deconvolution of PBMCs in 9 cell types
groundTruth_ABIS: PBMCS true cell type proprotions
load_ABIS()
load_ABIS()
Returns string confirming successfull loading of the data.
Vincent Kuettel, Sabina Pfister
# load data load_ABIS()
# load data load_ABIS()
plot_benchmark
plots the median correlation scores
between estimated and measured cell types across methods and cell types.
plot_benchmark(benchmarked, metric = "pcc")
plot_benchmark(benchmarked, metric = "pcc")
benchmarked |
List: output object from function |
metric |
Character: the metric of evaluation. Options include Pearson Correlation Coefficient ('pcc'), Concordance Correlation Coefficient ('ccc'), Coefficient of Determination ('adj.r2') and Root Mean Square Error ('rmse') of the linear regression model. |
Plot showing correlations across algorithms and cell types.
Vincent Kuettel, Sabina Pfister
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # bechmark bench<- benchmark(deconvoluted = decon, ground_truth = groundTruth_ABIS) # plot bechmark plot_benchmark(benchmarked = bench, metric = 'pcc')
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # bechmark bench<- benchmark(deconvoluted = decon, ground_truth = groundTruth_ABIS) # plot bechmark plot_benchmark(benchmarked = bench, metric = 'pcc')
plot_correlate
is used to visualize the results
obtained by correlation_analysis
.
plot_correlate(correlated, method = "heatmap", legend = TRUE)
plot_correlate(correlated, method = "heatmap", legend = TRUE)
correlated |
output object from |
method |
plot type ("heatmap" or "boxplot") |
legend |
boolean to display color legend |
plot_correlate
plots the correlation of cell type
proportions across methods in form of a heatmap or a violin plot.
If methods agree, cell type proportions of the same cell type should
by strongly correlated. For cell types with weak correlation across
methods, corresploding estimated cell type proportions should be
interpreted with caution.
Returns a heatmap or violin plot showing the correlation distribution of by different methods/signature matrices for each cell type
Vincent Kuettel, Sabina Pfister
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # correlate correl <- correlate(deconvoluted = decon) # plot correlate plot_correlate(correlated = correl, method="heatmap")
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # correlate correl <- correlate(deconvoluted = decon) # plot correlate plot_correlate(correlated = correl, method="heatmap")
plot_deconvolute
allows to compare methods across cell
types, where the different methods show a high level of agreement or
potentially generate diverging proportion estimates.
plot_deconvolute( deconvoluted = deconvoluted, scale = TRUE, labels = TRUE, markers = TRUE )
plot_deconvolute( deconvoluted = deconvoluted, scale = TRUE, labels = TRUE, markers = TRUE )
deconvoluted |
output object from function |
scale |
Boolean: indicate whether the coefficients should be transformed to standard scores (default: scale = TRUE). |
labels |
Boolean: indicate if x axis labels should be included (default: labels = TRUE). |
markers |
Boolean: indicate if data points markers should be drawn (default: markers = TRUE). |
Plots the estimated cell types generated by different deconvolution methods/signature matrices across samples. Scaling is used to directly compare deconvolution outputs across methods.
line plot
Vincent Kuettel, Sabina Pfister
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # plot deconvolute plot_deconvolute(deconvoluted = decon)
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # plot deconvolute plot_deconvolute(deconvoluted = decon)
plot_proportions
plots the estimated cell type
proportions as computed by a given method and signature matrix.
plot_proportions(deconvoluted, method = "svr", signature = "sig1")
plot_proportions(deconvoluted, method = "svr", signature = "sig1")
deconvoluted |
Output object from function |
method |
Character string with name of method to be regressed. |
signature |
Character string with name of signature to be regressed. |
Plot showing regression of estimated versus measured cell type coefficients.
Vincent Kuettel, Sabina Pfister
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # plot cell type proportions plot_proportions(deconvoluted = decon, method = 'svr', signature = 'sig1')
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # plot cell type proportions plot_proportions(deconvoluted = decon, method = 'svr', signature = 'sig1')
plot_regress
depicts the measured cell type
proportions (x-axis) vs. the estimated proportions (y-axis).
plot_regress(benchmarked, method = "svr", signature = "sig1")
plot_regress(benchmarked, method = "svr", signature = "sig1")
benchmarked |
List: output object from function |
method |
Character string with name of method to be regressed. |
signature |
Character string with name of signature to be regressed. |
Plot showing regression of estimated versus measured cell type coefficients.
Vincent Kuettel, Sabina Pfister
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # bechmark bench <- benchmark(deconvoluted = decon, ground_truth = groundTruth_ABIS) # plot regress plot_regress(benchmarked = bench, method = 'svr', signature = 'sig1')
# load demo PBMCS data load_ABIS() # deconvolute decon <- deconvolute(m = bulkRNAseq_ABIS, sigMatrix = sigMatrix_ABIS_S0) # bechmark bench <- benchmark(deconvoluted = decon, ground_truth = groundTruth_ABIS) # plot regress plot_regress(benchmarked = bench, method = 'svr', signature = 'sig1')
plot_similarity
plots cell type similarity matrix by
computing the Kendall rank correlations between cell type expression profiles.
Kendall rank correlation is used to test the similarities in the ordering of data
when it is ranked by quantities, and provides a less inflated measure of accuracy
than Pearson correlation by accounting for ties in the data.
plot_similarity(sigMatrix)
plot_similarity(sigMatrix)
sigMatrix |
Signature matrix: a data frame or a named list of data frames. Each signature matrix should be a genes (rows) by cell types (columns) data frame containing TPM-normalized gene expression values of signature genes. |
Plot showing the Kendall rank correlations similariy matrix.
Vincent Kuettel, Sabina Pfister
# load demo PBMCS data load_ABIS() # generate list of reference profiles to be tested sigMatrix <- list(sig1 = sigMatrix_ABIS_S0, sig2 = sigMatrix_ABIS_S2) # plot similarity plot_similarity(sigMatrix = sigMatrix)
# load demo PBMCS data load_ABIS() # generate list of reference profiles to be tested sigMatrix <- list(sig1 = sigMatrix_ABIS_S0, sig2 = sigMatrix_ABIS_S2) # plot similarity plot_similarity(sigMatrix = sigMatrix)
A dataset containing the TPM-normalized RNA-seq gene expression values for signature genes of 17 PBMCs.
data(sigMatrix_ABIS_S0)
data(sigMatrix_ABIS_S0)
A matrix with 1296 rows (genes) and 17 variables (cell types)
Monaco et al. (2019) Cell Reports 26, 1627–1640 (Cell Reports)
A dataset containing the TPM-normalized RNA-seq gene expression values for signature genes of 17 PBMCs.
data(sigMatrix_ABIS_S1)
data(sigMatrix_ABIS_S1)
A matrix with 1296 rows (genes) and 13 variables (cell types)
Monaco et al. (2019) Cell Reports 26, 1627–1640 (Cell Reports)
A dataset containing the TPM-normalized RNA-seq gene expression values for signature genes of 17 PBMCs.
data(sigMatrix_ABIS_S2)
data(sigMatrix_ABIS_S2)
A matrix with 1296 rows (genes) and 11 variables (cell types)
Monaco et al. (2019) Cell Reports 26, 1627–1640 (Cell Reports)
A dataset containing the TPM-normalized RNA-seq gene expression values for signature genes of 17 PBMCs.
data(sigMatrix_ABIS_S3)
data(sigMatrix_ABIS_S3)
A matrix with 1296 rows (genes) and 9 variables (cell types)
Monaco et al. (2019) Cell Reports 26, 1627–1640 (Cell Reports)