| Title: | Helper Functions for LIBD Deconvolution |
|---|---|
| Description: | Functions helpful for LIBD deconvolution project. Includes tools for marker finding with mean ratio, expression plotting, and plotting deconvolution results. Working to include DLPFC datasets. |
| Authors: | Louise Huuki-Myers [aut, cre] (ORCID: <https://orcid.org/0000-0001-5148-3602>), Leonardo Collado-Torres [ctb] (ORCID: <https://orcid.org/0000-0003-2140-308X>), Nicholas J. Eagles [ctb] (ORCID: <https://orcid.org/0000-0002-9808-5254>) |
| Maintainer: | Louise Huuki-Myers <[email protected]> |
| License: | Artistic-2.0 |
| Version: | 1.5.0 |
| Built: | 2026-05-28 06:34:11 UTC |
| Source: | https://github.com/bioc/DeconvoBuddies |
This function returns a character() vector with valid R colors for a given
input character() of unique cell types. These were colors that have been
useful in our experience.
create_cell_colors( cell_types = c("Astro", "Micro", "Endo", "Oligo", "OPC", "Excit", "Inhib", "Other"), palette_name = c("classic", "gg", "tableau"), palette = NULL, split = NA, preview = FALSE )create_cell_colors( cell_types = c("Astro", "Micro", "Endo", "Oligo", "OPC", "Excit", "Inhib", "Other"), palette_name = c("classic", "gg", "tableau"), palette = NULL, split = NA, preview = FALSE )
cell_types |
A |
palette_name |
A
|
palette |
A |
split |
delineating |
preview |
A |
A named character() vector of R and hex color values compatible
with ggplot2:scale_color_manual().
## create cell colors with included palettes create_cell_colors(palette_name = "classic") create_cell_colors(palette_name = "classic", preview = TRUE) create_cell_colors(palette_name = "tableau", preview = TRUE) ## use custom colors my_colors <- c("darkorchid4", "deeppink4", "aquamarine3", "darkolivegreen1") create_cell_colors( cell_type = c("A", "B", "C", "D"), palette = my_colors, preview = TRUE ) ## use Rcolor brewer create_cell_colors( cell_type = c("A", "B", "C"), palette = RColorBrewer::brewer.pal(n = 3, name = "Set1"), previe = TRUE ) ## Options for subtype handling ## Provide unique colors for cell subtypes (DEFAULT) - returns one level list create_cell_colors( cell_types = c("A.1", "A.2", "B.1", "C", "D"), palette_name = "classic", preview = FALSE ) ## Provide gradient colors for A.1 and A.2 by using the "split" argument ## returns a nested list with broad & fine cell type colors, fine cell types ## are gradient with the top level matching the broad cell type create_cell_colors( cell_types = c("A.1", "A.2", "B.1", "C", "D"), split = "\\.", palette_name = "classic", preview = TRUE ) ## try with custom colors create_cell_colors( cell_types = c("A.1", "A.2", "B.1", "C", "D"), split = "\\.", palette = my_colors, preview = TRUE )## create cell colors with included palettes create_cell_colors(palette_name = "classic") create_cell_colors(palette_name = "classic", preview = TRUE) create_cell_colors(palette_name = "tableau", preview = TRUE) ## use custom colors my_colors <- c("darkorchid4", "deeppink4", "aquamarine3", "darkolivegreen1") create_cell_colors( cell_type = c("A", "B", "C", "D"), palette = my_colors, preview = TRUE ) ## use Rcolor brewer create_cell_colors( cell_type = c("A", "B", "C"), palette = RColorBrewer::brewer.pal(n = 3, name = "Set1"), previe = TRUE ) ## Options for subtype handling ## Provide unique colors for cell subtypes (DEFAULT) - returns one level list create_cell_colors( cell_types = c("A.1", "A.2", "B.1", "C", "D"), palette_name = "classic", preview = FALSE ) ## Provide gradient colors for A.1 and A.2 by using the "split" argument ## returns a nested list with broad & fine cell type colors, fine cell types ## are gradient with the top level matching the broad cell type create_cell_colors( cell_types = c("A.1", "A.2", "B.1", "C", "D"), split = "\\.", palette_name = "classic", preview = TRUE ) ## try with custom colors create_cell_colors( cell_types = c("A.1", "A.2", "B.1", "C", "D"), split = "\\.", palette = my_colors, preview = TRUE )
Cell type proportions estimated by Bisque for DLPFC bulk RNA-seq data set, utilizing DLPFC snRNA-seq data as the reference data.
data("est_prop")data("est_prop")
A data.frame object.
16.79 kB
These are the columns of the data.frame object:
Astro: estimated proportions of Astrocyte cells
EndoMural: estimated proportions of Endothelia + Mural cells
Micro: estimated proportions of Microglia cells
Oligo: estimated proportions of Oligodendrocyte Cells
OPC: estimated proportions of Oligodendrocyte Precursor Cells
Excit: estimated proportions for Excitatory Neurons
Inhib: estimated proportions for Inhibitory Neurons
https://github.com/LieberInstitute/DeconvoBuddies/blob/master/inst/extdata/data-raw/est_prop.R
## R Note that the `rowSums(est_prop)` is equal to 1, ## with a small error tolerance. data("est_prop") summary(rowSums(est_prop) - 1) ## You can check this yourself with: all(round(rowSums(est_prop), 3) == 1) # To view source system.file("extdata", "data-raw", "est_prop.R", package = "DeconvoBuddies")## R Note that the `rowSums(est_prop)` is equal to 1, ## with a small error tolerance. data("est_prop") summary(rowSums(est_prop) - 1) ## You can check this yourself with: all(round(rowSums(est_prop), 3) == 1) # To view source system.file("extdata", "data-raw", "est_prop.R", package = "DeconvoBuddies")
A test dataset of estimated proportions for 5 cell types over 100 samples.
data("est_prop_test")data("est_prop_test")
A data.frame object.
11.62 kB
These are the columns of the data.frame object:
cell_A: estimated proportions for cell type A
cell_B: estimated proportions for cell type B
cell_C: estimated proportions for cell type C
cell_D: estimated proportions for cell type D
cell_E: estimated proportions for cell type E
https://github.com/LieberInstitute/DeconvoBuddies/blob/master/inst/extdata/data-raw/est_prop_test.R
## R Note that the `rowSums(est_prop_test)` is equal to 1, ## with a small error tolerance. data("est_prop_test") summary(rowSums(est_prop_test) - 1) ## You can check this yourself with: all(round(rowSums(est_prop_test), 3) == 1) # To view source system.file("extdata", "data-raw", "est_prop_test.R", package = "DeconvoBuddies")## R Note that the `rowSums(est_prop_test)` is equal to 1, ## with a small error tolerance. data("est_prop_test") summary(rowSums(est_prop_test) - 1) ## You can check this yourself with: all(round(rowSums(est_prop_test), 3) == 1) # To view source system.file("extdata", "data-raw", "est_prop_test.R", package = "DeconvoBuddies")
This function downloads the processed data for the experiment documented
at https://github.com/LieberInstitute/Human_DLPFC_Deconvolution.
Internally, this function downloads the data from ExperimentHub.
fetch_deconvo_data( type = c("rse_gene", "sce", "sce_DLPFC_example"), destdir = tempdir(), eh = ExperimentHub::ExperimentHub(), bfc = BiocFileCache::BiocFileCache() )fetch_deconvo_data( type = c("rse_gene", "sce", "sce_DLPFC_example"), destdir = tempdir(), eh = ExperimentHub::ExperimentHub(), bfc = BiocFileCache::BiocFileCache() )
type |
A
|
destdir |
The destination directory to where files will be downloaded
to in case the |
eh |
An |
bfc |
A |
We are currently waiting for https://doi.org/10.1101/2024.02.09.579665 to
pass peer review at a journal, which could lead to changes requested by the
peer reviewers on the processed data for this study. Thus, this function
temporarily downloads the files from Dropbox using
BiocFileCache::bfcrpath() unless the files are present already at
destdir.
Note that ExperimentHub and BiocFileCache will cache the data and
automatically detect if you have previously downloaded it, thus making it
the preferred way to interact with the data.
This function is based on spatialLIBD::fetch_data().
The requested object: rse_gene that you assign to an object
## Download the bulk RNA gene expression data ## A RangedSummarizedExperiment (41.16 MB) if (!exists("rse-gene")) rse_gene <- fetch_deconvo_data("rse_gene") ## explore bulk data rse_gene ## load example snRNA-seq data ## A SingleCellExperiment (4.79 MB) if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## explore example sce data sce_DLPFC_example ## check the logcounts SingleCellExperiment::logcounts(sce_DLPFC_example)[1:5, 1:5] ## download the full sce experiment object sce_path_zip <- fetch_deconvo_data("sce") sce_path <- unzip(sce_path_zip, exdir = tempdir()) sce <- HDF5Array::loadHDF5SummarizedExperiment( file.path(tempdir(), "sce_DLPFC_annotated") )## Download the bulk RNA gene expression data ## A RangedSummarizedExperiment (41.16 MB) if (!exists("rse-gene")) rse_gene <- fetch_deconvo_data("rse_gene") ## explore bulk data rse_gene ## load example snRNA-seq data ## A SingleCellExperiment (4.79 MB) if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## explore example sce data sce_DLPFC_example ## check the logcounts SingleCellExperiment::logcounts(sce_DLPFC_example)[1:5, 1:5] ## download the full sce experiment object sce_path_zip <- fetch_deconvo_data("sce") sce_path <- unzip(sce_path_zip, exdir = tempdir()) sce <- HDF5Array::loadHDF5SummarizedExperiment( file.path(tempdir(), "sce_DLPFC_annotated") )
scran::findMarkers().For each cell type, this function computes the statistics comparing that cell type (the "1") against all other cell types combined ("All").
findMarkers_1vAll( sce, assay_name = "counts", cellType_col = "cellType", add_symbol = FALSE, mod = NULL, verbose = TRUE, direction = "up", BPPARAM = BiocParallel::SerialParam(), raw_logFC = FALSE )findMarkers_1vAll( sce, assay_name = "counts", cellType_col = "cellType", add_symbol = FALSE, mod = NULL, verbose = TRUE, direction = "up", BPPARAM = BiocParallel::SerialParam(), raw_logFC = FALSE )
sce |
A SingleCellExperiment object. |
assay_name |
Name of the assay to use for calculation. See
see |
cellType_col |
Column name on |
add_symbol |
A |
mod |
A |
verbose |
A |
direction |
A |
BPPARAM |
A BiocParallelParam object specifying how to parallelize computation across cell types. |
raw_logFC |
A |
See https://github.com/MarioniLab/scran/issues/57 for a more in depth
discussion about the standard log fold change statistics provided by
scran::findMarkers().
See also https://youtu.be/IaclszgZb-g for a LIBD rstats club presentation on "Finding and interpreting marker genes in sc/snRNA-seq data". The companion notes are available at https://docs.google.com/document/d/1BeMtKgE7gpmNywInndVC9o_ufopn-U2EZHB32bO7ObM/edit?usp=sharing.
A tibble::tibble() of 1 vs. ALL standard log fold change + p-values
for each gene x cell type.
gene is the name of the gene (from rownames(sce)).
logFC the log fold change from the DE test, only returned if raw_logFC = TRUE.
log.p.value the log of the p-value of the DE test
log.FDR the log of the False Discovery Rate adjusted p.value
std.logFC the standard logFC.
cellType.target the cell type we're finding marker genes for
std.logFC.rank the rank of std.logFC for each cell type
std.logFC.anno is an annotation of the std.logFC value
helpful for plotting.
Other marker gene functions:
get_mean_ratio()
## load example SingleCellExperiment if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## Explore properties of the sce object sce_DLPFC_example ## this data contains logcounts of gene expression SummarizedExperiment::assays(sce_DLPFC_example)$logcounts[1:5, 1:5] ## nuclei are classified in to cell types table(sce_DLPFC_example$cellType_broad_hc) ## Get the 1vALL stats for each gene for each cell type defined in ## `cellType_broad_hc` marker_stats_1vAll <- findMarkers_1vAll( sce = sce_DLPFC_example, assay_name = "logcounts", cellType_col = "cellType_broad_hc", mod = "~BrNum" ) ## explore output, top markers have high logFC head(marker_stats_1vAll)## load example SingleCellExperiment if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## Explore properties of the sce object sce_DLPFC_example ## this data contains logcounts of gene expression SummarizedExperiment::assays(sce_DLPFC_example)$logcounts[1:5, 1:5] ## nuclei are classified in to cell types table(sce_DLPFC_example$cellType_broad_hc) ## Get the 1vALL stats for each gene for each cell type defined in ## `cellType_broad_hc` marker_stats_1vAll <- findMarkers_1vAll( sce = sce_DLPFC_example, assay_name = "logcounts", cellType_col = "cellType_broad_hc", mod = "~BrNum" ) ## explore output, top markers have high logFC head(marker_stats_1vAll)
Calculate the Mean Ratio value and rank for each gene for each cell type in
the sce object, to identify effective marker genes for deconvolution.
get_mean_ratio( sce, cellType_col, assay_name = "logcounts", gene_ensembl = NULL, gene_name = NULL, BPPARAM = BiocParallel::SerialParam() )get_mean_ratio( sce, cellType_col, assay_name = "logcounts", gene_ensembl = NULL, gene_name = NULL, BPPARAM = BiocParallel::SerialParam() )
sce |
SummarizedExperiment-class (or any derivative class) object containing single cell/nucleus gene expression data. |
cellType_col |
A |
assay_name |
A |
gene_ensembl |
A |
gene_name |
A |
BPPARAM |
A BiocParallelParam object specifying how to potentially parallelize key matrix operations. |
Note if a cell type has < 10 cells the MeanRatio results may be unstable. See rational in OSCA: https://bioconductor.org/books/3.19/OSCA.multisample/multi-sample-comparisons.html#performing-the-de-analysis.
A tibble::tibble() with the MeanRatio values for each gene x cell
type.
gene is the name of the gene (from rownames(sce)).
cellType.target is the cell type we're finding marker genes for.
mean.target is the mean expression of gene for cellType.target.
cellType.2nd is the second highest non-target cell type.
mean.2nd is the mean expression of gene for cellType.2nd.
MeanRatio is the ratio of mean.target/mean.2nd.
MeanRatio.rank is the rank of MeanRatio for the cell type.
MeanRatio.anno is an annotation of the MeanRatio calculation helpful
for plotting.
gene_ensembl & gene_name optional columns from rowData(sce) specified
by the user to add gene information.
Other marker gene functions:
findMarkers_1vAll()
## load example SingleCellExperiment if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## Explore properties of the sce object sce_DLPFC_example ## this data contains logcounts of gene expression SummarizedExperiment::assays(sce_DLPFC_example)$logcounts[1:5, 1:5] ## nuclei are classified in to cell types table(sce_DLPFC_example$cellType_broad_hc) ## Get the mean ratio for each gene for each cell type defined in ## `cellType_broad_hc` get_mean_ratio(sce_DLPFC_example, cellType_col = "cellType_broad_hc") # Option to specify gene_name as the "Symbol" column from rowData # this will be added to the marker stats output SummarizedExperiment::rowData(sce_DLPFC_example) ## specify rowData col names for gene_name and gene_ensembl get_mean_ratio(sce_DLPFC_example, cellType_col = "cellType_broad_hc", gene_name = "gene_name", gene_ensembl = "gene_id" )## load example SingleCellExperiment if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## Explore properties of the sce object sce_DLPFC_example ## this data contains logcounts of gene expression SummarizedExperiment::assays(sce_DLPFC_example)$logcounts[1:5, 1:5] ## nuclei are classified in to cell types table(sce_DLPFC_example$cellType_broad_hc) ## Get the mean ratio for each gene for each cell type defined in ## `cellType_broad_hc` get_mean_ratio(sce_DLPFC_example, cellType_col = "cellType_broad_hc") # Option to specify gene_name as the "Symbol" column from rowData # this will be added to the marker stats output SummarizedExperiment::rowData(sce_DLPFC_example) ## specify rowData col names for gene_name and gene_ensembl get_mean_ratio(sce_DLPFC_example, cellType_col = "cellType_broad_hc", gene_name = "gene_name", gene_ensembl = "gene_id" )
The counts are simulated from a poisson distribution with stats::rpois().
Use set.seed() if you want the results to be reproducible.
make_test_sce(n_cell = 100, n_gene = 100, n_cellType = 4, n_donor = 2)make_test_sce(n_cell = 100, n_gene = 100, n_cellType = 4, n_donor = 2)
n_cell |
An |
n_gene |
An |
n_cellType |
An |
n_donor |
An |
A
SingleCellExperiment
object with randomly generated counts and colData().
## Create an example sce using default values. set.seed(20240823) test <- make_test_sce() ## Let's check the number of cells per cell type from each donor addmargins(table(test$cellType, test$donor))## Create an example sce using default values. set.seed(20240823) test <- make_test_sce() ## Let's check the number of cells per cell type from each donor addmargins(table(test$cellType, test$donor))
A tibble contatinting the marker statistics calculated for 5k genes from
DLPFC snRNA-seq dataset by findMarkers_1vAll.
data("marker_stats_1vAll")data("marker_stats_1vAll")
A data.frame object.
3.47 MB
https://github.com/LieberInstitute/DeconvoBuddies/blob/master/inst/extdata/data-raw/RNAScope_prop.R
# To view source system.file("extdata", "data-raw", "marker_stats_1vAll.R", package = "DeconvoBuddies")# To view source system.file("extdata", "data-raw", "marker_stats_1vAll.R", package = "DeconvoBuddies")
A tibble containing the marker stats from get_mean_ratio() for
sce_DLPFC_example.
data("marker_test")data("marker_test")
A tibble::tibble(). See get_mean_ratio() for more details on the column
names.
402.60 kB
https://github.com/LieberInstitute/DeconvoBuddies/blob/master/inst/extdata/data-raw/marker_test.R
# To view source system.file("extdata", "data-raw", "marker_test.R", package = "DeconvoBuddies")# To view source system.file("extdata", "data-raw", "marker_test.R", package = "DeconvoBuddies")
Given a long formatted data.frame, this function creates a barplot for
the average cell type composition among a set of samples (donors) using
ggplot2.
plot_composition_bar( prop_long, sample_col = "RNum", x_col = "ALL", prop_col = "prop", ct_col = "cell_type", add_text = TRUE, min_prop_text = 0 )plot_composition_bar( prop_long, sample_col = "RNum", x_col = "ALL", prop_col = "prop", ct_col = "cell_type", add_text = TRUE, min_prop_text = 0 )
prop_long |
A |
sample_col |
A |
x_col |
A |
prop_col |
A |
ct_col |
A |
add_text |
A |
min_prop_text |
A |
A stacked barplot ggplot2 object representing the mean proportion
of cell types for each group.
# Load example data data("rse_bulk_test") data("est_prop_test") # extract relevant colData from the example RangedSummarizedExperiment object pd <- SummarizedExperiment::colData(rse_bulk_test) |> as.data.frame() # combine with the example estimated proportions in a long style table est_prop_test_long <- est_prop_test |> tibble::rownames_to_column("RNum") |> tidyr::pivot_longer(!RNum, names_to = "cell_type", values_to = "prop") |> dplyr::inner_join(pd |> dplyr::select(RNum, Dx)) est_prop_test_long # Create composition bar plots # Mean composition of all samples plot_composition_bar(est_prop_test_long) # Mean composition by Dx plot_composition_bar(est_prop_test_long, x_col = "Dx") # control minimum value of text to add plot_composition_bar(est_prop_test_long, x_col = "Dx", min_prop_text = 0.1) # plot all samples, then facet by Dx plot_composition_bar(est_prop_test_long, x_col = "RNum", add_text = FALSE) + ggplot2::facet_wrap(~Dx, scales = "free_x")# Load example data data("rse_bulk_test") data("est_prop_test") # extract relevant colData from the example RangedSummarizedExperiment object pd <- SummarizedExperiment::colData(rse_bulk_test) |> as.data.frame() # combine with the example estimated proportions in a long style table est_prop_test_long <- est_prop_test |> tibble::rownames_to_column("RNum") |> tidyr::pivot_longer(!RNum, names_to = "cell_type", values_to = "prop") |> dplyr::inner_join(pd |> dplyr::select(RNum, Dx)) est_prop_test_long # Create composition bar plots # Mean composition of all samples plot_composition_bar(est_prop_test_long) # Mean composition by Dx plot_composition_bar(est_prop_test_long, x_col = "Dx") # control minimum value of text to add plot_composition_bar(est_prop_test_long, x_col = "Dx", min_prop_text = 0.1) # plot all samples, then facet by Dx plot_composition_bar(est_prop_test_long, x_col = "RNum", add_text = FALSE) + ggplot2::facet_wrap(~Dx, scales = "free_x")
This function plots the expression of one or more genes as a violin plot,
over a user defined category, typically a cell type annotation. The plots are
made using ggplot2.
plot_gene_express( sce, genes, assay_name = "logcounts", category = "cellType", color_pal = NULL, title = NULL, plot_points = FALSE, ncol = 2, plot_type = c("violin", "boxplot"), free_y = FALSE, label_points = NULL )plot_gene_express( sce, genes, assay_name = "logcounts", category = "cellType", color_pal = NULL, title = NULL, plot_points = FALSE, ncol = 2, plot_type = c("violin", "boxplot"), free_y = FALSE, label_points = NULL )
sce |
A SummarizedExperiment-class object or one inheriting it. |
genes |
A |
assay_name |
A |
category |
A |
color_pal |
A named |
title |
A |
plot_points |
A |
ncol |
An |
plot_type |
A |
free_y |
|
label_points |
A |
A ggplot() violin plot for selected genes.
Other expression plotting functions:
plot_marker_express(),
plot_marker_express_ALL(),
plot_marker_express_List()
## Using Symbol as rownames makes this more human readable data("sce_ab") plot_gene_express(sce = sce_ab, genes = c("G-D1_A")) plot_gene_express(sce = sce_ab, genes = c("G-D1_A"), plot_type = "boxplot") # Access example data if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## plot expression of two genes plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22") ) ## plot as boxplot plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), plot_type = "boxplot" ) ## plot points - note this creates large images and is easy to over plot plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), plot_points = TRUE ) ## with boxplot plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), plot_points = TRUE, plot_type = "boxplot" ) ## Use free y-axis between genes plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), plot_points = TRUE, plot_type = "boxplot", free_y = FALSE ) ## Add title plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), title = "My Genes" ) ## Add color pallet my_cell_colors <- create_cell_colors(cell_types = levels(sce_DLPFC_example$cellType_broad_hc)) plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), color_pal = my_cell_colors, plot_type = "boxplot", plot_points = TRUE ) #'my_cell_colors <- create_cell_colors(cell_types = levels(sce_DLPFC_example$cellType_broad_hc)) ## Add lables to points plot_gene_express(sce = sce_ab, genes = c("G-D1_A"), assay = 'counts' ,plot_points = TRUE, label_points = "donor") select_cells <- colnames(sce_DLPFC_example)[sce_DLPFC_example$cellType_broad_hc %in% c("Excit", "Inhib")] select_cells <- sample(select_cells, 10) plot_gene_express( sce = sce_DLPFC_example[, select_cells], category = "cellType_broad_hc", genes = c("GAD2", "CD22"), color_pal = my_cell_colors, plot_type = "boxplot", plot_points = TRUE, label_points = "Sample" )## Using Symbol as rownames makes this more human readable data("sce_ab") plot_gene_express(sce = sce_ab, genes = c("G-D1_A")) plot_gene_express(sce = sce_ab, genes = c("G-D1_A"), plot_type = "boxplot") # Access example data if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## plot expression of two genes plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22") ) ## plot as boxplot plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), plot_type = "boxplot" ) ## plot points - note this creates large images and is easy to over plot plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), plot_points = TRUE ) ## with boxplot plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), plot_points = TRUE, plot_type = "boxplot" ) ## Use free y-axis between genes plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), plot_points = TRUE, plot_type = "boxplot", free_y = FALSE ) ## Add title plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), title = "My Genes" ) ## Add color pallet my_cell_colors <- create_cell_colors(cell_types = levels(sce_DLPFC_example$cellType_broad_hc)) plot_gene_express( sce = sce_DLPFC_example, category = "cellType_broad_hc", genes = c("GAD2", "CD22"), color_pal = my_cell_colors, plot_type = "boxplot", plot_points = TRUE ) #'my_cell_colors <- create_cell_colors(cell_types = levels(sce_DLPFC_example$cellType_broad_hc)) ## Add lables to points plot_gene_express(sce = sce_ab, genes = c("G-D1_A"), assay = 'counts' ,plot_points = TRUE, label_points = "donor") select_cells <- colnames(sce_DLPFC_example)[sce_DLPFC_example$cellType_broad_hc %in% c("Excit", "Inhib")] select_cells <- sample(select_cells, 10) plot_gene_express( sce = sce_DLPFC_example[, select_cells], category = "cellType_broad_hc", genes = c("GAD2", "CD22"), color_pal = my_cell_colors, plot_type = "boxplot", plot_points = TRUE, label_points = "Sample" )
This function plots the top n marker genes for a specified cell type based off of
the stats table from get_mean_ratio().
The gene expression is plotted as violin plot with plot_gene_express and adds
annotations to each plot.
plot_marker_express( sce, stats, cell_type, n_genes = 4, rank_col = "MeanRatio.rank", anno_col = "MeanRatio.anno", gene_col = "gene", cellType_col = "cellType", color_pal = NULL, plot_points = FALSE, ncol = 2 )plot_marker_express( sce, stats, cell_type, n_genes = 4, rank_col = "MeanRatio.rank", anno_col = "MeanRatio.anno", gene_col = "gene", cellType_col = "cellType", color_pal = NULL, plot_points = FALSE, ncol = 2 )
sce |
SummarizedExperiment-class object |
stats |
A |
cell_type |
A |
n_genes |
An |
rank_col |
The |
anno_col |
The |
gene_col |
The |
cellType_col |
The |
color_pal |
A named |
plot_points |
A |
ncol |
An |
A ggplot2 object created with plot_gene_express(). It is
a scater::plotExpression() style violin plot for selected marker genes.
Other expression plotting functions:
plot_gene_express(),
plot_marker_express_ALL(),
plot_marker_express_List()
## Download the processed study data from ## <https://github.com/LieberInstitute/Human_DLPFC_Deconvolution>. if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## load example marker stats data("marker_test") ## Plot the top markers for Astrocytes plot_marker_express( sce = sce_DLPFC_example, stat = marker_test, cellType_col = "cellType_broad_hc", cell_type = "Astro", gene_col = "gene" )## Download the processed study data from ## <https://github.com/LieberInstitute/Human_DLPFC_Deconvolution>. if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## load example marker stats data("marker_test") ## Plot the top markers for Astrocytes plot_marker_express( sce = sce_DLPFC_example, stat = marker_test, cellType_col = "cellType_broad_hc", cell_type = "Astro", gene_col = "gene" )
This function plots the top n marker genes for a all cell types based off of
the stats table from get_mean_ratio() in a multi-page PDF file.
The gene expression is plotted as violin plot with plot_gene_express() and
adds annotations to each plot.
plot_marker_express_ALL( sce, stats, pdf_fn = NULL, n_genes = 10, rank_col = "MeanRatio.rank", anno_col = "MeanRatio.anno", gene_col = "gene", cellType_col = "cellType", color_pal = NULL, plot_points = FALSE )plot_marker_express_ALL( sce, stats, pdf_fn = NULL, n_genes = 10, rank_col = "MeanRatio.rank", anno_col = "MeanRatio.anno", gene_col = "gene", cellType_col = "cellType", color_pal = NULL, plot_points = FALSE )
sce |
SummarizedExperiment-class object |
stats |
A |
pdf_fn |
A |
n_genes |
An |
rank_col |
The |
anno_col |
The |
gene_col |
The |
cellType_col |
The |
color_pal |
A named |
plot_points |
A |
A PDF file with violin plots for the expression of top marker genes for all cell types.
Other expression plotting functions:
plot_gene_express(),
plot_marker_express(),
plot_marker_express_List()
#' ## Fetch sce example data if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## load example marker stats data("marker_test") # Plot marker gene expression to PDF, one page per cell type in stats pdf_file <- tempfile("test_marker_expression_ALL", fileext = ".pdf") plot_marker_express_ALL( sce_DLPFC_example, cellType_col = "cellType_broad_hc", stat = marker_test, pdf_fn = pdf_file ) if (interactive()) browseURL(pdf_file)#' ## Fetch sce example data if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## load example marker stats data("marker_test") # Plot marker gene expression to PDF, one page per cell type in stats pdf_file <- tempfile("test_marker_expression_ALL", fileext = ".pdf") plot_marker_express_ALL( sce_DLPFC_example, cellType_col = "cellType_broad_hc", stat = marker_test, pdf_fn = pdf_file ) if (interactive()) browseURL(pdf_file)
This function plots a nested list of genes as a multi-page PDF, one for each sub list. A use case is plotting known marker genes for multiple cell types over cell type clusters with unknown identities.
plot_marker_express_List( sce, gene_list, pdf_fn = NULL, cellType_col = "cellType", gene_name_col = "gene_name", color_pal = NULL, plot_points = FALSE )plot_marker_express_List( sce, gene_list, pdf_fn = NULL, cellType_col = "cellType", gene_name_col = "gene_name", color_pal = NULL, plot_points = FALSE )
sce |
SummarizedExperiment-class object |
gene_list |
A named |
pdf_fn |
A |
cellType_col |
The |
gene_name_col |
The |
color_pal |
A named |
plot_points |
A |
A PDF file with violin plots for the expression of top marker genes for all cell types.
Other expression plotting functions:
plot_gene_express(),
plot_marker_express(),
plot_marker_express_ALL()
## Fetch sce example data if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## Create list-of-lists of genes to plot, names of sub-list become title of page my_gene_list <- list(Inhib = c("GAD2", "SAMD5"), Astro = c("RGS20", "PRDM16")) # Return a list of plots plots <- plot_marker_express_List( sce_DLPFC_example, gene_list = my_gene_list, cellType_col = "cellType_broad_hc" ) print(plots[[1]]) # Plot marker gene expression to PDF, one page per cell type in stats pdf_file <- tempfile("test_marker_expression_List", fileext = ".pdf") plot_marker_express_List( sce_DLPFC_example, gene_list = my_gene_list, pdf_fn = pdf_file, cellType_col = "cellType_broad_hc" ) if (interactive()) browseURL(pdf_file)## Fetch sce example data if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example") ## Create list-of-lists of genes to plot, names of sub-list become title of page my_gene_list <- list(Inhib = c("GAD2", "SAMD5"), Astro = c("RGS20", "PRDM16")) # Return a list of plots plots <- plot_marker_express_List( sce_DLPFC_example, gene_list = my_gene_list, cellType_col = "cellType_broad_hc" ) print(plots[[1]]) # Plot marker gene expression to PDF, one page per cell type in stats pdf_file <- tempfile("test_marker_expression_List", fileext = ".pdf") plot_marker_express_List( sce_DLPFC_example, gene_list = my_gene_list, pdf_fn = pdf_file, cellType_col = "cellType_broad_hc" ) if (interactive()) browseURL(pdf_file)
Cell type proportion estimates from high quality images from Huuki-Myers et al., bioRxiv, 2024, doi: https://doi.org/10.1101/2024.02.09.579665.
data("RNAScope_prop")data("RNAScope_prop")
A data.frame object.
11.49 kB
These are the columns of the data.frame object:
SAMPLE_ID : DLPFC Tissue block + RNAScope combination.
Sample : DLFPC Tissue block (Donor BrNum + DLPFC position).
Combo : RNAScope probe combination, either "Circle" marking cell types Astro
Endo, Inhib, or "Star" marking Excit, Micro, and OligoOPC.
cell_type : The cell type measured.
Confidence : Image confidence, this dataset has been filtered to the high & Ok confidence images.
n_cell : the number of cells counted for the Sample and cell type.
prop : the calculated cell type proportion from n_cell.
n_cell_sn : number of nuclei in the corresponding snRNA-seq data.
prop_sn : cell type proportion from the snRNA-seq data.
NOTE: For the RNAScope assay utilized here only 3 cell types could be measured at once. Two consecutive tissue slices were used from each brain block to measure one combination of three major cell types, then the other three (differentiated as circle and star). DAPI was used to mark the nuclei, in each tissue section, so the overall number of cells is recorded but only a fraction has a cell type label, unlabeled nuclei are classified "other".
The two sections combined should get close to identifying the type of all the cells, but often the combined "non-other" fractions are around 0.7 and not 1. This could be due to a few reasons: the sections while as similar as possible are not the same tissue, error in the assay and/or image processing not labeling all cells possible, or presence of rare cell types not marked by the selected probes.
With all of this considered, we still think the RNAScope estimates are good approximations of the cell type proportions in these samples.
For more info check out https://doi.org/10.1101/2024.02.09.579665 Figure 2, and Methods: RNAScope/Immunofluorescence Data Generation and HALO Analysis.
https://github.com/LieberInstitute/DeconvoBuddies/blob/master/inst/extdata/data-raw/RNAScope_prop.R
# To view source system.file("extdata", "data-raw", "RNAScope_prop.R", package = "DeconvoBuddies")# To view source system.file("extdata", "data-raw", "RNAScope_prop.R", package = "DeconvoBuddies")
A test rse_gene object with data for 1000 genes across 100 samples.
data("rse_bulk_test")data("rse_bulk_test")
A SummarizedExperiment object.
976.77 kB
https://github.com/LieberInstitute/DeconvoBuddies/blob/master/inst/extdata/data-raw/rse_bulk_test.R
# To view source system.file("extdata", "data-raw", "rse_bulk_test.R", package = "DeconvoBuddies")# To view source system.file("extdata", "data-raw", "rse_bulk_test.R", package = "DeconvoBuddies")
An example sce object for testing, with two cell types A and B.
data("sce_ab")data("sce_ab")
A SingleCellExperiment object.
Generated with DeconvoBuddies::make_test_sce()
38.26 kB
https://github.com/LieberInstitute/DeconvoBuddies/blob/master/inst/extdata/data-raw/sce_ab.R
# To view source system.file("extdata", "data-raw", "sce_ab.R", package = "DeconvoBuddies")# To view source system.file("extdata", "data-raw", "sce_ab.R", package = "DeconvoBuddies")