| Title: | SpatialeXpressionR: Cell Type Identification in Spatial Transcriptomics |
|---|---|
| Description: | Spatial-eXpression-R (spacexr) is a package for analyzing cell types in spatial transcriptomics data. This implementation is a fork of the spacexr GitHub repo (https://github.com/dmcable/spacexr), adapted to work with Bioconductor objects. The original package implements two statistical methods: RCTD for learning cell types and CSIDE for inferring cell type-specific differential expression. Currently, this fork only implements RCTD, which learns cell type profiles from annotated RNA sequencing (RNA-seq) reference data and uses these profiles to identify cell types in spatial transcriptomic pixels while accounting for platform-specific effects. Future releases will include an implementation of CSIDE. |
| Authors: | Dylan Cable [aut], Rafael Irizarry [aut] (ORCID: <https://orcid.org/0000-0002-3944-4309>), Gabriel Grajeda [cre] (ORCID: <https://orcid.org/0009-0003-7242-7476>), Fannie and John Hertz Foundation [fnd] |
| Maintainer: | Gabriel Grajeda <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.5.0 |
| Built: | 2026-05-30 09:47:54 UTC |
| Source: | https://github.com/bioc/spacexr |
Spatial-eXpression-R (spacexr) is a package for analyzing cell types in spatial transcriptomics data. This implementation is a fork of the spacexr GitHub repo (https://github.com/dmcable/spacexr), adapted to work with Bioconductor objects. The original package implements two statistical methods: RCTD for learning cell types and CSIDE for inferring cell type-specific differential expression. Currently, this fork only implements RCTD, which learns cell type profiles from annotated RNA sequencing (RNA-seq) reference data and uses these profiles to identify cell types in spatial transcriptomic pixels while accounting for platform-specific effects. Future releases will include an implementation of CSIDE.
To get started, create a SpatialExperiment
object (called spatial here) for the spatial transcriptomics data
and a SummarizedExperiment object
(called reference here) for the RNA-seq data. Then simply run RCTD
as:
rctd_data <- createRctd(spatial, reference)
results <- runRctd(rctd_data)
Maintainer: Gabriel Grajeda [email protected] (ORCID)
Authors:
Dylan Cable [email protected]
Other contributors:
Fannie and John Hertz Foundation [funder]
Useful links:
Performs initial preprocessing steps on a spatial transcriptomics dataset
and a reference dataset prior to running RCTD. This function filters pixels
and genes based on UMI counts and other thresholds, and identifies
differentially expressed genes. The output of this function should be passed
to runRctd to perform the cell type deconvolution.
createRctd( spatial_experiment, reference_experiment, cell_type_col = "cell_type", require_int = TRUE, gene_cutoff = 0.000125, fc_cutoff = 0.5, gene_cutoff_reg = 2e-04, fc_cutoff_reg = 0.75, gene_obs_min = 3, pixel_count_min = 10, UMI_min = 100, UMI_max = 2e+07, UMI_min_sigma = 300, ref_UMI_min = 100, ref_n_cells_min = 25, ref_n_cells_max = 10000, cell_type_profiles = NULL, class_df = NULL, cell_type_names = NULL )createRctd( spatial_experiment, reference_experiment, cell_type_col = "cell_type", require_int = TRUE, gene_cutoff = 0.000125, fc_cutoff = 0.5, gene_cutoff_reg = 2e-04, fc_cutoff_reg = 0.75, gene_obs_min = 3, pixel_count_min = 10, UMI_min = 100, UMI_max = 2e+07, UMI_min_sigma = 300, ref_UMI_min = 100, ref_n_cells_min = 25, ref_n_cells_max = 10000, cell_type_profiles = NULL, class_df = NULL, cell_type_names = NULL )
spatial_experiment |
|
reference_experiment |
|
cell_type_col |
character, name of the entry in
|
require_int |
logical, whether counts and nUMI are required to be
integers (default: |
gene_cutoff |
numeric, minimum normalized gene expression for genes to be included in the platform effect normalization step (default: 0.000125) |
fc_cutoff |
numeric, minimum log fold change (across cell types) for genes to be included in the platform effect normalization step (default: 0.5) |
gene_cutoff_reg |
numeric, minimum normalized gene expression for genes to be included in the RCTD step (default: 0.0002) |
fc_cutoff_reg |
numeric, minimum log fold change (across cell types) for genes to be included in the RCTD step (default: 0.75) |
gene_obs_min |
numeric, minimum number of times a gene must appear in the spatial transcriptomics data to be included in the analysis (default: 3) |
pixel_count_min |
numeric, minimum total gene count for a pixel to be included in the analysis (default: 10) |
UMI_min |
numeric, minimum UMI count per pixel (default: 100) |
UMI_max |
numeric, maximum UMI count per pixel (default: 20,000,000) |
UMI_min_sigma |
numeric, minimum UMI count for pixels used in platform effect normalization (default: 300) |
ref_UMI_min |
numeric, minimum UMI count for cells to be included in the reference (default: 100) |
ref_n_cells_min |
numeric, minimum number of cells per cell type in the reference (default: 25) |
ref_n_cells_max |
numeric, maximum number of cells per cell type in the reference. Will downsample if this number is exceeded. (default: 10,000) |
cell_type_profiles |
matrix of precomputed cell type expression profiles
(genes by cell type), optional. If this option is used, gene names and cell
type names must be present in the |
class_df |
data frame mapping cell types to classes, optional. If specified, RCTD will report confidence on the class level. |
cell_type_names |
character vector of cell type names to include, optional |
A list with four elements:
spatial_experiment: Preprocessed
SummarizedExperiment object
containing spatial transcriptomics data with filtered pixels and genes
cell_type_info: List containing cell type information,
including expression profiles and metadata
internal_vars: List of internal variables used by RCTD,
including differentially expressed gene lists and class information
config: List of configuration parameters used for RCTD
data(rctdSim) # Spatial transcriptomics data library(SpatialExperiment) spatial_spe <- SpatialExperiment( assay = rctdSim$spatial_rna_counts, spatialCoords = rctdSim$spatial_rna_coords ) # Reference data library(SummarizedExperiment) reference_se <- SummarizedExperiment( assays = list(counts = rctdSim$reference_counts), colData = rctdSim$reference_cell_types ) # Filter spatial transcriptomics data and aggregate reference data rctd_data <- createRctd(spatial_spe, reference_se) # Run RCTD on filtered data results <- runRctd(rctd_data, rctd_mode = "doublet", max_cores = 1) # Access the cell type proportions (cell types as rows, pixels as columns) assay(results, "weights") # Check spot classifications for doublet mode colData(results)$spot_class # Access spatial coordinates head(spatialCoords(results))data(rctdSim) # Spatial transcriptomics data library(SpatialExperiment) spatial_spe <- SpatialExperiment( assay = rctdSim$spatial_rna_counts, spatialCoords = rctdSim$spatial_rna_coords ) # Reference data library(SummarizedExperiment) reference_se <- SummarizedExperiment( assays = list(counts = rctdSim$reference_counts), colData = rctdSim$reference_cell_types ) # Filter spatial transcriptomics data and aggregate reference data rctd_data <- createRctd(spatial_spe, reference_se) # Run RCTD on filtered data results <- runRctd(rctd_data, rctd_mode = "doublet", max_cores = 1) # Access the cell type proportions (cell types as rows, pixels as columns) assay(results, "weights") # Check spot classifications for doublet mode colData(results)$spot_class # Access spatial coordinates head(spatialCoords(results))
Generates a visualization where each pixel is represented by a pie chart
showing the proportions of different cell types at that location. Users
should run this function on the result of runRctd.
plotAllWeights( rctd_spe, assay_name = "weights", cell_type_colors = NA, r = 0.4, lwd = 1, title = NA )plotAllWeights( rctd_spe, assay_name = "weights", cell_type_colors = NA, r = 0.4, lwd = 1, title = NA )
rctd_spe |
|
assay_name |
character, name of the assay to plot
(default: |
cell_type_colors |
vector of colors for the different cell types (default: rainbow) |
r |
numeric, radius of the pie charts (default: 0.4) |
lwd |
numeric, line width of the pie chart borders (default: 1) |
title |
character, plot title (default: |
This function is adapted from vizAllTopics in the STdeconvolve
package.
ggplot object showing cell type proportions at each pixel
using pie charts
data(rctdSim) # In practice, results_spe should contain the results of an RCTD run. results_spe <- rctdSim$proportions_spe plotAllWeights( results_spe, r = 0.05, lwd = 0.5, title = "Cell Type Proportions" )data(rctdSim) # In practice, results_spe should contain the results of an RCTD run. results_spe <- rctdSim$proportions_spe plotAllWeights( results_spe, r = 0.05, lwd = 0.5, title = "Cell Type Proportions" )
Creates a visualization showing how the proportion of a specific cell type
varies across space, represented by point color intensity. Users should run
this function on the result of runRctd.
plotCellTypeWeight( rctd_spe, cell_type, assay_name = "weights", size = 10, stroke = 1, alpha = 1, low = "white", high = "red", title = NA )plotCellTypeWeight( rctd_spe, cell_type, assay_name = "weights", size = 10, stroke = 1, alpha = 1, low = "white", high = "red", title = NA )
rctd_spe |
|
cell_type |
character, name of cell type to plot |
assay_name |
character, name of the assay to plot
(default: |
size |
numeric, size of the points (default: 10) |
stroke |
numeric, border width of the points (default: 1) |
alpha |
numeric, point transparency between 0 and 1 (default: 1) |
low |
color for the low end of the proportion color scale
(default: |
high |
color for the high end of the proportion color scale
(default: |
title |
character, plot title (default: |
This function is adapted from vizTopic in the STdeconvolve
package.
ggplot object showing the proportion of a specified cell type
at each pixel
data(rctdSim) # In practice, results_spe should contain the results of an RCTD run. results_spe <- rctdSim$proportions_spe plotCellTypeWeight( results_spe, "ct1", size = 5, title = "Cell Type Density (ct1)" )data(rctdSim) # In practice, results_spe should contain the results of an RCTD run. results_spe <- rctdSim$proportions_spe plotCellTypeWeight( results_spe, "ct1", size = 5, title = "Cell Type Density (ct1)" )
A simulated dataset containing both reference single-cell RNA-seq data and spatial transcriptomics data. The dataset includes 750 genes across 3 cell types, with 50% of genes being differentially expressed between cell types. The spatial data consists of two kinds of cell type mixtures, documented below.
data(rctdSim)data(rctdSim)
A list containing five components:
A matrix of simulated reference counts with 750
rows (genes) and 75 columns (25 samples per cell type). Gene names are of
the form "g1", "g2", etc.
A data frame specifying the cell type
("ct1", "ct2", "ct3") for each reference sample.
A matrix with columns x and y
giving the coordinates of each spatial transcriptomics pixel.
A matrix of simulated spatial transcriptomics counts with 750 rows (genes) and 12 columns (spatial locations).
A
SpatialExperiment object containing
the true cell type proportions for each spatial location. The
weights assay contains a matrix with 3 rows (cell types) and 12
columns (spatial locations).
The dataset was generated using the following parameters:
750 genes, with 50% probability of differential expression
3 cell types with 25 reference samples each
12 spatial locations total:
6 locations with mixture type 1 (90% ct1, 10% ct2)
6 locations with mixture type 2 (20% ct1, 40% ct2,
40% ct3)
Base expression levels were sampled uniformly between 0 and 10. Differentially expressed genes were randomly selected to be either up-regulated (2x) or down-regulated (0.5x) in specific cell types. Final counts were generated using a Poisson distribution.
data(rctdSim) # Spatial transcriptomics data library(SpatialExperiment) spatial_spe <- SpatialExperiment( assay = rctdSim$spatial_rna_counts, spatialCoords = rctdSim$spatial_rna_coords ) # Reference data library(SummarizedExperiment) reference_se <- SummarizedExperiment( assays = list(counts = rctdSim$reference_counts), colData = rctdSim$reference_cell_types ) # Access true cell type proportions true_proportions <- assay(rctdSim$proportions_spe, "weights")data(rctdSim) # Spatial transcriptomics data library(SpatialExperiment) spatial_spe <- SpatialExperiment( assay = rctdSim$spatial_rna_counts, spatialCoords = rctdSim$spatial_rna_coords ) # Reference data library(SummarizedExperiment) reference_se <- SummarizedExperiment( assays = list(counts = rctdSim$reference_counts), colData = rctdSim$reference_cell_types ) # Access true cell type proportions true_proportions <- assay(rctdSim$proportions_spe, "weights")
Robust Cell Type Decomposition (RCTD) is a computational method for deconvolving cell type mixtures in spatial transcriptomics data. RCTD learns cell type profiles from annotated RNA sequencing (RNA-seq) reference data and uses these profiles to identify cell types in spatial transcriptomic pixels while accounting for platform-specific effects. The RCTD algorithm has three modes suited for different spatial technologies:
doublet: Fits at most two cell types per pixel and classifies
each pixel as a "singlet" (one cell type) or "doublet"
(two cell types). Best for high spatial resolution technologies like
Slide-seq or MERFISH, where pixels are more likely to contain only 1 or
2 cells.
multi: Uses a greedy algorithm to fit up to
max_multi_types cell types per pixel (default: 4). Best for lower
resolution technologies like 100-micron Visium spots, which can contain
more cell types.
full: Fits any number of cell types per pixel without
restrictions.
runRctd( rctd_data, rctd_mode = c("doublet", "multi", "full"), max_cores = 4, max_multi_types = 4, confidence_threshold = 5, doublet_threshold = 20 )runRctd( rctd_data, rctd_mode = c("doublet", "multi", "full"), max_cores = 4, max_multi_types = 4, confidence_threshold = 5, doublet_threshold = 20 )
rctd_data |
list containing |
rctd_mode |
character string specifying the RCTD mode:
|
max_cores |
numeric, maximum number of cores to use for parallel processing (default: 4) |
max_multi_types |
numeric, maximum number of cell types per pixel in multi mode (default: 4) |
confidence_threshold |
numeric, minimum change in likelihood (compared to other cell types) necessary to determine a cell type identity with confidence (default: 5) |
doublet_threshold |
numeric, penalty weight of predicting a doublet instead of a singlet for a pixel (default: 20) |
A SpatialExperiment object
containing the RCTD results with:
Three assays (one in full mode):
weights: Cell type proportions restricted according to
the specified mode
weights_unconfident: Cell type proportions restricted
according to the specified mode, including unconfident predictions
(not available in full mode)
weights_full: Unrestricted cell type proportions (not
available in full mode, use weights instead)
Assays have cell types as rows and pixels as columns, with values representing the proportion (0 to 1) of each cell type in each pixel. Assay columns sum to 1 (except for rejected pixels, which sum to 0).
spatialCoords containing spatial coordinates for each pixel
colData containing:
For doublet mode:
spot_class: Classification as "singlet",
"doublet_certain", "doublet_uncertain", or
"reject"
first_type, second_type: Predicted cell types
first_class, second_class: Whether
predictions were made at the class level
Additional metrics like min_score,
singlet_score
For multi mode:
cell_type_list: List of cell types per pixel
conf_list: List of whether cell type predictions
are confident
Additional metrics like min_score
data(rctdSim) # Spatial transcriptomics data library(SpatialExperiment) spatial_spe <- SpatialExperiment( assay = rctdSim$spatial_rna_counts, spatialCoords = rctdSim$spatial_rna_coords ) # Reference data library(SummarizedExperiment) reference_se <- SummarizedExperiment( assays = list(counts = rctdSim$reference_counts), colData = rctdSim$reference_cell_types ) # Filter spatial transcriptomics data and aggregate reference data rctd_data <- createRctd(spatial_spe, reference_se) # Run RCTD on filtered data results <- runRctd(rctd_data, rctd_mode = "doublet", max_cores = 1) # Access the cell type proportions (cell types as rows, pixels as columns) assay(results, "weights") # Check spot classifications for doublet mode colData(results)$spot_class # Access spatial coordinates head(spatialCoords(results))data(rctdSim) # Spatial transcriptomics data library(SpatialExperiment) spatial_spe <- SpatialExperiment( assay = rctdSim$spatial_rna_counts, spatialCoords = rctdSim$spatial_rna_coords ) # Reference data library(SummarizedExperiment) reference_se <- SummarizedExperiment( assays = list(counts = rctdSim$reference_counts), colData = rctdSim$reference_cell_types ) # Filter spatial transcriptomics data and aggregate reference data rctd_data <- createRctd(spatial_spe, reference_se) # Run RCTD on filtered data results <- runRctd(rctd_data, rctd_mode = "doublet", max_cores = 1) # Access the cell type proportions (cell types as rows, pixels as columns) assay(results, "weights") # Check spot classifications for doublet mode colData(results)$spot_class # Access spatial coordinates head(spatialCoords(results))