Title: | Spatially Informed Cell Type Deconvolution for Spatial Transcriptomics |
---|---|
Description: | CARD is a reference-based deconvolution method that estimates cell type composition in spatial transcriptomics based on cell type specific expression information obtained from a reference scRNA-seq data. A key feature of CARD is its ability to accommodate spatial correlation in the cell type composition across tissue locations, enabling accurate and spatially informed cell type deconvolution as well as refined spatial map construction. CARD relies on an efficient optimization algorithm for constrained maximum likelihood estimation and is scalable to spatial transcriptomics with tens of thousands of spatial locations and tens of thousands of genes. |
Authors: | Ying Ma [aut], Jing Fu [cre] |
Maintainer: | Jing Fu <[email protected]> |
License: | GPL-3 + file LICENSE |
Version: | 0.99.5 |
Built: | 2025-02-22 03:30:01 UTC |
Source: | https://github.com/bioc/CARDspa |
The function to assign the spatial location information for each single cell
assign_sc_cords(mappint_spot_cell_cor, cords_new, numcell, sc_eset, ct_varname)
assign_sc_cords(mappint_spot_cell_cor, cords_new, numcell, sc_eset, ct_varname)
mappint_spot_cell_cor |
a mapped correlation matrix indicating the relashionship between each measured spatial location and the single cell in the scRNAseq reference |
cords_new |
output from the function get_high_res_cords |
numcell |
a numeric value indicating the number of single cells in each measured location, we suggest 20 for ST technology, 7 for 10x Viisum and 2 for Slide-seq |
sc_eset |
a single cell experiment object stored in CARD object |
ct_varname |
character, the name of the column in metaData that specifies the cell type annotation information, stroed in CARD object |
Return the assigned spatial location information for the mapped single cell
Spatially Informed Cell Type Deconvolution for Spatial Transcriptomics by CARD
CARD_deconvolution( sc_count, sc_meta, spatial_count, spatial_location, ct_varname, ct_select, sample_varname, mincountgene = 100, mincountspot = 5, sce = NULL, spe = NULL )
CARD_deconvolution( sc_count, sc_meta, spatial_count, spatial_location, ct_varname, ct_select, sample_varname, mincountgene = 100, mincountspot = 5, sce = NULL, spe = NULL )
sc_count |
Raw scRNA-seq count data, each column is a cell and each row is a gene. |
sc_meta |
data frame, with each row representing the cell type and/or sample information of a specific cell. The row names of this data frame should match exactly with the column names of the sc_count data |
spatial_count |
Raw spatial resolved transcriptomics data, each column is a spatial location, and each row is a gene. |
spatial_location |
data frame, with two columns representing the x and y coordinates of the spatial location. The rownames of this data frame should match eaxctly with the columns of the spatial_count. |
ct_varname |
character, the name of the column in metaData that specifies the cell type annotation information |
ct_select |
vector of cell type names that you are interested in to deconvolute, default as NULL. If NULL, then use all cell types provided by single cell dataset; |
sample_varname |
character,the name of the column in metaData that specifies the sample information. If NULL, we just use the whole as one sample. |
mincountgene |
Minimum counts for each gene |
mincountspot |
Minimum counts for each spatial location |
sce |
a |
spe |
a |
Returns a SpatialExperiment object with estimated cell type proportion stored in object$Proportion_CARD.
library(RcppML) library(NMF) library(RcppArmadillo) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 )
library(RcppML) library(NMF) library(RcppArmadillo) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 )
Construct an enhanced spatial expression map on the unmeasured tissue locations
CARD_imputation(CARD_object, num_grids, ineibor = 10, exclude = NULL)
CARD_imputation(CARD_object, num_grids, ineibor = 10, exclude = NULL)
CARD_object |
SpatialExperiment Object created by CARD_deconvolution with estimated cell type compositions on the original spatial resolved transcriptomics data. |
num_grids |
Initial number of newly grided spatial locations. The final number of newly grided spatial locations will be lower than this value since the newly grided locations outside the shape of the tissue will be filtered |
ineibor |
Numeric, number of neighbors used in the imputation on newly grided spatial locations, default is 10. |
exclude |
Vector, the rownames of spatial location data on the original resolution that you want to exclude. This is to avoid the weird detection of the shape. |
Return a SpatialExperiment object with the refined cell type compositions estimated for newly grided spots and the refined predicted gene expression (normalized).
data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) CARD_obj <- CARD_imputation( CARD_obj, num_grids = 200, ineibor = 10, exclude = NULL )
data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) CARD_obj <- CARD_imputation( CARD_obj, num_grids = 200, ineibor = 10, exclude = NULL )
Extension of CARD into a reference-free version of deconvolution: CARDfree.
CARD_refFree( markerlist, spatial_count, spatial_location, mincountgene = 100, mincountspot = 5, spe = NULL )
CARD_refFree( markerlist, spatial_count, spatial_location, mincountgene = 100, mincountspot = 5, spe = NULL )
markerlist |
a list of marker genes, with each element of the list being the vector of cell type specific marker genes |
spatial_count |
Raw spatial resolved transcriptomics data, each column is a spatial location, and each row is a gene. |
spatial_location |
data frame, with two columns representing the x and y coordinates of the spatial location. The rownames of this data frame should match eaxctly with the columns of the spatial_count. |
mincountgene |
Minimum counts for each gene |
mincountspot |
Minimum counts for each spatial location |
spe |
a |
Returns a SpatialExperiment object with estimated cell type proportion stored in object$Proportion_CARD. Because this is a reference-free version, the columns of estimated proportion is not cell type but cell type cluster
library(RcppML) library(NMF) library(RcppArmadillo) data(markerList) data(spatial_count) data(spatial_location) CARDfree_obj <- CARD_refFree( markerlist = markerList[8:16], spatial_count = spatial_count[1:2500, ], spatial_location = spatial_location, mincountgene = 100, mincountspot = 5 )
library(RcppML) library(NMF) library(RcppArmadillo) data(markerList) data(spatial_count) data(spatial_location) CARDfree_obj <- CARD_refFree( markerlist = markerList[8:16], spatial_count = spatial_count[1:2500, ], spatial_location = spatial_location, mincountgene = 100, mincountspot = 5 )
Extension of CARD into performing single cell Mapping from non-single cell spatial transcriptomics dataset.
CARD_scmapping(CARD_object, shapeSpot = "Square", numcell, ncore = 10)
CARD_scmapping(CARD_object, shapeSpot = "Square", numcell, ncore = 10)
CARD_object |
CARD object create by the CARD_deconvolution function. |
shapeSpot |
a character indicating whether the sampled spatial coordinates for single cells locating in a Square-like region or a Circle-like region. The center of this region is the measured spatial location in the non-single cell resolution spatial transcriptomics data. The default is 'Square', the other shape is 'Circle' |
numcell |
a numeric value indicating the number of single cells in each measured location, we suggest 20 for ST technology, 7 for 10x Viisum and 2 for Slide-seq |
ncore |
a numeric value indicating the number of cores used to accelerating the procedure |
Returns a SingleCellExperiment SCE object with the mapped expression at single cell resolution and the spatial location information of each single cell
library(SingleCellExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) scMapping <- CARD_scmapping( CARD_obj, shapeSpot = "Square", numcell = 20, ncore = 2) print(scMapping)
library(SingleCellExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) scMapping <- CARD_scmapping( CARD_obj, shapeSpot = "Square", numcell = 20, ncore = 2) print(scMapping)
Visualize the cell type proportion correlation
CARD_visualize_Cor(proportion, colors = colors)
CARD_visualize_Cor(proportion, colors = colors)
proportion |
Data frame, cell type proportion estimated by CARD in either original resolution or enhanced resolution. |
colors |
Vector of color names that you want to use, if NULL, we will use the default color scale c("#91a28c","white","#8f2c37") |
Returns a ggcorrplot figure.
library(ggplot2) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) CARD_visualize_Cor(CARD_obj$Proportion_CARD, colors = NULL)
library(ggplot2) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) CARD_visualize_Cor(CARD_obj$Proportion_CARD, colors = NULL)
Visualize the spatial distribution of cell type proportion
CARD_visualize_gene( spatial_expression, spatial_location, gene_visualize, colors = colors, NumCols )
CARD_visualize_gene( spatial_expression, spatial_location, gene_visualize, colors = colors, NumCols )
spatial_expression |
Data frame, spatial gene expression in either original resolution or enhanced resolution. |
spatial_location |
Data frame, spatial location information. |
gene_visualize |
Vector of selected gene names that are interested to visualize |
colors |
Vector of color names that you want to use, if NULL, we will use the default color scale in virdis palette |
NumCols |
Numeric, number of columns in the figure panel, it depends on the number of cell types you want to visualize. |
Returns a ggplot2 figure.
library(ggplot2) library(SummarizedExperiment) library(SpatialExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) CARD_visualize_gene( spatial_expression = assays(CARD_obj)$spatial_countMat, spatial_location = spatialCoords(CARD_obj), gene_visualize = c("A4GNT", "AAMDC", "CD248"), colors = NULL, NumCols = 3 )
library(ggplot2) library(SummarizedExperiment) library(SpatialExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) CARD_visualize_gene( spatial_expression = assays(CARD_obj)$spatial_countMat, spatial_location = spatialCoords(CARD_obj), gene_visualize = c("A4GNT", "AAMDC", "CD248"), colors = NULL, NumCols = 3 )
Visualize the spatial distribution of cell type proportion in a geom scatterpie plot
CARD_visualize_pie(proportion, spatial_location, colors = NULL, radius = NULL)
CARD_visualize_pie(proportion, spatial_location, colors = NULL, radius = NULL)
proportion |
Data frame, cell type proportion estimated by CARD in either original resolution or enhanced resolution. |
spatial_location |
Data frame, spatial location information. |
colors |
Vector of color names that you want to use, if NULL, we will use the color palette "Spectral" from RColorBrewer package. |
radius |
Numeric value about the radius of each pie chart, if NULL, we will calculate it inside the function. |
Returns a ggplot2 figure.
library(ggplot2) library(SpatialExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) colors <- c( "#FFD92F", "#4DAF4A", "#FCCDE5", "#D9D9D9", "#377EB8", "#7FC97F", "#BEAED4", "#FDC086", "#FFFF99", "#386CB0", "#F0027F", "#BF5B17", "#666666", "#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E", "#E6AB02", "#A6761D" ) CARD_visualize_pie( proportion = CARD_obj$Proportion_CARD, spatial_location = spatialCoords(CARD_obj), colors = colors, radius = 0.52 )
library(ggplot2) library(SpatialExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) colors <- c( "#FFD92F", "#4DAF4A", "#FCCDE5", "#D9D9D9", "#377EB8", "#7FC97F", "#BEAED4", "#FDC086", "#FFFF99", "#386CB0", "#F0027F", "#BF5B17", "#666666", "#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E", "#E6AB02", "#A6761D" ) CARD_visualize_pie( proportion = CARD_obj$Proportion_CARD, spatial_location = spatialCoords(CARD_obj), colors = colors, radius = 0.52 )
Visualize the spatial distribution of cell type proportion
CARD_visualize_prop( proportion, spatial_location, ct_visualize = ct_visualize, colors = c("lightblue", "lightyellow", "red"), NumCols, pointSize = 3 )
CARD_visualize_prop( proportion, spatial_location, ct_visualize = ct_visualize, colors = c("lightblue", "lightyellow", "red"), NumCols, pointSize = 3 )
proportion |
Data frame, cell type proportion estimated by CARD in either original resolution or enhanced resolution. |
spatial_location |
Data frame, spatial location information. |
ct_visualize |
Vector of selected cell type names that are interested to visualize |
colors |
Vector of color names that you want to use, if NULL, we will use the default color scale c("lightblue","lightyellow","red") |
NumCols |
Numeric, number of columns in the figure panel, it depends on the number of cell types you want to visualize. |
pointSize |
Size of each point used for plotting |
Returns a ggplot2 figure.
library(ggplot2) library(SpatialExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) ct_visualize <- c( "Acinar_cells", "Cancer_clone_A", "Cancer_clone_B", "Ductal_terminal_ductal_like", "Ductal_CRISP3_high-centroacinar_like", "Ductal_MHC_Class_II", "Ductal_APOL1_high-hypoxic", "Fibroblasts" ) CARD_visualize_prop( proportion = CARD_obj$Proportion_CARD, spatial_location = spatialCoords(CARD_obj), ct_visualize = ct_visualize, colors = c("lightblue", "lightyellow", "red"), NumCols = 4, pointSize = 3.0 )
library(ggplot2) library(SpatialExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) ct_visualize <- c( "Acinar_cells", "Cancer_clone_A", "Cancer_clone_B", "Ductal_terminal_ductal_like", "Ductal_CRISP3_high-centroacinar_like", "Ductal_MHC_Class_II", "Ductal_APOL1_high-hypoxic", "Fibroblasts" ) CARD_visualize_prop( proportion = CARD_obj$Proportion_CARD, spatial_location = spatialCoords(CARD_obj), ct_visualize = ct_visualize, colors = c("lightblue", "lightyellow", "red"), NumCols = 4, pointSize = 3.0 )
Visualize the spatial distribution of two cell type proportions on the same plot
CARD_visualize_prop_2CT( proportion, spatial_location, ct2_visualize = ct2_visualize, colors = NULL )
CARD_visualize_prop_2CT( proportion, spatial_location, ct2_visualize = ct2_visualize, colors = NULL )
proportion |
Data frame, cell type proportion estimated by CARD in either original resolution or enhanced resolution. |
spatial_location |
Data frame, spatial location information. |
ct2_visualize |
Vector of selected two cell type names that are interested to visualize, here we only focus on two cell types |
colors |
list of color names that you want to use for each cell type, if NULL, we will use the default color scale list list(c("lightblue","lightyellow","red"),c("lightblue","lightyellow","black") |
Returns a ggplot2 figure.
library(ggplot2) library(SpatialExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) CARD_visualize_prop_2CT( proportion = CARD_obj$Proportion_CARD, spatial_location = spatialCoords(CARD_obj), ct2_visualize = c("Cancer_clone_A", "Cancer_clone_B"), colors = list(c("lightblue", "lightyellow", "red"), c( "lightblue", "lightyellow", "black" )) )
library(ggplot2) library(SpatialExperiment) data(spatial_count) data(spatial_location) data(sc_count) data(sc_meta) CARD_obj <- CARD_deconvolution( sc_count = sc_count, sc_meta = sc_meta, spatial_count = spatial_count, spatial_location = spatial_location, ct_varname = "cellType", ct_select = unique(sc_meta$cellType), sample_varname = "sampleInfo", mincountgene = 100, mincountspot = 5 ) CARD_visualize_prop_2CT( proportion = CARD_obj$Proportion_CARD, spatial_location = spatialCoords(CARD_obj), ct2_visualize = c("Cancer_clone_A", "Cancer_clone_B"), colors = list(c("lightblue", "lightyellow", "red"), c( "lightblue", "lightyellow", "black" )) )
Each CARD object has a number of slots which store information. Key slots to access are listed below.
Return an object of CARD class
sc_eset
The filtered scRNA-seq data along with meta data stored in the format of SingleCellExperiment.
spatial_countMat
The filtered spatial count data.
spatial_location
The weights for combining p-values from multiple kernels.
Proportion_CARD
The estimated cell type proportion by CARD with each row is a spatial location and each column is a cell type.
project
The name of the project, default is deconvolution.
info_parameters
The paramters that are used in model fitting.
algorithm_matrix
The intermediate matrices that are used in the model fitting step.
refined_prop
The refined cell type proportion matrix estimated by CARD for the newly grided spatial locations. The number of initial grids are defined by the user.
refined_expression
The refined predicted expression matrix (normalized) estimated by CARD for the newly grided spatial locations. The number of initial grids are defined by the user.
SpatialDeconv function based on Conditional Autoregressive model
CARDfree( XinputIn, UIn, WIn, phiIn, max_iterIn, epsilonIn, initV, initb, initSigma_e2, initLambda )
CARDfree( XinputIn, UIn, WIn, phiIn, max_iterIn, epsilonIn, initV, initb, initSigma_e2, initLambda )
XinputIn |
The input of normalized spatial data |
UIn |
The input of cell type specific basis matrix B |
WIn |
The constructed W weight matrix from Gaussian kernel |
phiIn |
The phi value |
max_iterIn |
Maximum iterations |
epsilonIn |
epsilon for convergence |
initV |
Initial matrix of cell type compositions V |
initb |
Initial vector of cell type specific intercept |
initSigma_e2 |
Initial value of residual variance |
initLambda |
Initial vector of cell type sepcific scalar. |
A list
Each CARDfree object has a number of slots which store information. Key slots to access are listed below.
Return an object of CARDfree class
spatial_countMat
The filtered spatial count data.
spatial_location
The weights for combining p-values from multiple kernels.
Proportion_CARD
The estimated cell type proportion by CARD with each row is a spatial location and each column is a cell type.
estimated_refMatrix
The estimated reference matrix by CARDfree with each row represents a gene and each column represents a cell type cluster.
project
The name of the project, default is deconvolution.
markerList
The nlist of cell type specific markers, with each element represents the vector of cell type specific markers
info_parameters
The paramters that are used in model fitting.
algorithm_matrix
The intermediate matrices that are used in the model fitting step.
refined_prop
The refined cell type proportion matrix estimated by CARD for the newly grided spatial locations. The number of initial grids are defined by the user.
refined_expression
The refined predicted expression matrix (normalized) estimated by CARD for the newly grided spatial locations. The number of initial grids are defined by the user.
SpatialDeconv function based on Conditional Autoregressive model
CARDref( XinputIn, UIn, WIn, phiIn, max_iterIn, epsilonIn, initV, initb, initSigma_e2, initLambda )
CARDref( XinputIn, UIn, WIn, phiIn, max_iterIn, epsilonIn, initV, initb, initSigma_e2, initLambda )
XinputIn |
The input of normalized spatial data |
UIn |
The input of cell type specific basis matrix B |
WIn |
The constructed W weight matrix from Gaussian kernel |
phiIn |
The phi value |
max_iterIn |
Maximum iterations |
epsilonIn |
epsilon for convergence |
initV |
Initial matrix of cell type compositions V |
initb |
Initial vector of cell type specific intercept |
initSigma_e2 |
Initial value of residual variance |
initLambda |
Initial vector of cell type sepcific scalar. |
A list
Construct the mean gene expression basis matrix (B), this is the faster version
create_ref(sc_eset, ct_select = NULL, ct_varname, sample_varname = NULL)
create_ref(sc_eset, ct_select = NULL, ct_varname, sample_varname = NULL)
sc_eset |
S4 class for storing data from single-cell experiments. This format is usually created by the package SingleCellExperiment with stored counts, along with the usual metadata for genes and cells. |
ct_select |
vector of cell type names that you are interested in to deconvolute, default as NULL. If NULL, then use all cell types provided by single cell dataset; |
ct_varname |
character, the name of the column in metaData that specifies the cell type annotation information |
sample_varname |
character,the name of the column in metaData that specifies the sample information. If NULL, we just use the whole as one sample. |
Return a list of basis (B) matrix
Create the CARD object
createCARDfreeObject( markerlist, spatial_count, spatial_location, mincountgene = 100, mincountspot = 5, spe = NULL )
createCARDfreeObject( markerlist, spatial_count, spatial_location, mincountgene = 100, mincountspot = 5, spe = NULL )
markerlist |
a list of marker genes, with each element of the list being the vector of cell type specific marker genes |
spatial_count |
Raw spatial resolved transcriptomics data, each column is a spatial location, and each row is a gene. |
spatial_location |
data frame, with two columns representing the x and y coordinates of the spatial location. The rownames of this data frame should match eaxctly with the columns of the spatial_count. |
mincountgene |
Minimum counts for each gene |
mincountspot |
Minimum counts for each spatial location |
spe |
a |
Returns CARDfree object with filtered spatial count and marker gene list.
Create the CARD object
createCARDObject( sc_count, sc_meta, spatial_count, spatial_location, ct_varname, ct_select, sample_varname, mincountgene = 100, mincountspot = 5, sce = NULL, spe = NULL )
createCARDObject( sc_count, sc_meta, spatial_count, spatial_location, ct_varname, ct_select, sample_varname, mincountgene = 100, mincountspot = 5, sce = NULL, spe = NULL )
sc_count |
Raw scRNA-seq count data, each column is a cell and each row is a gene. |
sc_meta |
data frame, with each row representing the cell type and/or sample information of a specific cell. The row names of this data frame should match exactly with the column names of the sc_count data |
spatial_count |
Raw spatial resolved transcriptomics data, each column is a spatial location, and each row is a gene. |
spatial_location |
data frame, with two columns representing the x and y coordinates of the spatial location. The rownames of this data frame should match eaxctly with the columns of the spatial_count. |
ct_varname |
character, the name of the column in metadata that specifies the cell type annotation information |
ct_select |
vector of cell type names that you are interested in to deconvolute, default as NULL. If NULL, then use all cell types provided by single cell dataset; |
sample_varname |
character,the name of the column in metadata that specifies the sample information. If NULL, we just use the whole as one sample. |
mincountgene |
Minimum counts for each gene |
mincountspot |
Minimum counts for each spatial location |
sce |
a |
spe |
a |
Returns CARD object with filtered spatial count and single cell RNA-seq dataset.
The function to sample the spatial location information for each single cell
get_high_res_cords(cords, numcell, shape = "Square")
get_high_res_cords(cords, numcell, shape = "Square")
cords |
The spatial location information in the measure spatial locations, with the first and second columns represent the 2-D x-y coordinate system |
numcell |
a numeric value indicating the number of single cells in each measured location, we suggest 20 for ST technology, 7 for 10x Viisum and 2 for Slide-seq |
shape |
a character indicating whether the sampled spatial coordinates for single cells locating in a Square-like region or a Circle-like region. The center of this region is the measured spatial location in the non-single cell resolution spatial transcriptomics data. The default is 'Square', the other shape is 'Circle' |
Returns a dataframe with the sampled spatial location information for each single cell
The function to estimate the cell type composition signature for each single cell in the scRNaseq reference data
get_weight_for_cell(sc_eset, ct_varname, ct_select, sample_varname, B)
get_weight_for_cell(sc_eset, ct_varname, ct_select, sample_varname, B)
sc_eset |
the sc_eset stored in the CARD object |
ct_varname |
character, the name of the column in metaData that specifies the cell type annotation information, stored in the CARD object |
ct_select |
vector of cell type names that you are interested in to deconvolute, default as NULL. stored in the CARD object |
sample_varname |
character,the name of the column in metaData that specifies the sample information. stored in the CARD object |
B |
reference basis matrix stored in the CARD object. |
Returns a matrix of the cell type composition signature for each single cell in the scRNaseq reference
The marker gene list is a list format with each element of the list being the cell type specific gene markers.
data(markerList)
data(markerList)
An object of class list
of length 20.
Imputation and Construction of High-Resolution Spatial Maps for Cell Type Composition and Gene Expression by the spatial correlation structure between original spatial locations and new grided spatial locations
mvn_cv( vtrain, location_orig, train_ind, test_ind, B, xinput_norm, optimal_b, optimal_phi, lambda, ineibor )
mvn_cv( vtrain, location_orig, train_ind, test_ind, B, xinput_norm, optimal_b, optimal_phi, lambda, ineibor )
vtrain |
Matrix, estimated V matrix from CARD |
location_orig |
Data frame, spatial location data frame of the original spatial resolved transcriptomics dataset, stored in the spatialCoords(CARD_object) |
train_ind |
Vector, index of the original spatial locations |
test_ind |
Vector, index of the newly grided spatial locations |
B |
Matrix, used in the deconvolution as the reference basis matrix |
xinput_norm |
Matrix, used in the deconvolution as the normalized spatial count data |
optimal_b |
Vector, vector of the intercept for each cel type estimated based on the original spatial resolution |
optimal_phi |
Numeric, the optimal phi value stored in CARD_object |
lambda |
Vector, vector of cell type specific scalar in the CAR model |
ineibor |
Numeric, number of neighbors used in the imputation on newly grided spatial locations, default is 10. |
Return a list with the imputed Cell type composition Vtest matrix on the newly grided spatial locations and predicted normalized gene expression
Normalize the new spatial locations without changing the shape and relative positions
norm_coords_train_test(location_orig, train_ind, test_ind)
norm_coords_train_test(location_orig, train_ind, test_ind)
location_orig |
Data frame, spatial location data frame of the original spatial resolved transcriptomics dataset, stored in the spatialCoords(CARD_object) |
train_ind |
Vector, Index of the original spatial locations |
test_ind |
Vector, Index of the newly grided spatial locations |
Return the normalized spatial location data frame
Make new spatial locations on unmeasured tissue through grids.
sample_grid_within(location, num_sample, concavity = 2)
sample_grid_within(location, num_sample, concavity = 2)
location |
Data frame, spatial location data frame of the original spatial resolved transcriptomics dataset, stored in the spatialCoords(CARD_object) |
num_sample |
Numeric, approximate number of cells in grid within the shape of the spatial location data frame |
concavity |
Numeric, a relative measure of concavity. The default is 2.0, which can prodecure detailed enough shapes. Infinity results in a convex hull while 1 results in a more detailed shape. |
Return a list of data frame with newly grided points
The scRNA-seq count data must be in the format of matrix or sparseMatrix, while each row represents a gene and each column represents a cell.
data(sc_count)
data(sc_count)
An object of class dgCMatrix
with 7000 rows and 1926 columns.
The scRNAseq meta data must be in the format of data frame while each row represents a cell. The rownames of the scRNAseq meta data should match exactly with the column names of the scRNAseq count data. The sc_meta data must contain the column indicating the cell type assignment for each cell (e.g., “cellType” column in the example sc_meta data). Sample/subject information should be provided, if there is only one sample, we can add a column by sc_meta$sampleInfo = "sample1".
data(sc_meta)
data(sc_meta)
An object of class data.frame
with 1926 rows and 3 columns.
Quality control of scRNA-seq count data
sc_QC( counts_in, metadata, ct_varname, ct_select, sample_varname = NULL, min_cells = 0, min_genes = 0 )
sc_QC( counts_in, metadata, ct_varname, ct_select, sample_varname = NULL, min_cells = 0, min_genes = 0 )
counts_in |
Raw scRNAseq count data, each column is a cell and each row is a gene. |
metadata |
data frame, metadata with "ct_varname" specify the cell type annotation information and "sample_varname" specify the sample information |
ct_varname |
character, the name of the column in metadata that specifies the cell type annotation information |
ct_select |
vector of cell type names that you are interested in to deconvolute, default as NULL. If NULL, then use all cell types provided by single cell dataset; |
sample_varname |
character,the name of the column in metadata that specifies the sample information. If NULL, we just use the whole as one sample. |
min_cells |
numeric, we filtered out the non-expressed cells. |
min_genes |
numeric we filtered out the non-expressed genes |
Return the filtered scRNA-seq data and meta data stored in a S4 class (SingleCellExperiment)
Select Informative Genes used in the deconvolution
select_info(basis, sc_eset, commongene, ct_select, ct_varname)
select_info(basis, sc_eset, commongene, ct_select, ct_varname)
basis |
Reference basis matrix. |
sc_eset |
scRNAseq data along with meta data stored in the S4 class format (SingleCellExperiment). |
commongene |
common genes between scRNAseq count data and spatial resolved transcriptomics data. |
ct_select |
vector of cell type names that you are interested in to deconvolute, default as NULL. If NULL, then use all cell types provided by single cell dataset; |
ct_varname |
character, the name of the column in metaData that specifies the cell type annotation information |
a vector of informative genes selected
This method provides a concise summary of an object of class CARD
,
displaying key information including the project name, the number of spots,
the number of cell types, and a sample of the
Proportion_CARD
matrix.
## S4 method for signature 'CARD' show(object)
## S4 method for signature 'CARD' show(object)
object |
An object of class |
A concise summary of the CARD
object is printed to the
console.
This method provides a concise summary of an object of
class CARDfree
, displaying key information including the project
name, the number of spots, the number of cell types, and a sample of the
Proportion_CARD
matrix.
## S4 method for signature 'CARDfree' show(object)
## S4 method for signature 'CARDfree' show(object)
object |
An object of class |
A concise summary of the CARDfree
object is
printed to the console.
Calculate the variance covariance matrix used in the imputation of the new grided locations
Sigma(location_orig, train_ind, test_ind, optimal_phi, ineibor)
Sigma(location_orig, train_ind, test_ind, optimal_phi, ineibor)
location_orig |
Data frame, spatial location data frame of the original spatial resolved transcriptomics dataset, stored in the spatialCoords(CARD_object) |
train_ind |
Vector, index of the original spatial locations |
test_ind |
Vector, index of the newly grided spatial locations |
optimal_phi |
Numeric, the optimal phi value stored in CARD_object |
ineibor |
Numeric, number of neighbors used in the imputation on newly grided spatial locations, default is 10. |
Return a list with the imputed Cell type composition Vtest matrix on the newly grided spatial locations and predicted normalized gene expression
The spatial transcriptomics count data must be in the format of matrix or sparseMatrix, while each row represents a gene and each column represents a spatial location. The column names of the spatial data can be in the “XcoordxYcoord” (i.e., 10x10) format, but you can also maintain your original spot names, for example, barcode names.
data(spatial_count)
data(spatial_count)
An object of class dgCMatrix
with 11000 rows and 428 columns.
The spatial location data must be in the format of data frame while each row represents a spatial location, the first column represents the x coordinate and the second column represents the y coordinate. The rownames of the spatial location data frame should match exactly with the column names of the spatial_count.
data(spatial_location)
data(spatial_location)
An object of class data.frame
with 428 rows and 2 columns.