We developed a statistical method for spatially informed cell type deconvolution for spatial transcriptomics. Briefly,CARD is a reference-based deconvolution method that estimates cell type composition in spatial transcriptomics based on cell type specific expression information obtained from a reference scRNA-seq data. A key feature of CARD is its ability to accommodate spatial correlation in the cell type composition across tissue locations, enabling accurate and spatially informed cell type deconvolution as well as refined spatial map construction. CARD relies on an efficient optimization algorithm for constrained maximum likelihood estimation and is scalable to spatial transcriptomics with tens of thousands of spatial locations and tens of thousands of genes.
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("CARDspa")
This tutorial is the example analysis with CARDspa on the human pancreatic ductal adenocarcinomas data from Moncada et al, 2020. Please note that we are using edited data, see the tutorial for an example using the complete data.
CARD
requires two types of input data: - spatial
transcriptomics count data, along with spatial location
information.
- single cell RNAseq (scRNA-seq) count data, along with meta information
indicating the cell type information and the sample (subject)
information for each cell.
The example data for running the tutorial is included in the package. Here are the details about the required data input illustrated by the example datasets.
library(CARDspa)
library(RcppML)
library(NMF)
#> Loading required package: registry
#> Registered S3 methods overwritten by 'registry':
#> method from
#> print.registry_field proxy
#> print.registry_entry proxy
#> Loading required package: rngtools
#> Loading required package: cluster
#> NMF - BioConductor layer [OK] | Shared memory capabilities [NO: bigmemory] | Cores 2/2
#> To enable shared memory capabilities, try: install.extras('
#> NMF
#> ')
#>
#> Attaching package: 'NMF'
#> The following object is masked from 'package:generics':
#>
#> fit
#> The following object is masked from 'package:RcppML':
#>
#> nmf
library(RcppArmadillo)
library(SingleCellExperiment)
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#>
#> Attaching package: 'matrixStats'
#> The following objects are masked from 'package:Biobase':
#>
#> anyMissing, rowMedians
#>
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#>
#> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#> colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#> colWeightedMeans, colWeightedMedians, colWeightedSds,
#> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#> rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#> rowWeightedSds, rowWeightedVars
#> The following object is masked from 'package:Biobase':
#>
#> rowMedians
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:NMF':
#>
#> nrun
#> The following object is masked from 'package:utils':
#>
#> findMatches
#> The following objects are masked from 'package:base':
#>
#> I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
library(SpatialExperiment)
library(ggplot2)
#### load the example spatial transcriptomics count data,
data(spatial_count)
spatial_count[1:4, 1:4]
#> 4 x 4 sparse Matrix of class "dgCMatrix"
#> 10x10 10x13 10x14 10x15
#> ALDH1L1.AS2 . . . .
#> EIF4A1P11 . . . .
#> CYTH2 . . . .
#> AL121612.1 7 . 1 .
The spatial transcriptomics count data must be in the format of matrix or sparseMatrix, while each row represents a gene and each column represents a spatial location. The column names of the spatial data can be in the “XcoordxYcoord” (i.e., 10x10) format, but you can also maintain your original spot names, for example, barcode names.
#### load the example spatial location data,
data(spatial_location)
spatial_location[1:4, ]
#> x y
#> 10x10 10 10
#> 10x13 10 13
#> 10x14 10 14
#> 10x15 10 15
The spatial location data must be in the format of data frame while each row represents a spatial location, the first column represents the x coordinate and the second column represents the y coordinate. The rownames of the spatial location data frame should match exactly with the column names of the spatial_count.
data(sc_count)
sc_count[1:4, 1:4]
#> 4 x 4 sparse Matrix of class "dgCMatrix"
#> Cell1 Cell2 Cell3 Cell4
#> ZNF641 . . . .
#> XAB2 . . 1 .
#> C1orf87 . . . .
#> USP30 . . . .
The scRNA-seq count data must be in the format of matrix or sparseMatrix, while each row represents a gene and each column represents a cell.
data(sc_meta)
sc_meta[1:4, ]
#> cellID cellType sampleInfo
#> Cell1 Cell1 Acinar_cells sample1
#> Cell2 Cell2 Ductal_terminal_ductal_like sample1
#> Cell3 Cell3 Ductal_terminal_ductal_like sample1
#> Cell4 Cell4 Ductal_CRISP3_high-centroacinar_like sample1
The scRNAseq meta data must be in the format of data frame while each
row represents a cell. The rownames of the scRNAseq meta data should
match exactly with the column names of the scRNAseq count data. The
sc_meta data must contain the column indicating the cell type assignment
for each cell (e.g., “cellType” column in the example sc_meta data).
Sample/subject information should be provided, if there is only one
sample, we can add a column by
sc_meta$sampleInfo = "sample1"
.
We suggest the users to check their single cell RNASeq data carefully before running CARD. We suggest the users to input the single cell RNAseq data with each cell type containing at least 2 cells. i.e. print(table(sc_meta$cellType,useNA = “ifany”))
We can use CARD_deconvolution
to deconvolute the spatial
transcriptomics data. The essential inputs are: - sc_count: Matrix or
sparse matrix of raw scRNA-seq count data, each row represents a gene
and each column represents a cell. This sc_count data serves as a
reference for the cell type deconvolution for spatial transcriptomics
data. - sc_meta: Data frame, with each row representing the cell type
and/or sample information of a specific cell. The row names of this data
frame should match exactly with the column names of the sc_count data.
The sc_meta data must contain the column indicating the cell type
assignment for each cell (e.g., “cellType” column in the example sc_meta
data). - spatial_count: Matrix or sparse matrix of raw spatial resolved
transcriptomics count data, each row represents a gene and each column
represents a spatial location. This is the spatial transcriptomics data
that we are interested to deconvolute. - spatial_location: Data frame,
with two columns representing the x and y coordinates of the spatial
location. The rownames of this data frame should match eaxctly with the
columns of the spatial_count. - ct.varname: Caracter, the name of the
column in sc_meta that specifies the cell type assignment. - ct.select:
Vector of cell type names that you are interested in to deconvolute,
default as NULL. If NULL, then use all cell types provided by single
cell dataset. - sample.varname: Character,the name of the column in
sc_meta that specifies the sample/subject information. If NULL, we just
use the whole data as one sample/subject. - minCountGene: Numeric,
include spatial locations where at least this number of counts detected.
Default is 100. - minCountSpot: Numeric, include genes where at least
this number of spatial locations that have non-zero expression. Default
is 5.
This function first create a CARD object and then do deconvolution.
Finally return a SpatialExperiment object. The results are stored in
CARD_obj$Proportion_CARD
CARD is computationally fast and memory efficient. CARD relies on an efficient optimization algorithm for constrained maximum likelihood estimation and is scalable to spatial transcriptomics with tens of thousands of spatial locations and tens of thousands of genes. For the example dataset with the sample size of 428 locations, it takes within a minute to finish the deconvolution.
set.seed(seed = 20200107)
CARD_obj <- CARD_deconvolution(
sc_count = sc_count,
sc_meta = sc_meta,
spatial_count = spatial_count,
spatial_location = spatial_location,
ct_varname = "cellType",
ct_select = unique(sc_meta$cellType),
sample_varname = "sampleInfo",
mincountgene = 100,
mincountspot = 5 )
#> ## QC on scRNASeq dataset! ...
#> ## QC on spatially-resolved dataset! ...
#> ## create reference matrix from scRNASeq...
#> ## Select Informative Genes! ...
#> ## Deconvolution Starts! ...
#> ## Deconvolution Finish! ...
## QC on scRNASeq dataset! ...
## QC on spatially-resolved dataset! ..
## create reference matrix from scRNASeq...
## Select Informative Genes! ...
## Deconvolution Starts! ...
## Deconvolution Finish! ...
And CARDspa
also supports using SingleCellExperiment
object and SingleCellExperiment object, you can run the following
code:
## create sce object
sce <- SingleCellExperiment(assay = list(counts = sc_count),
colData = sc_meta)
## create spe object
spe <- SpatialExperiment(assay = list(counts = spatial_count),
spatialCoords = as.matrix(spatial_location)
)
celltypes <- unique(sc_meta$cellType)
set.seed(seed = 20200107)
CARD_obj <- CARD_deconvolution(
spe = spe,
sce = sce,
sc_count = NULL,
sc_meta = NULL,
spatial_count = NULL,
spatial_location = NULL,
ct_varname = "cellType",
ct_select = celltypes,
sample_varname = "sampleInfo",
mincountgene = 100,
mincountspot = 5
)
#> ## QC on scRNASeq dataset! ...
#> ## QC on spatially-resolved dataset! ...
#> ## create reference matrix from scRNASeq...
#> ## Select Informative Genes! ...
#> ## Deconvolution Starts! ...
#> ## Deconvolution Finish! ...
The spatial data are stored in
assays(CARD_obj)$spatial_countMat
and
spatialCoords(CARD_obj)
while the scRNA-seq data is stored
in CARD_obj@metadata$sc_eset
in the format of
SingleCellExperiment. The results are stored in
CARD_obj$Proportion_CARD
.
print(CARD_obj$Proportion_CARD[1:2, ])
#> Acinar_cells Ductal_terminal_ductal_like
#> 10x10 0.02822940 0.002398456
#> 10x13 0.01838797 0.037258897
#> Ductal_CRISP3_high-centroacinar_like Cancer_clone_A Ductal_MHC_Class_II
#> 10x10 0.01616564 0.0002066411 0.1180560
#> 10x13 0.50388060 0.0958701424 0.1474945
#> Cancer_clone_B mDCs_A Ductal_APOL1_high-hypoxic Tuft_cells
#> 10x10 0.01831038 0.034041601 0.005807915 0.023857592
#> 10x13 0.04352440 0.005606759 0.009695791 0.002570037
#> mDCs_B pDCs Endocrine_cells Endothelial_cells Macrophages_A
#> 10x10 0.204028802 0.001366423 0.0006522617 0.08394567 1.249407e-06
#> 10x13 0.001295846 0.001051143 0.0039317752 0.02744028 7.680739e-09
#> Mast_cells Macrophages_B T_cells_&_NK_cells Monocytes RBCs
#> 10x10 2.390952e-07 3.830345e-05 4.859000e-02 4.281992e-07 0.00186420
#> 10x13 2.075633e-02 2.655632e-09 9.415854e-05 5.860814e-05 0.00359144
#> Fibroblasts
#> 10x10 0.41243879
#> 10x13 0.07749133
First, we jointly visualize the cell type proportion matrix through scatterpie plot.Note that here because the number of spots is relatively small, so jointly visualize the cell type proportion matrix in the scatterpie plot format is duable. We do not recommend users to visualize this plot when the number of spots is > 500. Instead, we recommend users to visualize the proportion directly, i.e., using the function CARD_visualize_prop(). Details of using this function see the next example.
## set the colors. Here, I just use the colors in the manuscript, if the color
## is not provided, the function will use default color in the package.
colors <- c(
"#FFD92F", "#4DAF4A", "#FCCDE5", "#D9D9D9", "#377EB8", "#7FC97F",
"#BEAED4", "#FDC086", "#FFFF99", "#386CB0", "#F0027F", "#BF5B17",
"#666666", "#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E",
"#E6AB02", "#A6761D"
)
p1 <- CARD_visualize_pie(
proportion = CARD_obj$Proportion_CARD,
spatial_location = spatialCoords(CARD_obj),
colors = colors,
radius = 0.52
) ### You can choose radius = NULL or your own radius number
print(p1)
Then, we can select some interested cell types to visualize separately.
## select the cell type that we are interested
ct.visualize <- c(
"Acinar_cells", "Cancer_clone_A", "Cancer_clone_B",
"Ductal_terminal_ductal_like",
"Ductal_CRISP3_high-centroacinar_like",
"Ductal_MHC_Class_II",
"Ductal_APOL1_high-hypoxic",
"Fibroblasts"
)
## visualize the spatial distribution of the cell type proportion
p2 <- CARD_visualize_prop(
proportion = CARD_obj$Proportion_CARD,
spatial_location = spatialCoords(CARD_obj),
### selected cell types to visualize
ct_visualize = ct.visualize,
### if not provide, we will use the default colors
colors = c("lightblue", "lightyellow", "red"),
### number of columns in the figure panel
NumCols = 4,
### point size in ggplot2 scatterplot
pointSize = 3.0
)
print(p2)
We added a new visualization function to visualize the distribution of two cell types on the same post.
## visualize the spatial distribution of two cell types on the same plot
p3 <- CARD_visualize_prop_2CT(
### Cell type proportion estimated by CARD
proportion = CARD_obj$Proportion_CARD,
### spatial location information
spatial_location = spatialCoords(CARD_obj),
### two cell types you want to visualize
ct2_visualize = c("Cancer_clone_A", "Cancer_clone_B"),
### two color scales
colors = list(
c("lightblue", "lightyellow", "red"),
c("lightblue", "lightyellow", "black")
)
)
print(p3)
A unique feature of CARD is its ability to model the spatial
correlation in cell type composition across tissue locations, thus
enabling spatially informed cell type deconvolution. Modeling spatial
correlation allows us to not only accurately infer the cell type
composition on each spatial location, but also impute cell type
compositions and gene expression levels on unmeasured tissue locations,
facilitating the construction of a refined spatial tissue map with a
resolution much higher than that measured in the original study.
Specifically, CARD constructed a refined spatial map through the
function CARD_imputation
. The essential inputs are:
Briefly, CARD first outlined the shape of the tissue by applying a two-dimensional concave hull algorithm on the existing locations, then perform imputation on the newly grided spatial locations. We recommend to check the exisiting spatial locations to see if there are outliers that are seriously affect the detection of the shape.
CARD_obj <- CARD_imputation(
CARD_obj,
num_grids = 2000,
ineibor = 10,
exclude = NULL)
#> ## The rownames of locations are matched ...
#> ## Make grids on new spatial locations ...
## The rownames of locations are matched ...
## Make grids on new spatial locations ...
The results are store in CARD_obj$refined_prop
and
assays(CARD_obj)$refined_expression
## Visualize the newly grided spatial locations to see if the shape is correctly
## detected. If not, the user can provide the row names of the excluded spatial
## location data into the CARD_imputation function
location_imputation <- cbind.data.frame(
x = as.numeric(sapply(
strsplit(rownames(CARD_obj$refined_prop), split = "x"), "[", 1
)),
y = as.numeric(sapply(
strsplit(rownames(CARD_obj$refined_prop), split = "x"), "[", 2
))
)
rownames(location_imputation) <- rownames(CARD_obj$refined_prop)
library(ggplot2)
p5 <- ggplot(
location_imputation,
aes(x = x, y = y)
) +
geom_point(shape = 22, color = "#7dc7f5") +
theme(
plot.margin = margin(0.1, 0.1, 0.1, 0.1, "cm"),
legend.position = "bottom",
panel.background = element_blank(),
plot.background = element_blank(),
panel.border = element_rect(colour = "grey89", fill = NA,
linewidth = 0.5)
)
print(p5)
Now we can use the same CARD_visualize_prop
function to
visualize the cell type proportion at the enhanced resolution. But this
time, the input of the function should be the imputed cell typr
propotion and corresponding newly grided spatial locations.
p6 <- CARD_visualize_prop(
proportion = CARD_obj$refined_prop,
spatial_location = location_imputation,
ct_visualize = ct.visualize,
colors = c("lightblue", "lightyellow", "red"),
NumCols = 4
)
print(p6)
After we obtained cell type proportion at the enhanced resolution by CARD, we can predict the spatial gene expression at the enhanced resolution. The following code is to visualize the marker gene expression at an enhanced resolution.
p7 <- CARD_visualize_gene(
spatial_expression = assays(CARD_obj)$refined_expression,
spatial_location = location_imputation,
gene_visualize = c("A4GNT", "AAMDC", "CD248"),
colors = NULL,
NumCols = 6
)
print(p7)
Now, compare with the original resolution, CARD facilitates the construction of a refined spatial tissue map with a resolution much higher than that measured in the original study.
p8 <- CARD_visualize_gene(
spatial_expression = metadata(CARD_obj)$spatial_countMat,
spatial_location = metadata(CARD_obj)$spatial_location,
gene_visualize = c("A4GNT", "AAMDC", "CD248"),
colors = NULL,
NumCols = 6
)
print(p8)
We extended CARD to enable reference-free cell type deconvolution and eliminate the need for the single-cell reference data. We refer to this extension as the reference-free version of CARD, or simply as CARDfree. Different from CARD, CARDfree no longer requires an external single-cell reference dataset and only needs users to input a list of gene names for previously known cell type markers We use the same exmple dataset to illustrate the use of CARDfree. In addition to the exmple dataset, CARDfree also requires the input of marker gene list, which is in a list format with each element of the list being the cell type specific gene markers. The example marker list for runing the tutorial is included.
Similar to CARD, we will first need to create a CARDfree object with the spatial transcriptomics dataset and the marker gene list
We can use CARD_refFree
to do reference-free
deconvolution. This function frist creat a CARDfree object and then do
deconvolution. Briefly, the essential inputs are the same as the
function CARD_deconvolution
, except that this function does
not require the single cell count and meta information matrix. Instead,
it requires a markerList.
## deconvolution using CARDfree
data(markerList)
set.seed(seed = 20200107)
CARDfree_obj <- CARD_refFree(
markerlist = markerList,
spatial_count = spatial_count,
spatial_location = spatial_location,
mincountgene = 100,
mincountspot = 5
)
#> ## Number of unique marker genes: 1711 for 20 cell types ...
#> ## Deconvolution Finish! ...
Similarly, you can use SpatialExperiment object.
data(markerList)
set.seed(seed = 20200107)
CARDfree_obj <- CARD_refFree(
markerlist = markerList,
spatial_count = NULL,
spatial_location = NULL,
spe = spe,
mincountgene = 100,
mincountspot = 5
)
#> ## Number of unique marker genes: 1711 for 20 cell types ...
#> ## Deconvolution Finish! ...
The spatial data are stored in
assays(CARDfree_obj)$spatial_countMat
and
spatialCoords(CARDfree_obj)
while the marker list is stored
in CARDfree_obj@metadata$markerList
in the format of list.
The results are stored in CARDfree_obj$Proportion_CARD
.
## One limitation of reference-free version of CARD is that the cell
## types inferred
## from CARDfree do not come with a cell type label. It might be difficult to
## interpret the results.
print(CARDfree_obj$Proportion_CARD[1:2, ])
#> CT1 CT2 CT3 CT4 CT5
#> 10x10 0.56801726 6.769247e-02 1.588566e-12 4.774388e-02 4.516061e-03
#> 10x13 0.01074093 6.617604e-42 1.145420e-01 2.500291e-21 2.693620e-27
#> CT6 CT7 CT8 CT9 CT10
#> 10x10 7.621232e-30 0.09191463 0.0008195946 1.756793e-02 1.014314e-02
#> 10x13 1.316703e-01 0.21314265 0.1151354998 1.816705e-06 9.278407e-45
#> CT11 CT12 CT13 CT14 CT15
#> 10x10 8.808949e-02 0.004413839 1.169808e-14 1.195882e-14 3.446914e-03
#> 10x13 7.412540e-12 0.163930305 5.222797e-61 7.274610e-09 1.180371e-41
#> CT16 CT17 CT18 CT19 CT20
#> 10x10 0.0058732826 0.01014773 0.05644397 8.872218e-24 0.02316981
#> 10x13 0.0001647055 0.02645184 0.01083576 2.048134e-109 0.21338415
Note that here because the number of spots is relatively small, so jointly visualize the cell type proportion matrix in the scatterpie plot format is duable. We do not recommend users to visualize this plot when the number of spots is > 500. Instead, we recommend users to visualize the proportion directly, i.e., using the function CARD_visualize_prop().
colors <- c(
"#FFD92F", "#4DAF4A", "#FCCDE5", "#D9D9D9", "#377EB8", "#7FC97F", "#BEAED4",
"#FDC086", "#FFFF99", "#386CB0", "#F0027F", "#BF5B17", "#666666",
"#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E", "#E6AB02",
"#A6761D"
)
### In order to maximumply match with the original results of CARD, we order the
### colors to generally match with the results infered by CARD
current_data <- CARDfree_obj$Proportion_CARD
new_order <- current_data[, c(
8, 10, 14, 2, 1, 6, 12, 18, 7, 13, 20, 19, 16,
17, 11, 15, 4, 9, 3, 5
)]
CARDfree_obj$Proportion_CARD <- new_order
colnames(CARDfree_obj$Proportion_CARD) <- paste0("CT", 1:20)
p9 <- CARD_visualize_pie(CARDfree_obj$Proportion_CARD,
spatialCoords(CARDfree_obj),
colors = colors
)
print(p9)
We also extended CARD to facilitate the construction of single-cell
resolution spatial transcriptomics from non-single-cell resolution
spatial transcriptomics. Details of the algorithm see the main text.
Briefly, we infer the single cell resolution gene expression for each
measured spatial location from the non-single cell resolution spatial
transcriptomics data based on reference scRNaseq data we used for
deconvolution. The procedure is implemented in the function
CARD_SCMapping
. The essential inputs are: - CARD_object:
CARD object create by the createCARDObject function. This one should be
the one after we finish the deconvolution procedure - shapeSpot: a
character indicating whether the sampled spatial coordinates for single
cells locating in a Square-like region or a Circle-like region. The
center of this region is the measured spatial location in the non-single
cell resolution spatial transcriptomics data. The default is “Square”,
and the other option is “Circle” - numCell: a numeric value indicating
the number of cores used to accelerate the procedure.
#### Note that here the shapeSpot is the user defined variable which
#### indicates the capturing area of single cells. Details see above.
set.seed(seed = 20210107)
scMapping <- CARD_scmapping(CARD_obj, shapeSpot = "Square", numcell = 20,
ncore = 2)
print(scMapping)
#> class: SingleCellExperiment
#> dim: 5814 8460
#> metadata(0):
#> assays(1): counts
#> rownames(5814): ZNF641 XAB2 ... APOBEC3G PCYT1B
#> rowData names(1): rownames(count_ct)
#> colnames(8460): Cell1531:10x10:9.97518192091957x9.83765210071579
#> Cell973:10x10:10.1896061601583x9.94081321195699 ...
#> Cell738:9x32:9.26695604040287x32.4533654800616
#> Cell1002:9x32:9.40997453313321x32.3579135271721
#> colData names(7): x y ... CT Cell
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
### spatial location info and expression count of the single cell resolution
### data
MapCellCords <- as.data.frame(colData(scMapping))
count_SC <- assays(scMapping)$counts
The results are stored in a SingleCellExperiment object with mapped single cell resolution counts stored in the assays slot and the information of the spatial location for each single cell as well as their relashionship to the original measured spatial location is stored in the colData slot.
Next, we visualize the cell type for each single cell with their spatial location information
df <- MapCellCords
colors <- c(
"#8DD3C7", "#CFECBB", "#F4F4B9", "#CFCCCF", "#D1A7B9", "#E9D3DE", "#F4867C",
"#C0979F", "#D5CFD6", "#86B1CD", "#CEB28B", "#EDBC63", "#C59CC5",
"#C09CBF", "#C2D567", "#C9DAC3", "#E1EBA0",
"#FFED6F", "#CDD796", "#F8CDDE"
)
p10 <- ggplot(df, aes(x = x, y = y, colour = CT)) +
geom_point(size = 3.0) +
scale_colour_manual(values = colors) +
# facet_wrap(~Method,ncol = 2,nrow = 3) +
theme(
plot.margin = margin(0.1, 0.1, 0.1, 0.1, "cm"),
panel.background = element_rect(colour = "white", fill = "white"),
plot.background = element_rect(colour = "white", fill = "white"),
legend.position = "bottom",
panel.border = element_rect(
colour = "grey89",
fill = NA,
linewidth = 0.5),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
legend.title = element_text(size = 13, face = "bold"),
legend.text = element_text(size = 12),
legend.key = element_rect(colour = "transparent", fill = "white"),
legend.key.size = unit(0.45, "cm"),
strip.text = element_text(size = 15, face = "bold")
) +
guides(color = guide_legend(title = "Cell Type"))
print(p10)
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] ggplot2_3.5.1 SpatialExperiment_1.17.0
#> [3] SingleCellExperiment_1.29.1 SummarizedExperiment_1.37.0
#> [5] GenomicRanges_1.59.1 GenomeInfoDb_1.43.4
#> [7] IRanges_2.41.3 S4Vectors_0.45.4
#> [9] MatrixGenerics_1.19.1 matrixStats_1.5.0
#> [11] RcppArmadillo_14.2.3-1 NMF_0.28
#> [13] Biobase_2.67.0 BiocGenerics_0.53.6
#> [15] generics_0.1.3 cluster_2.1.8
#> [17] rngtools_1.5.2 registry_0.5-1
#> [19] RcppML_0.3.7 CARDspa_0.99.5
#>
#> loaded via a namespace (and not attached):
#> [1] DBI_1.2.3 deldir_2.0-4 rlang_1.1.5
#> [4] magrittr_2.0.3 gridBase_0.4-7 spatstat.geom_3.3-5
#> [7] e1071_1.7-16 compiler_4.4.2 vctrs_0.6.5
#> [10] maps_3.4.2.1 reshape2_1.4.4 quantreg_6.00
#> [13] stringr_1.5.1 pkgconfig_2.0.3 crayon_1.5.3
#> [16] fastmap_1.2.0 magick_2.8.5 XVector_0.47.2
#> [19] mcmc_0.9-8 labeling_0.4.3 rmarkdown_2.29
#> [22] UCSC.utils_1.3.1 MatrixModels_0.5-3 purrr_1.0.4
#> [25] xfun_0.51 cachem_1.1.0 jsonlite_1.9.0
#> [28] DelayedArray_0.33.6 spatstat.utils_3.1-2 BiocParallel_1.41.2
#> [31] tweenr_2.0.3 wrMisc_1.15.2 parallel_4.4.2
#> [34] R6_2.6.1 spatstat.data_3.1-4 bslib_0.9.0
#> [37] stringi_1.8.4 RColorBrewer_1.1-3 spatstat.univar_3.1-1
#> [40] jquerylib_0.1.4 iterators_1.0.14 Rcpp_1.0.14
#> [43] knitr_1.49 fields_16.3 Matrix_1.7-2
#> [46] nnls_1.6 splines_4.4.2 tidyselect_1.2.1
#> [49] abind_1.4-8 yaml_2.3.10 doParallel_1.0.17
#> [52] spatstat.random_3.3-2 codetools_0.2-20 curl_6.2.1
#> [55] lattice_0.22-6 tibble_3.2.1 plyr_1.8.9
#> [58] withr_3.0.2 coda_0.19-4.1 evaluate_1.0.3
#> [61] survival_3.8-3 sf_1.0-19 units_0.8-5
#> [64] proxy_0.4-27 scatterpie_0.2.4 polyclip_1.10-7
#> [67] BiocManager_1.30.25 pillar_1.10.1 KernSmooth_2.23-26
#> [70] foreach_1.5.2 ggfun_0.1.8 sp_2.2-0
#> [73] munsell_0.5.1 scales_1.3.0 gtools_3.9.5
#> [76] class_7.3-23 glue_1.8.0 maketools_1.3.2
#> [79] tools_4.4.2 sys_3.4.3 SparseM_1.84-2
#> [82] RANN_2.6.2 buildtools_1.0.0 fs_1.6.5
#> [85] dotCall64_1.2 grid_4.4.2 tidyr_1.3.1
#> [88] MCMCpack_1.7-1 colorspace_2.1-1 GenomeInfoDbData_1.2.13
#> [91] ggforce_0.4.2 cli_3.6.4 spam_2.11-1
#> [94] S4Arrays_1.7.3 viridisLite_0.4.2 dplyr_1.1.4
#> [97] concaveman_1.1.0 V8_6.0.1 gtable_0.3.6
#> [100] ggcorrplot_0.1.4.1 yulab.utils_0.2.0 sass_0.4.9
#> [103] digest_0.6.37 classInt_0.4-11 SparseArray_1.7.6
#> [106] rjson_0.2.23 farver_2.1.2 htmltools_0.5.8.1
#> [109] lifecycle_1.0.4 httr_1.4.7 MASS_7.3-64
Ma, Y., Zhou, X. Spatially informed cell-type deconvolution for spatial transcriptomics. Nat Biotechnol 40, 1349–1359 (2022). https://doi.org/10.1038/s41587-022-01273-7
Moncada, R., Barkley, D., Wagner, F. et al. Integrating microarray-based spatial transcriptomics and single-cell RNA-seq reveals tissue architecture in pancreatic ductal adenocarcinomas. Nat Biotechnol 38, 333–342 (2020). https://doi.org/10.1038/s41587-019-0392-8