Package 'SpatialDecon'

Title:	Deconvolution of mixed cells from spatial and/or bulk gene expression data
Description:	Using spatial or bulk gene expression data, estimates abundance of mixed cell types within each observation. Based on "Advances in mixed cell deconvolution enable quantification of cell types in spatial transcriptomic data", Danaher (2022). Designed for use with the NanoString GeoMx platform, but applicable to any gene expression data.
Authors:	Maddy Griswold [cre, aut], Patrick Danaher [aut]
Maintainer:	Maddy Griswold <[email protected]>
License:	MIT + file LICENSE
Version:	1.17.0
Built:	2025-02-21 06:20:05 UTC
Source:	https://github.com/bioc/SpatialDecon

Help Index

SpatialDecon: A package for estimating mixed cell type abundance in the regions of spatially-resolved gene expression studies
Default colors for the cell types in the safeTME matrix
Collapse related cell types within a deconvolution result
Create Custom Cell Profile Matrix
Derive background at the scale of the normalized data for GeoMx data
Download a cell profile matrix
Draw coxcomb plots as points in a graphics window
Genes' biological variability in immune deconvolution from TCGA.
Estimate a tumor-specific profile and merge it with the pre-specified cell profile matrix (X)
Small example GeoMx data
Mini human colon single cell dataset
Large example GeoMx data
Reverse deconvolution
Run collapseCellTypes
Apply error model to estimate technical SD from raw counts
Run MergeTumorIntoX
Run Reversedecon
Run spatialdecon
SafeTME matrix
Mapping from granularly-defined cell populations to broaded cell populations
Mixed cell deconvolution of spatiall-resolved gene expression data
Barplot of abundance estimates

SpatialDecon: A package for estimating mixed cell type abundance in the regions of spatially-resolved gene expression studies

Description

The SpatialDecon package estimates mixed cell type abundance in the regions of spatially-resolved gene expression studies, using the method of Danaher & Kim (2020), "Advances in mixed cell deconvolution enable quantification of cell types in spatially-resolved gene expression data." It is also appropriate to apply to bulk gene expression data.

functions

Functions to help set up deconvolution:

derive_GeoMx_background Estimates the background levels from GeoMx experiments
collapseCellTypes reformats deconvolution results to merge closely-related cell types
download_profile_matrix Downloads a cell profile matrix.
safeTME: a data object, a matrix of immune cell profiles for use in tumor-immune deconvolution.

Deconvolution functions:

spatialdecon runs the core deconvolution function
reverseDecon runs a transposed/reverse deconvolution problem, fitting the data as a function of cell abundance estimates. Used to measure genes' dependency on cell mixing and to calculate gene residuals from cell mixing.

Plotting functions:

florets Plot cell abundance on a specified x-y space, with each point a cockscomb plot showing the cell abundances of that region/sample.
TIL_barplot Plot abundances of tumor infiltrating lymphocytes (TILs) estimated from the safeTME cell profile matrix

Examples

data(mini_geomx_dataset)
data(safeTME)
data(safeTME.matches)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# run decon with bells and whistles:
res <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME,
  cellmerges = safeTME.matches,
  cell_counts = mini_geomx_dataset$annot$nuclei,
  is_pure_tumor = mini_geomx_dataset$annot$AOI.name == "Tumor"
)
data(mini_geomx_dataset)
data(safeTME)
data(safeTME.matches)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# run decon with bells and whistles:
res <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME,
  cellmerges = safeTME.matches,
  cell_counts = mini_geomx_dataset$annot$nuclei,
  is_pure_tumor = mini_geomx_dataset$annot$AOI.name == "Tumor"
)

Default colors for the cell types in the safeTME matrix

Description

A named vector of colors, giving colors for the cell types of the safeTME matrix.

Usage

cellcols
cellcols

Format

A named vector

Collapse related cell types within a deconvolution result

Description

Given the input of an SpatialDecon result output and a list of which cell types to combine, returns a reshaped deconvolution result object with the specified cell types merged.

Usage

collapseCellTypes(fit, matching)
collapseCellTypes(fit, matching)

Arguments

`fit`	The object (a list) returned by the SpatialDecon algorithm
`matching`	A list object holding the mapping from beta's cell names to official cell names. See str(safeTME.matches) for an example.

Value

A reshaped deconvolution result object

Examples

data(mini_geomx_dataset)
data(safeTME)
data(safeTME.matches)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
res1 <- collapseCellTypes(
    fit = res0,
    matching = safeTME.matches
)
data(mini_geomx_dataset)
data(safeTME)
data(safeTME.matches)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
res1 <- collapseCellTypes(
    fit = res0,
    matching = safeTME.matches
)

Create Custom Cell Profile Matrix

Description

Create custom cell profile matrix using single cell data. The average gene expression for each cell type is returned.

Usage

create_profile_matrix(
  mtx,
  cellAnnots,
  cellTypeCol,
  cellNameCol,
  matrixName = "Custom",
  outDir = "./",
  geneList = NULL,
  normalize = FALSE,
  scalingFactor = 5,
  minCellNum = 15,
  minGenes = 100,
  discardCellTypes = FALSE
)
create_profile_matrix(
  mtx,
  cellAnnots,
  cellTypeCol,
  cellNameCol,
  matrixName = "Custom",
  outDir = "./",
  geneList = NULL,
  normalize = FALSE,
  scalingFactor = 5,
  minCellNum = 15,
  minGenes = 100,
  discardCellTypes = FALSE
)

Arguments

`mtx`	gene x cell count matrix
`cellAnnots`	cell annotations with cell type and cell name as columns
`cellTypeCol`	column containing cell type
`cellNameCol`	column containing cell ID/name
`matrixName`	name of final profile matrix
`outDir`	path to desired output directory, set to NULL if matrix should not be written
`geneList`	gene list to filter profile matrix to
`normalize`	Should data be normalized? (TRUE/FALSE) if TRUE data will be normalize using total gene count
`scalingFactor`	what should all values be multiplied by for final matrix, set to 1 if no scaling is wanted
`minCellNum`	minimum number of cells of one type needed to create profile, exclusive
`minGenes`	minimum number of genes expressed in a cell, exclusive
`discardCellTypes`	should cell types be filtered for types like mitotic, doublet, low quality, unknown, etc.

Value

A custom cell profile matrix genes (rows) by cell types (columns), matrix gets written to disk and outDir

Examples

cellNames <- paste0("Cell", seq_len(1500))
geneNames <- paste0("Gene", seq_len(1500))
mtx <- matrix(data=sample(size = length(cellNames)*length(geneNames),
                          replace = TRUE,
                          x = c(0,seq_len(100)), 
                          prob = c(0.6784, rep(0.0075, 15), rep(0.005, 25),
                                   rep(0.002, 25), rep(0.001, 35))), 
                          ncol = length(cellNames), nrow = length(geneNames), 
                          dimnames = list(geneNames, cellNames))
cellAnnots <- as.data.frame(cbind(CellID=cellNames, 
                                  cellType=sample(size = length(cellNames), 
                                                  replace = TRUE,
                                                  x = c("A", "B", "C", "D"),
                                                  prob = c(0.1, 0.4, 0.3, 0.2))))
table(cellAnnots$cellType)
profile_matrix <- create_profile_matrix(mtx = mtx,    
                                        cellAnnots = cellAnnots, 
                                        cellTypeCol = "cellType",
                                        cellNameCol = "CellID",
                                        minGenes = 10,
                                        scalingFactor = 1)
head(profile_matrix)
cellNames <- paste0("Cell", seq_len(1500))
geneNames <- paste0("Gene", seq_len(1500))
mtx <- matrix(data=sample(size = length(cellNames)*length(geneNames),
                          replace = TRUE,
                          x = c(0,seq_len(100)), 
                          prob = c(0.6784, rep(0.0075, 15), rep(0.005, 25),
                                   rep(0.002, 25), rep(0.001, 35))), 
                          ncol = length(cellNames), nrow = length(geneNames), 
                          dimnames = list(geneNames, cellNames))
cellAnnots <- as.data.frame(cbind(CellID=cellNames, 
                                  cellType=sample(size = length(cellNames), 
                                                  replace = TRUE,
                                                  x = c("A", "B", "C", "D"),
                                                  prob = c(0.1, 0.4, 0.3, 0.2))))
table(cellAnnots$cellType)
profile_matrix <- create_profile_matrix(mtx = mtx,    
                                        cellAnnots = cellAnnots, 
                                        cellTypeCol = "cellType",
                                        cellNameCol = "CellID",
                                        minGenes = 10,
                                        scalingFactor = 1)
head(profile_matrix)

Derive background at the scale of the normalized data for GeoMx data

Description

Estimates per-datapoint background levels from a GeoMx experiment. In studies with two or more probe pools, different probes will have different background levels. This function provides a convenient way to account for this phenomenon.

Usage

derive_GeoMx_background(norm, probepool, negnames)
derive_GeoMx_background(norm, probepool, negnames)

Arguments

`norm`	Matrix of normalized data, genes in rows and segments in columns. Must include negprobes, and must have rownames.
`probepool`	Vector of probe pool names for each gene, aligned to the rows of "norm".
`negnames`	Names of all negProbes in the dataset. Must be at least one neg.name within each probe pool.

Value

A matrix of expected background values, in the same scale and dimensions as the "norm" argument.

Examples

data(mini_geomx_dataset)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
data(mini_geomx_dataset)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)

Download a cell profile matrix

Description

Download a cell profile matrix from the online library

Usage

download_profile_matrix(species, age_group, matrixname)
download_profile_matrix(species, age_group, matrixname)

Arguments

`species`	species of profile matrix
`age_group`	age_group of profile matrix, if fetal mouse please add the developmental stage separated with /, i.e. Fetal/E14.5
`matrixname`	name of profile matrix

Details

Valid matrices can be found on the github site https://github.com/Nanostring-Biostats/CellProfileLibrary/tree/master

Value

A cell profile matrix, suggested cell groups, and paper metadata

Examples

download_profile_matrix(species = "Human", age_group = "Adult", matrixname = "Colon_HCA")
head(profile_matrix)
print(cellGroups)
print(metadata)
download_profile_matrix(species = "Human", age_group = "Adult", matrixname = "Colon_HCA")
head(profile_matrix)
print(cellGroups)
print(metadata)

Draw coxcomb plots as points in a graphics window

Description

Draws a scatterplot where each point is a circular barplot, intended to show decon results

Usage

florets(
  x,
  y,
  b,
  col = NULL,
  legendwindow = FALSE,
  rescale.by.sqrt = TRUE,
  border = NA,
  add = FALSE,
  cex = 1,
  bty = "n",
  xaxt = "n",
  yaxt = "n",
  xlab = "",
  ylab = "",
  ...
)
florets(
  x,
  y,
  b,
  col = NULL,
  legendwindow = FALSE,
  rescale.by.sqrt = TRUE,
  border = NA,
  add = FALSE,
  cex = 1,
  bty = "n",
  xaxt = "n",
  yaxt = "n",
  xlab = "",
  ylab = "",
  ...
)

Arguments

`x`	Vector of x coordinates
`y`	Vector of y coordinates
`b`	matrix or cell abundances, with columns aligned with the elements of x and y
`col`	vector of colors, aligned to the rows of b.
`legendwindow`	Logical. If TRUE, the function draws a color legend in a new window
`rescale.by.sqrt`	Logical, for whether to rescale b by its square root to make value proportional to shape area, not shape length.
`border`	Color of pie segment border, defauls to NA/none
`add`	Logical. If TRUE, the function draws florets atop an existing graphics device (TRUE) or call a new device (FALSE).
`cex`	Floret size. Florets are scaled relative to the range of x and y; this further scales up or down.
`bty`	bty argument passed to plot()
`xaxt`	xaxt argument passed to plot()
`yaxt`	yaxt argument passed to plot()
`xlab`	xlab, defaults to ""
`ylab`	ylab, defaults to ""
`...`	additional arguments passed to plot()

Value

Draws a coxcomb plot, returns no data.

Examples

data(mini_geomx_dataset)
data(safeTME)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# draw florets:
florets(
  x = mini_geomx_dataset$annot$x,
  y = mini_geomx_dataset$annot$y,
  b = res0$beta, cex = 2
)
data(mini_geomx_dataset)
data(safeTME)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# draw florets:
florets(
  x = mini_geomx_dataset$annot$x,
  y = mini_geomx_dataset$annot$y,
  b = res0$beta, cex = 2
)

Genes' biological variability in immune deconvolution from TCGA.

Description

Genes' biological SDs, as estimated from immune deconvolution from TCGA. Used to weight genes in spatialdecon.

Usage

mean.resid.sd
mean.resid.sd

Format

A named vector giving SDs of 1179 genes.

Estimate a tumor-specific profile and merge it with the pre-specified cell profile matrix (X)

Description

Given the input of "tumor-only" AOI's, estimates an collection of tumor-specific expression profiles and merges them with the immune cell expression training matrix. The process:

log2/normalized data from tumor-only AOIs is clustered with hclust, and cutree() is used to define clusters.
2. Each cluster's geomean profile is merged into the immune cell profile matrix.

Usage

mergeTumorIntoX(norm, bg, pure_tumor_ids, X, K = 10)
mergeTumorIntoX(norm, bg, pure_tumor_ids, X, K = 10)

Arguments

`norm`	matrix of normalized data
`bg`	matrix of expected background, on the scale of norm.
`pure_tumor_ids`	Vector identifying columns of norm that are pure tumor. Can be indices, logicals or column names.
`X`	The training matrix
`K`	the number of clusters to fit

Value

an updated X matrix with new columns, "tumor.1", "tumor.2", ...

Examples

data(mini_geomx_dataset)
data(safeTME)
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
safeTME.with.tumor <- mergeTumorIntoX(
  norm = mini_geomx_dataset$norm,
  bg = mini_geomx_dataset$bg,
  pure_tumor_ids = mini_geomx_dataset$annot$AOI.name == "Tumor",
  X = safeTME,
  K = 3
)
data(mini_geomx_dataset)
data(safeTME)
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
safeTME.with.tumor <- mergeTumorIntoX(
  norm = mini_geomx_dataset$norm,
  bg = mini_geomx_dataset$bg,
  pure_tumor_ids = mini_geomx_dataset$annot$AOI.name == "Tumor",
  X = safeTME,
  K = 3
)

Small example GeoMx data

Description

A miniature GeoMx dataset used by the spatialdecon examples.

Usage

mini_geomx_dataset
mini_geomx_dataset

Format

A list with the following elements:

normalized: normalized data matrix
raw: raw data matrix
annot: AOI annotation data frame

Mini human colon single cell dataset

Description

Random 250 cells and most informative genes (CV > 10) between cell types from Kinchen, J. et al. Structural Remodeling of the Human Colonic Mesenchyme in Inflammatory Bowel Disease. Cell 175, 372-386.e17 (2018).

Usage

mini_singleCell_dataset
mini_singleCell_dataset

Format

A list with the following elements:

mtx: sparse count matrix
annots: cell type annotation data frame

Large example GeoMx data

Description

A GeoMx dataset with dense AOIs gridded over a NSCLC tumor. Each AOI is split into tumor and microenvironment segments.

Usage

nsclc
nsclc

Format

GeoMxSet Object

Reverse deconvolution

Description

Performs "reverse deconvolution", modelling each gene expression's ~ cell scores. Returns a matrix of "fitted" expression values, a matrix of residuals, a matrix of reverse decon coefficients for genes * cells.

Usage

reverseDecon(norm, beta, epsilon = NULL)
reverseDecon(norm, beta, epsilon = NULL)

Arguments

`norm`	Matrix of normalized data, with genes in rows and observations in columns
`beta`	Matrix of cell abundance estimates, with cells in rows and observations in columns. Columns are aligned to "norm".
`epsilon`	All y and yhat values are thresholded up to this point when performing decon. Essentially says, "ignore variability in counts below this threshold."

Value

A list:

coefs, a matrix of coefficients for genes * cells, where element i,j is interpreted as "every unit increase in cell score j is expected to increase expression of gene i by _".
yhat, a matrix of fitted values, in the same dimension as norm
resids, a matrix of log2-scale residuals from the reverse decon fit, in the same dimension as norm
cors, a vector giving each gene's correlation between fitted and observed expression
resid.sd, a vector of each gene's residual SD, a metric of how much variability genes have independend of cell mixing.

Examples

data(mini_geomx_dataset)
data(safeTME)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# run reverse decon:
rdecon <- reverseDecon(
  norm = mini_geomx_dataset$norm,
  beta = res0$beta
)
data(mini_geomx_dataset)
data(safeTME)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# run reverse decon:
rdecon <- reverseDecon(
  norm = mini_geomx_dataset$norm,
  beta = res0$beta
)

Run collapseCellTypes

Description

Runs collapseCellTypes from an S4 object

Given the input of an SpatialDecon result output and a list of which cell types to combine, returns a reshaped deconvolution result object with the specified cell types merged.

Usage

runCollapseCellTypes(object, ...)

## S4 method for signature 'NanoStringGeoMxSet'
runCollapseCellTypes(object, matching = NULL)
runCollapseCellTypes(object, ...)

## S4 method for signature 'NanoStringGeoMxSet'
runCollapseCellTypes(object, matching = NULL)

Arguments

`object`	An S4 object such as a GeoMxSet object
`...`	Arguments passed to collapseCellTypes
`matching`	A list object holding the mapping from beta's cell names to official cell names. See str(safeTME.matches) for an example.

Value

A reshaped deconvolution result object

Examples

library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

target_demoData <- normalize(target_demoData, "quant")
                
# run basic decon:
res0 <- runspatialdecon(object = target_demoData,
                        norm_elt = "exprs_norm",
                        raw_elt = "exprs")

# run reverse decon:
target_demoData <- runReverseDecon(object = target_demoData,
                                   norm_elt = "exprs_norm",
                                   beta = pData(res0)$beta)

library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

target_demoData <- normalize(target_demoData, "quant")
                
# run basic decon:
res0 <- runspatialdecon(object = target_demoData,
                        norm_elt = "exprs_norm",
                        raw_elt = "exprs")

# run reverse decon:
target_demoData <- runReverseDecon(object = target_demoData,
                                   norm_elt = "exprs_norm",
                                   beta = pData(res0)$beta)

Apply error model to estimate technical SD from raw counts

Description

Based on raw counts, uses past data to estimate each raw count's log-scale SD from technical noise. Specifies different error models for different platforms.

Usage

runErrorModel(counts, platform = "general")
runErrorModel(counts, platform = "general")

Arguments

`counts`	vector or matrix of raw counts
`platform`	String specifying which platform was used to create "rawCounts". Default to "dsp", for digital spatial profiler/ aka GeoMx. Other options include "ncounter", "rsem", "quantile", and "st" for spatial transcriptomics/visium.

Value

a matrix of log2-scale SDs

Examples

library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

sd_from_noise <- runErrorModel(counts = exprs(target_demoData), platform = "dsp") 
wts <- 1 / sd_from_noise
library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

sd_from_noise <- runErrorModel(counts = exprs(target_demoData), platform = "dsp") 
wts <- 1 / sd_from_noise

Run MergeTumorIntoX

Description

Runs mergeTumorIntoX from an S4 object

A wrapper for applying mergeTumorIntoX to a NanostringGeomxSet object.

Usage

runMergeTumorIntoX(object, ...)

## S4 method for signature 'NanoStringGeoMxSet'
runMergeTumorIntoX(object, X, K = 10, pure_tumor_ids = NULL, norm_elt = NULL)
runMergeTumorIntoX(object, ...)

## S4 method for signature 'NanoStringGeoMxSet'
runMergeTumorIntoX(object, X, K = 10, pure_tumor_ids = NULL, norm_elt = NULL)

Arguments

`object`	An S4 object such as a GeoMxSet object
`...`	Arguments passed to mergeTumorIntoX
`X`	The training matrix
`K`	the number of clusters to fit
`pure_tumor_ids`	Vector identifying columns of norm that are pure tumor. Can be indices, logicals or column names.
`norm_elt`	norm data element in assayData

Value

updated X matrix with new columns, "tumor.1", "tumor.2", ...

Examples

library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

target_demoData <- normalize(target_demoData, "quant")

data(safeTME)
tumor.ids <- as.logical(sample(x = c("TRUE","FALSE"), size = 88, replace = TRUE))
safeTME.with.tumor <- runMergeTumorIntoX(object = target_demoData,
                                         X = safeTME,
                                         K = 3,
                                         pure_tumor_ids = tumor.ids,
                                         norm_elt = "exprs_norm")

library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

target_demoData <- normalize(target_demoData, "quant")

data(safeTME)
tumor.ids <- as.logical(sample(x = c("TRUE","FALSE"), size = 88, replace = TRUE))
safeTME.with.tumor <- runMergeTumorIntoX(object = target_demoData,
                                         X = safeTME,
                                         K = 3,
                                         pure_tumor_ids = tumor.ids,
                                         norm_elt = "exprs_norm")

Run Reversedecon

Description

Runs reversedecon from an S4 object

A wrapper for applying reversedecon to a NanostringGeomxSet object.

Usage

runReverseDecon(object, ...)

## S4 method for signature 'NanoStringGeoMxSet'
runReverseDecon(object, norm_elt = NULL, beta, epsilon = NULL)
runReverseDecon(object, ...)

## S4 method for signature 'NanoStringGeoMxSet'
runReverseDecon(object, norm_elt = NULL, beta, epsilon = NULL)

Arguments

`object`	An S4 object such as a GeoMxSet object
`...`	Arguments passed to reversedecon
`norm_elt`	normalized data element in assayData.
`beta`	Matrix of cell abundance estimates, with cells in columns and observations in rows. Columns are aligned to "norm".
`epsilon`	All y and yhat values are thresholded up to this point when performing decon. Essentially says, "ignore variability in counts below this threshold."

Value

a valid GeoMx S4 object including the following items:

in fData
- coefs, a matrix of coefficients for genes * cells, where element i,j is interpreted as "every unit increase in cell score j is expected to increase expression of gene i by _".
- cors, a vector giving each gene's correlation between fitted and observed expression
- resid.sd, a vector of each gene's residual SD, a metric of how much variability genes have independend of cell mixing.
in assayData
- yhat, a matrix of fitted values, in the same dimension as norm
- resids, a matrix of log2-scale residuals from the reverse decon fit, in the same dimension as norm

Examples

library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

target_demoData <- normalize(target_demoData, "quant")
                
# run basic decon:
res0 <- runspatialdecon(object = target_demoData,
                        norm_elt = "exprs_norm",
                        raw_elt = "exprs")

# run reverse decon:
target_demoData <- runReverseDecon(object = target_demoData,
                                   norm_elt = "exprs_norm",
                                   beta = pData(res0)$beta)

library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

target_demoData <- normalize(target_demoData, "quant")
                
# run basic decon:
res0 <- runspatialdecon(object = target_demoData,
                        norm_elt = "exprs_norm",
                        raw_elt = "exprs")

# run reverse decon:
target_demoData <- runReverseDecon(object = target_demoData,
                                   norm_elt = "exprs_norm",
                                   beta = pData(res0)$beta)

Run spatialdecon

Description

Runs spatialdecon from an S4 object

A wrapper for applying spatialdecon to a NanostringGeomxSet object.

A wrapper for applying spatialdecon to the Spatial data element in a Seurat object. Unlike spatialdecon, which expects a normalized data matrix, this function operates on raw counts. Scaling for total cells

Usage

runspatialdecon(object, ...)

## S4 method for signature 'NanoStringGeoMxSet'
runspatialdecon(
  object,
  X = NULL,
  norm_elt = NULL,
  raw_elt = NULL,
  wts = NULL,
  resid_thresh = 3,
  lower_thresh = 0.5,
  align_genes = TRUE,
  is_pure_tumor = NULL,
  n_tumor_clusters = 10,
  cell_counts = NULL,
  cellmerges = NULL,
  maxit = 1000
)

## S4 method for signature 'Seurat'
runspatialdecon(
  object,
  X = NULL,
  bg = 0.1,
  wts = NULL,
  resid_thresh = 3,
  lower_thresh = 0.5,
  align_genes = TRUE,
  is_pure_tumor = NULL,
  n_tumor_clusters = 10,
  cell_counts = NULL,
  cellmerges = NULL,
  maxit = 1000
)
runspatialdecon(object, ...)

## S4 method for signature 'NanoStringGeoMxSet'
runspatialdecon(
  object,
  X = NULL,
  norm_elt = NULL,
  raw_elt = NULL,
  wts = NULL,
  resid_thresh = 3,
  lower_thresh = 0.5,
  align_genes = TRUE,
  is_pure_tumor = NULL,
  n_tumor_clusters = 10,
  cell_counts = NULL,
  cellmerges = NULL,
  maxit = 1000
)

## S4 method for signature 'Seurat'
runspatialdecon(
  object,
  X = NULL,
  bg = 0.1,
  wts = NULL,
  resid_thresh = 3,
  lower_thresh = 0.5,
  align_genes = TRUE,
  is_pure_tumor = NULL,
  n_tumor_clusters = 10,
  cell_counts = NULL,
  cellmerges = NULL,
  maxit = 1000
)

Arguments

`object`	An S4 object such as a Seurat object that includse a "Spatial" element in the "assays" slot or a GeoMxSet object
`...`	Arguments passed to spatialdecon
`X`	Cell profile matrix. If NULL, the safeTME matrix is used.
`norm_elt`	normalized data element in assayData in NanostringGeomxSet object
`raw_elt`	raw data element in assayData in NanostringGeomxSet object
`wts`	Optional, a matrix of weights.
`resid_thresh`	A scalar, sets a threshold on how extreme individual data points' values can be (in log2 units) before getting flagged as outliers and set to NA.
`lower_thresh`	A scalar. Before log2-scale residuals are calculated, both observed and fitted values get thresholded up to this value. Prevents log2-scale residuals from becoming extreme in points near zero.
`align_genes`	Logical. If TRUE, then Y, X, bg, and wts are row-aligned by shared genes.
`is_pure_tumor`	A logical vector denoting whether each AOI consists of pure tumor. If specified, then the algorithm will derive a tumor expression profile and merge it with the immune profiles matrix.
`n_tumor_clusters`	Number of tumor-specific columns to merge into the cell profile matrix. Has an impact only when is_pure_tumor argument is used to indicate pure tumor AOIs. Takes this many clusters from the pure-tumor AOI data and gets the average expression profile in each cluster. Default 10.
`cell_counts`	Number of cells estimated to be within each sample. If provided alongside norm_factors, then the algorithm will additionally output cell abundance esimtates on the scale of cell counts.
`cellmerges`	A list object holding the mapping from beta's cell names to combined cell names. If left NULL, then defaults to a mapping of granular immune cell definitions to broader categories.
`maxit`	Maximum number of iterations. Default 1000.
`bg`	Expected background counts. Either a scalar applied equally to all points in the count matrix, or a matrix with the same dimensions as the count matrix in GetAssayData(object, assay = "Spatial"). Recommended to use a small non-zero value, default of 0.1.

Value

decon results in list or in GeoMxSet object

For GeoMxSet object, if not given cellmerges and cell_counts, a valid GeoMx S4 object including the following items

In pData
- beta: matrix of cell abundance estimates, cells in rows and observations in columns
- p: matrix of p-values for H0: beta == 0
- t: matrix of t-statistics for H0: beta == 0
- se: matrix of standard errors of beta values
- prop_of_all: rescaling of beta to sum to 1 in each observation
- prop_of_nontumor: rescaling of beta to sum to 1 in each observation, excluding tumor abundance estimates
- sigmas: covariance matrices of each observation's beta estimates
In assayData
- yhat: a matrix of fitted values
- resids: a matrix of residuals from the model fit. (log2(pmax(y, lower_thresh)) - log2(pmax(xb, lower_thresh))).
In experimentData
- SpatialDeconMatrix: the cell profile matrix used in the decon fit.

if given cellmerges, the valid GeoMx S4 object will additionally include the following items

In pData
- beta.granular: cell abundances prior to combining closely-related cell types
- sigma.granular: sigmas prior to combining closely-related cell types

if given cell_counts, the valid GeoMx S4 object will additionally include the following items

In pData
- cell.counts: beta rescaled to estimate cell numbers, based on prop_of_all and nuclei count

if given both cellmerges and cell_counts, the valid GeoMx S4 object will additionally include the following items

In pData
- cell.counts.granular: cell.counts prior to combining closely-related cell types

For Seurat Object, if not given cellmerges and cell_counts, a list including the following items:

beta: matrix of cell abundance estimates, cells in rows and observations in columns
p: matrix of p-values for H0: beta == 0
t: matrix of t-statistics for H0: beta == 0
se: matrix of standard errors of beta values
prop_of_all: rescaling of beta to sum to 1 in each observation
prop_of_nontumor: rescaling of beta to sum to 1 in each observation, excluding tumor abundance estimates
yhat: a matrix of fitted values
resids: a matrix of residuals from the model fit. (log2(pmax(y, lower_thresh)) - log2(pmax(xb, lower_thresh))).
X: the cell profile matrix used in the decon fit.
sigmas: covariance matrices of each observation's beta estimates

if given cellmerges, the list will additionally include the following items

beta.granular: cell abundances prior to combining closely-related cell types
sigma.granular: sigmas prior to combining closely-related cell types

if given cell_counts, the list will additionally include the following items

cell.counts: beta rescaled to estimate cell numbers, based on prop_of_all and nuclei count

if given both cellmerges and cell_counts, the list will additionally include the following items

cell.counts.granular: cell.counts prior to combining closely-related cell types

Examples


## GeoMxSet Object ##
library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

target_demoData <- normalize(target_demoData, "quant")

demoData <- runspatialdecon(object = target_demoData, 
                            norm_elt = "exprs_norm",
                            raw_elt = "exprs")

## Seurat Object ##
# get dataset
con <- gzcon(url("https://github.com/almaan/her2st/raw/master/data/ST-cnts/G1.tsv.gz"))
txt <- readLines(con)
temp <- read.table(textConnection(txt), sep = "\t", header = TRUE, row.names = 1)
# parse data
raw = t(as.matrix(temp))
norm = sweep(raw, 2, colSums(raw), "/") * mean(colSums(raw))
x = as.numeric(substr(rownames(temp), 1, unlist(gregexpr("x", rownames(temp))) - 1))
y = -as.numeric(substr(rownames(temp), 
                 unlist(gregexpr("x", rownames(temp))) + 1, nchar(rownames(temp))))
# put into a seurat object:
andersson_g1 = SeuratObject::CreateSeuratObject(counts = raw, assay="Spatial")
[email protected]$x = x
[email protected]$y = y

res <- runspatialdecon(andersson_g1)
str(res)
## GeoMxSet Object ##
library(GeomxTools)
datadir <- system.file("extdata", "DSP_NGS_Example_Data", package = "GeomxTools")
demoData <- readRDS(file.path(datadir, "/demoData.rds"))

demoData <- shiftCountsOne(demoData)
target_demoData <- aggregateCounts(demoData)

target_demoData <- normalize(target_demoData, "quant")

demoData <- runspatialdecon(object = target_demoData, 
                            norm_elt = "exprs_norm",
                            raw_elt = "exprs")

## Seurat Object ##
# get dataset
con <- gzcon(url("https://github.com/almaan/her2st/raw/master/data/ST-cnts/G1.tsv.gz"))
txt <- readLines(con)
temp <- read.table(textConnection(txt), sep = "\t", header = TRUE, row.names = 1)
# parse data
raw = t(as.matrix(temp))
norm = sweep(raw, 2, colSums(raw), "/") * mean(colSums(raw))
x = as.numeric(substr(rownames(temp), 1, unlist(gregexpr("x", rownames(temp))) - 1))
y = -as.numeric(substr(rownames(temp), 
                 unlist(gregexpr("x", rownames(temp))) + 1, nchar(rownames(temp))))
# put into a seurat object:
andersson_g1 = SeuratObject::CreateSeuratObject(counts = raw, assay="Spatial")
andersson_g1@meta.data$x = x
andersson_g1@meta.data$y = y

res <- runspatialdecon(andersson_g1)
str(res)

SafeTME matrix

Description

A matrix of expression profiles of 906 genes over 18 cell types.

Usage

safeTME
safeTME

Format

A matrix with 906 genes (rows) and 18 cell types (columns)

Mapping from granularly-defined cell populations to broaded cell populations

Description

Mapping from granularly-defined cell populations to broaded cell populations, for use by the convertCellTypes function.

Usage

safeTME.matches
safeTME.matches

Format

A list. Each element of the list contains the granular cell types that roll up to a single coarse cell type.

Mixed cell deconvolution of spatiall-resolved gene expression data

Description

Runs the spatialdecon algorithm with added optional functionalities. Workflow is:

compute weights from raw data
Estimate a tumor profile and merge it into the cell profiles matrix
run deconvolution once
remove poorly-fit genes from first round of decon
re-run decon with cleaned-up gene set
combine closely-related cell types
compute p-values
rescale abundance estimates, to proportions of total, proportions of immune, cell counts

Usage

spatialdecon(
  norm,
  bg,
  X = NULL,
  raw = NULL,
  wts = NULL,
  resid_thresh = 3,
  lower_thresh = 0.5,
  align_genes = TRUE,
  is_pure_tumor = NULL,
  n_tumor_clusters = 10,
  cell_counts = NULL,
  cellmerges = NULL,
  maxit = 1000
)
spatialdecon(
  norm,
  bg,
  X = NULL,
  raw = NULL,
  wts = NULL,
  resid_thresh = 3,
  lower_thresh = 0.5,
  align_genes = TRUE,
  is_pure_tumor = NULL,
  n_tumor_clusters = 10,
  cell_counts = NULL,
  cellmerges = NULL,
  maxit = 1000
)

Arguments

`norm`	p-length expression vector or p * N expression matrix - the actual (linear-scale) data
`bg`	Same dimension as norm: the background expected at each data point.
`X`	Cell profile matrix. If NULL, the safeTME matrix is used.
`raw`	Optional for using an error model to weight the data points. p-length expression vector or p * N expression matrix - the raw (linear-scale) data
`wts`	Optional, a matrix of weights.
`resid_thresh`	A scalar, sets a threshold on how extreme individual data points' values can be (in log2 units) before getting flagged as outliers and set to NA.
`lower_thresh`	A scalar. Before log2-scale residuals are calculated, both observed and fitted values get thresholded up to this value. Prevents log2-scale residuals from becoming extreme in points near zero.
`align_genes`	Logical. If TRUE, then Y, X, bg, and wts are row-aligned by shared genes.
`is_pure_tumor`	A logical vector denoting whether each AOI consists of pure tumor. If specified, then the algorithm will derive a tumor expression profile and merge it with the immune profiles matrix.
`n_tumor_clusters`	Number of tumor-specific columns to merge into the cell profile matrix. Has an impact only when is_pure_tumor argument is used to indicate pure tumor AOIs. Takes this many clusters from the pure-tumor AOI data and gets the average expression profile in each cluster. Default 10.
`cell_counts`	Number of cells estimated to be within each sample. If provided alongside norm_factors, then the algorithm will additionally output cell abundance esimtates on the scale of cell counts.
`cellmerges`	A list object holding the mapping from beta's cell names to combined cell names. If left NULL, then defaults to a mapping of granular immune cell definitions to broader categories.
`maxit`	Maximum number of iterations. Default 1000.

Value

a list:

beta: matrix of cell abundance estimates, cells in rows and observations in columns
sigmas: covariance matrices of each observation's beta estimates
p: matrix of p-values for H0: beta == 0
t: matrix of t-statistics for H0: beta == 0
se: matrix of standard errors of beta values
prop_of_all: rescaling of beta to sum to 1 in each observation
prop_of_nontumor: rescaling of beta to sum to 1 in each observation, excluding tumor abundance estimates
cell.counts: beta rescaled to estimate cell numbers, based on prop_of_all and nuclei count
beta.granular: cell abundances prior to combining closely-related cell types
sigma.granular: sigmas prior to combining closely-related cell types
cell.counts.granular: cell.counts prior to combining closely-related cell types
resids: a matrix of residuals from the model fit. (log2(pmax(y, lower_thresh)) - log2(pmax(xb, lower_thresh))).
X: the cell profile matrix used in the decon fit.

Examples

data(mini_geomx_dataset)
data(safeTME)
data(safeTME.matches)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# run decon with bells and whistles:
res <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME,
  cellmerges = safeTME.matches,
  cell_counts = mini_geomx_dataset$annot$nuclei,
  is_pure_tumor = mini_geomx_dataset$annot$AOI.name == "Tumor"
)
data(mini_geomx_dataset)
data(safeTME)
data(safeTME.matches)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# run decon with bells and whistles:
res <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME,
  cellmerges = safeTME.matches,
  cell_counts = mini_geomx_dataset$annot$nuclei,
  is_pure_tumor = mini_geomx_dataset$annot$AOI.name == "Tumor"
)

Barplot of abundance estimates

Description

Draw barplot of the "betas" from a decon fit

Usage

TIL_barplot(mat, draw_legend = FALSE, main = "", col = NULL, ...)
TIL_barplot(mat, draw_legend = FALSE, main = "", col = NULL, ...)

Arguments

`mat`	Matrix of cell proportions or abundances, in the same dimensions output by spatialdecon (cells in rows, observations in columns). User is free to re-order columns/observations in whatever order is best for display.
`draw_legend`	Logical. If TRUE, the function draws a legend in a new plot frame.
`main`	Title for barplot
`col`	Vector of colors for cell types. Defaults to pre-set colors for the safeTME cell types.
`...`	Arguments passed to barplot()

Value

Draws a barplot.

Examples

data(mini_geomx_dataset)
data(safeTME)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# run barplot:
TIL_barplot(mat = res0$beta)
# run barplot and draw a color legend
TIL_barplot(mat = res0$beta, draw_legend = TRUE)
data(mini_geomx_dataset)
data(safeTME)
# estimate background:
mini_geomx_dataset$bg <- derive_GeoMx_background(
  norm = mini_geomx_dataset$normalized,
  probepool = rep(1, nrow(mini_geomx_dataset$normalized)),
  negnames = "NegProbe"
)
# run basic decon:
res0 <- spatialdecon(
  norm = mini_geomx_dataset$normalized,
  bg = mini_geomx_dataset$bg,
  X = safeTME
)
# run barplot:
TIL_barplot(mat = res0$beta)
# run barplot and draw a color legend
TIL_barplot(mat = res0$beta, draw_legend = TRUE)