Title: | From geospatial to spatial omics |
---|---|
Description: | SpatialFeatureExperiment (SFE) is a new S4 class for working with spatial single-cell genomics data. The voyager package implements basic exploratory spatial data analysis (ESDA) methods for SFE. Univariate methods include univariate global spatial ESDA methods such as Moran's I, permutation testing for Moran's I, and correlograms. Bivariate methods include Lee's L and cross variogram. Multivariate methods include MULTISPATI PCA and multivariate local Geary's C recently developed by Anselin. The Voyager package also implements plotting functions to plot SFE data and ESDA results. |
Authors: | Lambda Moses [aut, cre] , Alik Huseynov [aut] , Kayla Jackson [aut] , Laura Luebbert [aut] , Lior Pachter [aut, rev] |
Maintainer: | Lambda Moses <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.9.2 |
Built: | 2024-12-25 03:29:02 UTC |
Source: | https://github.com/bioc/Voyager |
These functions perform bivariate spatial analysis. In this version, the
bivariate global method supported are lee
,
lee.mc
, and lee.test
from spdep
, and cross
variograms from gstat
(use cross_variogram
and
cross_variogram_map
for type
argument, see
variogram-internal
). Global Lee statistic is computed by my own
implementation that is much faster than that in spdep
. Bivariate local
methods supported are lee
(use locallee
for type
argument) and localmoran_bv
a bivariate version of Local Moran
in spdep
.
## S4 method for signature 'ANY' calculateBivariate( x, y = NULL, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = "BH", name = NULL, ... ) ## S4 method for signature 'SpatialFeatureExperiment' calculateBivariate( x, type, feature1, feature2 = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = "BH", swap_rownames = NULL, name = NULL, ... ) runBivariate( x, type, feature1, feature2 = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, p.adjust.method = "BH", name = NULL, overwrite = FALSE, ... )
## S4 method for signature 'ANY' calculateBivariate( x, y = NULL, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = "BH", name = NULL, ... ) ## S4 method for signature 'SpatialFeatureExperiment' calculateBivariate( x, type, feature1, feature2 = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = "BH", swap_rownames = NULL, name = NULL, ... ) runBivariate( x, type, feature1, feature2 = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, p.adjust.method = "BH", name = NULL, overwrite = FALSE, ... )
x |
A numeric matrix whose rows are features/genes, or a numeric vector
(then |
y |
A numeric matrix whose rows are features/genes, or a numeric vector. Bivariate statics will be computed for all pairwise combinations of row names of x and row names of y, except in cross variogram where combinations within x and y are also computed. |
type |
An |
listw |
Weighted neighborhood graph as a |
coords_df |
A |
BPPARAM |
A |
zero.policy |
default |
returnDF |
Logical, when the results are not added to a SFE object,
whether the results should be formatted as a |
p.adjust.method |
Method to correct for multiple testing, passed to
|
name |
Name to use to store the results, defaults to the name in the
|
... |
Other arguments passed to S4 method (for convenience wrappers like
|
feature1 |
ID or symbol of the first genes in SFE object, for the
argument |
feature2 |
ID or symbol of the second genes in SFE object, for the
argument |
colGraphName |
Name of the listw graph in the SFE object that
corresponds to entities represented by columns of the gene count matrix.
Use |
colGeometryName |
Name of a |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
exprs_values |
Integer scalar or string indicating which assay of x contains the expression values. |
swap_rownames |
Column name of |
overwrite |
Logical, whether to overwrite existing results with the same
name. Defaults to |
The calculateBivariate
function returns a correlation matrix
for global Lee, and the results for the each pair of genes for other
methods. Global results are not stored in the SFE object. Some methods
return one result for each pair of genes, while some return pairwise
results for more than 2 genes jointly. Local results are stored in the
localResults
field in the SFE object, with name the
concatenation the two gene names separated by two underscores (__
).
library(SFEData) library(scater) library(scran) library(SpatialFeatureExperiment) library(SpatialExperiment) sfe <- McKellarMuscleData() sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) gs <- modelGeneVar(sfe) hvgs <- getTopHVGs(gs, fdr.threshold = 0.01) g <- colGraph(sfe, "visium") <- findVisiumGraph(sfe) # Matrix method mat <- logcounts(sfe)[hvgs[1:5],] df <- df2sf(spatialCoords(sfe), spatialCoordsNames(sfe)) out <- calculateBivariate(mat, type = "lee", listw = g) out <- calculateBivariate(mat, type = "cross_variogram", coords_df = df) # SFE method out <- calculateBivariate(sfe, type = "lee", feature1 = c("Myh1", "Myh2", "Csrp3"), swap_rownames = "symbol") out2 <- calculateBivariate(sfe, type = "lee.test", feature1 = "Myh1", feature2 = "Myh2", swap_rownames = "symbol") sfe <- runBivariate(sfe, type = "locallee", feature1 = "Myh1", feature2 = "Myh2", swap_rownames = "symbol")
library(SFEData) library(scater) library(scran) library(SpatialFeatureExperiment) library(SpatialExperiment) sfe <- McKellarMuscleData() sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) gs <- modelGeneVar(sfe) hvgs <- getTopHVGs(gs, fdr.threshold = 0.01) g <- colGraph(sfe, "visium") <- findVisiumGraph(sfe) # Matrix method mat <- logcounts(sfe)[hvgs[1:5],] df <- df2sf(spatialCoords(sfe), spatialCoordsNames(sfe)) out <- calculateBivariate(mat, type = "lee", listw = g) out <- calculateBivariate(mat, type = "cross_variogram", coords_df = df) # SFE method out <- calculateBivariate(sfe, type = "lee", feature1 = c("Myh1", "Myh2", "Csrp3"), swap_rownames = "symbol") out2 <- calculateBivariate(sfe, type = "lee.test", feature1 = "Myh1", feature2 = "Myh2", swap_rownames = "symbol") sfe <- runBivariate(sfe, type = "locallee", feature1 = "Myh1", feature2 = "Myh2", swap_rownames = "symbol")
These functions perform multivariate spatial data analysis, usually spatially informed dimension reduction.
## S4 method for signature 'ANY,SFEMethod' calculateMultivariate( x, type, listw = NULL, transposed = FALSE, zero.policy = TRUE, p.adjust.method = "BH", ... ) ## S4 method for signature 'ANY,character' calculateMultivariate(x, type, listw = NULL, transposed = FALSE, ...) ## S4 method for signature 'SpatialFeatureExperiment,ANY' calculateMultivariate( x, type, colGraphName = 1L, subset_row = NULL, exprs_values = "logcounts", sample_action = c("joint", "separate"), BPPARAM = SerialParam(), ... ) runMultivariate( x, type, colGraphName = 1L, subset_row = NULL, exprs_values = "logcounts", sample_action = c("joint", "separate"), BPPARAM = SerialParam(), name = NULL, dest = c("reducedDim", "colData"), ... )
## S4 method for signature 'ANY,SFEMethod' calculateMultivariate( x, type, listw = NULL, transposed = FALSE, zero.policy = TRUE, p.adjust.method = "BH", ... ) ## S4 method for signature 'ANY,character' calculateMultivariate(x, type, listw = NULL, transposed = FALSE, ...) ## S4 method for signature 'SpatialFeatureExperiment,ANY' calculateMultivariate( x, type, colGraphName = 1L, subset_row = NULL, exprs_values = "logcounts", sample_action = c("joint", "separate"), BPPARAM = SerialParam(), ... ) runMultivariate( x, type, colGraphName = 1L, subset_row = NULL, exprs_values = "logcounts", sample_action = c("joint", "separate"), BPPARAM = SerialParam(), name = NULL, dest = c("reducedDim", "colData"), ... )
x |
A numeric matrix whose rows are features/genes, or a
|
type |
An |
listw |
Weighted neighborhood graph as a |
transposed |
Logical, whether the matrix has genes in columns and cells in rows. |
zero.policy |
default |
p.adjust.method |
Method to correct for multiple testing, passed to
|
... |
Extra arguments passed to the specific multivariate method. For
example, see |
colGraphName |
Name of the listw graph in the SFE object that
corresponds to entities represented by columns of the gene count matrix.
Use |
subset_row |
Vector specifying the subset of features to use for dimensionality reduction. This can be a character vector of row names, an integer vector of row indices or a logical vector. |
exprs_values |
Integer scalar or string indicating which assay of x contains the expression values. |
sample_action |
Character, either "joint" or "separate". Spatial methods
depend on the spatial coordinates and/or spatial neighborhood graph, which
is why |
BPPARAM |
A |
name |
Name to use to store the results, defaults to the name in the
|
dest |
Character, either "reducedDim" or "colData". If the output of the
multivariate method is a matrix or array, as in spatially informed
dimension reduction, then the only option is "reducedDim", so the results
will be stored in |
For the argument type
, this package supports "multispati" for
MULTISPATI PCA, "localC_multi" for a multivariate generalization of Geary's
C, "localC_perm_multi" for the multivariate Geary's C with permutation
testing, and "gwpca" for geographically weighted PCA.
In calculateMultivariate
, a matrix for cell embeddings whose
attributes include loadings and eigenvalues if relevant, ready to be added
to the SFE object with reducedDim
setter. For run*
, a
SpatialFeatureExperiment
object with the results added. See Details
for where the results are stored.
Dray, S., Said, S. and Debias, F. (2008) Spatial ordination of vegetation data using a generalization of Wartenberg's multivariate spatial correlation. Journal of vegetation science, 19, 45-56.
Anselin, L. (2019), A Local Indicator of Multivariate Spatial Association: Extending Geary's c. Geogr Anal, 51: 133-150. doi:10.1111/gean.12164
# example code library(SFEData) library(scater) library(scran) sfe <- McKellarMuscleData() sfe <- logNormCounts(sfe) gvs <- modelGeneVar(sfe) hvgs <- getTopHVGs(gvs, fdr.threshold = 0.05) colGraph(sfe, "visium") <- findVisiumGraph(sfe) sfe <- runMultivariate(sfe, "multispati", subset_row = hvgs)
# example code library(SFEData) library(scater) library(scran) sfe <- McKellarMuscleData() sfe <- logNormCounts(sfe) gvs <- modelGeneVar(sfe) hvgs <- getTopHVGs(gvs, fdr.threshold = 0.05) colGraph(sfe, "visium") <- findVisiumGraph(sfe) sfe <- runMultivariate(sfe, "multispati", subset_row = hvgs)
These functions compute univariate spatial statistics, both global and local,
on matrices, data frames, and SFE objects. For SFE objects, the statistics
can be computed for numeric columns of colData
, colGeometries
,
and annotGeometries
, and the results are stored within the SFE object.
calculateMoransI
and runMoransI
are convenience wrappers for
calculateUnivariate
and runUnivariate
respectively.
## S4 method for signature 'ANY,SFEMethod' calculateUnivariate( x, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = "BH", name = NULL, ... ) ## S4 method for signature 'ANY,character' calculateUnivariate( x, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = "BH", name = NULL, ... ) ## S4 method for signature 'SpatialFeatureExperiment,ANY' calculateUnivariate( x, type, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, include_self = FALSE, p.adjust.method = "BH", swap_rownames = NULL, name = NULL, ... ) ## S4 method for signature 'ANY' calculateMoransI( x, ..., BPPARAM = SerialParam(), zero.policy = NULL, name = "moran" ) ## S4 method for signature 'SpatialFeatureExperiment' calculateMoransI( x, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, include_self = FALSE, p.adjust.method = "BH", swap_rownames = NULL, name = NULL, ... ) colDataUnivariate( x, type, features, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) colDataMoransI( x, features, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) colGeometryUnivariate( x, type, features, colGeometryName = 1L, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) colGeometryMoransI( x, features, colGeometryName = 1L, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) annotGeometryUnivariate( x, type, features, annotGeometryName = 1L, annotGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) annotGeometryMoransI( x, features, annotGeometryName = 1L, annotGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) runUnivariate( x, type, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, overwrite = FALSE, ... ) runMoransI( x, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) reducedDimUnivariate( x, type, dimred = 1L, components = 1L, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) reducedDimMoransI( x, dimred = 1L, components = 1L, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... )
## S4 method for signature 'ANY,SFEMethod' calculateUnivariate( x, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = "BH", name = NULL, ... ) ## S4 method for signature 'ANY,character' calculateUnivariate( x, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = "BH", name = NULL, ... ) ## S4 method for signature 'SpatialFeatureExperiment,ANY' calculateUnivariate( x, type, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, include_self = FALSE, p.adjust.method = "BH", swap_rownames = NULL, name = NULL, ... ) ## S4 method for signature 'ANY' calculateMoransI( x, ..., BPPARAM = SerialParam(), zero.policy = NULL, name = "moran" ) ## S4 method for signature 'SpatialFeatureExperiment' calculateMoransI( x, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, include_self = FALSE, p.adjust.method = "BH", swap_rownames = NULL, name = NULL, ... ) colDataUnivariate( x, type, features, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) colDataMoransI( x, features, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) colGeometryUnivariate( x, type, features, colGeometryName = 1L, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) colGeometryMoransI( x, features, colGeometryName = 1L, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) annotGeometryUnivariate( x, type, features, annotGeometryName = 1L, annotGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) annotGeometryMoransI( x, features, annotGeometryName = 1L, annotGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) runUnivariate( x, type, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, overwrite = FALSE, ... ) runMoransI( x, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", exprs_values = "logcounts", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) reducedDimUnivariate( x, type, dimred = 1L, components = 1L, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... ) reducedDimMoransI( x, dimred = 1L, components = 1L, colGraphName = 1L, sample_id = "all", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = "BH", name = NULL, ... )
x |
A numeric matrix whose rows are features/genes, or a
|
type |
An |
listw |
Weighted neighborhood graph as a |
coords_df |
A |
BPPARAM |
A |
zero.policy |
default |
returnDF |
Logical, when the results are not added to a SFE object,
whether the results should be formatted as a |
p.adjust.method |
Method to correct for multiple testing, passed to
|
name |
Name to use to store the results, defaults to the name in the
|
... |
Other arguments passed to S4 method (for convenience wrappers like
|
features |
Genes ( |
colGraphName |
Name of the listw graph in the SFE object that
corresponds to entities represented by columns of the gene count matrix.
Use |
colGeometryName |
Name of a |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
exprs_values |
Integer scalar or string indicating which assay of x contains the expression values. |
include_self |
Logical, whether the spatial neighborhood graph should
include edges from each location to itself. This is for Getis-Ord Gi* as in
|
swap_rownames |
Column name of |
annotGeometryName |
Name of a |
annotGraphName |
Name of the listw graph in the SFE object that
corresponds to the |
overwrite |
Logical, whether to overwrite existing results with the same
name. Defaults to |
dimred |
Name of a dimension reduction, can be seen in
|
components |
Numeric vector of which components in the dimension reduction to compute spatial statistics on. |
Most univariate methods in the package spdep
are supported here. These
methods are global, meaning returning one result for all spatial locations in
the dataset: moran
, geary
,
moran.mc
, geary.mc
,
moran.test
, geary.test
,
globalG.test
, sp.correlogram
. The
variogram and variogram map from the gstat
package are also supported.
The following methods are local, meaning each location has its own results:
moran.plot
, localmoran
,
localmoran_perm
, localC
,
localC_perm
, localG
,
localG_perm
, LOSH
,
LOSH.mc
, LOSH.cs
. The
GWmodel::gwss
method will be supported soon, but is not supported yet.
Global results for genes are stored in rowData
. For colGeometry
and annotGeometry
, the results are added to an attribute of the data
frame called featureData
, which is a DataFrame analogous to
rowData
for the gene count matrix, and can be accessed with the
geometryFeatureData
function. New column names in
featureData
would follow the same rules as in rowData
. For
colData
, the results can be accessed with the
colFeatureData
function.
Local results are stored in the field localResults
field of the SFE
object, which can be accessed with
localResults
or
localResult
. If the results have
p-values, then -log10 p and adjusted -log10 p are added. Note that in the
multiple testing correction, p.adjustSP
is used.
When the results are stored in the SFE object, parameters used to compute the
results as well as to construct the spatial neighborhood graph are also
added. For localResults
, the parameters are added to the metadata
field params
of the localResults
sorted by name
, which
defaults to the name in the SFEMethod
object as specified in the
type
argument. For global methods, parameters for results for genes
are in the metadata of rowData(x)
, organized by name
(metadata(rowData(x))$params[[name]]
). For colData
, the global
method parameters are stored in metadata of colData
in the field
params
(metadata(colData(x))$params[[name]]
). For geometries,
the global method parameters are in an attribute named "params" of the
corresponding sf
data frame (attr(df, "params")[[name]]
).
In calculateUnivariate
, if returnDF = TRUE
, then a
DataFrame
, otherwise a list each element of which is the results for
each feature. For run*
, a SpatialFeatureExperiment
object
with the results added. See Details for where the results are stored.
Cliff, A. D., Ord, J. K. 1981 Spatial processes, Pion, p. 17.
Anselin, L. (1995), Local Indicators of Spatial Association-LISA. Geographical Analysis, 27: 93-115. doi:10.1111/j.1538-4632.1995.tb00338.x
Ord, J. K., & Getis, A. 2012. Local spatial heteroscedasticity (LOSH), The Annals of Regional Science, 48 (2), 529-539.
Ord, J. K. and Getis, A. 1995 Local spatial autocorrelation statistics: distributional issues and an application. Geographical Analysis, 27, 286-306
library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) features_use <- rownames(sfe)[1:5] # Moran's I moran_results <- calculateMoransI(sfe, features = features_use, colGraphName = "visium", exprs_values = "counts" ) # This does not advocate for computing Moran's I on raw counts. # Just an example for function usage. sfe <- runMoransI(sfe, features = features_use, colGraphName = "visium", exprs_values = "counts" ) # Look at the results head(rowData(sfe)) # Local Moran's I sfe <- runUnivariate(sfe, type = "localmoran", features = features_use, colGraphName = "visium", exprs_values = "counts" ) head(localResult(sfe, "localmoran", features_use[1])) # For colData sfe <- colDataUnivariate(sfe, type = "localmoran", features = "nCounts", colGraphName = "visium" ) head(localResult(sfe, "localmoran", "nCounts")) # For annotGeometries annotGraph(sfe, "myofiber_tri2nb") <- findSpatialNeighbors(sfe, type = "myofiber_simplified", MARGIN = 3L, method = "tri2nb", dist_type = "idw", zero.policy = TRUE ) sfe <- annotGeometryUnivariate(sfe, type = "localG", features = "area", annotGraphName = "myofiber_tri2nb", annotGeometryName = "myofiber_simplified", zero.policy = TRUE ) head(localResult(sfe, "localG", "area", annotGeometryName = "myofiber_simplified" ))
library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) features_use <- rownames(sfe)[1:5] # Moran's I moran_results <- calculateMoransI(sfe, features = features_use, colGraphName = "visium", exprs_values = "counts" ) # This does not advocate for computing Moran's I on raw counts. # Just an example for function usage. sfe <- runMoransI(sfe, features = features_use, colGraphName = "visium", exprs_values = "counts" ) # Look at the results head(rowData(sfe)) # Local Moran's I sfe <- runUnivariate(sfe, type = "localmoran", features = features_use, colGraphName = "visium", exprs_values = "counts" ) head(localResult(sfe, "localmoran", features_use[1])) # For colData sfe <- colDataUnivariate(sfe, type = "localmoran", features = "nCounts", colGraphName = "visium" ) head(localResult(sfe, "localmoran", "nCounts")) # For annotGeometries annotGraph(sfe, "myofiber_tri2nb") <- findSpatialNeighbors(sfe, type = "myofiber_simplified", MARGIN = 3L, method = "tri2nb", dist_type = "idw", zero.policy = TRUE ) sfe <- annotGeometryUnivariate(sfe, type = "localG", features = "area", annotGraphName = "myofiber_tri2nb", annotGeometryName = "myofiber_simplified", zero.policy = TRUE ) head(localResult(sfe, "localG", "area", annotGeometryName = "myofiber_simplified" ))
Cluster the correlograms to find patterns in length scales of spatial autocorrelation. All the correlograms clustered must be computed with the same method and have the same number of lags. Correlograms are clustered jointly across samples.
clusterCorrelograms( sfe, features, BLUSPARAM, sample_id = "all", method = "I", colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = "sp.correlogram" )
clusterCorrelograms( sfe, features, BLUSPARAM, sample_id = "all", method = "I", colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = "sp.correlogram" )
sfe |
A |
features |
Features whose correlograms to cluster. |
BLUSPARAM |
A BlusterParam object specifying the algorithm to use. |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
method |
"corr" for correlation, "I" for Moran's I, "C" for Geary's C |
colGeometryName |
Name of colGeometry from which to look for features. |
annotGeometryName |
Name of annotGeometry from which to look for features. |
reducedDimName |
Name of a dimension reduction, can be seen in
|
swap_rownames |
Column name of |
name |
Name under which the correlogram results are stored, which is by default "sp.correlogram". |
A data frame with 3 columns: feature
for the features,
cluster
a factor for cluster membership of the features within each
sample, and sample_id
for the sample.
library(SpatialFeatureExperiment) library(SFEData) library(bluster) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) inds <- c(1, 3, 4, 5) sfe <- runUnivariate(sfe, type = "sp.correlogram", features = rownames(sfe)[inds], exprs_values = "counts", order = 5 ) clust <- clusterCorrelograms(sfe, features = rownames(sfe)[inds], BLUSPARAM = KmeansParam(2) )
library(SpatialFeatureExperiment) library(SFEData) library(bluster) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) inds <- c(1, 3, 4, 5) sfe <- runUnivariate(sfe, type = "sp.correlogram", features = rownames(sfe)[inds], exprs_values = "counts", order = 5 ) clust <- clusterCorrelograms(sfe, features = rownames(sfe)[inds], BLUSPARAM = KmeansParam(2) )
The Moran plot plots the value at each location on the x axis, and the average of the neighbors of each locations on the y axis. Sometimes clusters can be seen on the Moran plot, indicating different types of neighborhoods.
clusterMoranPlot( sfe, features, BLUSPARAM, sample_id = "all", colGeometryName = NULL, annotGeometryName = NULL, swap_rownames = NULL )
clusterMoranPlot( sfe, features, BLUSPARAM, sample_id = "all", colGeometryName = NULL, annotGeometryName = NULL, swap_rownames = NULL )
sfe |
A |
features |
Features whose Moran plot are to be cluster. Features whose Moran plots have not been computed will be skipped, with a warning. |
BLUSPARAM |
A BlusterParam object specifying the algorithm to use. |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
colGeometryName |
Name of colGeometry from which to look for features. |
annotGeometryName |
Name of annotGeometry from which to look for features. |
swap_rownames |
Column name of |
A data frame each column of which is a factor for cluster membership of each feature. The column names are the features.
library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) library(bluster) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) # Compute moran plot sfe <- runUnivariate(sfe, type = "moran.plot", features = rownames(sfe)[1], exprs_values = "counts" ) clusts <- clusterMoranPlot(sfe, rownames(sfe)[1], BLUSPARAM = KmeansParam(2) )
library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) library(bluster) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) # Compute moran plot sfe <- runUnivariate(sfe, type = "moran.plot", features = rownames(sfe)[1], exprs_values = "counts" ) clusts <- clusterMoranPlot(sfe, rownames(sfe)[1], BLUSPARAM = KmeansParam(2) )
This function clusters variograms of features across samples to find patterns in decays in spatial autocorrelation. The fitted variograms are clustered as different samples can have different distance bins.
clusterVariograms( sfe, features, BLUSPARAM, n = 20, sample_id = "all", colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = "variogram" )
clusterVariograms( sfe, features, BLUSPARAM, n = 20, sample_id = "all", colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = "variogram" )
sfe |
A |
features |
Features whose correlograms to cluster. |
BLUSPARAM |
A BlusterParam object specifying the algorithm to use. |
n |
Number of points on the fitted variogram line. |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
colGeometryName |
Name of colGeometry from which to look for features. |
annotGeometryName |
Name of annotGeometry from which to look for features. |
reducedDimName |
Name of a dimension reduction, can be seen in
|
swap_rownames |
Column name of |
name |
Name under which the correlogram results are stored, which is by default "sp.correlogram". |
A data frame with 3 columns: feature
for the features,
cluster
a factor for cluster membership of the features within each
sample, and sample_id
for the sample.
library(SFEData) library(scater) library(bluster) library(Matrix) sfe <- McKellarMuscleData() sfe <- logNormCounts(sfe) # Just the highly expressed genes gs <- order(Matrix::rowSums(counts(sfe)), decreasing = TRUE)[1:10] genes <- rownames(sfe)[gs] sfe <- runUnivariate(sfe, "variogram", features = genes) clusts <- clusterVariograms(sfe, genes, BLUSPARAM = HclustParam(), swap_rownames = "symbol") # Plot the clustering plotVariogram(sfe, genes, color_by = clusts, group = "feature", use_lty = FALSE, swap_rownames = "symbol", show_np = FALSE)
library(SFEData) library(scater) library(bluster) library(Matrix) sfe <- McKellarMuscleData() sfe <- logNormCounts(sfe) # Just the highly expressed genes gs <- order(Matrix::rowSums(counts(sfe)), decreasing = TRUE)[1:10] genes <- rownames(sfe)[gs] sfe <- runUnivariate(sfe, "variogram", features = genes) clusts <- clusterVariograms(sfe, genes, BLUSPARAM = HclustParam(), swap_rownames = "symbol") # Plot the clustering plotVariogram(sfe, genes, color_by = clusts, group = "feature", use_lty = FALSE, swap_rownames = "symbol", show_np = FALSE)
Just to get the palette without having to install all those dependencies of dittoSeq.
ditto_colors
ditto_colors
A character vector of hex colors of the palette. There are 40 colors.
The dittoSeq package.
Apparently, there is no apparent way to plot the PC elbow plot other than extracting the variance explained attribute of the dimred slot, because even the OSCA book makes the elbow plot this way, which I find kind of cumbersome compared to Seurat. So I'm writing this function to make the elbow plot with SCE less cumbersome.
ElbowPlot( sce, ndims = 20, nfnega = 0, reduction = "PCA", sample_id = "all", facet = FALSE, ncol = NULL )
ElbowPlot( sce, ndims = 20, nfnega = 0, reduction = "PCA", sample_id = "all", facet = FALSE, ncol = NULL )
sce |
A |
ndims |
Number of components with positive eigenvalues, such as PCs in non-spatial PCA. |
nfnega |
Number of nega eigenvalues and their eigenvectors to compute. These indicate negative spatial autocorrelation. |
reduction |
Name of the dimension reduction to use. It must have an attribute called either "percentVar" or "eig" for eigenvalues. Defaults to "PCA". |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
facet |
Logical, whether to facet by samples when multiple samples are present. This is relevant when spatial PCA is run separately for each sample, which gives different results from running jointly for all samples. |
ncol |
Number of columns of facets if facetting. |
A ggplot object. The y axis is eigenvalues or percentage of variance explained if relevant.
library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- runPCA(sfe, ncomponents = 10, exprs_values = "counts") ElbowPlot(sfe, ndims = 10)
library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- runPCA(sfe, ncomponents = 10, exprs_values = "counts") ElbowPlot(sfe, ndims = 10)
This function is no longer used internally as it's unnecessary for
scico
divergent palettes. But it can be useful when using divergent
palettes outside scico
where one must specify beginning and end but
not midpoint, to override the default palette.
getDivergeRange(values, diverge_center = 0)
getDivergeRange(values, diverge_center = 0)
values |
Numeric vector to be colored. |
diverge_center |
Value to center on, defaults to 0. |
A numeric vector of length 2, the first element is for beginning, and the second for end. The values are between 0 and 1.
v <- rnorm(10) getDivergeRange(v, diverge_center = 0)
v <- rnorm(10) getDivergeRange(v, diverge_center = 0)
This package ships with many spatial statistics methods as
SFEMethod
objects. The user can adapt the uniform user
interface of this package to other spatial methods by creating new
SFEMethod
objects. This function lists the names of all methods within
Voyager
, to use for the type
argument in
calculateUnivariate
, calculateBivariate
, and
calculateMultivariate
.
listSFEMethods(variate = c("uni", "bi", "multi"), scope = c("global", "local"))
listSFEMethods(variate = c("uni", "bi", "multi"), scope = c("global", "local"))
variate |
Uni-, bi-, or multi-variate. |
scope |
whether it's local or global. |
A data frame with a column for the name and another for a brief description.
listSFEMethods("uni", "local")
listSFEMethods("uni", "local")
Values Moran's I can take depends on the spatial neighborhood graph. The
bounds of Moran's I given the graph, C, are given by the minimum and maximum
eigenvalues of the double centered – i.e. subtracting column means and row
means – adjacency matrix , where
is a vector of all 1's.
This implementation follows the implementation in
adespatial
and uses
the RSpectra
package to more quickly find only the minimum and maximum
eigenvalues without performing unnecessary work to find the full spectrum as
done in base R's eigen
.
moranBounds(listw)
moranBounds(listw)
listw |
A listw object for the spatial neighborhood graph. |
A numeric vector of minimum and maximum Moran's I given the spatial neighborhood graph.
After double centering, the adjacency matrix is no longer sparse, so this function can take up a lot of memory for larger datasets.
de Jong, P., Sprenger, C., & van Veen, F. (1984). On extreme values of Moran's I and Geary's C. Geographical Analysis, 16(1), 17-24.
# example code library(SFEData) sfe <- McKellarMuscleData("small") g <- findVisiumGraph(sfe) moranBounds(g)
# example code library(SFEData) sfe <- McKellarMuscleData("small") g <- findVisiumGraph(sfe) moranBounds(g)
This function uses ggplot2
to plot the Moran plot. The plot would be
more aesthetically pleasing than the base R version implemented in
spdep
. In addition, contours are plotted to show point density on the
plot, and the points can be colored by a variable, such as clusters. The
contours may also be filled and only influential points plotted. When filled,
the viridis E option is used.
moranPlot( sfe, feature, graphName = 1L, sample_id = "all", contour_color = "cyan", color_by = NULL, colGeometryName = NULL, annotGeometryName = NULL, plot_singletons = TRUE, binned = FALSE, filled = FALSE, divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, bins = 100, binwidth = NULL, hex = FALSE, plot_influential = TRUE, bins_contour = NULL, name = "moran.plot", ... )
moranPlot( sfe, feature, graphName = 1L, sample_id = "all", contour_color = "cyan", color_by = NULL, colGeometryName = NULL, annotGeometryName = NULL, plot_singletons = TRUE, binned = FALSE, filled = FALSE, divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, bins = 100, binwidth = NULL, hex = FALSE, plot_influential = TRUE, bins_contour = NULL, name = "moran.plot", ... )
sfe |
A |
feature |
Name of one variable to show on the plot. It will be converted
to sentence case on the x axis and lower case in the y axis appended after
"Spatially lagged". One feature at a time since the colors in
|
graphName |
Name of the |
sample_id |
One sample_id for the sample whose graph to plot. |
contour_color |
Color of the point density contours, which can be changed so the contours stand out from the points. |
color_by |
Variable to color the points by. It can be the name of a column in colData, a gene, or the name of a column in the colGeometry specified in colGeometryName. Or it can be a vector of the same length as the number of cells/spots in the sample_id of interest. |
colGeometryName |
Name of a |
annotGeometryName |
Name of a |
plot_singletons |
Logical, whether to plot items that don't have spatial neighbors. |
binned |
Logical, whether to plot 2D histograms. This argument has
precedence to |
filled |
Logical, whether to plot filled contours for the non-influential points and only plot influential points as points. |
divergent |
Logical, whether a divergent palette should be used. |
diverge_center |
If |
swap_rownames |
Column name of |
bins |
If binning the |
binwidth |
Width of bins, passed to |
hex |
Logical, whether to use |
plot_influential |
Logical, whether to plot influential points with
different palette if |
bins_contour |
Number of bins in the point density contour. Use a smaller number to make sparser contours. |
name |
Name under which the Moran plot results are stored. By default "moran.plot". |
... |
Other arguments to pass to |
A ggplot object.
library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) library(bluster) library(scater) sfe <- McKellarMuscleData("full") sfe <- sfe[, colData(sfe)$in_tissue] sfe <- logNormCounts(sfe) colGraph(sfe, "visium") <- findVisiumGraph(sfe) sfe <- runUnivariate(sfe, type = "moran.plot", features = "Myh1", swap_rownames = "symbol") clust <- clusterMoranPlot(sfe, "Myh1", BLUSPARAM = KmeansParam(2), swap_rownames = "symbol") moranPlot(sfe, "Myh1", graphName = "visium", color_by = clust[, 1], swap_rownames = "symbol")
library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) library(bluster) library(scater) sfe <- McKellarMuscleData("full") sfe <- sfe[, colData(sfe)$in_tissue] sfe <- logNormCounts(sfe) colGraph(sfe, "visium") <- findVisiumGraph(sfe) sfe <- runUnivariate(sfe, type = "moran.plot", features = "Myh1", swap_rownames = "symbol") clust <- clusterMoranPlot(sfe, "Myh1", BLUSPARAM = KmeansParam(2), swap_rownames = "symbol") moranPlot(sfe, "Myh1", graphName = "visium", color_by = clust[, 1], swap_rownames = "symbol")
This implementation uses the RSpectra
package to efficiently compute a
small subset of eigenvalues and eigenvectors, as a small subset is typically
used. Hence it's much faster and memory efficient than the original
implementation in adespatial
. However, this implementation here does
not support row and column weighting other than the standard ones for PCA.,
so the adespatial
implementation is more general.
multispati_rsp(x, listw, nfposi = 30L, nfnega = 30L, scale = TRUE)
multispati_rsp(x, listw, nfposi = 30L, nfnega = 30L, scale = TRUE)
x |
A matrix whose columns are features and rows are cells. |
listw |
A |
nfposi |
Number of positive eigenvalues and their eigenvectors to compute. |
nfnega |
Number of nega eigenvalues and their eigenvectors to compute. These indicate negative spatial autocorrelation. |
scale |
Logical, whether to scale the data. |
A matrix for the cell embeddings in each spatial PC, with attribute
loading
for the eigenvectors or gene loadings, and attribute
eig
for the eigenvalues.
Eigen decomposition will fail if any feature has variance zero leading to NaN in the scaled matrix.
Dray, S., Said, S. and Debias, F. (2008) Spatial ordination of vegetation data using a generalization of Wartenberg's multivariate spatial correlation. Journal of vegetation science, 19, 45-56.
library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) inds <- order(rowSums(logcounts(sfe)), decreasing = TRUE)[1:50] mat <- logcounts(sfe)[inds,] g <- findVisiumGraph(sfe) out <- multispati_rsp(t(mat), listw = g, nfposi = 10, nfnega = 10)
library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) inds <- order(rowSums(logcounts(sfe)), decreasing = TRUE)[1:50] mat <- logcounts(sfe)[inds,] g <- findVisiumGraph(sfe) out <- multispati_rsp(t(mat), listw = g, nfposi = 10, nfnega = 10)
This function plots cell density in histological space as 2D histograms, especially helpful for larger smFISH-based datasets.
plotCellBin2D( sfe, sample_id = "all", bins = 200, binwidth = NULL, hex = FALSE, ncol = NULL, bbox = NULL )
plotCellBin2D( sfe, sample_id = "all", bins = 200, binwidth = NULL, hex = FALSE, ncol = NULL, bbox = NULL )
sfe |
A |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
bins |
Number of bins. Can be a vector of length 2 to specify for x and y axes separately. |
binwidth |
Width of bins, passed to |
hex |
Logical, whether to use hexagonal bins. |
ncol |
Number of columns if plotting multiple features. Defaults to
|
bbox |
A bounding box to specify a smaller region to plot, useful when
the dataset is large. Can be a named numeric vector with names "xmin",
"xmax", "ymin", and "ymax", in any order. If plotting multiple samples, it
should be a matrix with sample IDs as column names and "xmin", "ymin",
"xmax", and "ymax" as row names. If multiple samples are plotted but
|
A ggplot object.
library(SFEData) sfe <- HeNSCLCData() plotCellBin2D(sfe)
library(SFEData) sfe <- HeNSCLCData() plotCellBin2D(sfe)
This function is recommended instead of plotColDataHistogram
when coloring by multiple categories and log transforming the y axis, which
causes problems in stacked histograms.
plotColDataFreqpoly( sce, feature, color_by = NULL, subset = NULL, bins = 100, binwidth = NULL, linewidth = 1.2, scales = "free", ncol = 1, position = "identity" ) plotRowDataFreqpoly( sce, feature, color_by = NULL, subset = NULL, bins = 100, binwidth = NULL, linewidth = 1.2, scales = "free", ncol = 1, position = "identity" )
plotColDataFreqpoly( sce, feature, color_by = NULL, subset = NULL, bins = 100, binwidth = NULL, linewidth = 1.2, scales = "free", ncol = 1, position = "identity" ) plotRowDataFreqpoly( sce, feature, color_by = NULL, subset = NULL, bins = 100, binwidth = NULL, linewidth = 1.2, scales = "free", ncol = 1, position = "identity" )
sce |
A |
feature |
Names of columns in |
color_by |
Name of a categorical column in |
subset |
Name of a logical column to only plot a subset of the data. |
bins |
Number of bins. Overridden by |
binwidth |
The width of the bins. Can be specified as a numeric value
or as a function that calculates width from unscaled x. Here, "unscaled x"
refers to the original x values in the data, before application of any
scale transformation. When specifying a function along with a grouping
structure, the function will be called once per group.
The default is to use the number of bins in The bin width of a date variable is the number of days in each time; the bin width of a time variable is the number of seconds. |
linewidth |
Line width of the polygons, defaults to a thicker 1.2. |
scales |
Should scales be fixed ( |
ncol |
Number of columns in the facetting. |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
plotColDataHistogram
library(SFEData) sfe <- McKellarMuscleData() plotColDataFreqpoly(sfe, c("nCounts", "nGenes"), color_by = "in_tissue", bins = 50) plotColDataFreqpoly(sfe, "nCounts", subset = "in_tissue") sfe2 <- sfe[, sfe$in_tissue] plotColDataFreqpoly(sfe2, c("nCounts", "nGenes"), bins = 50)
library(SFEData) sfe <- McKellarMuscleData() plotColDataFreqpoly(sfe, c("nCounts", "nGenes"), color_by = "in_tissue", bins = 50) plotColDataFreqpoly(sfe, "nCounts", subset = "in_tissue") sfe2 <- sfe[, sfe$in_tissue] plotColDataFreqpoly(sfe2, c("nCounts", "nGenes"), bins = 50)
Plot histograms for colData and rowData columns
plotColDataHistogram( sce, feature, fill_by = NULL, facet_by = NULL, subset = NULL, bins = 100, binwidth = NULL, scales = "free", ncol = 1, position = "stack", ... ) plotRowDataHistogram( sce, feature, fill_by = NULL, facet_by = NULL, subset = NULL, bins = 100, binwidth = NULL, scales = "free", ncol = 1, position = "stack", ... )
plotColDataHistogram( sce, feature, fill_by = NULL, facet_by = NULL, subset = NULL, bins = 100, binwidth = NULL, scales = "free", ncol = 1, position = "stack", ... ) plotRowDataHistogram( sce, feature, fill_by = NULL, facet_by = NULL, subset = NULL, bins = 100, binwidth = NULL, scales = "free", ncol = 1, position = "stack", ... )
sce |
A |
feature |
Names of columns in |
fill_by |
Name of a categorical column in |
facet_by |
Column in |
subset |
Name of a logical column to only plot a subset of the data. |
bins |
Numeric vector giving number of bins in both vertical and horizontal directions. Set to 100 by default. |
binwidth |
The width of the bins. Can be specified as a numeric value
or as a function that calculates width from unscaled x. Here, "unscaled x"
refers to the original x values in the data, before application of any
scale transformation. When specifying a function along with a grouping
structure, the function will be called once per group.
The default is to use the number of bins in The bin width of a date variable is the number of days in each time; the bin width of a time variable is the number of seconds. |
scales |
Should scales be fixed ( |
ncol |
Number of columns in the facetting. |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
A ggplot object
plotColDataFreqpoly
library(SFEData) sfe <- McKellarMuscleData() plotColDataHistogram(sfe, c("nCounts", "nGenes"), fill_by = "in_tissue", bins = 50, position = "stack") plotColDataHistogram(sfe, "nCounts", subset = "in_tissue") sfe2 <- sfe[, sfe$in_tissue] plotColDataHistogram(sfe2, c("nCounts", "nGenes"), bins = 50)
library(SFEData) sfe <- McKellarMuscleData() plotColDataHistogram(sfe, c("nCounts", "nGenes"), fill_by = "in_tissue", bins = 50, position = "stack") plotColDataHistogram(sfe, "nCounts", subset = "in_tissue") sfe2 <- sfe[, sfe$in_tissue] plotColDataHistogram(sfe2, c("nCounts", "nGenes"), bins = 50)
A ggplot version of spdep::plot.nb
, reducing boilerplate for SFE
objects.
plotColGraph( sfe, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", weights = FALSE, segment_size = 0.5, geometry_size = 0.5, ncol = NULL, bbox = NULL ) plotAnnotGraph( sfe, annotGraphName = 1L, annotGeometryName = 1L, sample_id = "all", weights = FALSE, segment_size = 0.5, geometry_size = 0.5, ncol = NULL, bbox = NULL )
plotColGraph( sfe, colGraphName = 1L, colGeometryName = 1L, sample_id = "all", weights = FALSE, segment_size = 0.5, geometry_size = 0.5, ncol = NULL, bbox = NULL ) plotAnnotGraph( sfe, annotGraphName = 1L, annotGeometryName = 1L, sample_id = "all", weights = FALSE, segment_size = 0.5, geometry_size = 0.5, ncol = NULL, bbox = NULL )
sfe |
A |
colGraphName |
Name of graph associated with columns of the gene count matrix to be plotted. |
colGeometryName |
Name of a |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
weights |
Whether to plot weights. If |
segment_size |
Thickness of the segments that represent graph edges. |
geometry_size |
Point size (for POINT geometries) or line thickness (for LINESTRING and POLYGON) to plot the geometry in the background. |
ncol |
Number of columns if plotting multiple features. Defaults to
|
bbox |
A bounding box to specify a smaller region to plot, useful when
the dataset is large. Can be a named numeric vector with names "xmin",
"xmax", "ymin", and "ymax", in any order. If plotting multiple samples, it
should be a matrix with sample IDs as column names and "xmin", "ymin",
"xmax", and "ymax" as row names. If multiple samples are plotted but
|
annotGraphName |
Name of the annotation graph to plot. |
annotGeometryName |
Name of the |
A ggplot2 object.
library(SpatialFeatureExperiment) library(SFEData) library(sf) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) plotColGraph(sfe, colGraphName = "visium", colGeometryName = "spotPoly") # Make the myofiber segmentations a valid POLYGON geometry ag <- annotGeometry(sfe, "myofiber_simplified") ag <- st_buffer(ag, 0) ag <- ag[!st_is_empty(ag), ] annotGeometry(sfe, "myofiber_simplified") <- ag annotGraph(sfe, "myofibers") <- findSpatialNeighbors(sfe, type = "myofiber_simplified", MARGIN = 3, method = "tri2nb", dist_type = "idw" ) plotAnnotGraph(sfe, annotGraphName = "myofibers", annotGeometryName = "myofiber_simplified", weights = TRUE )
library(SpatialFeatureExperiment) library(SFEData) library(sf) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) plotColGraph(sfe, colGraphName = "visium", colGeometryName = "spotPoly") # Make the myofiber segmentations a valid POLYGON geometry ag <- annotGeometry(sfe, "myofiber_simplified") ag <- st_buffer(ag, 0) ag <- ag[!st_is_empty(ag), ] annotGeometry(sfe, "myofiber_simplified") <- ag annotGraph(sfe, "myofibers") <- findSpatialNeighbors(sfe, type = "myofiber_simplified", MARGIN = 3, method = "tri2nb", dist_type = "idw" ) plotAnnotGraph(sfe, annotGraphName = "myofibers", annotGeometryName = "myofiber_simplified", weights = TRUE )
Use ggplot2
to plot correlograms computed by
runUnivariate
, pulling results from rowData
.
Correlograms of multiple genes with error bars can be plotted, and they can
be colored by any numeric or categorical column in rowData
or a vector
with the same length as nrow
of the SFE object. The coloring is useful
when the correlograms are clustered to show types of length scales or
patterns of decay of spatial autocorrelation. For method = "I"
, the
error bars are twice the standard deviation of the estimated Moran's I value.
plotCorrelogram( sfe, features, sample_id = "all", method = "I", color_by = NULL, facet_by = c("sample_id", "features"), ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, plot_signif = TRUE, p_adj_method = "BH", divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, name = "sp.correlogram" )
plotCorrelogram( sfe, features, sample_id = "all", method = "I", color_by = NULL, facet_by = c("sample_id", "features"), ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, plot_signif = TRUE, p_adj_method = "BH", divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, name = "sp.correlogram" )
sfe |
A |
features |
Features to plot, must be in rownames of the gene count
matrix, colnames of colData or a colGeometry, colnames of cell embeddings
in |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
method |
"corr" for correlation, "I" for Moran's I, "C" for Geary's C |
color_by |
Name of a column in |
facet_by |
Whether to facet by sample_id (default) or features. If facetting by sample_id, then different features will be plotted in the same facet for comparison. If facetting by features, then different samples will be compared for each feature. Ignored if only one sample is specified. |
ncol |
Number of columns if facetting. |
colGeometryName |
Name of a |
annotGeometryName |
Name of a |
reducedDimName |
Name of a dimension reduction, can be seen in
|
plot_signif |
Logical, whether to plot significance symbols: p < 0.001:
***, p < 0.01: **, p < 0.05 *, p < 0.1: ., otherwise no symbol. The
p-values are two sided, based on the assumption that the estimated Moran's
I is normally distributed with mean from a randomized version of the data.
The mean and variance come from |
p_adj_method |
Multiple testing correction method as in
|
divergent |
Logical, whether a divergent palette should be used. |
diverge_center |
If |
swap_rownames |
Column name of |
name |
Name under which the correlogram results are stored, which is by default "sp.correlogram". |
A ggplot object.
library(SpatialFeatureExperiment) library(SFEData) library(bluster) library(scater) sfe <- McKellarMuscleData("small") sfe <- logNormCounts(sfe) colGraph(sfe, "visium") <- findVisiumGraph(sfe) inds <- c(1, 3, 4, 5) features <- rownames(sfe)[inds] sfe <- runUnivariate(sfe, type = "sp.correlogram", features = features, exprs_values = "counts", order = 5 ) clust <- clusterCorrelograms(sfe, features = features, BLUSPARAM = KmeansParam(2) ) # Color by features plotCorrelogram(sfe, features) # Color by something else plotCorrelogram(sfe, features, color_by = clust$cluster) # Facet by features plotCorrelogram(sfe, features, facet_by = "features")
library(SpatialFeatureExperiment) library(SFEData) library(bluster) library(scater) sfe <- McKellarMuscleData("small") sfe <- logNormCounts(sfe) colGraph(sfe, "visium") <- findVisiumGraph(sfe) inds <- c(1, 3, 4, 5) features <- rownames(sfe)[inds] sfe <- runUnivariate(sfe, type = "sp.correlogram", features = features, exprs_values = "counts", order = 5 ) clust <- clusterCorrelograms(sfe, features = features, BLUSPARAM = KmeansParam(2) ) # Color by features plotCorrelogram(sfe, features) # Color by something else plotCorrelogram(sfe, features, color_by = clust$cluster) # Facet by features plotCorrelogram(sfe, features, facet_by = "features")
Equivalent to gstat::plot.gstatVariogram
, but using ggplot2 to be more
customizable.
plotCrossVariogram(res, show_np = TRUE)
plotCrossVariogram(res, show_np = TRUE)
res |
Cross variogram results for one sample, from
|
show_np |
Logical, whether to show number of pairs of cells at each distance bin. |
A ggplot object. Unfortunately I haven't figured out a way to collect all the facet labels to the top of the entire plot.
plotCrossVariogramMap
library(SFEData) library(scater) sfe <- McKellarMuscleData() sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) res <- calculateBivariate(sfe, type = "cross_variogram", feature1 = c("Myh1", "Myh2", "Csrp3"), swap_rownames = "symbol") plotCrossVariogram(res)
library(SFEData) library(scater) sfe <- McKellarMuscleData() sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) res <- calculateBivariate(sfe, type = "cross_variogram", feature1 = c("Myh1", "Myh2", "Csrp3"), swap_rownames = "symbol") plotCrossVariogram(res)
Equivalent to gstat::plot.gstatVariogram
, but using ggplot2 to be more
customizable.
plotCrossVariogramMap(res, plot_np = FALSE)
plotCrossVariogramMap(res, plot_np = FALSE)
res |
Cross variogram results for one sample, from
|
plot_np |
Logical, whether to plot the number of pairs in each distance bin instead of the variance. |
A ggplot object.
plotCrossVariogram
library(SFEData) library(scater) sfe <- McKellarMuscleData() sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) res <- calculateBivariate(sfe, type = "cross_variogram_map", feature1 = c("Myh1", "Myh2", "Csrp3"), swap_rownames = "symbol", width = 500, cutoff = 2000) plotCrossVariogramMap(res)
library(SFEData) library(scater) sfe <- McKellarMuscleData() sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) res <- calculateBivariate(sfe, type = "cross_variogram_map", feature1 = c("Myh1", "Myh2", "Csrp3"), swap_rownames = "symbol", width = 500, cutoff = 2000) plotCrossVariogramMap(res)
Just like Seurat's VizDimLoadings function. I haven't found an equivalent for SCE but find it useful. But I'm not trying to reproduce that Seurat function exactly. For instance, I don't like it when Seurat imposes a ggplot theme, and I don't like the cowplot theme. Maybe I should rewrite it in base R but for now I'm using Tidyverse.
plotDimLoadings( sce, dims = 1:4, nfeatures = 10, swap_rownames = NULL, reduction = "PCA", balanced = TRUE, ncol = 2, sample_id = "all" )
plotDimLoadings( sce, dims = 1:4, nfeatures = 10, swap_rownames = NULL, reduction = "PCA", balanced = TRUE, ncol = 2, sample_id = "all" )
sce |
A |
dims |
Numeric vector specifying which PCs to plot. For MULTISPATI, PCs
with negative eigenvalues are in the right most columns of the embedding and
loading matrices. See the |
nfeatures |
Number of genes to plot. |
swap_rownames |
Column name of |
reduction |
Name of the dimension reduction to use. It must have an attribute called either "percentVar" or "eig" for eigenvalues. Defaults to "PCA". |
balanced |
Return an equal number of genes with + and - scores. If FALSE, returns the top genes ranked by the scores absolute values. |
ncol |
Number of columns in the facetted plot. |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
A ggplot object. Loadings for different PCs are plotted in different facets so one ggplot object is returned.
library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- runPCA(sfe, ncomponents = 10, exprs_values = "counts") plotDimLoadings(sfe, dims = 1:2)
library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- runPCA(sfe, ncomponents = 10, exprs_values = "counts") plotDimLoadings(sfe, dims = 1:2)
Different samples are plotted in separate facets. When multiple geometries
are plotted at the same time, they will be differentiated by color, by
default using the dittoSeq
palette, but this can be overridden with
scale_color_*
functions. Transcript spots of different genes are
differentiated by point shape if plotted, so the number of genes plotted
shouldn't exceed about 6 or a warning will be issued.
plotGeometry( sfe, type = deprecated(), MARGIN = deprecated(), colGeometryName = NULL, annotGeometryName = NULL, rowGeometryName = NULL, gene = "all", sample_id = "all", fill = TRUE, ncol = NULL, bbox = NULL, tx_alpha = 1, tx_size = 1, tx_file = NULL, image_id = NULL, channel = NULL, maxcell = 5e+05, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE )
plotGeometry( sfe, type = deprecated(), MARGIN = deprecated(), colGeometryName = NULL, annotGeometryName = NULL, rowGeometryName = NULL, gene = "all", sample_id = "all", fill = TRUE, ncol = NULL, bbox = NULL, tx_alpha = 1, tx_size = 1, tx_file = NULL, image_id = NULL, channel = NULL, maxcell = 5e+05, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE )
sfe |
A |
type |
Name of the geometry associated with the MARGIN of interest for which to compute the graph. |
MARGIN |
Just like in |
colGeometryName |
Name of a |
annotGeometryName |
Name of a |
rowGeometryName |
Name of a |
gene |
Character vector of names of genes to plot. If "all" then transcript spots of all genes are plotted. |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
fill |
Logical, whether to fill polygons. |
ncol |
Number of columns if plotting multiple features. Defaults to
|
bbox |
A bounding box to specify a smaller region to plot, useful when
the dataset is large. Can be a named numeric vector with names "xmin",
"xmax", "ymin", and "ymax", in any order. If plotting multiple samples, it
should be a matrix with sample IDs as column names and "xmin", "ymin",
"xmax", and "ymax" as row names. If multiple samples are plotted but
|
tx_alpha |
Transparency for transcript spots, helpful when the transcript spots are overplotting. |
tx_size |
Point size for transcript spots. |
tx_file |
File path to GeoParquet file of the transcript spots if you
don't wish to load all transcript spots into the SFE object. See
|
image_id |
ID of the image to plot behind the geometries. If
|
channel |
Numeric vector indicating which channels in a multi-channel
image to plot. If |
maxcell |
Maximum number of pixels to plot in the image. If the image is larger, it will be resampled so it have less than this number of pixels to save memory and for faster plotting. We recommend reducing this number when plotting multiple facets. |
show_axes |
Logical, whether to show axes. |
dark |
Logical, whether to use dark theme. When using dark theme, the palette will have lighter color represent higher values as if glowing in the dark. This is intended for plotting gene expression on top of fluorescent images. |
palette |
Vector of colors to use to color grayscale images. |
normalize_channels |
Logical, whether to normalize each channel of the
image individually. Should be |
A ggplot
object.
library(SFEData) sfe1 <- McKellarMuscleData("small") sfe2 <- McKellarMuscleData("small2") sfe <- cbind(sfe1, sfe2) sfe <- removeEmptySpace(sfe) plotGeometry(sfe, colGeometryName = "spotPoly") plotGeometry(sfe, annotGeometryName = "myofiber_simplified")
library(SFEData) sfe1 <- McKellarMuscleData("small") sfe2 <- McKellarMuscleData("small2") sfe <- cbind(sfe1, sfe2) sfe <- removeEmptySpace(sfe) plotGeometry(sfe, colGeometryName = "spotPoly") plotGeometry(sfe, annotGeometryName = "myofiber_simplified")
This function plots the images in SFE objects without plotting geometries. When showing axes, the numbers are coordinates within the image itself and have the same units as the spatial extent, but are not the actual spatial extent when plotting multiple samples to avoid excessive empty space.
plotImage( sfe, sample_id = "all", image_id = NULL, channel = NULL, ncol = NULL, bbox = NULL, maxcell = 5e+05, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE )
plotImage( sfe, sample_id = "all", image_id = NULL, channel = NULL, ncol = NULL, bbox = NULL, maxcell = 5e+05, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE )
sfe |
A |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
image_id |
ID of the image(s) to plot. If |
channel |
Numeric vector indicating which channels in a multi-channel
image to plot. If |
ncol |
Number of columns if plotting multiple features. Defaults to
|
bbox |
A bounding box to specify a smaller region to plot, useful when
the dataset is large. Can be a named numeric vector with names "xmin",
"xmax", "ymin", and "ymax", in any order. If plotting multiple samples, it
should be a matrix with sample IDs as column names and "xmin", "ymin",
"xmax", and "ymax" as row names. If multiple samples are plotted but
|
maxcell |
Maximum number of pixels to plot in the image. If the image is larger, it will be resampled so it have less than this number of pixels to save memory and for faster plotting. We recommend reducing this number when plotting multiple facets. |
show_axes |
Logical, whether to show axes. |
dark |
Logical, whether to use dark theme. When using dark theme, the palette will have lighter color represent higher values as if glowing in the dark. This is intended for plotting gene expression on top of fluorescent images. |
palette |
Vector of colors to use to color grayscale images. |
normalize_channels |
Logical, whether to normalize each channel of the
image individually. Should be |
A ggplot
object.
library(SFEData) library(SpatialFeatureExperiment) fn <- XeniumOutput("v2", file_path = "xenium_example") # Weird RBioFormats null pointer error the first time it's run try(sfe <- readXenium(fn)) sfe <- readXenium(fn) # Plot one channel plotImage(sfe, image_id = "morphology_focus", channel = 1L) plotImage(sfe, image_id = "morphology_focus", channel = 1L, show_axes = TRUE, dark = TRUE) # Colorize based on different channels plotImage(sfe, image_id = "morphology_focus", channel = c(2,4,1), show_axes = TRUE, dark = TRUE) unlink("xenium_example", recursive = TRUE)
library(SFEData) library(SpatialFeatureExperiment) fn <- XeniumOutput("v2", file_path = "xenium_example") # Weird RBioFormats null pointer error the first time it's run try(sfe <- readXenium(fn)) sfe <- readXenium(fn) # Plot one channel plotImage(sfe, image_id = "morphology_focus", channel = 1L) plotImage(sfe, image_id = "morphology_focus", channel = 1L, show_axes = TRUE, dark = TRUE) # Colorize based on different channels plotImage(sfe, image_id = "morphology_focus", channel = c(2,4,1), show_axes = TRUE, dark = TRUE) unlink("xenium_example", recursive = TRUE)
Plot results of local spatial analyses in space, such as local Getis-Ord Gi* values.
plotLocalResult( sfe, name, features, attribute = NULL, sample_id = "all", colGeometryName = NULL, annotGeometryName = NULL, rowGeometryName = NULL, rowGeometryFeatures = NULL, ncol = NULL, ncol_sample = NULL, annot_aes = list(), annot_fixed = list(), tx_fixed = list(), bbox = NULL, tx_file = NULL, image_id = NULL, channel = NULL, maxcell = 5e+05, aes_use = c("fill", "color", "shape", "linetype"), divergent = FALSE, diverge_center = NULL, annot_divergent = FALSE, annot_diverge_center = NULL, size = 0.5, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = "black", fill = "gray80", swap_rownames = NULL, scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE, type = name, ... )
plotLocalResult( sfe, name, features, attribute = NULL, sample_id = "all", colGeometryName = NULL, annotGeometryName = NULL, rowGeometryName = NULL, rowGeometryFeatures = NULL, ncol = NULL, ncol_sample = NULL, annot_aes = list(), annot_fixed = list(), tx_fixed = list(), bbox = NULL, tx_file = NULL, image_id = NULL, channel = NULL, maxcell = 5e+05, aes_use = c("fill", "color", "shape", "linetype"), divergent = FALSE, diverge_center = NULL, annot_divergent = FALSE, annot_diverge_center = NULL, size = 0.5, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = "black", fill = "gray80", swap_rownames = NULL, scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE, type = name, ... )
sfe |
A |
name |
Which local spatial results. Use
|
features |
Character vector of vectors. To see which features have the
results of a given type, see
|
attribute |
Which field in the local results of the type and features.
If the result of each feature is a vector, the this argument is ignored.
But if the result is a data frame or a matrix, then this is the column name
of the result, such as "Ii" for local Moran's I. For each local spatial
analysis method, there's a default attribute. See Details. Use
|
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
colGeometryName |
Name of a |
annotGeometryName |
Name of a |
rowGeometryName |
Name of a |
rowGeometryFeatures |
Which features from |
ncol |
Number of columns if plotting multiple features. Defaults to
|
ncol_sample |
If plotting multiple samples as facets, how many columns
of such facets. This is distinct from |
annot_aes |
A named list of plotting parameters for the annotation sf data frame. The names are which geom (as in ggplot2, such as color and fill), and the values are column names in the annotation sf data frame. Tidyeval is NOT supported. |
annot_fixed |
Similar to |
tx_fixed |
Similar to |
bbox |
A bounding box to specify a smaller region to plot, useful when
the dataset is large. Can be a named numeric vector with names "xmin",
"xmax", "ymin", and "ymax", in any order. If plotting multiple samples, it
should be a matrix with sample IDs as column names and "xmin", "ymin",
"xmax", and "ymax" as row names. If multiple samples are plotted but
|
tx_file |
File path to GeoParquet file of the transcript spots if you
don't wish to load all transcript spots into the SFE object. See
|
image_id |
ID of the image to plot behind the geometries. If
|
channel |
Numeric vector indicating which channels in a multi-channel
image to plot. If |
maxcell |
Maximum number of pixels to plot in the image. If the image is larger, it will be resampled so it have less than this number of pixels to save memory and for faster plotting. We recommend reducing this number when plotting multiple facets. |
aes_use |
Aesthetic to use for discrete variables. For continuous variables, it's always "fill" for polygons and point shapes 21-25. For discrete variables, it can be fill, color, shape, or linetype, whenever applicable. The specified value will be changed to the applicable equivalent. For example, if the geometry is point but "linetype" is specified, then "shaped" will be used instead. |
divergent |
Logical, whether a divergent palette should be used. |
diverge_center |
If |
annot_divergent |
Just as |
annot_diverge_center |
Just as |
size |
Fixed size of points. For points defaults to 0.5. Ignored if
|
shape |
Fixed shape of points, ignored if |
linewidth |
Width of lines, including outlines of polygons. For polygons, this defaults to 0, meaning no outlines. |
linetype |
Fixed line type, ignored if |
alpha |
Transparency. |
color |
Fixed color for |
fill |
Similar to |
swap_rownames |
Column name of |
scattermore |
Logical, whether to use the |
pointsize |
Radius of rasterized point in |
bins |
If binning the |
summary_fun |
Function to summarize the feature value when the
|
hex |
Logical, whether to use |
show_axes |
Logical, whether to show axes. |
dark |
Logical, whether to use dark theme. When using dark theme, the palette will have lighter color represent higher values as if glowing in the dark. This is intended for plotting gene expression on top of fluorescent images. |
palette |
Vector of colors to use to color grayscale images. |
normalize_channels |
Logical, whether to normalize each channel of the
image individually. Should be |
type |
An |
... |
Other arguments passed to |
Many local spatial analyses return a data frame or matrix as the results,
whose columns can be the statistic of interest at each location, its
variance, expected value from permutation, p-value, and etc. The
attribute
argument specifies which column to use when there are
multiple columns. Below are the defaults for each local method supported by
this package what what they mean:
localmoran
and localmoran_perm
Ii
, local Moran's
I statistic at each location.
localC_perm
localC
, the local Geary C statistic at each
location.
localG
and localG_perm
localG
, the local
Getis-Ord Gi or Gi* statistic. If include_self = TRUE
when
calculateUnivariate
or runUnivariate
was called,
then it would be Gi*. Otherwise it's Gi.
LOSH
and LOSH.mc
Hi
, local spatial
heteroscedasticity
moran.plot
wx
, the average of the value of each neighbor
of each location. Moran plot is best plotted as a scatter plot of wx
vs x
. See moranPlot
.
Other local methods not listed above return vectors as results. For instance,
localC
returns a vector by default, which is the local Geary's C
statistic.
A ggplot2
object if plotting one feature. A patchwork
object if plotting multiple features.
While this function shares internals with
plotSpatialFeature
, there are some important differences. In
plotSpatialFeature
, the annotGeometry
is indeed only
used for annotation and the protagonist is the colGeometry
, since
it's easy to directly use ggplot2
to plot the data in
annotGeometry
sf
data frames while overlaying
annotGeometry
and colGeometry
involves more complicated code.
In contrast, in this function, local results for annotGeometry
can
be plotted separately without anything related to colGeometry
. Note
that when annotGeometry
local results are plotted without
colGeometry
, the annot_*
arguments are ignored. Use the other
arguments for aesthetics as if it's for colGeometry
.
library(SpatialFeatureExperiment) library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- sfe[,sfe$in_tissue] colGraph(sfe, "visium") <- findVisiumGraph(sfe) feature_use <- rownames(sfe)[1] sfe <- logNormCounts(sfe) sfe <- runUnivariate(sfe, "localmoran", feature_use) # Which types of results are available? localResultNames(sfe) # Which features for localmoran? localResultFeatures(sfe, "localmoran") # Which columns does the localmoran results have? localResultAttrs(sfe, "localmoran", feature_use) plotLocalResult(sfe, "localmoran", feature_use, "Ii", colGeometryName = "spotPoly" ) # For annotGeometry # Make sure it's type POLYGON annotGeometry(sfe, "myofiber_simplified") <- sf::st_buffer(annotGeometry(sfe, "myofiber_simplified"), 0) annotGraph(sfe, "poly2nb_myo") <- findSpatialNeighbors(sfe, type = "myofiber_simplified", MARGIN = 3, method = "poly2nb", zero.policy = TRUE ) sfe <- annotGeometryUnivariate(sfe, "localmoran", features = "area", annotGraphName = "poly2nb_myo", annotGeometryName = "myofiber_simplified", zero.policy = TRUE ) plotLocalResult(sfe, "localmoran", "area", "Ii", annotGeometryName = "myofiber_simplified", size = 0.3, color = "cyan" ) plotLocalResult(sfe, "localmoran", "area", "Z.Ii", annotGeometryName = "myofiber_simplified" ) # don't use annot_* arguments when annotGeometry is plotted without colGeometry
library(SpatialFeatureExperiment) library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- sfe[,sfe$in_tissue] colGraph(sfe, "visium") <- findVisiumGraph(sfe) feature_use <- rownames(sfe)[1] sfe <- logNormCounts(sfe) sfe <- runUnivariate(sfe, "localmoran", feature_use) # Which types of results are available? localResultNames(sfe) # Which features for localmoran? localResultFeatures(sfe, "localmoran") # Which columns does the localmoran results have? localResultAttrs(sfe, "localmoran", feature_use) plotLocalResult(sfe, "localmoran", feature_use, "Ii", colGeometryName = "spotPoly" ) # For annotGeometry # Make sure it's type POLYGON annotGeometry(sfe, "myofiber_simplified") <- sf::st_buffer(annotGeometry(sfe, "myofiber_simplified"), 0) annotGraph(sfe, "poly2nb_myo") <- findSpatialNeighbors(sfe, type = "myofiber_simplified", MARGIN = 3, method = "poly2nb", zero.policy = TRUE ) sfe <- annotGeometryUnivariate(sfe, "localmoran", features = "area", annotGraphName = "poly2nb_myo", annotGeometryName = "myofiber_simplified", zero.policy = TRUE ) plotLocalResult(sfe, "localmoran", "area", "Ii", annotGeometryName = "myofiber_simplified", size = 0.3, color = "cyan" ) plotLocalResult(sfe, "localmoran", "area", "Z.Ii", annotGeometryName = "myofiber_simplified" ) # don't use annot_* arguments when annotGeometry is plotted without colGeometry
Plot the simulations as a density plot or histogram compared to the observed
Moran's I or Geary's C, with ggplot2 so it looks nicer. Unlike the plotting
function in spdep
, this function can also plot the same feature in
different samples as facets or plot different features or samples together
for comparison.
plotMoranMC( sfe, features, sample_id = "all", facet_by = c("sample_id", "features"), ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, ptype = c("density", "histogram", "freqpoly"), swap_rownames = NULL, name = "moran.mc", ... )
plotMoranMC( sfe, features, sample_id = "all", facet_by = c("sample_id", "features"), ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, ptype = c("density", "histogram", "freqpoly"), swap_rownames = NULL, name = "moran.mc", ... )
sfe |
A |
features |
Features to plot, must be in rownames of the gene count
matrix, colnames of colData or a colGeometry, colnames of cell embeddings
in |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
facet_by |
Whether to facet by sample_id (default) or features. If facetting by sample_id, then different features will be plotted in the same facet for comparison. If facetting by features, then different samples will be compared for each feature. Ignored if only one sample is specified. |
ncol |
Number of columns if facetting. |
colGeometryName |
Name of a |
annotGeometryName |
Name of a |
reducedDimName |
Name of a dimension reduction, can be seen in
|
ptype |
Plot type, one of "density", "histogram", or "freqpoly". |
swap_rownames |
Column name of |
name |
Name under which the Monte Carlo results are stored, which defaults to "moran.mc". For Geary's C Monte Carlo, the default is "geary.mc". |
... |
Other arguments passed to |
A ggplot2
object.
library(SpatialFeatureExperiment) library(SFEData) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) sfe <- colDataUnivariate(sfe, type = "moran.mc", "nCounts", nsim = 100) plotMoranMC(sfe, "nCounts")
library(SpatialFeatureExperiment) library(SFEData) sfe <- McKellarMuscleData("small") colGraph(sfe, "visium") <- findVisiumGraph(sfe) sfe <- colDataUnivariate(sfe, type = "moran.mc", "nCounts", nsim = 100) plotMoranMC(sfe, "nCounts")
Unlike Seurat
and ggspavis
, plotting functions in this package
uses geom_sf
whenever applicable.
plotSpatialFeature( sfe, features, colGeometryName = 1L, sample_id = "all", ncol = NULL, ncol_sample = NULL, annotGeometryName = NULL, rowGeometryName = NULL, rowGeometryFeatures = NULL, annot_aes = list(), annot_fixed = list(), tx_fixed = list(), exprs_values = "logcounts", bbox = NULL, tx_file = NULL, image_id = NULL, channel = NULL, maxcell = 5e+05, aes_use = c("fill", "color", "shape", "linetype"), divergent = FALSE, diverge_center = NA, annot_divergent = FALSE, annot_diverge_center = NA, size = 0.5, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = "black", fill = "gray80", swap_rownames = NULL, scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE, ... )
plotSpatialFeature( sfe, features, colGeometryName = 1L, sample_id = "all", ncol = NULL, ncol_sample = NULL, annotGeometryName = NULL, rowGeometryName = NULL, rowGeometryFeatures = NULL, annot_aes = list(), annot_fixed = list(), tx_fixed = list(), exprs_values = "logcounts", bbox = NULL, tx_file = NULL, image_id = NULL, channel = NULL, maxcell = 5e+05, aes_use = c("fill", "color", "shape", "linetype"), divergent = FALSE, diverge_center = NA, annot_divergent = FALSE, annot_diverge_center = NA, size = 0.5, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = "black", fill = "gray80", swap_rownames = NULL, scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE, ... )
sfe |
A |
features |
Features to plot, must be in rownames of the gene count matrix, colnames of colData or a colGeometry. |
colGeometryName |
Name of a |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
ncol |
Number of columns if plotting multiple features. Defaults to
|
ncol_sample |
If plotting multiple samples as facets, how many columns
of such facets. This is distinct from |
annotGeometryName |
Name of a |
rowGeometryName |
Name of a |
rowGeometryFeatures |
Which features from |
annot_aes |
A named list of plotting parameters for the annotation sf data frame. The names are which geom (as in ggplot2, such as color and fill), and the values are column names in the annotation sf data frame. Tidyeval is NOT supported. |
annot_fixed |
Similar to |
tx_fixed |
Similar to |
exprs_values |
Integer scalar or string indicating which assay of x contains the expression values. |
bbox |
A bounding box to specify a smaller region to plot, useful when
the dataset is large. Can be a named numeric vector with names "xmin",
"xmax", "ymin", and "ymax", in any order. If plotting multiple samples, it
should be a matrix with sample IDs as column names and "xmin", "ymin",
"xmax", and "ymax" as row names. If multiple samples are plotted but
|
tx_file |
File path to GeoParquet file of the transcript spots if you
don't wish to load all transcript spots into the SFE object. See
|
image_id |
ID of the image to plot behind the geometries. If
|
channel |
Numeric vector indicating which channels in a multi-channel
image to plot. If |
maxcell |
Maximum number of pixels to plot in the image. If the image is larger, it will be resampled so it have less than this number of pixels to save memory and for faster plotting. We recommend reducing this number when plotting multiple facets. |
aes_use |
Aesthetic to use for discrete variables. For continuous variables, it's always "fill" for polygons and point shapes 21-25. For discrete variables, it can be fill, color, shape, or linetype, whenever applicable. The specified value will be changed to the applicable equivalent. For example, if the geometry is point but "linetype" is specified, then "shaped" will be used instead. |
divergent |
Logical, whether a divergent palette should be used. |
diverge_center |
If |
annot_divergent |
Just as |
annot_diverge_center |
Just as |
size |
Fixed size of points. For points defaults to 0.5. Ignored if
|
shape |
Fixed shape of points, ignored if |
linewidth |
Width of lines, including outlines of polygons. For polygons, this defaults to 0, meaning no outlines. |
linetype |
Fixed line type, ignored if |
alpha |
Transparency. |
color |
Fixed color for |
fill |
Similar to |
swap_rownames |
Column name of |
scattermore |
Logical, whether to use the |
pointsize |
Radius of rasterized point in |
bins |
If binning the |
summary_fun |
Function to summarize the feature value when the
|
hex |
Logical, whether to use |
show_axes |
Logical, whether to show axes. |
dark |
Logical, whether to use dark theme. When using dark theme, the palette will have lighter color represent higher values as if glowing in the dark. This is intended for plotting gene expression on top of fluorescent images. |
palette |
Vector of colors to use to color grayscale images. |
normalize_channels |
Logical, whether to normalize each channel of the
image individually. Should be |
... |
Other arguments passed to |
In the documentation of this function, a "feature" can be a gene (or whatever
entity that corresponds to rows of the gene count matrix), a column in
colData
, or a column in the colGeometry
sf
data frame
specified in the colGeometryName
argument.
In the light theme, for continuous variables, the Blues palette from
colorbrewer is used if divergent = FALSE
, and the roma palette from
the scico
package if divergent = TRUE
. In the dark theme, the
nuuk palette from scico
is used if divergent = FALSE
, and the
berlin palette from scico
is used if divergent = TRUE
. For
discrete variables, the dittoSeq
palette is used.
For annotation, the YlOrRd colorbrewer palette is used for continuous
variables in the light theme. In the dark theme, the acton palette from
scico
is used when divergent = FALSE
and the vanimo palette
from scico
is used when divergent = FALSE
. The other end of the
dittoSeq
palette is used for discrete variables.
Each individual palette should be colorblind friendly, but when plotting
continuous variables coloring a colGeometry
and a annotGeometry
simultaneously, the combination of the two palettes is not guaranteed to be
colorblind friendly.
In addition, when plotting an image behind the geometries, the colors of the image may distort color perception of the values of the geometries.
theme_void
is used for all spatial plots in this package, because the
units in the spatial coordinates are often arbitrary. This can be overriden
to show the axes by using a different theme as normally done in
ggplot2
.
A ggplot2
object if plotting one feature. A patchwork
object if plotting multiple features.
library(SFEData) library(sf) sfe <- McKellarMuscleData("small") # features can be genes or colData or colGeometry columns plotSpatialFeature(sfe, c("nCounts", rownames(sfe)[1]), exprs_values = "counts", colGeometryName = "spotPoly", annotGeometryName = "tissueBoundary" ) # Change fixed aesthetics plotSpatialFeature(sfe, "nCounts", colGeometryName = "spotPoly", annotGeometryName = "tissueBoundary", annot_fixed = list(color = "blue", size = 0.3, fill = NA), alpha = 0.7 ) # Make the myofiber segmentations a valid POLYGON geometry ag <- annotGeometry(sfe, "myofiber_simplified") ag <- st_buffer(ag, 0) ag <- ag[!st_is_empty(ag), ] annotGeometry(sfe, "myofiber_simplified") <- ag # Also plot an annotGeometry variable plotSpatialFeature(sfe, "nCounts", colGeometryName = "spotPoly", annotGeometryName = "myofiber_simplified", annot_aes = list(fill = "area") ) # Use a bounding box to zoom in bbox <- c(xmin = 5500, ymin = 13500, xmax = 6000, ymax = 14000) plotSpatialFeature(sfe, "nCounts", colGeometryName = "spotPoly", annotGeometry = "myofiber_simplified", bbox = bbox, annot_fixed = list(linewidth = 0.3))
library(SFEData) library(sf) sfe <- McKellarMuscleData("small") # features can be genes or colData or colGeometry columns plotSpatialFeature(sfe, c("nCounts", rownames(sfe)[1]), exprs_values = "counts", colGeometryName = "spotPoly", annotGeometryName = "tissueBoundary" ) # Change fixed aesthetics plotSpatialFeature(sfe, "nCounts", colGeometryName = "spotPoly", annotGeometryName = "tissueBoundary", annot_fixed = list(color = "blue", size = 0.3, fill = NA), alpha = 0.7 ) # Make the myofiber segmentations a valid POLYGON geometry ag <- annotGeometry(sfe, "myofiber_simplified") ag <- st_buffer(ag, 0) ag <- ag[!st_is_empty(ag), ] annotGeometry(sfe, "myofiber_simplified") <- ag # Also plot an annotGeometry variable plotSpatialFeature(sfe, "nCounts", colGeometryName = "spotPoly", annotGeometryName = "myofiber_simplified", annot_aes = list(fill = "area") ) # Use a bounding box to zoom in bbox <- c(xmin = 5500, ymin = 13500, xmax = 6000, ymax = 14000) plotSpatialFeature(sfe, "nCounts", colGeometryName = "spotPoly", annotGeometry = "myofiber_simplified", bbox = bbox, annot_fixed = list(linewidth = 0.3))
Plot transcript spot density as 2D histogram
plotTxBin2D( sfe = NULL, data_dir = NULL, tech = c("Vizgen", "Xenium", "CosMX"), file = NULL, sample_id = "all", bins = 200, binwidth = NULL, hex = FALSE, ncol = NULL, bbox = NULL, gene = "all", spatialCoordsNames = c("X", "Y"), gene_col = "gene", rowGeometryName = "txSpots", flip = FALSE, tx_file = NULL )
plotTxBin2D( sfe = NULL, data_dir = NULL, tech = c("Vizgen", "Xenium", "CosMX"), file = NULL, sample_id = "all", bins = 200, binwidth = NULL, hex = FALSE, ncol = NULL, bbox = NULL, gene = "all", spatialCoordsNames = c("X", "Y"), gene_col = "gene", rowGeometryName = "txSpots", flip = FALSE, tx_file = NULL )
sfe |
A |
data_dir |
Top level directory of the output files. This can be
specified in place of |
tech |
Name of the commercial technology, must be one of Vizgen, Xenium, and CosMX. |
file |
File (not GeoParquet) with numeric columns for xy coordinates of
the transcript spots. Ignored if |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
bins |
Number of bins. Can be a vector of length 2 to specify for x and y axes separately. |
binwidth |
Width of bins, passed to |
hex |
Logical, whether to use hexagonal bins. |
ncol |
Number of columns if plotting multiple features. Defaults to
|
bbox |
A bounding box to specify a smaller region to plot, useful when
the dataset is large. Can be a named numeric vector with names "xmin",
"xmax", "ymin", and "ymax", in any order. If plotting multiple samples, it
should be a matrix with sample IDs as column names and "xmin", "ymin",
"xmax", and "ymax" as row names. If multiple samples are plotted but
|
gene |
Character vector of names of genes to plot. If "all" then transcript spots of all genes are plotted. |
spatialCoordsNames |
Column names for the x, y, and optionally z coordinates of the spots. The defaults are for Vizgen. |
gene_col |
Column name for genes. |
rowGeometryName |
Name of a |
flip |
Logical, whether to flip the y axis when plotting data from file. |
tx_file |
File path to GeoParquet file of the transcript spots if you
don't wish to load all transcript spots into the SFE object. See
|
A ggplot object, facetting by sample.
This function plots the variogram of a feature and its fitted variogram
models, showing the nugget, range, and sill of the model. Unlike the plotting
functions in package automap
that uses lattice
, this function
uses ggplot2
to make prettier and more customizable plots.
plotVariogram( sfe, features, sample_id = "all", color_by = NULL, group = c("none", "sample_id", "features", "angles"), use_lty = TRUE, show_np = TRUE, ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, name = "variogram" )
plotVariogram( sfe, features, sample_id = "all", color_by = NULL, group = c("none", "sample_id", "features", "angles"), use_lty = TRUE, show_np = TRUE, ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, name = "variogram" )
sfe |
A |
features |
Features to plot, must be in rownames of the gene count
matrix, colnames of colData or a colGeometry, colnames of cell embeddings
in |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
color_by |
Name of a column in |
group |
Which of samples, features, and angles to show in the same facet for comparison when there are multiple. Default to "none", meaning each facet will contain one variogram. When grouping multiple variograms in the same facet, the text with model, nugget, sill, and range of the variograms will not be shown. |
use_lty |
Logical, whether to use linetype or point shape to distinguish
between the different features or samples in the same facet. If
|
show_np |
Logical, whether to show number of pairs of cells at each distance bin. |
ncol |
Number of columns if facetting. |
colGeometryName |
Name of a |
annotGeometryName |
Name of a |
reducedDimName |
Name of a dimension reduction, can be seen in
|
divergent |
Logical, whether a divergent palette should be used. |
diverge_center |
If |
swap_rownames |
Column name of |
name |
Name under which the correlogram results are stored, which is by default "sp.correlogram". |
A ggplot
object. The empirical variogram at each distance bin
is plotted as points, and the fitted variogram model is plotted as a line
for each feature. The number next to each point is the number of pairs of
cells in that distance bin.
plotVariogramMap
library(SFEData) sfe <- McKellarMuscleData() sfe <- colDataUnivariate(sfe, "variogram", features = "nCounts", model = "Sph") plotVariogram(sfe, "nCounts") # Anisotropy, will get a message sfe <- colDataUnivariate(sfe, "variogram", features = "nCounts", model = "Sph", alpha = c(30, 90, 150), name = "variogram_anis") # Facet by angles by default plotVariogram(sfe, "nCounts", name = "variogram_anis") # Plot angles with different colors plotVariogram(sfe, "nCounts", group = "angles", name = "variogram_anis")
library(SFEData) sfe <- McKellarMuscleData() sfe <- colDataUnivariate(sfe, "variogram", features = "nCounts", model = "Sph") plotVariogram(sfe, "nCounts") # Anisotropy, will get a message sfe <- colDataUnivariate(sfe, "variogram", features = "nCounts", model = "Sph", alpha = c(30, 90, 150), name = "variogram_anis") # Facet by angles by default plotVariogram(sfe, "nCounts", name = "variogram_anis") # Plot angles with different colors plotVariogram(sfe, "nCounts", group = "angles", name = "variogram_anis")
Plot variogram maps that show the variogram in all directions in a grid of distances in x and y coordinates.
plotVariogramMap( sfe, features, sample_id = "all", plot_np = FALSE, ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = "variogram_map" )
plotVariogramMap( sfe, features, sample_id = "all", plot_np = FALSE, ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = "variogram_map" )
sfe |
A |
features |
Features to plot, must be in rownames of the gene count
matrix, colnames of colData or a colGeometry, colnames of cell embeddings
in |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
plot_np |
Logical, whether to plot the number of pairs in each distance bin instead of the variance. |
ncol |
Number of columns if facetting. |
colGeometryName |
Name of a |
annotGeometryName |
Name of a |
reducedDimName |
Name of a dimension reduction, can be seen in
|
swap_rownames |
Column name of |
name |
Name under which the correlogram results are stored, which is by default "sp.correlogram". |
A ggplot object.
plotVariogram
library(SFEData) sfe <- McKellarMuscleData() sfe <- colDataUnivariate(sfe, "variogram_map", features = "nCounts", width = 500, cutoff = 5000) plotVariogramMap(sfe, "nCounts")
library(SFEData) sfe <- McKellarMuscleData() sfe <- colDataUnivariate(sfe, "variogram_map", features = "nCounts", width = 500, cutoff = 5000) plotVariogramMap(sfe, "nCounts")
This S4 class is used to wrap spatial analysis methods, taking inspiration
from the caret
and tidymodels
packages.
SFEMethod( name, fun, reorganize_fun, package, variate = c("uni", "bi", "multi"), scope = c("global", "local"), title = NULL, default_attr = NA, args_not_check = NA, joint = FALSE, use_graph = TRUE, use_matrix = FALSE, dest = c("reducedDim", "colData") ) ## S4 method for signature 'SFEMethod' info(x, type) ## S4 method for signature 'SFEMethod' is_local(x) ## S4 method for signature 'SFEMethod' fun(x) ## S4 method for signature 'SFEMethod' reorganize_fun(x) ## S4 method for signature 'SFEMethod' args_not_check(x) ## S4 method for signature 'SFEMethod' is_joint(x) ## S4 method for signature 'SFEMethod' use_graph(x) ## S4 method for signature 'SFEMethod' use_matrix(x)
SFEMethod( name, fun, reorganize_fun, package, variate = c("uni", "bi", "multi"), scope = c("global", "local"), title = NULL, default_attr = NA, args_not_check = NA, joint = FALSE, use_graph = TRUE, use_matrix = FALSE, dest = c("reducedDim", "colData") ) ## S4 method for signature 'SFEMethod' info(x, type) ## S4 method for signature 'SFEMethod' is_local(x) ## S4 method for signature 'SFEMethod' fun(x) ## S4 method for signature 'SFEMethod' reorganize_fun(x) ## S4 method for signature 'SFEMethod' args_not_check(x) ## S4 method for signature 'SFEMethod' is_joint(x) ## S4 method for signature 'SFEMethod' use_graph(x) ## S4 method for signature 'SFEMethod' use_matrix(x)
name |
Name of the method, used by user-facing functions to specify the method to use, such as "moran" for Moran's I. |
fun |
Function to run the method. See Details. |
reorganize_fun |
Function to reorganize results to add to the SFE object. See Details. |
package |
Name of the package whose implementation of the method is used here, used to check if the package is installed. |
variate |
How many variables this method works with, must be one of "uni" for univariate, "bi" for bivariate, or "multi" for multivariate. |
scope |
Either "global", returning one result for the entire dataset, or "local", returning one result for each spatial location. For multivariate methods, this is irrelevant. |
title |
Descriptive title to show when plotting the results. |
default_attr |
For local methods that return multiple fields, such as local Moran values and their p-values, the default field to use when plotting. |
args_not_check |
A character vector indicating which argument are not to be checked when comparing parameters in with those of a previous run. |
joint |
Logical, whether it makes sense to run this method to multiple
samples jointly. If |
use_graph |
Logical, to indicate whether the method uses a spatial neighborhood graph because unifying the user facing functions have an argument asking for the graph as most though not all methods require the graph. |
use_matrix |
Logical, whether the function in slot |
dest |
Whether the results are more appropriate for |
x |
A |
type |
One of the names of the |
The fun
slot should be specified as such:
For all methods, there must be arguments x
for a vector, listw
for a listw
object specifying the spatial neighborhood graph,
zero.policy
specifying what to do with cells without neighbors
(default NULL, use global option value; if TRUE assign zero to the lagged
value of zones without neighbours, if FALSE assign NA), and optionally other
method specific arguments and ...
to pass to the underlying imported
function. If the original function implementing the method in the package has
different argument names or orders, write a thin wrapper to rearrange and/or
rename the arguments.
For univariate methods that use a spatial neighborhood graph, the first two
arguments must be x
and listw
. For univariate methods that
don't use a spatial neighborhood graph, such as the variogram, the first two
arguments must be x
for a numeric vector and coords_df
for a
sf
data frame with cell locations and optionally other regressors. The
formula
argument is optional and can have defaults specifying
regressors to use.
For bivariate methods, the first three arguments must be x
, y
,
and listw
.
For multivariate methods, the argument x
is mandatory, for the matrix
input. These arguments must be present but can be optional by having
defaults: listw
and ncomponents
to set the number of dimentions
in the output.
The reorganize_fun
slot should be specified as such:
Univariate methods are meant to be run separately for each gene, so the input
to reorganize_fun
in the argument out
should be a list of
outputs; each element of the list corresponds to the output of a gene.
For univariate global methods, different fields of the result should be
columns of a data frame with one row so results for multiple features will be
a data frame. The arguments should be out
, and name
to rename
the primary field if a more informative name is needed, and ...
for
other arguments specific to methods. The output of reorganize_fun
should be a DataFrame
whose rows correspond to the genes and columns
correspond to fields in the output.
For univariate local methods, the arguments should be out
, nb
for a neighborhood list used for multiple testing correction, and
p.adjust.method
for a method to correct for multiple testing as in
p.adjust
, and ...
. The output of reorganize_fun
should be a list of reorganized output. Each element of the list corresponds
to a gene, and the reorganized content of the element can be a vector,
matrix, or data frame, but they must all have the same dimensions for all
genes. Each element of the vector, or each row of the matrix or data frame
corresponds to a cell.
For multivariate methods whose results go into reducedDim
,
reorganize_fun
should have one argument out
for the raw output.
The output of reorganize_fun
should be the cell embedding matrix ready
to be added to reducedDim
. Other relevant information such as gene
loadings and eigenvalues should be added to the attributes of the cell
embedding matrix.
For multivariate methods whose results can go into colData
, the
arguments should be out
, nb
, and p.adjust.method
. Unlike
the univariate local counterpart, out
takes the raw output instead of
a list of outputs. The output of reorganize_fun
is a vector or a data
frame ready to be added to colData
.
The constructor returns a SFEMethod
object. The getters return
the content of the corresponding slots.
info
A named character vector specifying information about the method.
fun
The function implementing the method. See Details.
reorganize_fun
Function to convert output from fun
into a format
to store in the SFE object. See Details.
misc
Miscellaneous information on how the method interacts with the rest of the package. This should be a named list.
moran <- SFEMethod( name = "moran", title = "Moran's I", package = "spdep", variate = "uni", scope = "global", fun = function(x, listw, zero.policy = NULL) spdep::moran(x, listw, n = length(listw$neighbours), S0 = spdep::Szero(listw), zero.policy = zero.policy), reorganize_fun = Voyager:::.moran2df )
moran <- SFEMethod( name = "moran", title = "Moran's I", package = "spdep", variate = "uni", scope = "global", fun = function(x, listw, zero.policy = NULL) spdep::moran(x, listw, n = length(listw$neighbours), S0 = spdep::Szero(listw), zero.policy = zero.policy), reorganize_fun = Voyager:::.moran2df )
Such as plotting the value of projection of gene expression of each cell to a principal component in space. At present, this function does not work for the 3D array of geographically weighted PCA (GWPCA), but a future version will deal with GWPCA results.
spatialReducedDim( sfe, dimred, ncomponents = NULL, components = ncomponents, colGeometryName = 1L, sample_id = "all", ncol = NULL, ncol_sample = NULL, annotGeometryName = NULL, rowGeometryName = NULL, rowGeometryFeatures = NULL, annot_aes = list(), annot_fixed = list(), tx_fixed = list(), exprs_values = "logcounts", bbox = NULL, tx_file = NULL, image_id = NULL, channel = NULL, maxcell = 5e+05, aes_use = c("fill", "color", "shape", "linetype"), divergent = FALSE, diverge_center = NULL, annot_divergent = FALSE, annot_diverge_center = NULL, size = 0, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = NA, fill = "gray80", scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE, ... )
spatialReducedDim( sfe, dimred, ncomponents = NULL, components = ncomponents, colGeometryName = 1L, sample_id = "all", ncol = NULL, ncol_sample = NULL, annotGeometryName = NULL, rowGeometryName = NULL, rowGeometryFeatures = NULL, annot_aes = list(), annot_fixed = list(), tx_fixed = list(), exprs_values = "logcounts", bbox = NULL, tx_file = NULL, image_id = NULL, channel = NULL, maxcell = 5e+05, aes_use = c("fill", "color", "shape", "linetype"), divergent = FALSE, diverge_center = NULL, annot_divergent = FALSE, annot_diverge_center = NULL, size = 0, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = NA, fill = "gray80", scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, show_axes = FALSE, dark = FALSE, palette = colorRampPalette(c("black", "white"))(255), normalize_channels = FALSE, ... )
sfe |
A |
dimred |
A string or integer scalar indicating the reduced dimension
result in |
ncomponents |
A numeric scalar indicating the number of dimensions to plot, starting from the first dimension. Alternatively, a numeric vector specifying the dimensions to be plotted. |
components |
A numeric scalar or vector specifying which dimensions to
be plotted. Use this instead of |
colGeometryName |
Name of a |
sample_id |
Sample(s) in the SFE object whose cells/spots to use. Can be "all" to compute metric for all samples; the metric is computed separately for each sample. |
ncol |
Number of columns if plotting multiple features. Defaults to
|
ncol_sample |
If plotting multiple samples as facets, how many columns
of such facets. This is distinct from |
annotGeometryName |
Name of a |
rowGeometryName |
Name of a |
rowGeometryFeatures |
Which features from |
annot_aes |
A named list of plotting parameters for the annotation sf data frame. The names are which geom (as in ggplot2, such as color and fill), and the values are column names in the annotation sf data frame. Tidyeval is NOT supported. |
annot_fixed |
Similar to |
tx_fixed |
Similar to |
exprs_values |
Integer scalar or string indicating which assay of x contains the expression values. |
bbox |
A bounding box to specify a smaller region to plot, useful when
the dataset is large. Can be a named numeric vector with names "xmin",
"xmax", "ymin", and "ymax", in any order. If plotting multiple samples, it
should be a matrix with sample IDs as column names and "xmin", "ymin",
"xmax", and "ymax" as row names. If multiple samples are plotted but
|
tx_file |
File path to GeoParquet file of the transcript spots if you
don't wish to load all transcript spots into the SFE object. See
|
image_id |
ID of the image to plot behind the geometries. If
|
channel |
Numeric vector indicating which channels in a multi-channel
image to plot. If |
maxcell |
Maximum number of pixels to plot in the image. If the image is larger, it will be resampled so it have less than this number of pixels to save memory and for faster plotting. We recommend reducing this number when plotting multiple facets. |
aes_use |
Aesthetic to use for discrete variables. For continuous variables, it's always "fill" for polygons and point shapes 21-25. For discrete variables, it can be fill, color, shape, or linetype, whenever applicable. The specified value will be changed to the applicable equivalent. For example, if the geometry is point but "linetype" is specified, then "shaped" will be used instead. |
divergent |
Logical, whether a divergent palette should be used. |
diverge_center |
If |
annot_divergent |
Just as |
annot_diverge_center |
Just as |
size |
Fixed size of points. For points defaults to 0.5. Ignored if
|
shape |
Fixed shape of points, ignored if |
linewidth |
Width of lines, including outlines of polygons. For polygons, this defaults to 0, meaning no outlines. |
linetype |
Fixed line type, ignored if |
alpha |
Transparency. |
color |
Fixed color for |
fill |
Similar to |
scattermore |
Logical, whether to use the |
pointsize |
Radius of rasterized point in |
bins |
If binning the |
summary_fun |
Function to summarize the feature value when the
|
hex |
Logical, whether to use |
show_axes |
Logical, whether to show axes. |
dark |
Logical, whether to use dark theme. When using dark theme, the palette will have lighter color represent higher values as if glowing in the dark. This is intended for plotting gene expression on top of fluorescent images. |
palette |
Vector of colors to use to color grayscale images. |
normalize_channels |
Logical, whether to normalize each channel of the
image individually. Should be |
... |
Other arguments passed to |
Same as in plotSpatialFeature
. A ggplot2
object
if plotting one component. A patchwork
object if plotting multiple
components.
scater::plotReducedDim
library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- logNormCounts(sfe) sfe <- runPCA(sfe, ncomponents = 2) spatialReducedDim(sfe, "PCA", ncomponents = 2, "spotPoly", annotGeometryName = "tissueBoundary", divergent = TRUE, diverge_center = 0 ) # Basically PC1 separates spots not on tissue from those on tissue.
library(SFEData) library(scater) sfe <- McKellarMuscleData("small") sfe <- logNormCounts(sfe) sfe <- runPCA(sfe, ncomponents = 2) spatialReducedDim(sfe, "PCA", ncomponents = 2, "spotPoly", annotGeometryName = "tissueBoundary", divergent = TRUE, diverge_center = 0 ) # Basically PC1 separates spots not on tissue from those on tissue.
Wrapper of automap::autofitVariogram
to facilitate computing
variograms for multiple genes in SFE objects as an EDA tool. These functions
are written to conform to a uniform format for univariate methods to be
called internally. These functions are not exported, but the documentation is
written to show users the extra arguments to use when alling
calculateUnivariate
or runUnivariate
.
.variogram(x, coords_df, formula = x ~ 1, scale = TRUE, ...) .variogram_bv(x, y, coords_df, scale = TRUE, map = FALSE, ...) .cross_variogram(x, y, coords_df, scale = TRUE, ...) .cross_variogram_map(x, y, coords_df, width, cutoff, scale = TRUE, ...) .variogram_map(x, coords_df, formula = x ~ 1, width, cutoff, scale = TRUE, ...)
.variogram(x, coords_df, formula = x ~ 1, scale = TRUE, ...) .variogram_bv(x, y, coords_df, scale = TRUE, map = FALSE, ...) .cross_variogram(x, y, coords_df, scale = TRUE, ...) .cross_variogram_map(x, y, coords_df, width, cutoff, scale = TRUE, ...) .variogram_map(x, coords_df, formula = x ~ 1, width, cutoff, scale = TRUE, ...)
x |
A numeric vector whose variogram is computed. |
coords_df |
A |
formula |
A formula defining the response vector and (possible) regressors, in case of absence of regressors, use x ~ 1. |
scale |
Logical, whether to scale |
... |
Other arguments passed to |
y |
For bivariate, another numeric vector whose variogram is computed. |
map |
logical; if TRUE, and |
width |
the width of subsequent distance intervals into which data point pairs are grouped for semivariance estimates |
cutoff |
spatial separation distance up to which point pairs are included in semivariance estimates; as a default, the length of the diagonal of the box spanning the data is divided by three. |
An autofitVariogram
object.