As including a more detailed vignette inside the package makes the package exceed the tarball size, more detailed vignettes are hosted on an external website. This is a simplified vignette.
This package can be installed from Bioconductor:
In non-spatial scRNA-seq, the SingleCellExperiment
(SCE) package implements a data structure and other packages such as scater
implement methods for quality control (QC), basic exploratory data
analysis (EDA), and plotting functions, using SCE to organize the data
and results. Voyager
to SpatialFeatureExperiment
(SFE) aims to be analogous scater
to SFE, implementing
basic exploratory spatial data analysis (ESDA) and plotting.
SFE inherits from SCE and SpatialExperiment
(SPE), so all methods written for SCE and SPE can be used for SFE as
well.
In this first version, ESDA is based on the classic geospatial
package spdep
,
but future versions will incorporate methods from GWmodel
,
adespatial
,
and etc.
These are the main functionalities of the Voyager
at
present:
colData
along with
annotation geometries, with colorblind friendly default palettes. The
actual geometries are plotted, not just centroids as in
Seurat
. The tissue image can be plotted behind the
geometries.Future versions may add user friendly wrappers of some successful
spatial transcriptomics data analysis packages for spatially variable
genes, cell type deconvolution, and spatial regions on CRAN,
Bioconductor, pip, and conda, to provide a uniform syntax and avoid
object conversion, as is done in Seurat
for some
non-spatial scRNA-seq methods.
Here we use a mouse skeletal muscle Visium dataset from Large-scale
integration of single-cell transcriptomic data captures transitional
progenitor states in mouse skeletal muscle regeneration. It’s in the
SFEData
package, as an SFE object, which contains Visium
spot polygons, myofiber and nuclei segmentations, and myofiber and
nuclei morphological metrics.
library(SFEData)
library(SpatialFeatureExperiment)
library(SpatialExperiment)
library(ggplot2)
library(Voyager)
library(scater)
library(scran)
library(pheatmap)
This is the H&E image:
if (!file.exists("tissue_lowres_5a.jpeg")) {
download.file("https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg",
destfile = "tissue_lowres_5a.jpeg")
}
sfe <- McKellarMuscleData()
#> see ?SFEData and browseVignettes('SFEData') for documentation
#> downloading 1 resources
#> retrieving 1 resource
#> loading from cache
This dataset was not originally in the standard Space Ranger output
format, so we can’t use read10xVisiumSFE()
. But the image
can be added later for plotting.
sfe <- addImg(sfe, imageSource = "tissue_lowres_5a.jpeg", sample_id = "Vis5A",
image_id = "lowres", scale_fct = 1024/22208)
The image needs to be flipped to match the spots, because the origin of the image is at the top left corner while the origin of the spots is at the bottom left.
# Only use spots in tissue here
sfe <- sfe[,colData(sfe)$in_tissue]
sfe <- logNormCounts(sfe)
sfe
#> class: SpatialFeatureExperiment
#> dim: 15123 932
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ...
#> ENSMUSG00000064368 ENSMUSG00000064370
#> rowData names(6): Ensembl symbol ... vars cv2
#> colnames(932): AAACATTTCCCGGATT AAACCTAAGCAGCCGG ... TTGTGTTTCCCGAAAG
#> TTGTTGTGTGTCAAGA
#> colData names(13): barcode col ... in_tissue sizeFactor
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : imageX imageY
#> imgData names(4): sample_id image_id data scaleFactor
#>
#> unit: full_res_image_pixels
#> Geometries:
#> colGeometries: spotPoly (POLYGON)
#> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT)
#>
#> Graphs:
#> Vis5A:
A spatial neighborhood graph is required for all spdep
analyses.
All of the numerous univariate methods can be used with
runUnivariate()
, which stores global results in
rowData(sfe)
and local results in
localResults(sfe)
. Here we compute Moran’s I for one gene.
While Ensembl IDs are used internally, the user can specify more human
readable gene symbols. A warning will be given if the gene symbol
matches multiple Ensembl IDs.
sfe <- runUnivariate(sfe, type = "moran", features = features_use,
colGraphName = "visium", swap_rownames = "symbol")
# Look at the results
rowData(sfe)[rowData(sfe)$symbol %in% features_use,]
#> DataFrame with 2 rows and 8 columns
#> Ensembl symbol type means
#> <character> <character> <character> <numeric>
#> ENSMUSG00000033196 ENSMUSG00000033196 Myh2 Gene Expression 0.97476
#> ENSMUSG00000056328 ENSMUSG00000056328 Myh1 Gene Expression 4.82572
#> vars cv2 moran_Vis5A K_Vis5A
#> <numeric> <numeric> <numeric> <numeric>
#> ENSMUSG00000033196 24.0374 25.2984 0.625500 2.21641
#> ENSMUSG00000056328 302.2385 12.9785 0.635718 2.68736
Since Moran’s I is very commonly used, one can call
runMoransI
rather than runUnivariate
.
Compute a local spatial statistic, Getis-Ord Gi*, which is commonly
used to detect hotspots and coldspots. The include_self
argument is only for Getis-Ord Gi*; when set to TRUE
Gi* is
computed as the spatial graph includes self-directing edges, and
otherwise Gi is computed.
sfe <- runUnivariate(sfe, type = "localG", features = features_use,
colGraphName = "visium", include_self = TRUE,
swap_rownames = "symbol")
# Look at the results
head(localResults(sfe, name = "localG")[[1]])
#> localG G*i E(G*i) V(G*i) Z(G*i) Pr(z != E(G*i))
#> 1 0.9539649 0.0013011969 0.001072961 5.72403e-08 0.9539649 0.3401014102
#> 2 2.2196433 0.0014737468 0.001072961 3.26030e-08 2.2196433 0.0264429927
#> 3 -1.4500290 0.0008111398 0.001072961 3.26030e-08 -1.4500290 0.1470504457
#> 4 3.5457397 0.0017131908 0.001072961 3.26030e-08 3.5457397 0.0003915127
#> 5 1.1947382 0.0012886869 0.001072961 3.26030e-08 1.1947382 0.2321893509
#> 6 -1.4750237 0.0008066266 0.001072961 3.26030e-08 -1.4750237 0.1402061682
#> -log10p -log10p_adj cluster
#> 1 0.4683916 0.0000000 High
#> 2 1.5776894 0.6745994 High
#> 3 0.8325337 0.0000000 High
#> 4 3.4072541 2.5041641 High
#> 5 0.6341577 0.0000000 High
#> 6 0.8532329 0.0000000 Low
Spatial statistics can also be computed for numeric columns of
colData(sfe)
, with colDataUnivariate()
, and
for numeric attributes of the geometries with
colGeometryUnivariate()
and
annotGeometryUnivariate()
, all with very similar
arguments.
Akin to runUnivariate()
and
calculateUnivariate()
, the uniform user interface for
bivariate spatial statistics is runBivariate()
and
calculateBivariate()
. Here we find top highly variable
genes (HVGs) and compute a spatially informed correlation matrix, with
Lee’s L. Note that global bivariate results can’t be stored in the SFE
object in this version of Voyager
.
pheatmap(res, color = scales::viridis_pal()(256), cellwidth = 1, cellheight = 1,
show_rownames = FALSE, show_colnames = FALSE)
Here we see groups of genes correlated to each other, taking spatial autocorrelation into account. This matrix can be used in WGCNA to find gene coexpression modules. Note that Lee’s L of a gene with itself is not 1, because of a spatial smoothing factor.
Multivariate spatial statistics also have a uniform user interface,
runMultivariate()
. The matrix results will go to
reducedDims
, while vector and data frame results can go
into colData
. Here we perform a form of spatially informed
PCA, which maximizes the product of Moran’s I and variance explained by
each principal component. This method is called MULTISPATI, which is
originally implemented in the adespatial
package, but a
much faster albeit less flexible implementation is used in
Voyager
. Because of the Moran’s I, MULTISPATI can give
negative eigenvalues, signifying negative spatial autocorrelation. The
number of the most negative components to compute is specified in the
nfnega
argument.
Plot gene expression and colData(sfe)
together with
annotation geometry. Here nCounts
is the total UMI counts
per spot, which is in colData
.
plotSpatialFeature(sfe, c("nCounts", "Myh1"), colGeometryName = "spotPoly",
annotGeometryName = "myofiber_simplified",
aes_use = "color", linewidth = 0.5, fill = NA,
annot_aes = list(fill = "area"), swap_rownames = "symbol")
Plot local results, with the image. The maxcell
argument
is the maximum number of pixels to plot from the image. If the image has
more pixels than that, then it will be downsampled to speed up
plotting.
plotLocalResult(sfe, "localG", features = features_use,
colGeometryName = "spotPoly", divergent = TRUE,
diverge_center = 0, swap_rownames = "symbol",
image_id = "lowres", maxcell = 5e+4)
Plot the eigenvalues:
Plot dimension reduction (projection of each cell’s gene expression profile on each dimension) in space:
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] pheatmap_1.0.12 scran_1.35.0
#> [3] scater_1.35.0 scuttle_1.17.0
#> [5] Voyager_1.9.2 ggplot2_3.5.1
#> [7] SpatialExperiment_1.17.0 SingleCellExperiment_1.29.1
#> [9] SummarizedExperiment_1.37.0 Biobase_2.67.0
#> [11] GenomicRanges_1.59.1 GenomeInfoDb_1.43.2
#> [13] IRanges_2.41.2 S4Vectors_0.45.2
#> [15] BiocGenerics_0.53.3 generics_0.1.3
#> [17] MatrixGenerics_1.19.0 matrixStats_1.4.1
#> [19] SpatialFeatureExperiment_1.9.6 SFEData_1.8.1
#> [21] BiocStyle_2.35.0
#>
#> loaded via a namespace (and not attached):
#> [1] splines_4.4.2 bitops_1.0-9
#> [3] filelock_1.0.3 tibble_3.2.1
#> [5] R.oo_1.27.0 lifecycle_1.0.4
#> [7] sf_1.0-19 edgeR_4.5.1
#> [9] lattice_0.22-6 MASS_7.3-61
#> [11] magrittr_2.0.3 limma_3.63.2
#> [13] sass_0.4.9 rmarkdown_2.29
#> [15] jquerylib_0.1.4 yaml_2.3.10
#> [17] metapod_1.15.0 sp_2.1-4
#> [19] RColorBrewer_1.1-3 DBI_1.2.3
#> [21] buildtools_1.0.0 multcomp_1.4-26
#> [23] abind_1.4-8 spatialreg_1.3-6
#> [25] purrr_1.0.2 R.utils_2.12.3
#> [27] RCurl_1.98-1.16 TH.data_1.1-2
#> [29] rappdirs_0.3.3 sandwich_3.1-1
#> [31] GenomeInfoDbData_1.2.13 ggrepel_0.9.6
#> [33] irlba_2.3.5.1 maketools_1.3.1
#> [35] terra_1.8-5 units_0.8-5
#> [37] RSpectra_0.16-2 dqrng_0.4.1
#> [39] DelayedMatrixStats_1.29.0 codetools_0.2-20
#> [41] DropletUtils_1.27.2 DelayedArray_0.33.3
#> [43] tidyselect_1.2.1 UCSC.utils_1.3.0
#> [45] memuse_4.2-3 farver_2.1.2
#> [47] viridis_0.6.5 ScaledMatrix_1.15.0
#> [49] BiocFileCache_2.15.0 jsonlite_1.8.9
#> [51] BiocNeighbors_2.1.2 e1071_1.7-16
#> [53] survival_3.8-3 tools_4.4.2
#> [55] ggnewscale_0.5.0 Rcpp_1.0.13-1
#> [57] glue_1.8.0 gridExtra_2.3
#> [59] SparseArray_1.7.2 xfun_0.49
#> [61] EBImage_4.49.0 dplyr_1.1.4
#> [63] HDF5Array_1.35.2 withr_3.0.2
#> [65] BiocManager_1.30.25 fastmap_1.2.0
#> [67] boot_1.3-31 rhdf5filters_1.19.0
#> [69] bluster_1.17.0 spData_2.3.3
#> [71] digest_0.6.37 rsvd_1.0.5
#> [73] mime_0.12 R6_2.5.1
#> [75] colorspace_2.1-1 wk_0.9.4
#> [77] LearnBayes_2.15.1 jpeg_0.1-10
#> [79] RSQLite_2.3.9 R.methodsS3_1.8.2
#> [81] data.table_1.16.4 class_7.3-22
#> [83] httr_1.4.7 htmlwidgets_1.6.4
#> [85] S4Arrays_1.7.1 spdep_1.3-8
#> [87] pkgconfig_2.0.3 scico_1.5.0
#> [89] gtable_0.3.6 blob_1.2.4
#> [91] XVector_0.47.1 sys_3.4.3
#> [93] htmltools_0.5.8.1 fftwtools_0.9-11
#> [95] scales_1.3.0 png_0.1-8
#> [97] knitr_1.49 rjson_0.2.23
#> [99] coda_0.19-4.1 nlme_3.1-166
#> [101] curl_6.0.1 proxy_0.4-27
#> [103] cachem_1.1.0 zoo_1.8-12
#> [105] rhdf5_2.51.1 BiocVersion_3.21.1
#> [107] KernSmooth_2.23-24 vipor_0.4.7
#> [109] parallel_4.4.2 AnnotationDbi_1.69.0
#> [111] s2_1.1.7 pillar_1.10.0
#> [113] grid_4.4.2 vctrs_0.6.5
#> [115] BiocSingular_1.23.0 dbplyr_2.5.0
#> [117] beachmat_2.23.5 sfheaders_0.4.4
#> [119] cluster_2.1.8 beeswarm_0.4.0
#> [121] evaluate_1.0.1 zeallot_0.1.0
#> [123] magick_2.8.5 mvtnorm_1.3-2
#> [125] cli_3.6.3 locfit_1.5-9.10
#> [127] compiler_4.4.2 rlang_1.1.4
#> [129] crayon_1.5.3 labeling_0.4.3
#> [131] classInt_0.4-10 ggbeeswarm_0.7.2
#> [133] viridisLite_0.4.2 deldir_2.0-4
#> [135] BiocParallel_1.41.0 munsell_0.5.1
#> [137] Biostrings_2.75.3 tiff_0.1-12
#> [139] Matrix_1.7-1 ExperimentHub_2.15.0
#> [141] patchwork_1.3.0 sparseMatrixStats_1.19.0
#> [143] bit64_4.5.2 Rhdf5lib_1.29.0
#> [145] KEGGREST_1.47.0 statmod_1.5.0
#> [147] AnnotationHub_3.15.0 igraph_2.1.2
#> [149] memoise_2.0.1 bslib_0.8.0
#> [151] bit_4.5.0.1