Save/load spatial omics data to/from file

Overview

The SpatialExperiment class (from the SpatialExperiment package) provides a representation of spatial transcriptomics data that is compatible with Bioconductor’s SummarizedExperiment ecosystem. The alabaster.spatial package contains methods to save and load SpatialExperiment objects into and out of file. Check out the alabaster.base for more details on the motivation and concepts of the alabaster framework.

Quick start

To demonstrate, we’ll use the example dataset provided in the SpatialExperiment package:

library(SpatialExperiment)

# Copying the example from ?read10xVisium.
dir <- system.file("extdata", "10xVisium", package = "SpatialExperiment")
sample_ids <- c("section1", "section2")
samples <- file.path(dir, sample_ids, "outs")
spe <- read10xVisium(
   samples,
   sample_ids,
   type = "sparse",
   data = "raw",
   images = "lowres", 
   load = FALSE
)
colnames(spe) <- make.unique(colnames(spe)) # Making the column names unique.

spe
## class: SpatialExperiment 
## dim: 50 99 
## metadata(0):
## assays(1): counts
## rownames(50): ENSMUSG00000051951 ENSMUSG00000089699 ...
##   ENSMUSG00000005886 ENSMUSG00000101476
## rowData names(1): symbol
## colnames(99): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
##   AAAGTCGACCCTCAGT-1.1 AAAGTGCCATCAATTA-1.1
## colData names(4): in_tissue array_row array_col sample_id
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor

We call the usual saveObject() function to save its contents to file:

library(alabaster.spatial)
tmp <- tempfile()
saveObject(spe, tmp)
list.files(tmp, recursive=TRUE)
##  [1] "OBJECT"                       "assays/0/OBJECT"             
##  [3] "assays/0/matrix.h5"           "assays/names.json"           
##  [5] "column_data/OBJECT"           "column_data/basic_columns.h5"
##  [7] "coordinates/OBJECT"           "coordinates/array.h5"        
##  [9] "images/0.png"                 "images/1.png"                
## [11] "images/mapping.h5"            "row_data/OBJECT"             
## [13] "row_data/basic_columns.h5"

This goes through the usual saving process for SingleCellExperiments, with an additional saving step for the spatial data (see the coordinates/ and images/ subdirectories). We can then load it back in using the readObject() function:

roundtrip <- readObject(tmp)
plot(imgRaster(getImg(roundtrip, "section1")))

More details on the metadata and on-disk layout are provided in the schema.

Session info

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] alabaster.spatial_1.7.1     alabaster.base_1.7.0       
##  [3] SpatialExperiment_1.17.0    SingleCellExperiment_1.29.1
##  [5] SummarizedExperiment_1.37.0 Biobase_2.67.0             
##  [7] GenomicRanges_1.59.0        GenomeInfoDb_1.43.0        
##  [9] IRanges_2.41.0              S4Vectors_0.45.0           
## [11] BiocGenerics_0.53.1         generics_0.1.3             
## [13] MatrixGenerics_1.19.0       matrixStats_1.4.1          
## [15] BiocStyle_2.35.0           
## 
## loaded via a namespace (and not attached):
##  [1] rjson_0.2.23              xfun_0.49                
##  [3] bslib_0.8.0               rhdf5_2.51.0             
##  [5] lattice_0.22-6            rhdf5filters_1.19.0      
##  [7] tools_4.4.2               parallel_4.4.2           
##  [9] R.oo_1.27.0               Matrix_1.7-1             
## [11] sparseMatrixStats_1.19.0  dqrng_0.4.1              
## [13] lifecycle_1.0.4           GenomeInfoDbData_1.2.13  
## [15] compiler_4.4.2            statmod_1.5.0            
## [17] alabaster.se_1.7.0        codetools_0.2-20         
## [19] htmltools_0.5.8.1         sys_3.4.3                
## [21] buildtools_1.0.0          sass_0.4.9               
## [23] alabaster.matrix_1.7.0    yaml_2.3.10              
## [25] crayon_1.5.3              jquerylib_0.1.4          
## [27] R.utils_2.12.3            BiocParallel_1.41.0      
## [29] limma_3.63.1              DelayedArray_0.33.1      
## [31] cachem_1.1.0              magick_2.8.5             
## [33] abind_1.4-8               locfit_1.5-9.10          
## [35] digest_0.6.37             maketools_1.3.1          
## [37] fastmap_1.2.0             grid_4.4.2               
## [39] cli_3.6.3                 SparseArray_1.7.1        
## [41] magrittr_2.0.3            S4Arrays_1.7.1           
## [43] edgeR_4.5.0               DelayedMatrixStats_1.29.0
## [45] UCSC.utils_1.3.0          rmarkdown_2.29           
## [47] XVector_0.47.0            httr_1.4.7               
## [49] DropletUtils_1.27.0       R.methodsS3_1.8.2        
## [51] beachmat_2.23.0           alabaster.sce_1.7.0      
## [53] HDF5Array_1.35.1          evaluate_1.0.1           
## [55] knitr_1.49                rlang_1.1.4              
## [57] Rcpp_1.0.13-1             scuttle_1.17.0           
## [59] BiocManager_1.30.25       alabaster.ranges_1.7.0   
## [61] alabaster.schemas_1.7.0   jsonlite_1.8.9           
## [63] R6_2.5.1                  Rhdf5lib_1.29.0          
## [65] zlibbioc_1.52.0