Saving SingleCellExperiments to artifacts and back again

Overview

The alabaster.sce package implements methods to save SingleCellExperiment objects to file artifacts and load them back into R. Check out the alabaster.base for more details on the motivation and concepts of the alabaster framework.

Quick start

Given a SingleCellExperiment, we can use saveObject() to save it inside a staging directory:

library(SingleCellExperiment)
mat <- matrix(rpois(10000, 10), ncol=10)
colnames(mat) <- letters[1:10]
rownames(mat) <- sprintf("GENE_%i", seq_len(nrow(mat)))

sce <- SingleCellExperiment(list(counts=mat))
sce$stuff <- LETTERS[1:10]
sce$blah <- runif(10)
reducedDims(sce) <- list(
 PCA=matrix(rnorm(ncol(sce)*10), ncol=10),
 TSNE=matrix(rnorm(ncol(sce)*2), ncol=2)
)
altExps(sce) <- list(spikes=SummarizedExperiment(list(counts=mat[1:2,])))
sce
## class: SingleCellExperiment 
## dim: 1000 10 
## metadata(0):
## assays(1): counts
## rownames(1000): GENE_1 GENE_2 ... GENE_999 GENE_1000
## rowData names(0):
## colnames(10): a b ... i j
## colData names(2): stuff blah
## reducedDimNames(2): PCA TSNE
## mainExpName: NULL
## altExpNames(1): spikes
library(alabaster.sce)
tmp <- tempfile()
saveObject(sce, tmp)

list.files(tmp, recursive=TRUE)
##  [1] "OBJECT"                                                
##  [2] "alternative_experiments/0/OBJECT"                      
##  [3] "alternative_experiments/0/assays/0/OBJECT"             
##  [4] "alternative_experiments/0/assays/0/array.h5"           
##  [5] "alternative_experiments/0/assays/names.json"           
##  [6] "alternative_experiments/0/column_data/OBJECT"          
##  [7] "alternative_experiments/0/column_data/basic_columns.h5"
##  [8] "alternative_experiments/0/row_data/OBJECT"             
##  [9] "alternative_experiments/0/row_data/basic_columns.h5"   
## [10] "alternative_experiments/names.json"                    
## [11] "assays/0/OBJECT"                                       
## [12] "assays/0/array.h5"                                     
## [13] "assays/names.json"                                     
## [14] "column_data/OBJECT"                                    
## [15] "column_data/basic_columns.h5"                          
## [16] "reduced_dimensions/0/OBJECT"                           
## [17] "reduced_dimensions/0/array.h5"                         
## [18] "reduced_dimensions/1/OBJECT"                           
## [19] "reduced_dimensions/1/array.h5"                         
## [20] "reduced_dimensions/names.json"                         
## [21] "row_data/OBJECT"                                       
## [22] "row_data/basic_columns.h5"

We can then load it back into the session with loadObject().

roundtrip <- readObject(tmp)
class(roundtrip)
## [1] "SingleCellExperiment"
## attr(,"package")
## [1] "SingleCellExperiment"

Session information

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] alabaster.sce_1.7.0         alabaster.base_1.7.2       
##  [3] SingleCellExperiment_1.29.1 SummarizedExperiment_1.37.0
##  [5] Biobase_2.67.0              GenomicRanges_1.59.1       
##  [7] GenomeInfoDb_1.43.2         IRanges_2.41.1             
##  [9] S4Vectors_0.45.2            BiocGenerics_0.53.3        
## [11] generics_0.1.3              MatrixGenerics_1.19.0      
## [13] matrixStats_1.4.1           BiocStyle_2.35.0           
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.9              SparseArray_1.7.2       lattice_0.22-6         
##  [4] alabaster.se_1.7.0      digest_0.6.37           evaluate_1.0.1         
##  [7] grid_4.4.2              fastmap_1.2.0           jsonlite_1.8.9         
## [10] Matrix_1.7-1            alabaster.schemas_1.7.0 BiocManager_1.30.25    
## [13] httr_1.4.7              UCSC.utils_1.3.0        HDF5Array_1.35.1       
## [16] jquerylib_0.1.4         abind_1.4-8             cli_3.6.3              
## [19] rlang_1.1.4             crayon_1.5.3            XVector_0.47.0         
## [22] cachem_1.1.0            DelayedArray_0.33.2     yaml_2.3.10            
## [25] S4Arrays_1.7.1          tools_4.4.2             Rhdf5lib_1.29.0        
## [28] GenomeInfoDbData_1.2.13 alabaster.ranges_1.7.0  alabaster.matrix_1.7.2 
## [31] buildtools_1.0.0        R6_2.5.1                lifecycle_1.0.4        
## [34] rhdf5_2.51.0            zlibbioc_1.52.0         bslib_0.8.0            
## [37] Rcpp_1.0.13-1           xfun_0.49               sys_3.4.3              
## [40] knitr_1.49              rhdf5filters_1.19.0     htmltools_0.5.8.1      
## [43] rmarkdown_2.29          maketools_1.3.1         compiler_4.4.2