This package provides a lightweight interface between the
Bioconductor SingleCellExperiment
data structure and the
Python AnnData
-based single-cell analysis environment. The
idea is to enable users and developers to easily move data between these
frameworks to construct a multi-language analysis pipeline across
R/Bioconductor and Python.
The readH5AD()
function can be used to read a
SingleCellExperiment
from a H5AD file. This can be
manipulated in the usual way as described in the SingleCellExperiment
documentation.
library(zellkonverter)
# Obtaining an example H5AD file.
example_h5ad <- system.file(
"extdata", "krumsiek11.h5ad",
package = "zellkonverter"
)
readH5AD(example_h5ad)
## class: SingleCellExperiment
## dim: 11 640
## metadata(2): highlights iroot
## assays(1): X
## rownames(11): Gata2 Gata1 ... EgrNab Gfi1
## rowData names(0):
## colnames(640): 0 1 ... 158-3 159-3
## colData names(1): cell_type
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
We can also write a SingleCellExperiment
to a H5AD file
with the writeH5AD()
function. This is demonstrated below
on the classic Zeisel mouse brain dataset from the scRNAseq
package. The resulting file can then be directly used in compatible
Python-based analysis frameworks.
SingleCellExperiment
and
AnnData
objectsDevelopers and power users who control their Python environments can
directly convert between SingleCellExperiment
and AnnData
objects using the SCE2AnnData()
and
AnnData2SCE()
utilities. These functions expect that reticulate
has already been loaded along with an appropriate version of the anndata package.
We suggest using the basilisk
package to set up the Python environment before using these
functions.
library(basilisk)
library(scRNAseq)
seger <- SegerstolpePancreasData()
roundtrip <- basiliskRun(fun = function(sce) {
# Convert SCE to AnnData:
adata <- SCE2AnnData(sce)
# Maybe do some work in Python on 'adata':
# BLAH BLAH BLAH
# Convert back to an SCE:
AnnData2SCE(adata)
}, env = zellkonverterAnnDataEnv(), sce = seger)
Package developers can guarantee that they are using the same
versions of Python packages as zellkonverter
by using the AnnDataDependencies()
function to set up their
Python environments.
## [1] "anndata==0.10.9" "h5py==3.12.1" "hdf5==1.14.3" "natsort==8.4.0"
## [5] "numpy==2.1.2" "packaging==24.1" "pandas==2.2.3" "python==3.12.7"
## [9] "scipy==1.14.1"
This function can also be used to return dependencies for environments using older versions of anndata.
## [1] "anndata==0.7.6" "h5py==3.2.1" "hdf5==1.10.6" "natsort==7.1.1"
## [5] "numpy==1.20.2" "packaging==20.9" "pandas==1.2.4" "python==3.7.10"
## [9] "scipy==1.6.3" "sqlite==3.35.5"
By default the functions in zellkonverter
don’t display any information about their progress but this can be
turned on by setting the verbose = TRUE
argument.
## class: SingleCellExperiment
## dim: 11 640
## metadata(2): highlights iroot
## assays(1): X
## rownames(11): Gata2 Gata1 ... EgrNab Gfi1
## rowData names(0):
## colnames(640): 0 1 ... 158-3 159-3
## colData names(1): cell_type
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
If you would like to see progress messages for all functions by
default you can turn this on using the
setZellkonverterVerbose()
function.
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] basilisk_1.19.0 reticulate_1.40.0
## [3] scRNAseq_2.20.0 SingleCellExperiment_1.29.1
## [5] SummarizedExperiment_1.37.0 Biobase_2.67.0
## [7] GenomicRanges_1.59.1 GenomeInfoDb_1.43.1
## [9] IRanges_2.41.1 S4Vectors_0.45.2
## [11] BiocGenerics_0.53.3 generics_0.1.3
## [13] MatrixGenerics_1.19.0 matrixStats_1.4.1
## [15] zellkonverter_1.17.0 knitr_1.49
## [17] BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] DBI_1.2.3 bitops_1.0-9 httr2_1.0.6
## [4] rlang_1.1.4 magrittr_2.0.3 gypsum_1.3.0
## [7] compiler_4.4.2 RSQLite_2.3.8 GenomicFeatures_1.59.1
## [10] dir.expiry_1.15.0 png_0.1-8 vctrs_0.6.5
## [13] ProtGenerics_1.39.0 pkgconfig_2.0.3 crayon_1.5.3
## [16] fastmap_1.2.0 dbplyr_2.5.0 XVector_0.47.0
## [19] utf8_1.2.4 Rsamtools_2.23.0 rmarkdown_2.29
## [22] UCSC.utils_1.3.0 bit_4.5.0 xfun_0.49
## [25] zlibbioc_1.52.0 cachem_1.1.0 jsonlite_1.8.9
## [28] blob_1.2.4 rhdf5filters_1.19.0 DelayedArray_0.33.2
## [31] Rhdf5lib_1.29.0 BiocParallel_1.41.0 parallel_4.4.2
## [34] R6_2.5.1 bslib_0.8.0 rtracklayer_1.67.0
## [37] jquerylib_0.1.4 Rcpp_1.0.13-1 Matrix_1.7-1
## [40] tidyselect_1.2.1 abind_1.4-8 yaml_2.3.10
## [43] codetools_0.2-20 curl_6.0.1 alabaster.sce_1.7.0
## [46] lattice_0.22-6 tibble_3.2.1 basilisk.utils_1.19.0
## [49] withr_3.0.2 KEGGREST_1.47.0 evaluate_1.0.1
## [52] BiocFileCache_2.15.0 alabaster.schemas_1.7.0 ExperimentHub_2.15.0
## [55] Biostrings_2.75.1 pillar_1.9.0 BiocManager_1.30.25
## [58] filelock_1.0.3 RCurl_1.98-1.16 ensembldb_2.31.0
## [61] BiocVersion_3.21.1 alabaster.base_1.7.2 alabaster.ranges_1.7.0
## [64] glue_1.8.0 lazyeval_0.2.2 alabaster.matrix_1.7.2
## [67] maketools_1.3.1 tools_4.4.2 AnnotationHub_3.15.0
## [70] BiocIO_1.17.1 sys_3.4.3 GenomicAlignments_1.43.0
## [73] buildtools_1.0.0 XML_3.99-0.17 rhdf5_2.51.0
## [76] grid_4.4.2 AnnotationDbi_1.69.0 GenomeInfoDbData_1.2.13
## [79] HDF5Array_1.35.1 restfulr_0.0.15 cli_3.6.3
## [82] rappdirs_0.3.3 fansi_1.0.6 S4Arrays_1.7.1
## [85] dplyr_1.1.4 AnnotationFilter_1.31.0 alabaster.se_1.7.0
## [88] sass_0.4.9 digest_0.6.37 SparseArray_1.7.2
## [91] rjson_0.2.23 memoise_2.0.1 htmltools_0.5.8.1
## [94] lifecycle_1.0.4 httr_1.4.7 bit64_4.5.2