Using TileDB-backed matrices with beachmat

Overview

beachmat.tiledb provides a C++ API to extract numeric data from TileDB-backed matrices from the TileDBArray package. This extends the beachmat package to the matrix representations in the tatami_tiledb library. By including this package, users and developers can enable tatami-compatible C++ code to operate natively on file-backed data via the TileDB C library.

For users

Users can simply load the package in their R session:

library(beachmat.tiledb)

This will automatically extend beachmat’s functionality to TileDBArray matrices. Any package code based on beachmat will now be able to access TileDB data natively without any further work.

For developers

Developers should read the beachmat developer guide if they have not done so already.

Developers can import beachmat.tiledb in their packages to guarantee native support for TileDBArray classes. This registers more initializeCpp() methods that initializes the appropriate C++ representations for these classes. Of course, this adds some more dependencies to the package, which may or may not be acceptable; some developers may prefer to leave this choice to the user or hide it behind an optional parameter to reduce the installation burden (e.g., if TileDB-backed matrices are not expected to be a common input in the package workflow).

It’s worth noting that beachmat by itself will already work with TileDBMatrix objects even without loading beachmat.tiledb. However, this is not as efficient as any package C++ code needs to go back into R to extract the matrix data via DelayedArray::extract_array() and friends. Importing beachmat.tiledb provides native support without the need for calls to R functions.

Session information

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] beachmat.tiledb_0.99.1 knitr_1.49             BiocStyle_2.35.0      
## 
## loaded via a namespace (and not attached):
##  [1] Matrix_1.7-1          bit_4.5.0.1           jsonlite_1.8.9       
##  [4] crayon_1.5.3          compiler_4.4.2        BiocManager_1.30.25  
##  [7] Rcpp_1.0.13-1         nanoarrow_0.6.0       jquerylib_0.1.4      
## [10] IRanges_2.41.2        yaml_2.3.10           fastmap_1.2.0        
## [13] lattice_0.22-6        XVector_0.47.1        R6_2.5.1             
## [16] RcppCCTZ_0.2.13       S4Arrays_1.7.1        generics_0.1.3       
## [19] tiledb_0.30.2         BiocGenerics_0.53.3   RcppSpdlog_0.0.19    
## [22] DelayedArray_0.33.3   MatrixGenerics_1.19.0 maketools_1.3.1      
## [25] TileDBArray_1.17.0    bslib_0.8.0           rlang_1.1.4          
## [28] cachem_1.1.0          xfun_0.49             sass_0.4.9           
## [31] sys_3.4.3             bit64_4.5.2           SparseArray_1.7.2    
## [34] cli_3.6.3             spdl_0.0.5            digest_0.6.37        
## [37] grid_4.4.2            lifecycle_1.0.4       S4Vectors_0.45.2     
## [40] evaluate_1.0.1        nanotime_0.3.10       buildtools_1.0.0     
## [43] zoo_1.8-12            beachmat_2.23.5       abind_1.4-8          
## [46] stats4_4.4.2          rmarkdown_2.29        matrixStats_1.4.1    
## [49] tools_4.4.2           htmltools_0.5.8.1