A quick overview of the S4Arrays package

Introduction

S4Arrays is an infrastructure package that provides a framework intended to facilitate implementation of array-like containers in other Bioconductor packages. Array-like containers are S4 objects that mimic the behavior of ordinary matrices or arrays in R. Please note that the package is not intended to be used directly by the end user.

Installation

Like any other Bioconductor package, S4Arrays should always be installed with BiocManager::install():

if (!require("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("S4Arrays")

However, note that S4Arrays will typically get automatically installed as a dependency of other Bioconductor packages, so explicit installation of the package is usually not needed.

The Array virtual class

At the center of the framework provided by the S4Arrays package is the Array virtual class whose only purpose is to be extended by other S4 classes that wish to implement a container with an array-like semantic. Examples of such classes are:

  • The SparseArray class defined in the SparseArray package.

  • The DelayedArray class defined in the DelayedArray package.

  • The ArrayGrid and ArrayViewport classes defined in the S4Arrays package itself.

Note that Array is a virtual class with no slots:

library(S4Arrays)

showClass("Array")
## Virtual Class "Array" [package "S4Arrays"]
## 
## No Slots, prototype of class "S4"
## 
## Known Subclasses: 
## Class "ArrayViewport", directly
## Class "ArrayGrid", directly
## Class "DummyArrayViewport", by class "ArrayViewport", distance 2
## Class "SafeArrayViewport", by class "ArrayViewport", distance 2
## Class "DummyArrayGrid", by class "ArrayGrid", distance 2
## Class "ArbitraryArrayGrid", by class "ArrayGrid", distance 2
## Class "RegularArrayGrid", by class "ArrayGrid", distance 2

The extract_array() generic function

The S4Arrays package also introduces the extract_array() S4 generic function that Arrays subclasses (a.k.a. Arrays extensions) are expected to support via specific methods. This allows some basic operations like type(), as.array() or as.matrix() to work out-of-the-box on instances of these Arrays subclasses (a.k.a. Arrays derivatives). It also allows them to be used as the seed of a DelayedArray object.

Please refer to the man page of the extract_array() function for more information: ?extract_array

Block processing of array-like objects

The S4Arrays package provides a framework that facilitates block processing of array-like objects. Note that block processing is typically used on on-disk objects, that is, on objects where the array data is stored on disk. This framework consists of:

  • ArrayGrid and ArrayViewport objects for representing grids and viewports on array-like objects. See ?ArrayGrid for more information.

  • The read_block() and write_block() functions to read and write array blocks. See ?read_block and ?write_block for more information.

  • The mapToGrid() and mapToRef() functions to map a set of reference array positions to grid positions and vice-versa. See ?mapToGrid for more information.

Other functionalities

In addition to the above, the S4Arrays package provides the following functionalities:

  • The is_sparse() generic function for determining whether an array-like object uses a sparse representation or not. See ?is_sparse for more information.

  • Low-level utilities for manipulating array selections. See ?Lindex2Mindex for more information.

  • aperm2(): an extension of base::aperm() that allows dropping and/or adding ineffective dimensions. See ?aperm2 for more information.

  • The abind() generic function for binding multidimensional array-like objects along any dimension. See ?abind for more information.

Session information

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] S4Arrays_1.7.1      IRanges_2.41.1      S4Vectors_0.45.2   
## [4] BiocGenerics_0.53.3 generics_0.1.3      abind_1.4-8        
## [7] Matrix_1.7-1        BiocStyle_2.35.0   
## 
## loaded via a namespace (and not attached):
##  [1] crayon_1.5.3        cli_3.6.3           knitr_1.49         
##  [4] rlang_1.1.4         xfun_0.49           jsonlite_1.8.9     
##  [7] buildtools_1.0.0    htmltools_0.5.8.1   maketools_1.3.1    
## [10] sys_3.4.3           sass_0.4.9          rmarkdown_2.29     
## [13] grid_4.4.2          evaluate_1.0.1      jquerylib_0.1.4    
## [16] fastmap_1.2.0       yaml_2.3.10         lifecycle_1.0.4    
## [19] BiocManager_1.30.25 compiler_4.4.2      lattice_0.22-6     
## [22] digest_0.6.37       R6_2.5.1            bslib_0.8.0        
## [25] tools_4.4.2         cachem_1.1.0