The BiocIO
package is primarily to be used by developers
for interfacing with the abstract classes and generics in this package
to develop their own related classes and methods.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("BiocIO")
The functions import and export load and save objects from and to particular file formats. This package contains the following generics for the import and export methods used throughout the Bioconductor package suite.
## standardGeneric for "import" defined from package "BiocIO"
##
## function (con, format, text, ...)
## standardGeneric("import")
## <bytecode: 0x556830689eb8>
## <environment: 0x55683069c398>
## Methods may be defined for arguments: con, format, text
## Use showMethods(import) for currently available ones.
## standardGeneric for "export" defined from package "BiocIO"
##
## function (object, con, format, ...)
## standardGeneric("export")
## <bytecode: 0x556830777cb8>
## <environment: 0x556830788898>
## Methods may be defined for arguments: object, con, format
## Use showMethods(export) for currently available ones.
BiocFile
is a base class for high-level file
abstractions, where subclasses are associated with a particular file
format/type. It wraps a low-level representation of a file, currently
either a path/URL or connection.
CompressedFile
is a base class that extends the
BiocFile
class that offers high-level file abstractions for
compressed file formats. As with the BiocFile
class, it
takes either a path/URL of connection as an argument. This package also
includes other File classes that extend CompressedFile
including: BZ2File
, XZFile
,
GZFile
, and BGZFile
which extends the
GZfile
class
In previous releases, rtracklayer
package’s
RTLFile
, RTLList
, and
CompressedFile
classes threw errors when a class that
extended them was initialized. The error could have been seen with the
LoomFile
class from LoomExperiment
.
file <- tempfile(fileext = ".loom")
LoomFile(file)
### LoomFile object
### resource: file.loom
### Warning messages:
### 1: This class is extending the deprecated RTLFile class from
### rtracklayer. Use BiocFile from BiocIO in place of RTLFile.
### 2: Use BiocIO::resource()
The first warning indicated that the RTLFile
class from
rtracklayer
was deprecated for future releases. The second
warning indicated that the resource
method from
rtracklayer
was moved to BiocIO
.
To resolve this issue, developers should simply replace the
contains="RTLFile"
argument in setClass
with
contains="BiocFile"
.
The primary purpose of this package is to provide high-level classes
and generics to facilitate file IO within the Bioconductor package
suite. The remainder of this vignette will detail how to create File
classes that extend the BiocFile
class and create methods
for these classes. This section will also detail using the filter and
select methods from the tidyverse dplyr package to facilitate lazy
operations on files.
The CSVFile
class defined in this package will be used
as an example. The purpose of the CSVFile
class is to
represent CSVFile
so that IO operations can be performed on
the file. The following code defines the CSVFile
class that
extends the BiocFile
class using the contains
argument. The CSVFile
function is used as a constructor
function requiring only the argument resource
(either a
character
or a connection
).
.CSVFile <- setClass("CSVFile", contains = "BiocFile")
CSVFile <- function(resource) .CSVFile(resource = resource)
Next, the import and export functions are defined. These functions
are meant to import the data into R in a usable format (a
data.frame
or another user-friendly R class), then export
that R object into a file. For the CSVFile
example, the
base read.csv()
and write.csv()
functions are
used as the body for our methods.
setMethod("import", "CSVFile", function(con, format, text, ...) {
read.csv(resource(con), ...)
})
setMethod("export", c("data.frame", "CSVFile"),
function(object, con, format, ...) {
write.csv(object, resource(con), ...)
}
)
And finally a demonstration of the CSVFile
class and
import/export methods in action.
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] BiocIO_1.17.1 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.37 R6_2.5.1 fastmap_1.2.0
## [4] xfun_0.49 maketools_1.3.1 cachem_1.1.0
## [7] knitr_1.49 BiocGenerics_0.53.3 htmltools_0.5.8.1
## [10] generics_0.1.3 rmarkdown_2.29 buildtools_1.0.0
## [13] stats4_4.4.2 lifecycle_1.0.4 cli_3.6.3
## [16] sass_0.4.9 jquerylib_0.1.4 compiler_4.4.2
## [19] sys_3.4.3 tools_4.4.2 evaluate_1.0.1
## [22] bslib_0.8.0 yaml_2.3.10 BiocManager_1.30.25
## [25] S4Vectors_0.45.2 jsonlite_1.8.9 rlang_1.1.4