| Title: | Representing GDS files as array-like objects |
|---|---|
| Description: | GDS files are widely used to represent genotyping or sequence data. The GDSArray package implements the `GDSArray` class to represent nodes in GDS files in a matrix-like representation that allows easy manipulation (e.g., subsetting, mathematical transformation) in _R_. The data remains on disk until needed, so that very large files can be processed. |
| Authors: | Qian Liu [aut], Martin Morgan [aut], Hervé Pagès [aut], Xiuwen Zheng [aut, cre] |
| Maintainer: | Xiuwen Zheng <[email protected]> |
| License: | GPL-3 |
| Version: | 1.33.0 |
| Built: | 2026-05-30 08:17:18 UTC |
| Source: | https://github.com/bioc/GDSArray |
gds.class class.Acquire a (possibly cached) gds.class object given it's path.
acquireGDS(path, type = NULL, ...) releaseGDS(path, type = NULL, ...)acquireGDS(path, type = NULL, ...) releaseGDS(path, type = NULL, ...)
path |
String containing a path to a GDS file. |
type |
String containing the GDS file type. Case
insensitive. Can be "seqgds" for a GDS file with sequencing
data, or "snpgds" for a GDS file with SNP data. This argument
was added for the |
... |
arguments to be passed to |
acquireConn will cache the gds.class object
in the current R session to avoid repeated initialization. This
improves efficiency for repeated calls. The cached
gds.class object for any given path can be
deleted by calling releaseGDS for the same path.
For acquireGDS, by default returns a regular
gds.class object, which are identical to that returned
by gdsfmt::openfn.gds(path). If type is not NULL,
a SeqVarGDSClass that is identical to
SeqArray::seqOPen(path), or SNPGDSFileClass that
is identical to SNPRelate::snpgdsOpen(path). Both are
inherited from gds.class but with additional checking
and methods.
For releaseGDS, any existing gds.class object for the
path is disconnected and cleared from cache, and NULL
is invisibly returned. This is equivalent to that returned by
gdsfmt::closefn.gds() except it take path as
input. If path=NULL, all cached connections are removed.
Qian Liu
fn <- gdsExampleFileName() gdscon <- acquireGDS(fn) acquireGDS(fn) ## just re-uses the cache acquireGDS(fn, type = "seqgds") ## construct a new GDS connection releaseGDS(fn) ## clears the cachefn <- gdsExampleFileName() gdscon <- acquireGDS(fn) acquireGDS(fn) ## just re-uses the cache acquireGDS(fn, type = "seqgds") ## construct a new GDS connection releaseGDS(fn) ## clears the cache
extract_array: the function to extract data from
a GDS file, by taking GDSArraySeed as input. This
function is required by the DelayedArray for the seed
contract.
GDSArray: The function to convert a gds file
into the GDSArray data structure.
GDSArray example data
## S4 method for signature 'GDSArraySeed' extract_array(x, index) GDSArray(gdsfile, varname) gdsExampleFileName(type = c("seqgds", "snpgds"))## S4 method for signature 'GDSArraySeed' extract_array(x, index) GDSArray(gdsfile, varname) gdsExampleFileName(type = c("seqgds", "snpgds"))
x |
the GDSArraySeed object |
index |
An unnamed list of subscripts as positive integer
vectors, one vector per dimension in |
gdsfile |
Can be a GDSArraySeed, a character string of gds file name, or an "gds.class" R object. |
varname |
A character string specifying the gds array node to be read into GDSArray. |
type |
the type of gds file, available are "seqgds" for
|
GDSArray class object.
fn <- gdsExampleFileName("snpgds") allnodes <- gdsnodes(fn) ## print all available gds nodes in fn. allnodes GDSArray(fn, "genotype") GDSArray(fn, "sample.annot/pop.group") fn1 <- gdsExampleFileName("seqgds") allnodes1 <- gdsnodes(fn1) ## print all available gds nodes in fn1. allnodes1 ## GDSArray(fn1, "genotype/data") GDSArray(fn1, "variant.id") GDSArray(fn1, "sample.annotation/family") GDSArray(fn1, "annotation/format/DP/data") GDSArray(fn1, "annotation/info/DP") gdsExampleFileName("snpgds") gdsExampleFileName("seqgds")fn <- gdsExampleFileName("snpgds") allnodes <- gdsnodes(fn) ## print all available gds nodes in fn. allnodes GDSArray(fn, "genotype") GDSArray(fn, "sample.annot/pop.group") fn1 <- gdsExampleFileName("seqgds") allnodes1 <- gdsnodes(fn1) ## print all available gds nodes in fn1. allnodes1 ## GDSArray(fn1, "genotype/data") GDSArray(fn1, "variant.id") GDSArray(fn1, "sample.annotation/family") GDSArray(fn1, "annotation/format/DP/data") GDSArray(fn1, "annotation/info/DP") gdsExampleFileName("snpgds") gdsExampleFileName("seqgds")
GDSFile: GDSFile is a light-weight class
to represent a GDS file. It has the '$' completion method to
complete any possible gds nodes. If the slot of 'current_path'
in 'GDSFile' object represent a valid gds node, it will return
the 'GDSArray' of that node directly. Otherwise, it will return
the 'GDSFile' object with an updated 'current_path'.
GDSFile: the GDSFile class constructor.
gdsfile: filename slot getter for
GDSFile object.
gdsfile<-: filename slot setter for
GDSFile object.
gdsnodes: to get the available gds nodes from a
gds file name or a GDSFile object.
GDSFile(file, current_path = "") ## S4 method for signature 'GDSFile' gdsfile(object) gdsfile(object) <- value ## S4 method for signature 'GDSFile' x$name ## S4 method for signature 'ANY' gdsnodes(x, node)GDSFile(file, current_path = "") ## S4 method for signature 'GDSFile' gdsfile(object) gdsfile(object) <- value ## S4 method for signature 'GDSFile' x$name ## S4 method for signature 'ANY' gdsnodes(x, node)
file |
the GDS file path. |
current_path |
the current path to the closest gds node. |
object |
|
value |
the new gds file path |
x |
a character string for the GDS file name or a |
name |
the name of gds node |
node |
the node name of a gds file or |
gdsfile: the file path of corresponding
GDSfile object.
$: a GDSFile with updated @current_path, or
GDSArray object if the current_path is a valid
gds node.
gdsnodes: a character vector of all available gds
nodes within the related GDS file and the specified node.
fn <- gdsExampleFileName("seqgds") gf <- GDSFile(fn) gdsfile(gf) fn <- gdsExampleFileName("seqgds") gdsnodes(fn) gdsnodes(fn, "annotation/info") fn1 <- gdsExampleFileName("snpgds") gdsnodes(fn1) gdsnodes(fn1, "sample.annot") gf <- GDSFile(fn) gdsnodes(gf) gdsnodes(gf, "genotype") gdsfile(gf)fn <- gdsExampleFileName("seqgds") gf <- GDSFile(fn) gdsfile(gf) fn <- gdsExampleFileName("seqgds") gdsnodes(fn) gdsnodes(fn, "annotation/info") fn1 <- gdsExampleFileName("snpgds") gdsnodes(fn1) gdsnodes(fn1, "sample.annot") gf <- GDSFile(fn) gdsnodes(gf) gdsnodes(gf, "genotype") gdsfile(gf)
dim, dimnames: dimension and dimnames of
object contained in the GDS file.
seed: the GDSArraySeed getter for
GDSArray object.
seed<-: the GDSArraySeed setter for
GDSArray object.
gdsfile: on-disk location of GDS file
represented by this object.
## S4 method for signature 'GDSArray' seed(x) ## S4 replacement method for signature 'GDSArray' seed(x) <- value gdsfile(object) ## S4 method for signature 'GDSArraySeed' gdsfile(object) ## S4 method for signature 'GDSArray' gdsfile(object) ## S4 method for signature 'DelayedArray' gdsfile(object)## S4 method for signature 'GDSArray' seed(x) ## S4 replacement method for signature 'GDSArray' seed(x) <- value gdsfile(object) ## S4 method for signature 'GDSArraySeed' gdsfile(object) ## S4 method for signature 'GDSArray' gdsfile(object) ## S4 method for signature 'DelayedArray' gdsfile(object)
x |
the |
value |
the new |
object |
GDSArray, GDSMatrix, GDSArraySeed, GDSFile or SummarizedExperiment object. |
dim: the integer vector of dimensions for
GDSArray or GDSArraySeed objects.
dimnames: the unnamed list of dimension names for
GDSArray and GDSArraySeed objects.
seed: the GDSArraySeed of GDSArray
object.
gdsfile: the character string for the gds file path.
fn <- gdsExampleFileName("snpgds") ga <- GDSArray(fn, "sample.annot/pop.group") dim(ga) dimnames(ga) type(ga) seed(ga) dim(seed(ga)) gdsfile(ga)fn <- gdsExampleFileName("snpgds") ga <- GDSArray(fn, "sample.annot/pop.group") dim(ga) dimnames(ga) type(ga) seed(ga) dim(seed(ga)) gdsfile(ga)