Title: | Access HDF5 content from HDF Scalable Data Service |
---|---|
Description: | This package provides functionality for reading data from HDF Scalable Data Service from within R. The HSDSArray function bridges from HSDS to the user via the DelayedArray interface. Bioconductor manages an open HSDS instance graciously provided by John Readey of the HDF Group. |
Authors: | Samuela Pollack [aut], Shweta Gopaulakrishnan [aut], BJ Stubbs [aut], Alexey Sergushichev [aut], Vincent Carey [cre, aut] |
Maintainer: | Vincent Carey <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.29.0 |
Built: | 2024-12-15 05:01:53 UTC |
Source: | https://github.com/bioc/rhdf5client |
bracket method for 1d request from HSDSDataset
## S4 method for signature 'HSDSDataset,numeric,ANY,ANY' x[i, j, ..., drop = TRUE]
## S4 method for signature 'HSDSDataset,numeric,ANY,ANY' x[i, j, ..., drop = TRUE]
x |
object of type HSDSDataset |
i |
vector of indices (first dimension) |
j |
not used |
... |
not used |
drop |
logical(1) if TRUE return has no array character |
an array with the elements requested from the HSDSDataset
bracket method for 2d request from HSDSDataset
## S4 method for signature 'HSDSDataset,numeric,numeric,ANY' x[i, j, ..., drop = TRUE]
## S4 method for signature 'HSDSDataset,numeric,numeric,ANY' x[i, j, ..., drop = TRUE]
x |
object of type HSDSDataset |
i |
vector of indices (first dimension) |
j |
vector of indices (second dimension) |
... |
not used |
drop |
logical(1) if TRUE return has no array character |
an array with the elements requested from the HSDSDataset
Coercion method from HSDSMatrix to its superclass HSDSArray
Other HSDSArray:
HSDSArray
,
HSDSMatrix
a test request
check_hsds()
check_hsds()
logical, TRUE if hsds behaving as expected
check_hsds()
check_hsds()
(required by DelayedArray seed contract) HDF server content is assumed transposed relative to R matrix layout. This anticipates H5 datasets on the server with rows for experimental samples and columns for *-omic features. The Bioconductor SummarizedExperiment requires *-omic features in rows and samples in columns.
## S4 method for signature 'HSDSArraySeed' dim(x)
## S4 method for signature 'HSDSArraySeed' dim(x)
x |
An object of type HSDSArraySeed |
A numeric vector of the dimensions
(required by DelayedArray seed contract, returns NULL list)
## S4 method for signature 'HSDSArraySeed' dimnames(x)
## S4 method for signature 'HSDSArraySeed' dimnames(x)
x |
An object of type HSDSArraySeed |
A NULL list of length equal to the array dimensionality
Access dataset backed by an HSDSArraySeed
## S4 method for signature 'HSDSArraySeed' extract_array(x, index)
## S4 method for signature 'HSDSArraySeed' extract_array(x, index)
x |
An object of type HSDSArraySeed |
index |
A list of numeric vectors to be accessed, one vector for each dimension of the array object. A NULL vector indicates the entire range of indices in that dimension. A zero-length vector indicates no indices in the relevant dimension. (Accordingly, any zero-length vector of indices will result in an empty array being returned.) |
An array containing the data elements corresponding to the indices requested
compound operation
extractCompoundJSON(type, value)
extractCompoundJSON(type, value)
type |
type |
value |
value |
Fetch data from a remote dataset
getData(dataset, indices, transfermode) ## S4 method for signature 'HSDSDataset,character,character' getData(dataset, indices, transfermode) ## S4 method for signature 'HSDSDataset,character,missing' getData(dataset, indices) ## S4 method for signature 'HSDSDataset,list,character' getData(dataset, indices, transfermode) ## S4 method for signature 'HSDSDataset,list,missing' getData(dataset, indices)
getData(dataset, indices, transfermode) ## S4 method for signature 'HSDSDataset,character,character' getData(dataset, indices, transfermode) ## S4 method for signature 'HSDSDataset,character,missing' getData(dataset, indices) ## S4 method for signature 'HSDSDataset,list,character' getData(dataset, indices, transfermode) ## S4 method for signature 'HSDSDataset,list,missing' getData(dataset, indices)
dataset |
An object of type HSDSDataset, the dataset to access. |
indices |
The indices of the data to fetch |
transfermode |
Either 'JSON' or 'binary' (default) |
The servers require data to be fetched in slices, i.e., in sets of for which the indices of each dimension are of the form start:stop:step. More complex sets of indices will be split into slices and fetched in multiple requests. This is opaque to the user, but may enter into considerations of data access patterns, e.g., for performance-tuning.
an Array containing the data fetched from the server
if (check_hsds()) { s <- HSDSSource(URL_hsds()) f <- HSDSFile(s, '/shared/bioconductor/patelGBMSC.h5') d <- HSDSDataset(f, '/assay001') x <- getData(d, c('1:4', '1:27998'), transfermode='JSON') xb <- getData(d, c('1:4', '1:27998'), transfermode='binary') # x <- getData(d, c(1:4, 1:27998), transfermode='JSON') # method missing? x xb }
if (check_hsds()) { s <- HSDSSource(URL_hsds()) f <- HSDSFile(s, '/shared/bioconductor/patelGBMSC.h5') d <- HSDSDataset(f, '/assay001') x <- getData(d, c('1:4', '1:27998'), transfermode='JSON') xb <- getData(d, c('1:4', '1:27998'), transfermode='binary') # x <- getData(d, c(1:4, 1:27998), transfermode='JSON') # method missing? x xb }
A DelayedArray backend for accessing a remote HDF5 server.
Construct an object of type HSDSArray directly from the data members of its seed
HSDSArray(endpoint, svrtype, domain, dsetname)
HSDSArray(endpoint, svrtype, domain, dsetname)
endpoint |
URL of remote server |
svrtype |
type of server, must be either 'hsds' or 'h5serv' |
domain |
HDF5 domain of H5 file on server |
dsetname |
complete internal path to dataset in H5 file |
An initialized object of type HSDSArray
Other HSDSArray:
HSDSMatrix
,
as()
if (check_hsds()) { HSDSArray(URL_hsds(), "hsds", "/shared/bioconductor/darmgcls.h5", "/assay001") }
if (check_hsds()) { HSDSArray(URL_hsds(), "hsds", "/shared/bioconductor/darmgcls.h5", "/assay001") }
HSDSArraySeed for HSDSArray backend to DelayedArray
Construct an object of type HSDSArraySeed
HSDSArraySeed(endpoint, svrtype, domain, dsetname)
HSDSArraySeed(endpoint, svrtype, domain, dsetname)
endpoint |
URL of remote server |
svrtype |
type of server, must be either 'hsds' or 'h5serv' |
domain |
HDF5 domain of H5 file on server |
dsetname |
complete internal path to dataset in H5 file |
An initialized object of type HSDSArraySeed
endpoint
URL of remote server
svrtype
type of server, must be either 'hsds' or 'h5serv'
domain
HDF5 domain of H5 file on server
dsetname
complete internal path to dataset in H5 file
dataset
object of type HSDSDataset for access to the H5 dataset
Construct an object of type HSDSDataset A HSDSDataset is a representation of a dataset in a HDF5 file.
HSDSDataset(file, path)
HSDSDataset(file, path)
file |
An object of type HSDSFile which hosts the dataset |
path |
The complete intrafile path to the dataset |
An initialized object of type HSDSDataset
if (check_hsds()) { src <- HSDSSource(URL_hsds()) f <- HSDSFile(src, '/shared/bioconductor/patelGBMSC.h5') d <- HSDSDataset(f, '/assay001') }
if (check_hsds()) { src <- HSDSSource(URL_hsds()) f <- HSDSFile(src, '/shared/bioconductor/patelGBMSC.h5') d <- HSDSDataset(f, '/assay001') }
An S4 class to represent a dataset in a HDF5 file.
file
An object of type HSDSFile; the file in which the dataset is resident.
path
The dataset's path in the internal HDF5 hiearchy.
uuid
The unique unit ID by which the dataset is accessed in the server database system.
shape
The dimensions of the dataset
type
The dataset's HDF5 datatype
A HSDSFile is a representation of an HDF5 file the contents of which are accessible exposed by a HDF5 server.
HSDSFile(src, domain)
HSDSFile(src, domain)
src |
an object of type HSDSSource, the server which exposes the file |
domain |
the domain string; the file's location on the server's file system. |
This function is deprecated and will be defunct in the next release.
an initialized object of type HSDSFile
if (check_hsds()) { src <- HSDSSource(URL_hsds()) f10x <- HSDSFile(src, '/shared/bioconductor/patelGBMSC.h5') }
if (check_hsds()) { src <- HSDSSource(URL_hsds()) f10x <- HSDSFile(src, '/shared/bioconductor/patelGBMSC.h5') }
An S4 class to represent an HDF5 file accessible from a server.
HSDSSource
an object of type HSDSSource
domain
the file's domain on the server; more or less, an alias for its location in the external server file system
dsetdf
a data.frame that caches often-used information about the file
DelayedMatrix subclass for a two-dimensional HSDSArray
Other HSDSArray:
HSDSArray
,
as()
A HSDSSource is a representation of a URL which provides access to a HDF5 server (either h5serv or hsds.)
HSDSSource(endpoint, type = "hsds")
HSDSSource(endpoint, type = "hsds")
endpoint |
URL for server |
type |
Type of server software at the source; must be |
This function is deprecated and will be defunct in the next release.
An object of type HSDSSource
if (check_hsds()) { src.hsds <- HSDSSource(URL_hsds()) }
if (check_hsds()) { src.hsds <- HSDSSource(URL_hsds()) }
This class is deprecated and will be defunct in the next release.
endpoint
URL for server
type
Type of server software at the source; must be either 'h5serv' or (default) 'hsds'
isplit converts a numeric vector into a list of sequences for compact reexpression
isplit(x) sproc(spl)
isplit(x) sproc(spl)
x |
a numeric vector (should be integers) |
spl |
output of isplit |
list of vectors of integers which can be expressed as initial/final/stride triplets
list of colon-delimited strings each with initial/final/stride triplet
inds = c(1:10, seq(25,50,2), seq(200,150,-2)) sproc(isplit(inds))
inds = c(1:10, seq(25,50,2), seq(200,150,-2)) sproc(isplit(inds))
The datasets in an HDF5 file are organized internally by groups. This routine traverses the internal group hiearchy, locates all datasets and prints a list of them. Note that if the file's group hiearchy is complex, this could be time-consuming.
listDatasets(file)
listDatasets(file)
file |
an object of type HSDSFile to be searched |
This function is deprecated and will be defunct in the next release.
a list of inner-paths
if (check_hsds()) { src <- HSDSSource(URL_hsds()) f <- HSDSFile(src, '/shared/bioconductor/patelGBMSC.h5') listDatasets(f) }
if (check_hsds()) { src <- HSDSSource(URL_hsds()) f <- HSDSFile(src, '/shared/bioconductor/patelGBMSC.h5') listDatasets(f) }
The user needs to give the domain to start in. The search will be non-recursive. I.e., output for domain '/home/jreadey/' will not return the files in '/home/jreadey/HDFLabTutorial/'
listDomains(object, rootdir) ## S4 method for signature 'HSDSSource,character' listDomains(object, rootdir) ## S4 method for signature 'HSDSSource,missing' listDomains(object)
listDomains(object, rootdir) ## S4 method for signature 'HSDSSource,character' listDomains(object, rootdir) ## S4 method for signature 'HSDSSource,missing' listDomains(object)
object |
An object of type HSDSSource |
rootdir |
A slash-separated directory in the HSDSSource file system. |
This function is deprecated and will be defunct in the next release.
a vector of domains in the rootdir
src.hsds <- HSDSSource(URL_hsds()) listDomains(src.hsds, '/shared')
src.hsds <- HSDSSource(URL_hsds()) listDomains(src.hsds, '/shared')
The rhdf5client package provides read-only access to HDF5 files maintained on a server. The HDFGroup provides two servers, an obsolescent one called 'h5serv' and the newer prototype called 'hsds'.
These functions are provided for compatibility with older versions of ‘rhdf5client’ only, and will be defunct at the next release.
The following functions are deprecated and will be made defunct in the next release:
URL_h5serv
URL_hsds
dsmeta
getReq
groups
setPath
links
transfermode
dataset
internalDim
hsdsInfo
domains
getDatasetUUIDs
getDatasetAttrs
getDims
getHRDF
H5S_dataset2
getDatasetSlice
fetchDatasets
isplit
sproc
listDomains
listDatasets
getData
The following classes are deprecated and will be made defunct in the next release:
H5S_source
H5S_dataset
H5S_Array
H5S_Matrix
HSDSSource
HSDSFile
HSDSDataset
manage hsds URL
URL_hsds()
URL_hsds()
URL of hsds server
URL_hsds()
URL_hsds()