Title: | R interface to EBI HoloFood resource |
---|---|
Description: | Utility package to facilitate integration and analysis of EBI HoloFood data in R. This package streamlines access to the resource, allowing for direct loading of data into formats optimized for downstream analytics. |
Authors: | Tuomas Borman [aut, cre] , Leo Lahti [aut] |
Maintainer: | Tuomas Borman <[email protected]> |
License: | Artistic-2.0 | file LICENSE |
Version: | 1.1.0 |
Built: | 2024-11-01 03:35:31 UTC |
Source: | https://github.com/bioc/HoloFoodR |
Add results from MGnifyR to HoloFoodR results
addMGnify(x, y, ...) ## S4 method for signature 'SummarizedExperiment,MultiAssayExperiment' addMGnify( x, y, exp.name1 = "metagenomic", exp.name2 = "metagenomic_amplicon", replace = TRUE, ... ) ## S4 method for signature 'SummarizedExperiment,SummarizedExperiment' addMGnify(x, y, ...) ## S4 method for signature 'SummarizedExperiment,ANY' addMGnify(x, y, id.col1 = "sample_biosample", id.col2 = "accession", ...)
addMGnify(x, y, ...) ## S4 method for signature 'SummarizedExperiment,MultiAssayExperiment' addMGnify( x, y, exp.name1 = "metagenomic", exp.name2 = "metagenomic_amplicon", replace = TRUE, ... ) ## S4 method for signature 'SummarizedExperiment,SummarizedExperiment' addMGnify(x, y, ...) ## S4 method for signature 'SummarizedExperiment,ANY' addMGnify(x, y, id.col1 = "sample_biosample", id.col2 = "accession", ...)
x |
|
y |
|
... |
optional arguments not used currently. |
exp.name1 |
|
exp.name2 |
|
replace |
|
id.col1 |
|
id.col2 |
|
Metagenomic data is found in MGnify rather than HoloFoodR, and the two
databases use different sample identifiers. However, MGnify's sample
metadata includes references to the identifiers used in the HoloFood
database, making it straightforward to convert sample IDs for alignment
with HoloFood data. Despite this, HoloFood contains additional metadata
not available in MGnify. Moreover, integrating data into a
MultiAssayExperiment
while maintaining accurate sample and system
matches can be challenging.
This function is designed to simplify these
tasks, enabling seamless integration of MGnify data with HoloFood data after
retrieval from the database. You need only to input the returned data from
MGnifyR::getResult()
and HoloFoodR::getResult()
functions.
MultiAssayExperiment
## Not run: # Get data from HoloFood database mae <- HoloFoodR::getResult( salmon_sample_ids, use.cache = TRUE ) # Get data from MGnify database mg <- MgnifyClient( useCache = TRUE, cacheDir = ".MGnifyR_cache" ) tse <- MGnifyR::getResult( mg, accession = mgnify_analyses_ids, get.func = FALSE ) # Add MGnify data to HoloFood data mae <- addMGnify(tse, mae) ## End(Not run)
## Not run: # Get data from HoloFood database mae <- HoloFoodR::getResult( salmon_sample_ids, use.cache = TRUE ) # Get data from MGnify database mg <- MgnifyClient( useCache = TRUE, cacheDir = ".MGnifyR_cache" ) tse <- MGnifyR::getResult( mg, accession = mgnify_analyses_ids, get.func = FALSE ) # Add MGnify data to HoloFood data mae <- addMGnify(tse, mae) ## End(Not run)
Search HoloFood database for animals, genome catalogues, samples, or viral catalogues
doQuery(type, flatten = TRUE, ...)
doQuery(type, flatten = TRUE, ...)
type |
|
flatten |
|
... |
optional arguments:
|
doQuery
is a flexible query function which can be utilized to search
available animals, genome catalogues, samples, or viral catalogues. Search
results can be filtered; for example, animals can be filtered based on
available samples. See [Api browser](https://www.holofooddata.org/api/docs)
for information on filters. You can find help on customizing queries from
[here](https://emg-docs.readthedocs.io/en/latest/api.html#customising-queries).
data.frame
# Find animals results. The maximum amount of results is 100. Use filter # so that only chicken is searched. res <- doQuery("animals", max.hits = 100, system = "chicken") head(res)
# Find animals results. The maximum amount of results is 100. Use filter # so that only chicken is searched. res <- doQuery("animals", max.hits = 100, system = "chicken") head(res)
Get data from HoloFood database
getData( type = NULL, accession.type = NULL, accession = NULL, flatten = FALSE, ... )
getData( type = NULL, accession.type = NULL, accession = NULL, flatten = FALSE, ... )
type |
|
accession.type |
|
accession |
|
flatten |
|
... |
optional arguments:
|
With getData
, you can fetch data from the database. Compared to
getResult
, this function is more flexible since it can fetch any kind
of data from the database. However, this function returns the data
without further wrangling as list
or data.frame
which are not
optimized format for fetching data on samples.
Search results can be filtered; for example, animals can be filtered based on available samples. See [Api browser](https://www.holofooddata.org/api/docs) for information on filters. You can find help on customizing queries from [here](https://emg-docs.readthedocs.io/en/latest/api.html#customising-queries).
list
or data.frame
# Find genome catalogues catalogues <- getData(type = "genome-catalogues") head(catalogues) # Find genomes based on certain genome catalogue iD res <- getData( type = "genomes", accession.type = "genome-catalogues", accession = catalogues[1, "id"], max.hits = 100) # See the data. head(res) # It includes for instance summary of the CAZy # (Carbohydrate-Active enZymes) annotations as a counts per category cazy <- res[ , grepl("annotations.cazy", colnames(res)), drop = FALSE] head(cazy) # Moreover, it includes a sample list. This sample list represents a # collection of samples where the MAG was identified. Thr data has also the # completeness of MAG in a sample. head(res[ c("metadata.Sample_accession", "metadata.Completeness")])
# Find genome catalogues catalogues <- getData(type = "genome-catalogues") head(catalogues) # Find genomes based on certain genome catalogue iD res <- getData( type = "genomes", accession.type = "genome-catalogues", accession = catalogues[1, "id"], max.hits = 100) # See the data. head(res) # It includes for instance summary of the CAZy # (Carbohydrate-Active enZymes) annotations as a counts per category cazy <- res[ , grepl("annotations.cazy", colnames(res)), drop = FALSE] head(cazy) # Moreover, it includes a sample list. This sample list represents a # collection of samples where the MAG was identified. Thr data has also the # completeness of MAG in a sample. head(res[ c("metadata.Sample_accession", "metadata.Completeness")])
Get metabolomic data from MetaboLights database
getMetaboLights(study.id, ...) getMetaboLightsFile(study.id, file, ...)
getMetaboLights(study.id, ...) getMetaboLightsFile(study.id, file, ...)
study.id |
|
... |
optional arguments:
|
file |
|
The HoloFood database primarily comprises targeted metabolomic data,
omitting non-targeted metabolomic information. Nonetheless, it features URLs
linking to studies within the MetaboLights database. This functionality
enables users to access non-targeted metabolomic data. The
getMetaboLights
function returns
a structured list encompassing processed data in data.frame
format
for study metadata, assay metadata, and assay.
The metadata includes the file names of spectra data. Those files can be
loaded with getMetaboLightsFile
. Alternatively, once you've identified
the study and files to fetch, you can refer to this
[vignette](https://rformassspectrometry.github.io/MsIO/articles/MsIO.html#loading-data-from-metabolights)
for instructions on loading the data directly into an MsExperiment
object, specifically designed for metabolomics spectra data.
list
# This example is not run, because the server fails to respond sometimes. if( FALSE ){ res <- getMetaboLights("MTBLS4381") file_paths <- getMetaLightsFile( study.id = "MTBLS4381", file = res[["assay_meta"]][["Raw Spectral Data File"]] ) }
# This example is not run, because the server fails to respond sometimes. if( FALSE ){ res <- getMetaboLights("MTBLS4381") file_paths <- getMetaLightsFile( study.id = "MTBLS4381", file = res[["assay_meta"]][["Raw Spectral Data File"]] ) }
Get data on samples from HoloFood database
getResult(accession, ...)
getResult(accession, ...)
accession |
|
... |
optional arguments:
|
With getResult
, you can fetch data on samples from the HoloFood
database. Compared to getData
, this function is more convenient for
fetching the samples data because it converts the data to
MultiAssayExperiment
where different omics are stored as
TreeSummarizedExperiment
objects which are optimized for downstream
analytics. Columns of returned MultiAssayExperiment
are individual
animals. These columns are linked with individual samples that are stored in
TreeSummarizedExperiment
objects.
The HoloFood database lacks non-targeted metabolomic data but they can be
fetched from MetaboLights resource. Certain datasets include processed
features. Those datasets can be retrieved with the function
getResult
which integrates metabolomic data with other datasets from
HoloFood.
Furthermore, while the HoloFoodR database does not include metagenomic
assembly data, users can access such data from the MGnify database. The
MGnifyR package provides a convenient interface for accessing this database.
By employing MGnifyR::getResult()
, users can obtain data formatted as
a MultiAssayExperiment
object, containing multiple
TreeSummarizedExperiment
objects. Consequently, data from both
HoloFood and MGnify databases are inherently compatible for subsequent
downstream analysis.
MultiAssayExperiment
getData
TreeSummarizedExperiment
MultiAssayExperiment
MGnifyR:getResult
# Find samples on certain animal samples <- doQuery("samples", animal_accession = "SAMEA112904746") # Get the data mae <- getResult(samples[["accession"]]) mae
# Find samples on certain animal samples <- doQuery("samples", animal_accession = "SAMEA112904746") # Get the data mae <- getResult(samples[["accession"]]) mae
HoloFoodR
packageHoloFoodR
implements an interface to the EBI HoloFood database.
See the vignette for a general introduction to this package,
[about HoloFood](https://www.holofood.eu/) for general HoloFood
information, and
[API documentation](https://docs.holofooddata.org/api.html) for
details on the JSONAPI implementation.
Maintainer: Tuomas Borman [email protected] (ORCID)
Authors:
Leo Lahti [email protected] (ORCID)
TreeSummarizedExperiment MultiAssayExperiment