Title: | R-side access to published microbial signatures from BugSigDB |
---|---|
Description: | The bugsigdbr package implements convenient access to bugsigdb.org from within R/Bioconductor. The goal of the package is to facilitate import of BugSigDB data into R/Bioconductor, provide utilities for extracting microbe signatures, and enable export of the extracted signatures to plain text files in standard file formats such as GMT. |
Authors: | Ludwig Geistlinger [aut, cre], Jennifer Wokaty [aut], Levi Waldron [aut] |
Maintainer: | Ludwig Geistlinger <[email protected]> |
License: | GPL-3 |
Version: | 1.13.0 |
Built: | 2024-10-30 04:27:46 UTC |
Source: | https://github.com/bioc/bugsigdbr |
Functionality for programmatically displaying microbe signatures on BugSigDB signature pages.
browseSignature(sname)
browseSignature(sname)
sname |
character. Signature name. Expected to start with a prefix
of the form |
The URL of the selected BugSigDB signature page. If interactive, opens the URL in the default web browser.
BugSigDB: https://bugsigdb.org
sname <- "bsdb:215/1/1_eczema:infant-with-eczema_vs_healthy-control_UP" browseSignature(sname)
sname <- "bsdb:215/1/1_eczema:infant-with-eczema_vs_healthy-control_UP" browseSignature(sname)
Functionality for programmatically displaying BugSigDB taxon pages.
browseTaxon(tax.id)
browseTaxon(tax.id)
tax.id |
character. NCBI taxonomy ID. |
The URL of the selected BugSigDB taxon page. If interactive, opens the URL in the default web browser.
BugSigDB: https://bugsigdb.org
# BugSigDB taxon page for Escherichia coli browseTaxon("562")
# BugSigDB taxon page for Escherichia coli browseTaxon("562")
Functionality for extracting specific taxonomic levels (such as genus and species) from a microbe signature containing taxonomic clades in MetaPhlAn format.
extractTaxLevel( sig, tax.id.type = c("metaphlan", "taxname"), tax.level = "mixed", exact.tax.level = TRUE )
extractTaxLevel( sig, tax.id.type = c("metaphlan", "taxname"), tax.level = "mixed", exact.tax.level = TRUE )
sig |
character. Microbe signature containing taxonomic clades in MetaPhlAn format. |
tax.id.type |
Character. Taxonomic ID type of the returned microbe
sets.
Either |
tax.level |
character. Either |
exact.tax.level |
logical. Should only the exact taxonomic level
specified by |
a character vector storing taxonomic clades restricted to chosen taxonomic level(s).
BugSigDB: https://bugsigdb.org
ord <- "k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales" sig <- c("f__Lactobacillaceae|g__Lactobacillus", "f__Aerococcaceae|g__Abiotrophia|s__Abiotrophia defectiva", "f__Lactobacillaceae|g__Limosilactobacillus|s__Limosilactobacillus mucosae") sig <- paste(ord, sig, sep = "|") sig <- extractTaxLevel(sig, tax.level = "genus") sig <- extractTaxLevel(sig, tax.level = "genus", exact.tax.level = FALSE) sig <- extractTaxLevel(sig, tax.id.type = "taxname", tax.level = "genus", exact.tax.level = FALSE)
ord <- "k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales" sig <- c("f__Lactobacillaceae|g__Lactobacillus", "f__Aerococcaceae|g__Abiotrophia|s__Abiotrophia defectiva", "f__Lactobacillaceae|g__Limosilactobacillus|s__Limosilactobacillus mucosae") sig <- paste(ord, sig, sep = "|") sig <- extractTaxLevel(sig, tax.level = "genus") sig <- extractTaxLevel(sig, tax.level = "genus", exact.tax.level = FALSE) sig <- extractTaxLevel(sig, tax.id.type = "taxname", tax.level = "genus", exact.tax.level = FALSE)
Functionality for obtaining meta-signatures for a column of interest
getMetaSignatures( df, column, direction = c("BOTH", "UP", "DOWN"), min.studies = 2, min.taxa = 5, comb.fun = sum, ... )
getMetaSignatures( df, column, direction = c("BOTH", "UP", "DOWN"), min.studies = 2, min.taxa = 5, comb.fun = sum, ... )
df |
|
column |
character. Column of interest. Need to be a valid column name
of |
direction |
character. Indicates direction of abundance change for signatures
to be included in the computation of meta-signatures. Use |
min.studies |
integer. Minimum number of studies for a category in |
min.taxa |
integer. Minimum size for meta-signatures. Defaults to 5, which will then only include meta-signatures containing at least 5 taxa. |
comb.fun |
function. Function for combining sample size of the exposed group
and sample size of the unexposed group into an overall study sample size. Defaults
to |
... |
additionals argument passed on to |
A list
of meta-signatures, each meta-signature being a named
numeric vector. Names are the taxa of the meta-signature, numeric values
correspond to sample size weights associated with each taxon.
getSignatures
df <- importBugSigDB() # Body-site specific meta-signatures composed from signatures reported as both # increased or decreased across all conditions of study: bs.meta.sigs <- getMetaSignatures(df, column = "Body site") # Condition-specific meta-signatures from fecal samples, increased # in conditions of study. Use taxonomic names instead of the default NCBI IDs: df.feces <- df[df$`Body site` == "Feces", ] cond.meta.sigs <- getMetaSignatures(df.feces, column = "Condition", direction = "UP", tax.id.type = "taxname") # Inspect the results names(cond.meta.sigs) cond.meta.sigs["Bipolar disorder"]
df <- importBugSigDB() # Body-site specific meta-signatures composed from signatures reported as both # increased or decreased across all conditions of study: bs.meta.sigs <- getMetaSignatures(df, column = "Body site") # Condition-specific meta-signatures from fecal samples, increased # in conditions of study. Use taxonomic names instead of the default NCBI IDs: df.feces <- df[df$`Body site` == "Feces", ] cond.meta.sigs <- getMetaSignatures(df.feces, column = "Condition", direction = "UP", tax.id.type = "taxname") # Inspect the results names(cond.meta.sigs) cond.meta.sigs["Bipolar disorder"]
Lightweight wrapper around ontologyIndex::get_ontology
to parse the Experimental Factor Ontology (EFO) or the Uber-anatomy ontology
(UBERON) from OBO format into an R object.
getOntology(onto = c("efo", "uberon"), cache = TRUE)
getOntology(onto = c("efo", "uberon"), cache = TRUE)
onto |
character. Ontology to obtain. Should be either |
cache |
logical. Should a locally cached version used if available?
Defaults to |
An object of class ontology_index
as defined in the
ontologyIndex package.
EFO: https://www.ebi.ac.uk/ols/ontologies/efo
UBERON: https://www.ebi.ac.uk/ols/ontologies/uberon
get_ontology
from the ontologyIndex package.
uberon <- getOntology("uberon")
uberon <- getOntology("uberon")
Functionality for obtaining microbe signatures from BugSigDB
getSignatures( df, tax.id.type = c("ncbi", "metaphlan", "taxname"), tax.level = "mixed", exact.tax.level = TRUE, min.size = 1 )
getSignatures( df, tax.id.type = c("ncbi", "metaphlan", "taxname"), tax.level = "mixed", exact.tax.level = TRUE, min.size = 1 )
df |
|
tax.id.type |
Character. Taxonomic ID type of the returned microbe sets.
Either |
tax.level |
character. Either |
exact.tax.level |
logical. Should only the exact taxonomic level
specified by |
min.size |
integer. Minimum signature size. Defaults to 1, which will
filter out empty signature. Use |
a list
of microbe signatures. Each signature is a character
vector of taxonomic IDs depending on the chosen tax.id.type
.
BugSigDB: https://bugsigdb.org
importBugSigDB
df <- importBugSigDB() sigs <- getSignatures(df)
df <- importBugSigDB() sigs <- getSignatures(df)
Obtain published microbial signatures from bugsigdb.org
importBugSigDB(version = "10.5281/zenodo.13997429", cache = TRUE)
importBugSigDB(version = "10.5281/zenodo.13997429", cache = TRUE)
version |
character. A Zenodo DOI, git commit hash, or "devel". Defaults to the most recent stable release on Zenodo, which includes complete and reviewed content from BugSigDB. See details. |
cache |
logical. Should a locally cached version used if available?
Defaults to |
There are three different options to obtain data from
BugSigDB, as determined by the version
argument.
a Zenodo DOI: use this option if you would like to obtain
one of the stable release versions of BugSigDB on Zenodo. These
stable release versions of BugSigDB have been automatically checked and
manually reviewed and provide for the highest data quality. Select this option
if you would like to incorporate BugSigDB into analysis and published
research. If not specified otherwise, the importBugSigDB
function
will obtain the most recent stable release from Zenodo by default.
"devel"
: use this option to obtain the latest version
("bleeding edge") of BugSigDB from the BugSigDBExports GitHub repo
(see references).
Note that this will also include incomplete and not reviewed content,
which should be filtered out prior to an analysis.
Select this option if you are a curator that actively contributes to
BugSigDB and would like to access data that you and other curators have
recently contributed to BugSigDB and that has not been included in a stable
release yet.
a git commit hash: it might be occasionally of interest to obtain a specific snapshot of the BugSigDBExports GitHub repo, e.g. for the sake of debugging and troubleshooting. This can be done by providing the short 7-character git commit hash (SHA) or the full SHA of the export of choice. To provide the full SHA, go to the BugSigDBExports commits page (see references) and use the copy symbol to the left of the 7-character codes to copy the full SHA code of the export version you want to use.
a data.frame
.
BugSigDB: https://bugsigdb.org
Stable release: https://doi.org/10.5281/zenodo.10627578
Latest version (incl. not reviewed content): https://github.com/waldronlab/BugSigDBExports
Release v1.2.2: https://zenodo.org/records/13997429
Release v1.2.1: https://zenodo.org/records/10627578
Release v1.2.0: https://zenodo.org/records/10407666
Release v1.1.0: https://zenodo.org/records/6468009
Release v1.0.2: https://zenodo.org/records/5904281
Release v1.0.1: https://zenodo.org/records/5819260
Release v1.0.0: https://zenodo.org/records/5606166
BugSigDBExports commits page: https://github.com/waldronlab/BugSigDBExports/commits/devel
df <- importBugSigDB()
df <- importBugSigDB()
Functionality for restricting microbe signatures to specific taxonomic levels such as genus and species.
restrictTaxLevel(df, tax.level = "mixed", exact.tax.level = TRUE, min.size = 1)
restrictTaxLevel(df, tax.level = "mixed", exact.tax.level = TRUE, min.size = 1)
df |
|
tax.level |
character. Either |
exact.tax.level |
logical. Should only the exact taxonomic level
specified by |
min.size |
integer. Minimum signature size. Defaults to 1, which will
filter out empty signatures. Use |
a data.frame
with microbe signature columns restricted to
chosen
taxonomic level(s).
BugSigDB: https://bugsigdb.org
importBugSigDB
df <- importBugSigDB() df <- restrictTaxLevel(df, tax.level = "genus")
df <- importBugSigDB() df <- restrictTaxLevel(df, tax.level = "genus")
This function facilitates ontology-based queries for experimental factors and body sites.
subsetByOntology(df, column = c("Body site", "Condition"), term, ontology)
subsetByOntology(df, column = c("Body site", "Condition"), term, ontology)
df |
|
column |
character. Column of |
term |
character. A valid term of |
ontology |
an object of class |
a data.frame
with the chosen column restricted to descendants
of the chosen term in the chosen ontology.
EFO: https://www.ebi.ac.uk/ols/ontologies/efo
UBERON: https://www.ebi.ac.uk/ols/ontologies/uberon
importBugSigDB
, getOntology
# (1) Obtain BugSigDB data df <- importBugSigDB() # (2) Obtain ontology of interest as an R object uberon <- getOntology("uberon") # (3) High-level query on body site sdf <- subsetByOntology(df, column = "Body site", term = "digestive system element", ontology = uberon) table(sdf[,"Body site"])
# (1) Obtain BugSigDB data df <- importBugSigDB() # (2) Obtain ontology of interest as an R object uberon <- getOntology("uberon") # (3) High-level query on body site sdf <- subsetByOntology(df, column = "Body site", term = "digestive system element", ontology = uberon) table(sdf[,"Body site"])
Functionality for writing microbe signatures to file in GMT format.
writeGMT(sigs, gmt.file, ...)
writeGMT(sigs, gmt.file, ...)
sigs |
A list of microbe signatures (character vectors of taxonomic IDs). |
gmt.file |
character. Path to output file in GMT format. |
... |
Arguments passed on to cat() |
none, writes to file.
GMT file format: http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats
bsdb <- importBugSigDB() sigs <- getSignatures(bsdb) writeGMT(sigs, gmt.file = "signatures.gmt") file.remove("signatures.gmt")
bsdb <- importBugSigDB() sigs <- getSignatures(bsdb) writeGMT(sigs, gmt.file = "signatures.gmt") file.remove("signatures.gmt")