Authors: Johannes Rainer [cre] (https://orcid.org/0000-0002-6977-7147)
Compiled: Mon Nov 18 03:11:37 2024
MassBank is an open access, community maintained annotation database
for small compounds. Annotations provided by this database comprise
names, chemical formulas, exact masses and other chemical properties for
small compounds (including metabolites, medical treatment agents and
others). In addition, fragment spectra are available which are crucial
for the annotation of untargeted mass spectrometry data. The CompoundDb
Bioconductor package supports conversion of MassBank data into the
CompDb
(SQLite) format which enables a simplified
distribution of the resource and easy integration into
Bioconductor-based annotation workflows.
CompDb
Databases from
AnnotationHub
The AHMassBank
package provides the metadata for all
CompDb
SQLite databases with MassBank annotations in
r Biocpkg("AnnotationHub")
. To get and use MassBank
annotations we first we load/update the AnnotationHub
resource.
## Warning: multiple methods tables found for 'intersect'
## Warning: multiple methods tables found for 'intersect'
Next we list all MassBank entries from
AnnotationHub
.
## AnnotationHub with 6 records
## # snapshotDate(): 2024-10-28
## # $dataprovider: MassBank
## # $species: NA
## # $rdataclass: CompDb
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["AH107048"]]'
##
## title
## AH107048 | MassBank CompDb for release 2021.03
## AH107049 | MassBank CompDb for release 2022.06
## AH111334 | MassBank CompDb for release 2022.12.1
## AH116164 | MassBank CompDb for release 2023.06
## AH116165 | MassBank CompDb for release 2023.09
## AH116166 | MassBank CompDb for release 2023.11
We fetch the CompDb
with MassBank annotations for
release 2021.03.
## downloading 1 resources
## retrieving 1 resource
## loading from cache
## require("CompoundDb")
## Warning: multiple methods tables found for 'union'
## Warning: multiple methods tables found for 'intersect'
## Warning: multiple methods tables found for 'setdiff'
CompDb
Databases from MassBankMassBank provides its annotation database as a MySQL dump. To
simplify its usage (also for users not experienced with MySQL or with
the specific MassBank database layout), MassBank annotations can also be
converted into the (SQLite-based) CompDb
format which can
be easily used with the CompoundDb
package. The steps to convert a MassBank MySQL database to a
CompDb
SQLite database are described below.
First the MySQL database dump needs to be downloaded from the
MassBank github
page. This database needs to be installed into a local MySQL/MariaDB
database server (using
mysql -h localhost -u <username> -p < MassBank.sql
with <username>
being the name of the user with write
access to the database server).
To transfer the MassBank data into a CompDb
database a
helper function from the CompoundDb
package can be
used.
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] CompoundDb_1.11.0 S4Vectors_0.45.2 AnnotationFilter_1.31.0
## [4] AnnotationHub_3.15.0 BiocFileCache_2.15.0 dbplyr_2.5.0
## [7] BiocGenerics_0.53.3 generics_0.1.3 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 dplyr_1.1.4 blob_1.2.4
## [4] filelock_1.0.3 Biostrings_2.75.1 bitops_1.0-9
## [7] fastmap_1.2.0 lazyeval_0.2.2 RCurl_1.98-1.16
## [10] digest_0.6.37 mime_0.12 lifecycle_1.0.4
## [13] cluster_2.1.6 ProtGenerics_1.39.0 rsvg_2.6.1
## [16] KEGGREST_1.47.0 RSQLite_2.3.8 magrittr_2.0.3
## [19] compiler_4.4.2 rlang_1.1.4 sass_0.4.9
## [22] tools_4.4.2 utf8_1.2.4 yaml_2.3.10
## [25] knitr_1.49 htmlwidgets_1.6.4 bit_4.5.0
## [28] curl_6.0.1 xml2_1.3.6 BiocParallel_1.41.0
## [31] withr_3.0.2 purrr_1.0.2 sys_3.4.3
## [34] grid_4.4.2 fansi_1.0.6 colorspace_2.1-1
## [37] ggplot2_3.5.1 MASS_7.3-61 scales_1.3.0
## [40] cli_3.6.3 rmarkdown_2.29 crayon_1.5.3
## [43] httr_1.4.7 rjson_0.2.23 DBI_1.2.3
## [46] cachem_1.1.0 zlibbioc_1.52.0 parallel_4.4.2
## [49] AnnotationDbi_1.69.0 BiocManager_1.30.25 XVector_0.47.0
## [52] base64enc_0.1-3 vctrs_0.6.5 jsonlite_1.8.9
## [55] IRanges_2.41.1 bit64_4.5.2 clue_0.3-66
## [58] maketools_1.3.1 jquerylib_0.1.4 glue_1.8.0
## [61] codetools_0.2-20 DT_0.33 Spectra_1.17.0
## [64] gtable_0.3.6 BiocVersion_3.21.1 GenomeInfoDb_1.43.0
## [67] GenomicRanges_1.59.0 UCSC.utils_1.3.0 munsell_0.5.1
## [70] tibble_3.2.1 pillar_1.9.0 rappdirs_0.3.3
## [73] htmltools_0.5.8.1 GenomeInfoDbData_1.2.13 R6_2.5.1
## [76] evaluate_1.0.1 Biobase_2.67.0 png_0.1-8
## [79] memoise_2.0.1 bslib_0.8.0 MetaboCoreUtils_1.15.0
## [82] Rcpp_1.0.13-1 gridExtra_2.3 ChemmineR_3.59.0
## [85] xfun_0.49 fs_1.6.5 MsCoreUtils_1.19.0
## [88] buildtools_1.0.0 pkgconfig_2.0.3