## $recountmethylation
## [1] "limma"
## [2] "gridExtra"
## [3] "knitr"
## [4] "recountmethylation"
## [5] "HDF5Array"
## [6] "rhdf5"
## [7] "DelayedArray"
## [8] "SparseArray"
## [9] "S4Arrays"
## [10] "abind"
## [11] "Matrix"
## [12] "ggplot2"
## [13] "minfiDataEPIC"
## [14] "IlluminaHumanMethylationEPICanno.ilm10b2.hg19"
## [15] "IlluminaHumanMethylationEPICmanifest"
## [16] "minfiData"
## [17] "IlluminaHumanMethylation450kanno.ilmn12.hg19"
## [18] "IlluminaHumanMethylation450kmanifest"
## [19] "minfi"
## [20] "bumphunter"
## [21] "locfit"
## [22] "parallel"
## [23] "iterators"
## [24] "foreach"
## [25] "Biostrings"
## [26] "XVector"
## [27] "SummarizedExperiment"
## [28] "Biobase"
## [29] "MatrixGenerics"
## [30] "matrixStats"
## [31] "GenomicRanges"
## [32] "GenomeInfoDb"
## [33] "IRanges"
## [34] "S4Vectors"
## [35] "stats4"
## [36] "BiocGenerics"
## [37] "generics"
## [38] "BiocStyle"
## [39] "stats"
## [40] "graphics"
## [41] "grDevices"
## [42] "utils"
## [43] "datasets"
## [44] "methods"
## [45] "base"
##
## $basilisk
## [1] "basilisk"
## [2] "reticulate"
## [3] "limma"
## [4] "gridExtra"
## [5] "knitr"
## [6] "recountmethylation"
## [7] "HDF5Array"
## [8] "rhdf5"
## [9] "DelayedArray"
## [10] "SparseArray"
## [11] "S4Arrays"
## [12] "abind"
## [13] "Matrix"
## [14] "ggplot2"
## [15] "minfiDataEPIC"
## [16] "IlluminaHumanMethylationEPICanno.ilm10b2.hg19"
## [17] "IlluminaHumanMethylationEPICmanifest"
## [18] "minfiData"
## [19] "IlluminaHumanMethylation450kanno.ilmn12.hg19"
## [20] "IlluminaHumanMethylation450kmanifest"
## [21] "minfi"
## [22] "bumphunter"
## [23] "locfit"
## [24] "parallel"
## [25] "iterators"
## [26] "foreach"
## [27] "Biostrings"
## [28] "XVector"
## [29] "SummarizedExperiment"
## [30] "Biobase"
## [31] "MatrixGenerics"
## [32] "matrixStats"
## [33] "GenomicRanges"
## [34] "GenomeInfoDb"
## [35] "IRanges"
## [36] "S4Vectors"
## [37] "stats4"
## [38] "BiocGenerics"
## [39] "generics"
## [40] "BiocStyle"
## [41] "stats"
## [42] "graphics"
## [43] "grDevices"
## [44] "utils"
## [45] "datasets"
## [46] "methods"
## [47] "base"
##
## $reticulate
## [1] "basilisk"
## [2] "reticulate"
## [3] "limma"
## [4] "gridExtra"
## [5] "knitr"
## [6] "recountmethylation"
## [7] "HDF5Array"
## [8] "rhdf5"
## [9] "DelayedArray"
## [10] "SparseArray"
## [11] "S4Arrays"
## [12] "abind"
## [13] "Matrix"
## [14] "ggplot2"
## [15] "minfiDataEPIC"
## [16] "IlluminaHumanMethylationEPICanno.ilm10b2.hg19"
## [17] "IlluminaHumanMethylationEPICmanifest"
## [18] "minfiData"
## [19] "IlluminaHumanMethylation450kanno.ilmn12.hg19"
## [20] "IlluminaHumanMethylation450kmanifest"
## [21] "minfi"
## [22] "bumphunter"
## [23] "locfit"
## [24] "parallel"
## [25] "iterators"
## [26] "foreach"
## [27] "Biostrings"
## [28] "XVector"
## [29] "SummarizedExperiment"
## [30] "Biobase"
## [31] "MatrixGenerics"
## [32] "matrixStats"
## [33] "GenomicRanges"
## [34] "GenomeInfoDb"
## [35] "IRanges"
## [36] "S4Vectors"
## [37] "stats4"
## [38] "BiocGenerics"
## [39] "generics"
## [40] "BiocStyle"
## [41] "stats"
## [42] "graphics"
## [43] "grDevices"
## [44] "utils"
## [45] "datasets"
## [46] "methods"
## [47] "base"
##
## $HDF5Array
## [1] "basilisk"
## [2] "reticulate"
## [3] "limma"
## [4] "gridExtra"
## [5] "knitr"
## [6] "recountmethylation"
## [7] "HDF5Array"
## [8] "rhdf5"
## [9] "DelayedArray"
## [10] "SparseArray"
## [11] "S4Arrays"
## [12] "abind"
## [13] "Matrix"
## [14] "ggplot2"
## [15] "minfiDataEPIC"
## [16] "IlluminaHumanMethylationEPICanno.ilm10b2.hg19"
## [17] "IlluminaHumanMethylationEPICmanifest"
## [18] "minfiData"
## [19] "IlluminaHumanMethylation450kanno.ilmn12.hg19"
## [20] "IlluminaHumanMethylation450kmanifest"
## [21] "minfi"
## [22] "bumphunter"
## [23] "locfit"
## [24] "parallel"
## [25] "iterators"
## [26] "foreach"
## [27] "Biostrings"
## [28] "XVector"
## [29] "SummarizedExperiment"
## [30] "Biobase"
## [31] "MatrixGenerics"
## [32] "matrixStats"
## [33] "GenomicRanges"
## [34] "GenomeInfoDb"
## [35] "IRanges"
## [36] "S4Vectors"
## [37] "stats4"
## [38] "BiocGenerics"
## [39] "generics"
## [40] "BiocStyle"
## [41] "stats"
## [42] "graphics"
## [43] "grDevices"
## [44] "utils"
## [45] "datasets"
## [46] "methods"
## [47] "base"
##
## $ggplot2
## [1] "basilisk"
## [2] "reticulate"
## [3] "limma"
## [4] "gridExtra"
## [5] "knitr"
## [6] "recountmethylation"
## [7] "HDF5Array"
## [8] "rhdf5"
## [9] "DelayedArray"
## [10] "SparseArray"
## [11] "S4Arrays"
## [12] "abind"
## [13] "Matrix"
## [14] "ggplot2"
## [15] "minfiDataEPIC"
## [16] "IlluminaHumanMethylationEPICanno.ilm10b2.hg19"
## [17] "IlluminaHumanMethylationEPICmanifest"
## [18] "minfiData"
## [19] "IlluminaHumanMethylation450kanno.ilmn12.hg19"
## [20] "IlluminaHumanMethylation450kmanifest"
## [21] "minfi"
## [22] "bumphunter"
## [23] "locfit"
## [24] "parallel"
## [25] "iterators"
## [26] "foreach"
## [27] "Biostrings"
## [28] "XVector"
## [29] "SummarizedExperiment"
## [30] "Biobase"
## [31] "MatrixGenerics"
## [32] "matrixStats"
## [33] "GenomicRanges"
## [34] "GenomeInfoDb"
## [35] "IRanges"
## [36] "S4Vectors"
## [37] "stats4"
## [38] "BiocGenerics"
## [39] "generics"
## [40] "BiocStyle"
## [41] "stats"
## [42] "graphics"
## [43] "grDevices"
## [44] "utils"
## [45] "datasets"
## [46] "methods"
## [47] "base"
This vignette provides instructions to construct and analyze a search
index of DNAm array data. The index is made using the
hnswlib
Python library, and the basilisk
and
reticulate
R/Bioconductor libraries are used to manage
Python environments and functions. These methods should be widely
usedful for genomics and epigenomics analyses, especially for very large
datasets.
The search index has a similar function to the index of a book.
Rather than storing the full/uncompressed data, only the between-entity
relations are stored, which enables rapid entity lookup and nearest
neighbors analysis while keeping stored file sizes manageable. Many
methods for search index construction are available. The Hierarchical
Navigable Small World (HNSW) graph method, used below, is fairly new
(Malkov and Yashunin (2018)) and was among
the overall top performing methods benchmarked in ANN benchmarks
(Aumüller, Bernhardsson, and Faithfull
(2018)). HNSW is implemented by the hnswlib
Python
library, which also includes helpful docstrings and a ReadMe to apply
the method in practice.
While prior work showed the utility of indexing several types of
biomedical data for research, to our knowledge this is the first time
support has been provided for R users to make and analyze search indexes
of DNAm array data. This vignette walks through a small example using a
handful DNAm array samples from blood. Interested users can further
access a comprehensive index
of pre-compiled DNAm array data from blood samples on the recountmethylation
server. These data were available in the Gene Expression Omnibus
(GEO) by March 31, 2021, and they include 13,835 samples run on either
the HM450K or EPIC platform.
Make a new search index using sample DNAm array data after performing dimensionality reduction on the data using feature hashing (a.k.a. “the hashing trick”, Weinberger et al. (2010)).
First, use the setup_sienv()
function to set up a Python
virtual environment named “dnam_si_vignette” which contains the required
dependencies. Since this function and other related functions are not
exported, use :::
to call it.
For greater reproducibility, specify exact dependency versions
e.g. like “numpy==1.20.1”. Install the hnswlib
(v0.5.1)
library to manage search index construction and access. install
pandas
(v1.2.2) and numpy
(v1.20.1) for data
manipulations, and mmh3
(v3.0.0) for feature hashing. Note
certain packages, including hnswlib
, may only be available
in conda repositories for certain operating systems.
Search index efficiency pairs well with dimensionality reduction to enable very rapid queries on large datasets. This section shows how to reduce a dataset of DNAm fractions prior to indexing.
First, save the DNAm fractions (Beta-values) from a
SummarizedExperiment
object, ensuring sample data is in
rows and probes are in columns. First, access the DNAm locally from the
object gr-noob_h5se_hm450k-epic-merge_0-0-3
, which is
available for download from recount.bio/data
. Next,
identify samples of interest from the sample metadata, which is accessed
using colData()
. After subsetting the samples, store the
DNAm fractions (a.k.a.
`Beta-values'') for 100 CpG probes, which we access using
getBeta()`.
First, load the h5se
object.
gr.fname <- "gr-noob_h5se_hm450k-epic-merge_0-0-3"
gr <- HDF5Array::loadHDF5SummarizedExperiment(gr.fname)
md <- as.data.frame(colData(gr)) # get sample metadata
Next, identify samples from study GSE67393 (Inoshita et al. (2015)) using the sample
metadata object md
.
# identify samples from metadata
gseid <- "GSE67393"
gsmv <- md[md$gse == gseid,]$gsm # get study samples
gsmv <- gsmv[sample(length(gsmv), 10)]
For this vignette, select a random subset of whole blood and PBMC
samples to analyze. Identify these using the “blood.subgroup” column in
md
.
# get random samples by group label
set.seed(1)
mdf <- md[md$blood.subgroup=="PBMC",]
gsmv <- c(gsmv, mdf[sample(nrow(mdf), 20),]$gsm)
mdf <- md[md$blood.subgroup=="whole_blood",]
gsmv <- c(gsmv, mdf[sample(nrow(mdf), 20),]$gsm)
For the specified samples of interest, extract DNAm Beta-value fractions for a subset of 100 probes.
# norm bvals for probe subset
num.cg <- 100
grf <- gr[,gr$gsm %in% gsmv]; dim(grf)
bval <- getBeta(grf[sample(nrow(grf), num.cg),])
bval <- t(bval) # get transpose
rownames(bval) <- gsub("\\..*", "", rownames(bval)) # format rownames
This produced a DNAm matrix of 50 samples by 100 probes, which we’ll save.
Call get_fh()
to perform feature hashing on the DNAm
data. Feature hashing is a dimensionality reduction technique, which
here means that it reduces the dataset features/columns (or probes in
this case) while preserving beween-sample variances.
Specify the target dimensions for this step using the
ndim
argument. For this small example, reduce the DNAm
matrix to about 10% of its original size by setting the target
dimensions to 10 (e.g. use ndim = 10
).
# get example table and labels
num.dim <- 10 # target reduced dimensions
fhtable.fpath <- "bval_100_fh10.csv"
recountmethylation:::get_fh(csv_savepath = fhtable.fpath,
csv_openpath = bval.fpath,
ndim = num.dim)
If ndim
is high, the data is less reduced/compressed but
more closely resembles the original uncompressed data, while the
opposite is true at lower ndim
. The exact target dimensions
to ultimately use is up to user discretion. In practice, 10,000
dimensions yields a good tradeoff between compression and uncompressed
data simliarity for HM450K arrays.
Use make_si()
to make a new search index. This function
calls the hnswlib
Python package to make a new search index
and dictionary using the hashed features file generated above. The
resulting search index file has the extension *.pickle
,
since the pickle
Python library is used to compress the
search index binary.
# set file paths
si.fname.str <- "new_search_index"
si.fpath <- file.path(si.dpath, paste0(si.fname.str, ".pickle"))
sidict.fpath <- file.path(si.dpath, paste0(si.fname.str, "_dict.pickle"))
# make the new search index
recountmethylation:::make_si(fh_csv_fpath = fhtable.fpath,
si_fname = si.fpath,
si_dict_fname = sidict.fpath)
The tuning parameters space_val
, efc_val
,
m_val
, and ef_val
were selected to work well
for DNAm array data, and further details about these parameters can be
found in the hnswlib
package docstrings and ReadMe.
Analyze nearest neighbors returned from a series of queries varying the k number of nearest neighbors from 1 to 20.
Specify a vector of valid GSM IDs which can be found in the hashed
features table bval_100_fh10.csv
as well as the saved
search index new_search_index.pickle
and dictionary
new_search_index_dictionary.pickle
.
Specify the vector lkval
containing the k numbers of
nearest neighbors to return in each query.
Certain query constraints are determined by the seach index properites. First, if each record in a search index includes 100 features (e.g. probes, hashed features, etc.), then queries should include exactly 100 dimensions per queried sample, in the same order as the search index. This is why it is convenient to use the previously compiled hashed features table for queries. Further, the k nearest neighbors to return cannot exceed the total indexed entities, or 50 in this example.
Use query_si()
to run the query. The path to the hashed
feature table bval_100_fh10.csv
is specified, which is
where sample data are accessed for each query.
The query results were assigned to dfnn
, which we can
now inspect. First show its dimensions.
dfnn
has 10 rows, corresponding to the 10 queried sample
IDs, and 5 columns. The first column shows the IDs for the queried
samples. Columns 2-5 show the results of individual queries, where
column names designate the k value for a query as
k=...
.
Now consider the query results for the sample in the first row, called “GSM1646161.1607013450.hlink.GSM1646161_9611518054_R01C01”. Check the results of the first 3 queries for this sample (k = 1, 5, or 10).
When k = 1, the sample ID is returned. This is because the query uses the same hashed features data as was used to make the search index, and the search is for a subset of the indexed samples.
This shows that samples are returned in the order of descending similarity to the queried data. For k = 5, the first sample returned is the same as k = 1, followed by the next 4 nearest neighboring samples.
For k = 10, the first 5 neighbors are the same as for k = 5, followed by the next 5 nearest neighbors.
This section shows some ways to visualize the results of nearest
neighbors queries using the ggplot2
package.
Now analyze the type of samples among returned nearest neighbors. Use
the md
object to map labels to returned sample IDs for a
single query, e.g. the first row where k = 5.
We see there are 4 whole blood samples, and 1 labeled other/NOS which corresponds to the label for the queried sample.
Now get the distribution of labels across samples for a single k value. We’ll show the distribution of samples with the label “whole_blood” from the variable “blood.subgroup”, focusing on nearest neighbors returned from the first query with k = 20.
dist.wb <- unlist(lapply(seq(nrow(dfnn)), function(ii){
gsmvi <- unlist(strsplit(dfnn[ii,"k=20"], ";"))
length(which(md[gsmvi,]$blood.subgroup=="whole_blood"))
}))
Now plot the results for whole blood after formatting the results variables. Use a composite violin plot and boxplot to show the results scaled as percentages on the y-axis, including important distribution features such as the median, interquantile range, and outliers.
dfp <- data.frame(num.samples = dist.wb)
dfp$perc.samples <- 100*dfp$num.samples/20
dfp$type <- "whole_blood"
ggplot(dfp, aes(x = type, y = perc.samples)) +
geom_violin() + geom_boxplot(width = 0.2) + theme_bw() +
ylab("Percent of neighbors") + theme(axis.title.x = element_blank()) +
scale_y_continuous(labels = scales::percent_format(scale = 1))
Repeat the above for all three mapped metadata labels. Define a
function get_dfgrp()
to calculate the number of neighbors
having each metadata label. This function takes the metadata object
md
as a first argument, and the vector of metadata labels
ugroupv
as the second argument.
For ugroupv
, specify the three metadata labels of
interest identifiable under the blood.subgroup
column in
md
(e.g. “whole_blood”, “PBMC”, “other/NOS”).
# function to get samples by label
get_dfgrp <- function(md, ugroupv = c("whole_blood", "PBMC", "other/NOS")){
do.call(rbind, lapply(c("whole_blood", "PBMC", "other/NOS"), function(ugroupi){
num.grp <- length(which(md[ugroupv,]$blood.subgroup==ugroupi))
data.frame(num.samples = num.grp, type = ugroupi)
}))
}
For each queried sample, use get_dfgrp()
to calculate
frequencies for metadata labels specified in ugroupv
.
Assign the results to dfp
.
# get samples by label across queries
dfp <- do.call(rbind, lapply(seq(nrow(dfnn)), function(ii){
get_dfgrp(md = md, unlist(strsplit(dfnn[ii,"k=20"], ";")))
}))
Format dfp
’s variables for plotting, then make the
composite violin and boxplots. This will show the three label
distributions across queries. The metadata labels will be ordered
according to their distribution medians, and the y-axis will reflect the
percent of neighbors containing each label.
# format dfp variables for plots
dfp$perc.samples <- 100*dfp$num.samples/20
# reorder on medians
medianv <- unlist(lapply(c("whole_blood", "PBMC", "other/NOS"), function(groupi){
median(dfp[dfp$type==groupi,]$perc.samples)}))
# define legend groups
dfp$`Sample\ntype` <- factor(dfp$type, levels = c("whole_blood", "PBMC", "other/NOS")[order(medianv)])
Generate composite violin plots and boxplots for each metadata label.
# make new composite plot
ggplot(dfp, aes(x = `Sample\ntype`, y = perc.samples, fill = `Sample\ntype`)) +
geom_violin() + geom_boxplot(width = 0.2) + theme_bw() +
ylab("Percent of neighbors") + theme(axis.title.x = element_blank()) +
scale_y_continuous(labels = scales::percent_format(scale = 1))
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] basilisk_1.19.0
## [2] reticulate_1.40.0
## [3] limma_3.63.2
## [4] gridExtra_2.3
## [5] knitr_1.49
## [6] recountmethylation_1.17.0
## [7] HDF5Array_1.35.2
## [8] rhdf5_2.51.1
## [9] DelayedArray_0.33.3
## [10] SparseArray_1.7.2
## [11] S4Arrays_1.7.1
## [12] abind_1.4-8
## [13] Matrix_1.7-1
## [14] ggplot2_3.5.1
## [15] minfiDataEPIC_1.32.0
## [16] IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
## [17] IlluminaHumanMethylationEPICmanifest_0.3.0
## [18] minfiData_0.52.0
## [19] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1
## [20] IlluminaHumanMethylation450kmanifest_0.4.0
## [21] minfi_1.53.1
## [22] bumphunter_1.49.0
## [23] locfit_1.5-9.10
## [24] iterators_1.0.14
## [25] foreach_1.5.2
## [26] Biostrings_2.75.3
## [27] XVector_0.47.1
## [28] SummarizedExperiment_1.37.0
## [29] Biobase_2.67.0
## [30] MatrixGenerics_1.19.0
## [31] matrixStats_1.4.1
## [32] GenomicRanges_1.59.1
## [33] GenomeInfoDb_1.43.2
## [34] IRanges_2.41.2
## [35] S4Vectors_0.45.2
## [36] BiocGenerics_0.53.3
## [37] generics_0.1.3
## [38] BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 sys_3.4.3
## [3] jsonlite_1.8.9 magrittr_2.0.3
## [5] GenomicFeatures_1.59.1 farver_2.1.2
## [7] rmarkdown_2.29 BiocIO_1.17.1
## [9] zlibbioc_1.52.0 vctrs_0.6.5
## [11] multtest_2.63.0 memoise_2.0.1
## [13] Rsamtools_2.23.1 DelayedMatrixStats_1.29.0
## [15] RCurl_1.98-1.16 askpass_1.2.1
## [17] htmltools_0.5.8.1 curl_6.0.1
## [19] Rhdf5lib_1.29.0 sass_0.4.9
## [21] nor1mix_1.3-3 bslib_0.8.0
## [23] plyr_1.8.9 cachem_1.1.0
## [25] buildtools_1.0.0 GenomicAlignments_1.43.0
## [27] lifecycle_1.0.4 pkgconfig_2.0.3
## [29] R6_2.5.1 fastmap_1.2.0
## [31] GenomeInfoDbData_1.2.13 digest_0.6.37
## [33] colorspace_2.1-1 siggenes_1.81.0
## [35] reshape_0.8.9 AnnotationDbi_1.69.0
## [37] RSQLite_2.3.9 base64_2.0.2
## [39] filelock_1.0.3 labeling_0.4.3
## [41] mgcv_1.9-1 httr_1.4.7
## [43] compiler_4.4.2 beanplot_1.3.1
## [45] rngtools_1.5.2 withr_3.0.2
## [47] bit64_4.5.2 BiocParallel_1.41.0
## [49] DBI_1.2.3 MASS_7.3-61
## [51] openssl_2.3.0 rjson_0.2.23
## [53] tools_4.4.2 rentrez_1.2.3
## [55] glue_1.8.0 quadprog_1.5-8
## [57] restfulr_0.0.15 nlme_3.1-166
## [59] rhdf5filters_1.19.0 grid_4.4.2
## [61] gtable_0.3.6 tzdb_0.4.0
## [63] preprocessCore_1.69.0 tidyr_1.3.1
## [65] data.table_1.16.4 hms_1.1.3
## [67] xml2_1.3.6 pillar_1.10.0
## [69] genefilter_1.89.0 splines_4.4.2
## [71] dplyr_1.1.4 lattice_0.22-6
## [73] survival_3.8-3 rtracklayer_1.67.0
## [75] bit_4.5.0.1 GEOquery_2.75.0
## [77] annotate_1.85.0 tidyselect_1.2.1
## [79] maketools_1.3.1 xfun_0.49
## [81] scrime_1.3.5 statmod_1.5.0
## [83] UCSC.utils_1.3.0 yaml_2.3.10
## [85] evaluate_1.0.1 codetools_0.2-20
## [87] tibble_3.2.1 BiocManager_1.30.25
## [89] cli_3.6.3 xtable_1.8-4
## [91] munsell_0.5.1 jquerylib_0.1.4
## [93] Rcpp_1.0.13-1 dir.expiry_1.15.0
## [95] png_0.1-8 XML_3.99-0.17
## [97] readr_2.1.5 blob_1.2.4
## [99] basilisk.utils_1.19.0 mclust_6.1.1
## [101] doRNG_1.8.6 sparseMatrixStats_1.19.0
## [103] bitops_1.0-9 scales_1.3.0
## [105] illuminaio_0.49.0 purrr_1.0.2
## [107] crayon_1.5.3 rlang_1.1.4
## [109] KEGGREST_1.47.0