knowYourCG

A tool for functional analysis of DNA methylomes

Quick Start

knowYourCG is a tool for evaluating the enrichment of CpG probes in different methylation feature sets. These features can be categorical (e.g., CpGs located at tissue-specific transcription factors) or continuous (e.g., the local CpG density at a regulatory element). Additionally, the set of CpGs to which the test will be applied can be categorical or continuous as well.

The set of CpGs tested for enrichment is called the query set, and the curated target features are called the database sets. A query set, for example, might be the results of a differential methylation analysis or an epigenome-wide association study. We have curated a variety of database sets that represent different categorical and continuous methylation features such as CpGs associated with chromatin states, technical artifacts, gene association and gene expression correlation, transcription factor binding sites, tissue specific methylation, CpG density, etc.

The following commands prepare the use of knowYourCG. Several database sets are retrieved and caching is performed to enable faster access in future enrichment testing. More information on viewing and accessing available database sets is discussed later on.

library(knowYourCG)
library(sesameData)
sesameDataCache(data_titles=c("genomeInfo.hg38","genomeInfo.mm10",
                  "KYCG.MM285.tissueSignature.20211211",
                  "MM285.tissueSignature","MM285.address",
                  "probeIDSignature","KYCG.MM285.designGroup.20210210",
                  "KYCG.MM285.chromHMM.20210210",
                  "KYCG.MM285.TFBSconsensus.20220116",
                  "KYCG.MM285.HMconsensus.20220116",
                  "KYCG.MM285.chromosome.mm10.20210630"
                  ))

The following example uses a query of CpGs methylated in mouse primordial germ cells (design group PGCMeth). First get the CG list using the following code.

query <- getDBs("MM285.designGroup")[["PGCMeth"]]
head(query)
## [1] "cg36615889_TC11" "cg36646136_BC21" "cg36647910_BC11" "cg36857173_TC21"
## [5] "cg36877289_BC21" "cg36899653_BC21"

Now test the enrichment. By default, KYCG will select all the categorical groups available but we can specify a subset of databases.

dbs <- c("KYCG.MM285.chromHMM.20210210",
         "KYCG.HM450.TFBSconsensus.20211013",
         "KYCG.MM285.HMconsensus.20220116",
         "KYCG.MM285.tissueSignature.20211211",
         "KYCG.MM285.chromosome.mm10.20210630",
         "KYCG.MM285.designGroup.20210210")
results_pgc <- testEnrichment(query,databases = dbs,platform="MM285")
head(results_pgc)
##        estimate       p.value log10.p.value     test  nQ    nD overlap
## 123 1024.000000  0.000000e+00   -1529.07602 Log2(OR) 474   474     474
## 54     7.622407  0.000000e+00    -528.31906 Log2(OR) 474 10603     415
## 6      5.943042 5.091536e-244    -243.29315 Log2(OR) 474  3575     197
## 48     3.942719 1.890431e-113    -112.72344 Log2(OR) 474  9641     160
## 37     1.809729  2.626803e-13     -12.58057 Log2(OR) 474 10089      52
## 9      1.794176  1.324557e-06      -5.87793 Log2(OR) 474  4113      22
##      cf_Jaccard cf_overlap   cf_NPMI cf_SorensenDice           FDR
## 123 1.000000000  1.0000000 1.0000000     1.000000000  0.000000e+00
## 54  0.038923279  0.8755274 0.4865289     0.074930035  0.000000e+00
## 6   0.051142264  0.4156118 0.4837397     0.097307977 2.342107e-242
## 48  0.016072325  0.3375527 0.3108444     0.031636184 6.521987e-112
## 37  0.004947198  0.1097046 0.1352113     0.009845688  7.249978e-12
## 9   0.004819277  0.0464135 0.1268791     0.009592326  3.046480e-05
##                               group   dbname n_min
## 123 KYCG.MM285.designGroup.20210210  PGCMeth    NA
## 54  KYCG.MM285.HMconsensus.20220116  H3K9me3    14
## 6      KYCG.MM285.chromHMM.20210210      Het    NA
## 48  KYCG.MM285.HMconsensus.20220116 H3K79me3     2
## 37  KYCG.MM285.HMconsensus.20220116 H3K36me3    30
## 9      KYCG.MM285.chromHMM.20210210   Quies3    NA

As expected, the PGCMeth group itself appears on the top of the list. But one can also find histone H3K9me3, chromHMM Het and transcription factor Trim28 binding enriched in this CG group.

Testing Scenarios

There are four testing scenarios depending on the type format of the query set and database sets. They are shown with the respective testing scenario in the table below. testEnrichment, testEnrichmentSEA are for Fisher’s exact test and Set Enrichment Analysis respectively.

Four knowYourCG Testing Scenarios
Continuous Database Set Discrete Database Set
Continuous Query Correlation-based Set Enrichment Analysis
Discrete Query Set Enrichment Analysis Fisher’s Exact Test

Enrichment Testing

The main work horse function for testing enrichment of a categorical query against categorical databases is the testEnrichment function. This function will perform Fisher’s exact testing of the query against each database set (one-tailed by default, but two-tailed optionally) and reports overlap and enrichment statistics.

Choice of universe set: Universe set is the set of all probes for a given platform. It can either be passed in as an argument called universeSet or the platform name can be passed with argument platform. If neither of these are supplied, the universe set will be inferred from the probes in the query.

library(SummarizedExperiment)

## prepare a query
df <- rowData(sesameDataGet('MM285.tissueSignature'))
query <- df$Probe_ID[df$branch == "fetal_brain" & df$type == "Hypo"]

results <- testEnrichment(query, "TFBS", platform="MM285")
results %>% dplyr::filter(overlap>10) %>% head
##   estimate      p.value log10.p.value     test  nQ    nD overlap  cf_Jaccard
## 1 3.058586 4.813228e-18    -17.317564 Log2(OR) 200  6645      32 0.004696903
## 2 3.245138 5.187778e-18    -17.285019 Log2(OR) 200  5228      29 0.005371365
## 3 2.959109 1.916663e-14    -13.717454 Log2(OR) 200  5604      26 0.004499827
## 4 2.244055 3.357158e-12    -11.474028 Log2(OR) 200 12296      34 0.002728294
## 5 2.536127 3.448999e-10     -9.462307 Log2(OR) 200  6195      22 0.003452063
## 6 1.876932 1.114057e-04     -3.953092 Log2(OR) 200  5509      13 0.002282303
##   cf_overlap   cf_NPMI cf_SorensenDice          FDR n_min
## 1      0.160 0.2150698     0.009349890 1.110185e-15     1
## 2      0.145 0.2280937     0.010685335 1.110185e-15     1
## 3      0.130 0.2063000     0.008959338 2.734439e-12     8
## 4      0.170 0.1553535     0.005441741 3.592159e-10     2
## 5      0.110 0.1745582     0.006880375 2.952344e-08     1
## 6      0.065 0.1246681     0.004554213 7.946942e-03     1
##                               group  dbname
## 1 KYCG.MM285.TFBSconsensus.20220116    LHX3
## 2 KYCG.MM285.TFBSconsensus.20220116  POU3F1
## 3 KYCG.MM285.TFBSconsensus.20220116    ISL1
## 4 KYCG.MM285.TFBSconsensus.20220116 ONECUT2
## 5 KYCG.MM285.TFBSconsensus.20220116    SOX3
## 6 KYCG.MM285.TFBSconsensus.20220116  NKX2-1
## prepare another query
query <- df$Probe_ID[df$branch == "fetal_liver" & df$type == "Hypo"]
results <- testEnrichment(query, "TFBS", platform="MM285")
results %>% dplyr::filter(overlap>10) %>%
    dplyr::select(dbname, estimate, test, FDR) %>% head
##   dbname estimate     test          FDR
## 1   TAL1 4.253039 Log2(OR) 8.749398e-42
## 2  GATA1 3.738643 Log2(OR) 9.060254e-30
## 3  SMAD1 3.162168 Log2(OR) 2.924401e-26
## 4   LDB1 2.084605 Log2(OR) 4.586066e-06
## 5    MYB 1.497470 Log2(OR) 8.785247e-04
## 6  GATA2 1.433997 Log2(OR) 7.724124e-03

The output of each test contains multiple variables including: the estimate (fold enrichment), p-value, overlap statistics, type of test, as well as the name of the database set and the database group. By default, the results are sorted by -log10 of the of p-value and the fold enrichment.

The nQ and nD columns identify the length of the query set and the database set, respectively. Often, it’s important to examine the extent of overlap between the two sets, so that metric is reported as well in the overlap column.

Database Sets

The success of enrichment testing depends on the availability of biologically-relevant databases. To reflect the biological meaning of databases and facilitate selective testing, we have organized our database sets into different groups. Each group contains one or multiple databases. Here is how to find the names of pre-built database groups:

listDBGroups("MM285")
## # A tibble: 11 × 3
##    Title                               Description                         type 
##    <chr>                               <chr>                               <chr>
##  1 KYCG.MM285.chromHMM.20210210        chromHMM consensus from mouseENCODE cate…
##  2 KYCG.MM285.chromosome.mm10.20210630 CpG position by mm10 chromosomes f… cate…
##  3 KYCG.MM285.designGroup.20210210     MM285 probe design categories       cate…
##  4 KYCG.MM285.HMconsensus.20220116     CpGs associated with consensus his… cate…
##  5 KYCG.MM285.Mask.20220123            MM285 probe masking 20220123_MM285… cate…
##  6 KYCG.MM285.metagene.20220126        metagene coordinates with respect … cate…
##  7 KYCG.MM285.probeType.20210630       Probe type database sets (rs, cg, … cate…
##  8 KYCG.MM285.seqContext.20210630      Sequence context groups, e.g., CpG… cate…
##  9 KYCG.MM285.seqContextN.20210630     KYCG numerical database holding Se… nume…
## 10 KYCG.MM285.TFBSconsensus.20220116   CpGs associated with consensus TFB… cate…
## 11 KYCG.MM285.tissueSignature.20211211 MM285 probes associated with tissu… cate…

The listDBGroups() function returns a data frame containing information of these databases. The Title column is the accession key one needs for the testEnrichment function. With the accessions, one can either directly use them in the testEnrichment function or explicitly call the getDBs() function to retrieve databases themselves. Caching these databases on the local machine is important, for two reasons: it limits the number of requests sent to the Bioconductor server, and secondly it limits the amount of time the user needs to wait when re-downloading database sets. For this reason, one should run sesameDataCache() before loading in any database sets. This will take some time to download all of the database sets but this only needs to be done once per installation. During the analysis the database sets can be identified using these accessions. knowYourCG also does some guessing when a unique substring is given. For example, the string “MM285.designGroup” retrieves the “KYCG.MM285.designGroup.20210210” database. Let’s look at the database group which we had used as the query (query and database are reciprocal) in our first example:

dbs <- getDBs("MM285.design")
## Selected the following database groups:
## 1. KYCG.MM285.designGroup.20210210

In total, 32 datasets have been loaded for this group. We can get the “PGCMeth” as an element of the list:

str(dbs[["PGCMeth"]])
##  chr [1:474] "cg36615889_TC11" "cg36646136_BC21" "cg36647910_BC11" ...
##  - attr(*, "group")= chr "KYCG.MM285.designGroup.20210210"
##  - attr(*, "dbname")= chr "PGCMeth"

On subsequent runs of the getDBs() function, the database loading can be faster thanks to the sesameData in-memory caching, if the corresponding database has been loaded.

Query Set(s)

A query set represents probes of interest. It may either be in the form of a character vector where the values correspond to probe IDs or a named numeric vector where the names correspond to probe IDs. The query and database definition is rather arbitrary. One can regard a database as a query and turn a query into a database, like in our first example. In real world scenario, query can come from differential methylation testing, unsupervised clustering, correlation with a phenotypic trait, and many others. For example, we could consider CpGs that show tissue-specific methylation as the query. We are getting the B-cell-specific hypomethylation.

df <- rowData(sesameDataGet('MM285.tissueSignature'))
query <- df$Probe_ID[df$branch == "B_cell"]
head(query)
## [1] "cg32668003_TC11" "cg45118317_TC11" "cg37563895_TC11" "cg46105105_BC11"
## [5] "cg47206675_TC21" "cg38855216_TC21"

This query set represents hypomethylated probes in Mouse B-cells from the MM285 platform. This specific query set has 168 probes.

Gene Enrichment

A special case of set enrichment is to test whether CpGs are associated with specific genes. Automating the enrichment test process only works when the number of database sets is small. This is important when targeting all genes as there are tens of thousands of genes on each platform. By testing only those genes that overlap with the query set, we can greatly reduce the number of tests. For this reason, the gene enrichment analysis is a special case of these enrichment tests. We can perform this analysis using the buildGeneDBs() function.

query <- names(sesameData_getProbesByGene("Dnmt3a", "MM285"))
results <- testEnrichment(query, 
    buildGeneDBs(query, max_distance=100000, platform="MM285"),
    platform="MM285")
main_stats <- c("dbname","estimate","gene_name","FDR", "nQ", "nD", "overlap")
results[,main_stats]
##                  dbname   estimate gene_name           FDR nQ nD overlap
## 5  ENSMUSG00000073242.4 1024.00000  Dnmt3aos 1.563399e-137 36 63      36
## 3 ENSMUSG00000020661.15 1024.00000    Dnmt3a 5.221704e-134 36 75      36
## 7  ENSMUSG00000112271.1   16.36791    Gm9088 3.532663e-118 36 60      32
## 2  ENSMUSG00000020660.6   15.45278      Pomc 6.421680e-108 36 63      30
## 1 ENSMUSG00000020658.10   13.96872     Efr3b  6.004251e-88 36 74      26
## 8  ENSMUSG00000112517.1   13.37524   Gm48001  3.814402e-70 36 60      21
## 4 ENSMUSG00000071454.13   12.51988      Dtnb  2.249011e-48 36 51      15
## 6  ENSMUSG00000092286.1   12.62688    Dtnbos  2.335881e-33 36 28      10

As expected, we recover our targeted gene (Dnmt3a).

Gene enrichment testing can easily be included with default or user specified database sets by setting include_genes=TRUE:

query <- names(sesameData_getProbesByGene("Dnmt3a", "MM285"))
dbs <- c("KYCG.MM285.chromHMM.20210210","KYCG.HM450.TFBSconsensus.20211013",
         "KYCG.MM285.chromosome.mm10.20210630")
results <- testEnrichment(query,databases=dbs,
                          platform="MM285",include_genes=TRUE)
main_stats <- c("dbname","estimate","gene_name","FDR", "nQ", "nD", "overlap")
results[,main_stats] %>% 
    head()
##                   dbname    estimate gene_name           FDR nQ    nD overlap
## 41 ENSMUSG00000020661.15 1024.000000    Dnmt3a 6.204571e-153 36    37      36
## 22                 chr12 1024.000000      <NA>  5.564283e-51 36 10949      36
## 42  ENSMUSG00000073242.4 1024.000000  Dnmt3aos  2.893398e-31 36     8       8
## 2                   EnhG    4.008313      <NA>  5.126860e-05 36  3640       6
## 17                    Tx    2.588614      <NA>  3.081189e-04 36 17801      10
## 1                    Enh    2.800184      <NA>  3.142022e-03 36  8269       6

GO/Pathway Enrichment

One can get all the genes associated with a probe set and test the Gene Ontology of the probe-associated genes using the testGO() function, which internally utilizes g:Profiler2 for the enrichment analysis:

library(gprofiler2)
df <- rowData(sesameDataGet('MM285.tissueSignature'))
query <- df$Probe_ID[df$branch == "fetal_liver" & df$type == "Hypo"]
res <- testGO(query, platform="MM285",organism = "mmusculus")
head(res$result)
##      query significant      p_value term_size query_size intersection_size
## 6  query_1        TRUE 7.964257e-09     11561        128                92
## 7  query_1        TRUE 4.732933e-05     17412        128               109
## 25 query_1        TRUE 1.986984e-04      5296        130                59
## 12 query_1        TRUE 2.442191e-04       441        126                13
## 13 query_1        TRUE 2.442191e-04       441        126                13
## 26 query_1        TRUE 2.975598e-04      8023        130                77
##    precision      recall    term_id source
## 6  0.7187500 0.007957789 GO:0005737  GO:CC
## 7  0.8515625 0.006260051 GO:0005622  GO:CC
## 25 0.4538462 0.011140483  TF:M05599     TF
## 12 0.1031746 0.029478458 GO:0030695  GO:MF
## 13 0.1031746 0.029478458 GO:0060589  GO:MF
## 26 0.5923077 0.009597407  TF:M00189     TF
##                                       term_name effective_domain_size
## 6                                     cytoplasm                 26995
## 7            intracellular anatomical structure                 26995
## 25        Factor: WT1; motif: NGCGGGGGGGTSMMCYN                 21629
## 12                    GTPase regulator activity                 25063
## 13 nucleoside-triphosphatase regulator activity                 25063
## 26            Factor: AP-2; motif: MKCCCSCNGGCG                 21629
##    source_order      parents
## 6           309 GO:00056....
## 7           237   GO:0110165
## 25         3828    TF:M00000
## 12         3988   GO:0060589
## 13         8391   GO:0030234
## 26          118    TF:M00000

Genomic Proximity Testing

Sometimes it may be of interest whether a query set of probes share close genomic proximity. Co-localization may suggest co-regulation or co-occupancy in the same regulatory element. KYCG can test for genomic proximity using the testProbeProximity()function. Poisson statistics for the expected # of co-localized hits from the given query size (lambda) and the actual co-localized CpG pairs along with the p value are returned:

df <- rowData(sesameDataGet('MM285.tissueSignature'))
probes <- df$Probe_ID[df$branch == "fetal_liver" & df$type == "Hypo"]
res <- testProbeProximity(probeIDs=probes)
head(res)
## $Stats
##    nQ Hits Lambda        P.val
## 1 194    4   0.08 2.554721e-08
## 
## $Clusters
##   seqnames     start       end distance
## 1     chr1 165770666 165770667       11
## 2     chr1 165770677 165770678   377829
## 3     chr5  75601915  75601916       29
## 4     chr5  75601944  75601945 73617660
## 5     chr9 110235046 110235047       26
## 6     chr9 110235072 110235073       NA
## 7    chr11  32245638  32245639       95
## 8    chr11  32245733  32245734 63088309

Set Enrichment Analysis

The query may be a named continuous vector. In that case, either a gene enrichment score will be calculated (if the database is discrete) or a Spearman correlation will be calculated (if the database is continuous as well). The three other cases are shown below using biologically relevant examples.

To display this functionality, let’s load two numeric database sets individually. One is a database set for CpG density and the other is a database set corresponding to the distance of the nearest transcriptional start site (TSS) to each probe.

query <- getDBs("KYCG.MM285.designGroup")[["TSS"]]
sesameDataCache(data_titles = c("KYCG.MM285.seqContextN.20210630"))
res <- testEnrichmentSEA(query, "MM285.seqContextN")
main_stats <- c("dbname", "test", "estimate", "FDR", "nQ", "nD", "overlap")
res[,main_stats]
##        dbname                 test   estimate        FDR    nQ     nD overlap
## 2   distToTSS Set Enrichment Score  0.7486501 0.00000000 69236 303421   69236
## 1 CpGDesity50 Set Enrichment Score -0.2626335 0.03625893 69236 297415   69236

The estimate here is enrichment score.

NOTE: Negative enrichment score suggests enrichment of the categorical database with the higher values (in the numerical database). Positive enrichment score represent enrichment with the smaller values. As expected, the designed TSS CpGs are significantly enriched in smaller TSS distance and higher CpG density.

Alternatively one can test the enrichment of a continuous query with discrete databases. Here we will use the methylation level from a sample as the query and test it against the chromHMM chromatin states.

library(sesame)
sesameDataCache(data_titles = c("MM285.1.SigDF"))
beta_values <- getBetas(sesameDataGet("MM285.1.SigDF"))
res <- testEnrichmentSEA(beta_values, "MM285.chromHMM")
main_stats <- c("dbname", "test", "estimate", "FDR", "nQ", "nD", "overlap")
res[,main_stats] 
##      dbname                 test   estimate           FDR    nQ     nD overlap
## 14      Tss Set Enrichment Score  0.8010037  0.000000e+00 41675 296070   41672
## 15   TssBiv Set Enrichment Score  0.6609816  0.000000e+00 12278 296070   12278
## 10   Quies4 Set Enrichment Score  0.3407788  0.000000e+00  6751 296070    6751
## 1       Enh Set Enrichment Score  0.3277562  0.000000e+00  8269 296070    8269
## 5     EnhPr Set Enrichment Score  0.2930447  0.000000e+00  5912 296070    5912
## 16  TssFlnk Set Enrichment Score  0.2873390  0.000000e+00  9462 296070    9461
## 12   ReprPC Set Enrichment Score  0.2365804  0.000000e+00  8858 296070    8858
## 3     EnhLo Set Enrichment Score  0.1898612 4.925381e-145  1808 296070    1808
## 17       Tx Set Enrichment Score -0.4111345  4.356632e-02 17801 296070   17801
## 13 ReprPCWk Set Enrichment Score -0.2297174  4.686427e-02  9806 296070    9805
## 11   QuiesG Set Enrichment Score -0.2460352  5.080536e-02 35428 296070   35423
## 18     TxWk Set Enrichment Score -0.3113600  6.100270e-02 14167 296070   14165
## 7     Quies Set Enrichment Score -0.3181956  6.100270e-02 96622 296070   96602
## 6       Het Set Enrichment Score -0.1748840  6.676702e-02  3575 296070    3575
## 2      EnhG Set Enrichment Score -0.1601814  6.597839e-01  3640 296070    3640
## 9    Quies3 Set Enrichment Score -0.1629812  1.000000e+00  4113 296070    4112
## 4   EnhPois Set Enrichment Score -0.1866245  1.000000e+00 12317 296070   12317
## 8    Quies2 Set Enrichment Score -0.1213437  1.000000e+00  2603 296070    2601

As expected, chromatin states Tss, Enh has negative enrichment score, meaning these databases are associated with small values of the query (DNA methylation level). On the contrary, Het and Quies states are associated with high methylation level.

Session Info

sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] sesame_1.24.0               gprofiler2_0.2.3           
##  [3] SummarizedExperiment_1.36.0 Biobase_2.67.0             
##  [5] GenomicRanges_1.59.0        GenomeInfoDb_1.43.0        
##  [7] IRanges_2.41.0              S4Vectors_0.44.0           
##  [9] MatrixGenerics_1.19.0       matrixStats_1.4.1          
## [11] knitr_1.48                  sesameData_1.23.0          
## [13] ExperimentHub_2.15.0        AnnotationHub_3.15.0       
## [15] BiocFileCache_2.15.0        dbplyr_2.5.0               
## [17] BiocGenerics_0.53.0         knowYourCG_1.3.0           
## 
## loaded via a namespace (and not attached):
##  [1] DBI_1.2.3               bitops_1.0-9            rlang_1.1.4            
##  [4] magrittr_2.0.3          compiler_4.4.1          RSQLite_2.3.7          
##  [7] png_0.1-8               vctrs_0.6.5             reshape2_1.4.4         
## [10] stringr_1.5.1           pkgconfig_2.0.3         crayon_1.5.3           
## [13] fastmap_1.2.0           XVector_0.46.0          utf8_1.2.4             
## [16] rmarkdown_2.28          tzdb_0.4.0              preprocessCore_1.68.0  
## [19] UCSC.utils_1.2.0        purrr_1.0.2             bit_4.5.0              
## [22] xfun_0.48               zlibbioc_1.52.0         cachem_1.1.0           
## [25] jsonlite_1.8.9          blob_1.2.4              DelayedArray_0.33.1    
## [28] BiocParallel_1.41.0     parallel_4.4.1          R6_2.5.1               
## [31] RColorBrewer_1.1-3      bslib_0.8.0             stringi_1.8.4          
## [34] jquerylib_0.1.4         Rcpp_1.0.13             wheatmap_0.2.0         
## [37] readr_2.1.5             Matrix_1.7-1            tidyselect_1.2.1       
## [40] abind_1.4-8             yaml_2.3.10             codetools_0.2-20       
## [43] curl_5.2.3              lattice_0.22-6          tibble_3.2.1           
## [46] plyr_1.8.9              withr_3.0.2             KEGGREST_1.47.0        
## [49] evaluate_1.0.1          Biostrings_2.75.0       pillar_1.9.0           
## [52] BiocManager_1.30.25     filelock_1.0.3          plotly_4.10.4          
## [55] generics_0.1.3          RCurl_1.98-1.16         BiocVersion_3.21.1     
## [58] hms_1.1.3               ggplot2_3.5.1           munsell_0.5.1          
## [61] scales_1.3.0            glue_1.8.0              lazyeval_0.2.2         
## [64] maketools_1.3.1         tools_4.4.1             sys_3.4.3              
## [67] data.table_1.16.2       buildtools_1.0.0        grid_4.4.1             
## [70] tidyr_1.3.1             AnnotationDbi_1.69.0    colorspace_2.1-1       
## [73] GenomeInfoDbData_1.2.13 cli_3.6.3               rappdirs_0.3.3         
## [76] fansi_1.0.6             S4Arrays_1.6.0          viridisLite_0.4.2      
## [79] dplyr_1.1.4             gtable_0.3.6            sass_0.4.9             
## [82] digest_0.6.37           SparseArray_1.6.0       htmlwidgets_1.6.4      
## [85] memoise_2.0.1           htmltools_0.5.8.1       lifecycle_1.0.4        
## [88] httr_1.4.7              mime_0.12               bit64_4.5.2