Over the past years, RNA-seq data for several species have
accumulated in public repositories. Additionally, genome-wide
association studies (GWAS) have identified SNPs associated with
phenotypes of interest, such as agronomic traits in plants, production
traits in livestock, and complex human diseases. However, although GWAS
can identify SNPs, they cannot identify causative genes associated with
the studied phenotype. The goal of cageminer
is to
integrate GWAS-derived SNPs with transcriptomic data to mine candidate
genes and identify high-confidence genes associated with traits of
interest.
If you use cageminer
in your research, please cite us.
You can obtain citation information with
citation('cageminer')
, as demonstrated below:
print(citation('cageminer'), bibtex = TRUE)
#> To cite cageminer in publications use:
#>
#> Almeida-Silva, F., & Venancio, T. M. (2022). cageminer: an
#> R/Bioconductor package to prioritize candidate genes by integrating
#> genome-wide association studies and gene coexpression networks. in
#> silico Plants, 4(2), diac018.
#> https://doi.org/10.1093/insilicoplants/diac018
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Article{,
#> title = {cageminer: an R/Bioconductor package to prioritize candidate genes by integrating genome-wide association studies and gene coexpression networks},
#> author = {Fabricio Almeida-Silva and Thiago M. Venancio},
#> journal = {in silico Plants},
#> year = {2022},
#> volume = {4},
#> number = {2},
#> pages = {diac018},
#> url = {https://doi.org/10.1093/insilicoplants/diac018},
#> doi = {10.1093/insilicoplants/diac018},
#> }
For this vignette, we will use transcriptomic data on pepper (Capsicum annuum) response to Phytophthora root rot (Kim et al. 2018), and GWAS SNPs associated with resistance to Phytophthora root rot from Siddique et al. (2019). To ensure interoperability with other Bioconductor packages, expression data are stored as SummarizedExperiment objects, and gene/SNP positions are stored as GRanges objects.
# GRanges of SNP positions
data(snp_pos)
snp_pos
#> GRanges object with 116 ranges and 0 metadata columns:
#> seqnames ranges strand
#> <Rle> <IRanges> <Rle>
#> 2 Chr02 149068682 *
#> 3 Chr03 5274098 *
#> 4 Chr05 27703815 *
#> 5 Chr05 27761792 *
#> 6 Chr05 27807397 *
#> ... ... ... ...
#> 114 Chr12 230514706 *
#> 115 Chr12 230579178 *
#> 116 Chr12 230812962 *
#> 117 Chr12 230887290 *
#> 118 Chr12 231022812 *
#> -------
#> seqinfo: 8 sequences from an unspecified genome; no seqlengths
# GRanges of chromosome lengths
data(chr_length)
chr_length
#> GRanges object with 12 ranges and 0 metadata columns:
#> seqnames ranges strand
#> <Rle> <IRanges> <Rle>
#> [1] Chr01 1-272704604 *
#> [2] Chr02 1-171128871 *
#> [3] Chr03 1-257900543 *
#> [4] Chr04 1-222584275 *
#> [5] Chr05 1-233468049 *
#> ... ... ... ...
#> [8] Chr08 1-145103255 *
#> [9] Chr09 1-252779264 *
#> [10] Chr10 1-233593809 *
#> [11] Chr11 1-259726002 *
#> [12] Chr12 1-235688241 *
#> -------
#> seqinfo: 12 sequences from an unspecified genome; no seqlengths
# GRanges of gene coordinates
data(gene_ranges)
gene_ranges
#> GRanges object with 30242 ranges and 6 metadata columns:
#> seqnames ranges strand | source type score
#> <Rle> <IRanges> <Rle> | <factor> <factor> <numeric>
#> [1] Chr01 63209-63880 - | PGA1.55 gene NA
#> [2] Chr01 112298-112938 - | PGA1.55 gene NA
#> [3] Chr01 117979-118392 + | PGA1.55 gene NA
#> [4] Chr01 119464-119712 + | PGA1.55 gene NA
#> [5] Chr01 119892-120101 + | PGA1.55 gene NA
#> ... ... ... ... . ... ... ...
#> [30238] Chr12 235631138-235631467 - | PGA1.55 gene NA
#> [30239] Chr12 235642644-235645110 + | PGA1.55 gene NA
#> [30240] Chr12 235645483-235651927 - | PGA1.55 gene NA
#> [30241] Chr12 235652709-235655955 - | PGA1.55 gene NA
#> [30242] Chr12 235663655-235665276 - | PGA1.55 gene NA
#> phase ID Parent
#> <integer> <character> <CharacterList>
#> [1] <NA> CA01g00010
#> [2] <NA> CA01g00020
#> [3] <NA> CA01g00030
#> [4] <NA> CA01g00040
#> [5] <NA> CA01g00050
#> ... ... ... ...
#> [30238] <NA> CA12g22890
#> [30239] <NA> CA12g22900
#> [30240] <NA> CA12g22910
#> [30241] <NA> CA12g22920
#> [30242] <NA> CA12g22930
#> -------
#> seqinfo: 12 sequences from an unspecified genome; no seqlengths
# SummarizedExperiment of pepper response to Phytophthora root rot (RNA-seq)
data(pepper_se)
pepper_se
#> class: SummarizedExperiment
#> dim: 3892 45
#> metadata(0):
#> assays(1): ''
#> rownames(3892): CA02g23440 CA02g05510 ... CA03g35110 CA02g12750
#> rowData names(0):
#> colnames(45): PL1 PL2 ... TMV-P0-3D TMV-P0-Up
#> colData names(1): Condition
Before mining high-confidence candidates, you can visualize the SNP
distribution in the genome to explore possible patterns. First, let’s
see if SNPs are uniformly across chromosomes with
plot_snp_distribution()
.
As we can see, SNPs associated with resistance to Phytophthora root
rot tend to co-occur in chromosome 5. Now, we can see if they are close
to each other in the genome, and if they are located in gene-rich
regions. We can visualize it with plot_snp_circos
, which
displays a circos plot of SNPs across chromosomes.
There seems to be no clustering in gene-rich regions, but we can see that SNPs in the same chromosome tend to be physically close to each other.
If you have SNP positions for multiple traits, you need to store them in GRangesList or CompressedGRangesList objects, so each element will have SNP positions for a particular trait. Then, you can visualize their distribution as you would do for a single trait. Let’s simulate multiple traits to see how it works:
The cageminer
algorithm identifies high-confidence
candidate genes with 3 steps, which can be interpreted
as 3 sources of evidence:
These 3 steps can be executed individually (if users want more control on what happens after each step) or all at once.
To run the candidate mining step by step, you will need the functions
mine_step1()
, mine_step2
, and
mine_step3
.
The function mine_step1()
identifies genes based on step
1 and returns a GRanges object with all putative candidates and their
location in the genome. For that, you need to give 2 GRanges objects as
input, one with the gene coordinates1 and another with the SNP positions.
candidates1 <- mine_step1(gene_ranges, snp_pos)
candidates1
#> GRanges object with 1265 ranges and 6 metadata columns:
#> seqnames ranges strand | source type score
#> <Rle> <IRanges> <Rle> | <factor> <factor> <numeric>
#> [1] Chr02 147076830-147083477 + | PGA1.55 gene NA
#> [2] Chr02 147084450-147086637 - | PGA1.55 gene NA
#> [3] Chr02 147099482-147104002 - | PGA1.55 gene NA
#> [4] Chr02 147126373-147126537 + | PGA1.55 gene NA
#> [5] Chr02 147129897-147132335 - | PGA1.55 gene NA
#> ... ... ... ... . ... ... ...
#> [1261] Chr12 232989761-232990947 - | PGA1.55 gene NA
#> [1262] Chr12 232994658-232999784 + | PGA1.55 gene NA
#> [1263] Chr12 233001307-233004705 + | PGA1.55 gene NA
#> [1264] Chr12 233005539-233011740 - | PGA1.55 gene NA
#> [1265] Chr12 233018159-233022142 - | PGA1.55 gene NA
#> phase ID Parent
#> <integer> <character> <CharacterList>
#> [1] <NA> CA02g16550
#> [2] <NA> CA02g16560
#> [3] <NA> CA02g16570
#> [4] <NA> CA02g16580
#> [5] <NA> CA02g16590
#> ... ... ... ...
#> [1261] <NA> CA12g21190
#> [1262] <NA> CA12g21200
#> [1263] <NA> CA12g21210
#> [1264] <NA> CA12g21220
#> [1265] <NA> CA12g21230
#> -------
#> seqinfo: 12 sequences from an unspecified genome; no seqlengths
length(candidates1)
#> [1] 1265
The first step identified 1265 putative candidate genes. By default,
cageminer
uses a sliding window of 2 Mb to select putative
candidates2. If you want to visually inspect a
simulation of different sliding windows to choose a different one, you
can use simulate_windows()
.
The function mine_step2()
selects candidates in
coexpression modules enriched in guide genes. For that, users must infer
the GCN with the function exp2gcn() from the package BioNERO (Almeida-Silva and Venancio 2021). Guide genes
can be either a character vector of guide gene IDs or a data frame with
gene IDs in the first column and annotation in the second column (useful
if guides are divided in functional categories, for instance). Here,
pepper genes associated with defense-related MapMan bins were retrieved
from PLAZA 3.0 Dicots (Proost et al. 2015)
and used as guides.
The resulting object is a list of two elements:
# Load guide genes
data(guides)
head(guides)
#> Gene Description
#> 1 CA10g07770 response to stimulus
#> 2 CA10g07770 response to stress
#> 3 CA10g07770 cellular response to stimulus
#> 4 CA10g07770 cellular response to stress
#> 6 CA10g07770 regulation of cellular response to stress
#> 8 CA10g07770 regulation of response to stimulus
# Infer GCN
sft <- BioNERO::SFT_fit(pepper_se, net_type = "signed", cor_method = "pearson")
#> Warning: executing %dopar% sequentially: no parallel backend registered
#> Power SFT.R.sq slope truncated.R.sq mean.k. median.k. max.k.
#> 1 3 0.000902 0.0985 0.806 718.0 701.00 1060.0
#> 2 4 0.039500 -0.4680 0.833 470.0 451.00 807.0
#> 3 5 0.110000 -0.6540 0.851 322.0 301.00 639.0
#> 4 6 0.269000 -0.9120 0.891 229.0 209.00 520.0
#> 5 7 0.449000 -1.1200 0.920 168.0 149.00 432.0
#> 6 8 0.598000 -1.2900 0.945 126.0 109.00 364.0
#> 7 9 0.685000 -1.4300 0.949 96.8 81.00 311.0
#> 8 10 0.744000 -1.5000 0.961 75.7 61.30 268.0
#> 9 11 0.786000 -1.5800 0.964 60.2 47.00 233.0
#> 10 12 0.817000 -1.6100 0.969 48.5 36.50 204.0
#> 11 13 0.824000 -1.6600 0.966 39.5 28.80 180.0
#> 12 14 0.831000 -1.6900 0.965 32.5 23.00 159.0
#> 13 15 0.846000 -1.7000 0.972 27.1 18.30 142.0
#> 14 16 0.859000 -1.7100 0.976 22.7 14.70 127.0
#> 15 17 0.869000 -1.7200 0.981 19.2 11.90 115.0
#> 16 18 0.877000 -1.7200 0.984 16.3 9.76 103.0
#> 17 19 0.882000 -1.7200 0.986 14.0 7.97 93.7
#> 18 20 0.889000 -1.7100 0.988 12.0 6.63 85.2
gcn <- BioNERO::exp2gcn(pepper_se, net_type = "signed", cor_method = "pearson",
module_merging_threshold = 0.8, SFTpower = sft$power)
#> ..connectivity..
#> ..matrix multiplication (system BLAS)..
#> ..normalization..
#> ..done.
# Apply step 2
candidates2 <- mine_step2(pepper_se, gcn = gcn, guides = guides$Gene,
candidates = candidates1$ID)
#> Enrichment analysis for module black...
#> Enrichment analysis for module brown...
#> Enrichment analysis for module darkgreen...
#> Enrichment analysis for module darkgrey...
#> Enrichment analysis for module darkmagenta...
#> Enrichment analysis for module darkolivegreen...
#> Enrichment analysis for module darkorange...
#> Enrichment analysis for module darkorange2...
#> Enrichment analysis for module darkred...
#> Enrichment analysis for module darkturquoise...
#> Enrichment analysis for module green...
#> Enrichment analysis for module grey60...
#> Enrichment analysis for module ivory...
#> Enrichment analysis for module lightcyan...
#> Enrichment analysis for module midnightblue...
#> Enrichment analysis for module orangered4...
#> Enrichment analysis for module paleturquoise...
#> Enrichment analysis for module pink...
#> Enrichment analysis for module red...
#> Enrichment analysis for module royalblue...
#> Enrichment analysis for module salmon...
#> Enrichment analysis for module steelblue...
#> Enrichment analysis for module violet...
candidates2$candidates
#> [1] "CA10g08490" "CA03g01790" "CA10g12640" "CA12g21230" "CA10g02810"
#> [6] "CA03g01800" "CA02g17460" "CA10g02800" "CA03g03320" "CA05g14230"
#> [11] "CA07g04010" "CA05g06480" "CA03g02720" "CA10g02630" "CA12g18010"
#> [16] "CA07g04000" "CA02g16570" "CA10g02570" "CA05g15120" "CA02g16830"
#> [21] "CA12g18440" "CA12g18400" "CA10g02780" "CA07g12720" "CA03g01900"
#> [26] "CA12g07460" "CA03g02360" "CA02g16620" "CA10g08420" "CA03g02960"
#> [31] "CA03g03010" "CA05g15110" "CA02g16550" "CA05g14730" "CA02g16900"
#> [36] "CA03g03310" "CA02g17030"
candidates2$enrichment
#> term genes all pval padj category module
#> 2 guide 323 1303 2.575418e-05 5.150837e-05 Class black
After the step 2, we got 37 candidates.
The function mine_step3()
identifies candidate genes
whose expression levels significantly increase or decrease in a
particular condition. For that, you need to specify what level from the
sample metadata corresponds to this condition. The resulting object from
mine_step3()
is a data frame with mined candidates and
their correlation to the condition of interest.
# See the levels from the sample metadata
unique(pepper_se$Condition)
#> [1] "Placenta" "Pericarp" "PRR_control" "PRR_stress"
#> [5] "virus_control" "PepMov_stress" "TMV"
# Apply step 3 using "PRR_stress" as the condition of interest
candidates3 <- mine_step3(pepper_se, candidates = candidates2$candidates,
sample_group = "PRR_stress")
candidates3
#> gene trait cor pvalue group
#> 156 CA07g04010 PRR_stress 0.6441044 1.806861e-06 Condition
#> 149 CA07g04000 PRR_stress 0.6304891 3.453184e-06 Condition
#> 51 CA03g01790 PRR_stress 0.6270820 4.041221e-06 Condition
#> 191 CA10g02800 PRR_stress 0.5944358 1.666104e-05 Condition
#> 23 CA02g16830 PRR_stress 0.5896431 2.025006e-05 Condition
#> 254 CA12g21230 PRR_stress 0.5820251 2.743821e-05 Condition
#> 79 CA03g02720 PRR_stress 0.5817119 2.777850e-05 Condition
#> 128 CA05g14730 PRR_stress 0.5660078 5.072546e-05 Condition
#> 30 CA02g16900 PRR_stress 0.5633723 5.595313e-05 Condition
#> 233 CA12g18010 PRR_stress 0.5532646 8.088901e-05 Condition
#> 100 CA03g03310 PRR_stress 0.5520272 8.455364e-05 Condition
#> 16 CA02g16620 PRR_stress 0.5426992 1.174310e-04 Condition
#> 44 CA02g17460 PRR_stress 0.5415951 1.220086e-04 Condition
#> 198 CA10g02810 PRR_stress 0.5368391 1.436389e-04 Condition
#> 170 CA10g02570 PRR_stress 0.5363652 1.459748e-04 Condition
#> 58 CA03g01800 PRR_stress 0.5317864 1.703832e-04 Condition
#> 184 CA10g02780 PRR_stress 0.5255675 2.094609e-04 Condition
#> 247 CA12g18440 PRR_stress 0.5135431 3.087865e-04 Condition
#> 2 CA02g16550 PRR_stress 0.5099801 3.454712e-04 Condition
#> 72 CA03g02360 PRR_stress 0.4245213 3.655030e-03 Condition
#> 86 CA03g02960 PRR_stress 0.4132336 4.782097e-03 Condition
#> 205 CA10g08420 PRR_stress 0.4123162 4.885782e-03 Condition
#> 93 CA03g03010 PRR_stress 0.4074773 5.465748e-03 Condition
#> 212 CA10g08490 PRR_stress 0.3866456 8.700127e-03 Condition
#> 163 CA07g12720 PRR_stress 0.3829689 9.416098e-03 Condition
#> 142 CA05g15120 PRR_stress 0.3701199 1.233022e-02 Condition
#> 219 CA10g12640 PRR_stress 0.3264978 2.859993e-02 Condition
#> 9 CA02g16570 PRR_stress 0.3207866 3.167524e-02 Condition
#> 226 CA12g07460 PRR_stress 0.3076388 3.980351e-02 Condition
#> 107 CA03g03320 PRR_stress 0.3045042 4.197434e-02 Condition
Finally, we got 30 high-confidence candidate genes associated with resistance to Phytophthora root rot. Genes with negative correlation coefficients to the condition can be interpreted as having significantly reduced expression in this condition, while genes with positive correlation coefficients have significantly increased expression in this condition.
Alternatively, if you are not interested in inspecting the results
after each step, you can get to the same results from the previous
section with a single step by using the function
mine_candidates()
. This function is a wrapper that calls
mine_step1()
, sends the results to
mine_step2()
, and then it sends the results from
mine_step2()
to mine_step3()
.
candidates <- mine_candidates(gene_ranges = gene_ranges,
marker_ranges = snp_pos,
exp = pepper_se,
gcn = gcn, guides = guides$Gene,
sample_group = "PRR_stress")
#> Enrichment analysis for module black...
#> Enrichment analysis for module brown...
#> Enrichment analysis for module darkgreen...
#> Enrichment analysis for module darkgrey...
#> Enrichment analysis for module darkmagenta...
#> Enrichment analysis for module darkolivegreen...
#> Enrichment analysis for module darkorange...
#> Enrichment analysis for module darkorange2...
#> Enrichment analysis for module darkred...
#> Enrichment analysis for module darkturquoise...
#> Enrichment analysis for module green...
#> Enrichment analysis for module grey60...
#> Enrichment analysis for module ivory...
#> Enrichment analysis for module lightcyan...
#> Enrichment analysis for module midnightblue...
#> Enrichment analysis for module orangered4...
#> Enrichment analysis for module paleturquoise...
#> Enrichment analysis for module pink...
#> Enrichment analysis for module red...
#> Enrichment analysis for module royalblue...
#> Enrichment analysis for module salmon...
#> Enrichment analysis for module steelblue...
#> Enrichment analysis for module violet...
candidates
#> gene trait cor pvalue group
#> 156 CA07g04010 PRR_stress 0.6441044 1.806861e-06 Condition
#> 149 CA07g04000 PRR_stress 0.6304891 3.453184e-06 Condition
#> 51 CA03g01790 PRR_stress 0.6270820 4.041221e-06 Condition
#> 191 CA10g02800 PRR_stress 0.5944358 1.666104e-05 Condition
#> 23 CA02g16830 PRR_stress 0.5896431 2.025006e-05 Condition
#> 254 CA12g21230 PRR_stress 0.5820251 2.743821e-05 Condition
#> 79 CA03g02720 PRR_stress 0.5817119 2.777850e-05 Condition
#> 128 CA05g14730 PRR_stress 0.5660078 5.072546e-05 Condition
#> 30 CA02g16900 PRR_stress 0.5633723 5.595313e-05 Condition
#> 233 CA12g18010 PRR_stress 0.5532646 8.088901e-05 Condition
#> 100 CA03g03310 PRR_stress 0.5520272 8.455364e-05 Condition
#> 16 CA02g16620 PRR_stress 0.5426992 1.174310e-04 Condition
#> 44 CA02g17460 PRR_stress 0.5415951 1.220086e-04 Condition
#> 198 CA10g02810 PRR_stress 0.5368391 1.436389e-04 Condition
#> 170 CA10g02570 PRR_stress 0.5363652 1.459748e-04 Condition
#> 58 CA03g01800 PRR_stress 0.5317864 1.703832e-04 Condition
#> 184 CA10g02780 PRR_stress 0.5255675 2.094609e-04 Condition
#> 247 CA12g18440 PRR_stress 0.5135431 3.087865e-04 Condition
#> 2 CA02g16550 PRR_stress 0.5099801 3.454712e-04 Condition
#> 72 CA03g02360 PRR_stress 0.4245213 3.655030e-03 Condition
#> 86 CA03g02960 PRR_stress 0.4132336 4.782097e-03 Condition
#> 205 CA10g08420 PRR_stress 0.4123162 4.885782e-03 Condition
#> 93 CA03g03010 PRR_stress 0.4074773 5.465748e-03 Condition
#> 212 CA10g08490 PRR_stress 0.3866456 8.700127e-03 Condition
#> 163 CA07g12720 PRR_stress 0.3829689 9.416098e-03 Condition
#> 142 CA05g15120 PRR_stress 0.3701199 1.233022e-02 Condition
#> 219 CA10g12640 PRR_stress 0.3264978 2.859993e-02 Condition
#> 9 CA02g16570 PRR_stress 0.3207866 3.167524e-02 Condition
#> 226 CA12g07460 PRR_stress 0.3076388 3.980351e-02 Condition
#> 107 CA03g03320 PRR_stress 0.3045042 4.197434e-02 Condition
In some cases, you might have more high-confidence candidates than
you expected, and you want to pick only the top n genes for
validation, for instance. In this scenario, you need to assign scores to
your mined candidates to pick the top n genes with the highest
scores. The function score_genes()
does that by using the
formula below:
Si = rpbκ
where:
κ = 2 if the gene encodes a transcription factor
κ = 2 if the gene is a hub
κ = 3 if the gene encodes a hub transcription factor
κ = 1 if none of the conditions above are true
By default, score_genes
picks the top 10 candidates. If
there are less than 10 candidates, it will return all candidates sorted
by scores. Here, TFs were obtained from PlantTFDB 4.0 (Jin et al. 2017). Hub genes can be identified
with the function get_hubs_gcn()
from the package
BioNERO.
# Load TFs
data(tfs)
head(tfs)
#> Gene_ID Family
#> 1 CA12g20650 RAV
#> 2 CA00g00130 WRKY
#> 3 CA00g00230 WRKY
#> 4 CA00g00390 LBD
#> 5 CA00g03050 NAC
#> 6 CA00g07140 E2F/DP
# Get GCN hubs
hubs <- BioNERO::get_hubs_gcn(pepper_se, gcn)
head(hubs)
#> Gene Module kWithin
#> 1 CA12g18900 black 161.1001
#> 2 CA07g15040 black 159.9350
#> 3 CA07g15050 black 157.9800
#> 4 CA05g00150 black 153.9266
#> 5 CA03g08130 black 152.7083
#> 6 CA10g03980 black 148.5663
# Score candidates
scored <- score_genes(candidates, hubs$Gene, tfs$Gene_ID)
scored
#> gene trait cor pvalue group score
#> 191 CA10g02800 PRR_stress 0.5944358 1.666104e-05 Condition 1.1888715
#> 23 CA02g16830 PRR_stress 0.5896431 2.025006e-05 Condition 1.1792862
#> 44 CA02g17460 PRR_stress 0.5415951 1.220086e-04 Condition 1.0831901
#> 58 CA03g01800 PRR_stress 0.5317864 1.703832e-04 Condition 1.0635728
#> 72 CA03g02360 PRR_stress 0.4245213 3.655030e-03 Condition 0.8490427
#> 156 CA07g04010 PRR_stress 0.6441044 1.806861e-06 Condition 0.6441044
#> 149 CA07g04000 PRR_stress 0.6304891 3.453184e-06 Condition 0.6304891
#> 51 CA03g01790 PRR_stress 0.6270820 4.041221e-06 Condition 0.6270820
#> 254 CA12g21230 PRR_stress 0.5820251 2.743821e-05 Condition 0.5820251
#> 79 CA03g02720 PRR_stress 0.5817119 2.777850e-05 Condition 0.5817119
As none of the mined candidates are hubs or encode transcription factors, their scores are simply their correlation coefficients with the condition of interest.
This document was created under the following conditions:
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.4.2 (2024-10-31)
#> os Ubuntu 24.04.1 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate C
#> ctype en_US.UTF-8
#> tz Etc/UTC
#> date 2024-11-19
#> pandoc 3.2.1 @ /usr/local/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> abind 1.4-8 2024-09-12 [2] RSPM (R 4.4.0)
#> annotate 1.85.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> AnnotationDbi 1.69.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> AnnotationFilter 1.31.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> backports 1.5.0 2024-05-23 [2] RSPM (R 4.4.0)
#> base64enc 0.1-3 2015-07-28 [2] RSPM (R 4.4.0)
#> Biobase 2.67.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> BiocFileCache 2.15.0 2024-11-19 [2] https://bioc.r-universe.dev (R 4.4.2)
#> BiocGenerics 0.53.3 2024-11-15 [2] https://bioc.r-universe.dev (R 4.4.2)
#> BiocIO 1.17.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> BiocManager 1.30.25 2024-08-28 [2] RSPM (R 4.4.0)
#> BiocParallel 1.41.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> BiocStyle * 2.35.0 2024-11-19 [2] https://bioc.r-universe.dev (R 4.4.2)
#> biomaRt 2.63.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> BioNERO 1.15.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> Biostrings 2.75.1 2024-11-07 [2] https://bioc.r-universe.dev (R 4.4.2)
#> biovizBase 1.55.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> bit 4.5.0 2024-09-20 [2] RSPM (R 4.4.0)
#> bit64 4.5.2 2024-09-22 [2] RSPM (R 4.4.0)
#> bitops 1.0-9 2024-10-03 [2] RSPM (R 4.4.0)
#> blob 1.2.4 2023-03-17 [2] RSPM (R 4.4.0)
#> BSgenome 1.75.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> bslib 0.8.0 2024-07-29 [2] RSPM (R 4.4.0)
#> buildtools 1.0.0 2024-11-18 [3] local (/pkg)
#> cachem 1.1.0 2024-05-16 [2] RSPM (R 4.4.0)
#> cageminer * 1.13.0 2024-11-19 [1] https://bioc.r-universe.dev (R 4.4.2)
#> checkmate 2.3.2 2024-07-29 [2] RSPM (R 4.4.0)
#> circlize 0.4.16 2024-02-20 [2] RSPM (R 4.4.0)
#> cli 3.6.3 2024-06-21 [2] RSPM (R 4.4.0)
#> clue 0.3-66 2024-11-13 [2] RSPM (R 4.4.0)
#> cluster 2.1.6 2023-12-01 [2] RSPM (R 4.4.0)
#> coda 0.19-4.1 2024-01-31 [2] RSPM (R 4.4.0)
#> codetools 0.2-20 2024-03-31 [2] RSPM (R 4.4.0)
#> colorspace 2.1-1 2024-07-26 [2] RSPM (R 4.4.0)
#> commonmark 1.9.2 2024-10-04 [2] RSPM (R 4.4.0)
#> ComplexHeatmap 2.23.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> crayon 1.5.3 2024-06-20 [2] RSPM (R 4.4.0)
#> curl 6.0.1 2024-11-14 [2] RSPM (R 4.4.0)
#> data.table 1.16.2 2024-10-10 [2] RSPM (R 4.4.0)
#> DBI 1.2.3 2024-06-02 [2] RSPM (R 4.4.0)
#> dbplyr 2.5.0 2024-03-19 [2] RSPM (R 4.4.0)
#> DelayedArray 0.33.2 2024-11-15 [2] https://bioc.r-universe.dev (R 4.4.2)
#> dichromat 2.0-0.1 2022-05-02 [2] RSPM (R 4.4.0)
#> digest 0.6.37 2024-08-19 [2] RSPM (R 4.4.0)
#> doParallel 1.0.17 2022-02-07 [2] RSPM (R 4.4.0)
#> dplyr 1.1.4 2023-11-17 [2] RSPM (R 4.4.0)
#> dynamicTreeCut 1.63-1 2016-03-11 [2] RSPM (R 4.4.0)
#> edgeR 4.5.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> ensembldb 2.31.0 2024-11-01 [2] https://bioc.r-universe.dev (R 4.4.1)
#> evaluate 1.0.1 2024-10-10 [2] RSPM (R 4.4.0)
#> fansi 1.0.6 2023-12-08 [2] RSPM (R 4.4.0)
#> farver 2.1.2 2024-05-13 [2] RSPM (R 4.4.0)
#> fastcluster 1.2.6 2024-01-12 [2] RSPM (R 4.4.0)
#> fastmap 1.2.0 2024-05-15 [2] RSPM (R 4.4.0)
#> filelock 1.0.3 2023-12-11 [2] RSPM (R 4.4.0)
#> foreach 1.5.2 2022-02-02 [2] RSPM (R 4.4.0)
#> foreign 0.8-87 2024-06-26 [2] RSPM (R 4.4.0)
#> Formula 1.2-5 2023-02-24 [2] RSPM (R 4.4.0)
#> genefilter 1.89.0 2024-11-18 [2] https://bioc.r-universe.dev (R 4.4.2)
#> generics 0.1.3 2022-07-05 [2] RSPM (R 4.4.0)
#> GENIE3 1.29.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> GenomeInfoDb 1.43.1 2024-11-18 [2] https://bioc.r-universe.dev (R 4.4.2)
#> GenomeInfoDbData 1.2.13 2024-11-19 [2] Bioconductor
#> GenomicAlignments 1.43.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> GenomicFeatures 1.59.1 2024-11-08 [2] https://bioc.r-universe.dev (R 4.4.2)
#> GenomicRanges 1.59.1 2024-11-19 [2] https://bioc.r-universe.dev (R 4.4.2)
#> GetoptLong 1.0.5 2020-12-15 [2] RSPM (R 4.4.0)
#> GGally 2.2.1 2024-02-14 [2] RSPM (R 4.4.0)
#> ggbio 1.55.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> ggdendro 0.2.0 2024-02-23 [2] RSPM (R 4.4.0)
#> ggnetwork 0.5.13 2024-02-14 [2] RSPM (R 4.4.0)
#> ggplot2 3.5.1 2024-04-23 [2] RSPM (R 4.4.0)
#> ggrepel 0.9.6 2024-09-07 [2] RSPM (R 4.4.0)
#> ggstats 0.7.0 2024-09-22 [2] RSPM (R 4.4.0)
#> ggtext 0.1.2 2022-09-16 [2] RSPM (R 4.4.0)
#> GlobalOptions 0.1.2 2020-06-10 [2] RSPM (R 4.4.0)
#> glue 1.8.0 2024-09-30 [2] RSPM (R 4.4.0)
#> GO.db 3.20.0 2024-11-19 [2] Bioconductor
#> graph 1.85.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> gridExtra 2.3 2017-09-09 [2] RSPM (R 4.4.0)
#> gridtext 0.1.5 2022-09-16 [2] RSPM (R 4.4.0)
#> gtable 0.3.6 2024-10-25 [2] RSPM (R 4.4.0)
#> Hmisc 5.2-0 2024-10-28 [2] RSPM (R 4.4.0)
#> hms 1.1.3 2023-03-21 [2] RSPM (R 4.4.0)
#> htmlTable 2.4.3 2024-07-21 [2] RSPM (R 4.4.0)
#> htmltools 0.5.8.1 2024-04-04 [2] RSPM (R 4.4.0)
#> htmlwidgets 1.6.4 2023-12-06 [2] RSPM (R 4.4.0)
#> httr 1.4.7 2023-08-15 [2] RSPM (R 4.4.0)
#> httr2 1.0.6 2024-11-04 [2] RSPM (R 4.4.0)
#> igraph 2.1.1 2024-10-19 [2] RSPM (R 4.4.0)
#> impute 1.81.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> intergraph 2.0-4 2024-02-01 [2] RSPM (R 4.4.0)
#> IRanges 2.41.1 2024-11-17 [2] https://bioc.r-universe.dev (R 4.4.2)
#> iterators 1.0.14 2022-02-05 [2] RSPM (R 4.4.0)
#> jquerylib 0.1.4 2021-04-26 [2] RSPM (R 4.4.0)
#> jsonlite 1.8.9 2024-09-20 [2] RSPM (R 4.4.0)
#> KEGGREST 1.47.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> knitr 1.49 2024-11-08 [2] RSPM (R 4.4.0)
#> labeling 0.4.3 2023-08-29 [2] RSPM (R 4.4.0)
#> lattice 0.22-6 2024-03-20 [2] RSPM (R 4.4.0)
#> lazyeval 0.2.2 2019-03-15 [2] RSPM (R 4.4.0)
#> lifecycle 1.0.4 2023-11-07 [2] RSPM (R 4.4.0)
#> limma 3.63.2 2024-11-11 [2] https://bioc.r-universe.dev (R 4.4.2)
#> locfit 1.5-9.10 2024-06-24 [2] RSPM (R 4.4.0)
#> magrittr 2.0.3 2022-03-30 [2] RSPM (R 4.4.0)
#> maketools 1.3.1 2024-10-04 [3] RSPM (R 4.4.0)
#> markdown 1.13 2024-06-04 [2] RSPM (R 4.4.0)
#> MASS 7.3-61 2024-06-13 [2] RSPM (R 4.4.0)
#> Matrix 1.7-1 2024-10-18 [2] RSPM (R 4.4.0)
#> MatrixGenerics 1.19.0 2024-11-06 [2] https://bioc.r-universe.dev (R 4.4.2)
#> matrixStats 1.4.1 2024-09-08 [2] RSPM (R 4.4.0)
#> memoise 2.0.1 2021-11-26 [2] RSPM (R 4.4.0)
#> mgcv 1.9-1 2023-12-21 [2] RSPM (R 4.4.0)
#> minet 3.65.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> munsell 0.5.1 2024-04-01 [2] RSPM (R 4.4.0)
#> NetRep 1.2.7 2023-08-19 [2] RSPM (R 4.4.0)
#> network 1.18.2 2023-12-05 [2] RSPM (R 4.4.0)
#> nlme 3.1-166 2024-08-14 [2] RSPM (R 4.4.0)
#> nnet 7.3-19 2023-05-03 [2] RSPM (R 4.4.0)
#> OrganismDbi 1.49.0 2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#> patchwork 1.3.0 2024-09-16 [2] RSPM (R 4.4.0)
#> pillar 1.9.0 2023-03-22 [2] RSPM (R 4.4.0)
#> pkgconfig 2.0.3 2019-09-22 [2] RSPM (R 4.4.0)
#> plyr 1.8.9 2023-10-02 [2] RSPM (R 4.4.0)
#> png 0.1-8 2022-11-29 [2] RSPM (R 4.4.0)
#> preprocessCore 1.69.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> prettyunits 1.2.0 2023-09-24 [2] RSPM (R 4.4.0)
#> progress 1.2.3 2023-12-06 [2] RSPM (R 4.4.0)
#> ProtGenerics 1.39.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> purrr 1.0.2 2023-08-10 [2] RSPM (R 4.4.0)
#> R6 2.5.1 2021-08-19 [2] RSPM (R 4.4.0)
#> rappdirs 0.3.3 2021-01-31 [2] RSPM (R 4.4.0)
#> RBGL 1.83.0 2024-11-03 [2] https://bioc.r-universe.dev (R 4.4.1)
#> RColorBrewer 1.1-3 2022-04-03 [2] RSPM (R 4.4.0)
#> Rcpp 1.0.13-1 2024-11-02 [2] RSPM (R 4.4.0)
#> RCurl 1.98-1.16 2024-07-11 [2] RSPM (R 4.4.0)
#> reshape2 1.4.4 2020-04-09 [2] RSPM (R 4.4.0)
#> restfulr 0.0.15 2022-06-16 [2] RSPM (R 4.4.2)
#> RhpcBLASctl 0.23-42 2023-02-11 [2] RSPM (R 4.4.0)
#> rjson 0.2.23 2024-09-16 [2] RSPM (R 4.4.0)
#> rlang 1.1.4 2024-06-04 [2] RSPM (R 4.4.0)
#> rmarkdown 2.29 2024-11-04 [2] RSPM (R 4.4.0)
#> rpart 4.1.23 2023-12-05 [2] RSPM (R 4.4.0)
#> Rsamtools 2.23.0 2024-11-18 [2] https://bioc.r-universe.dev (R 4.4.2)
#> RSQLite 2.3.8 2024-11-17 [2] RSPM (R 4.4.0)
#> rstudioapi 0.17.1 2024-10-22 [2] RSPM (R 4.4.0)
#> rtracklayer 1.67.0 2024-11-06 [2] https://bioc.r-universe.dev (R 4.4.2)
#> S4Arrays 1.7.1 2024-11-18 [2] https://bioc.r-universe.dev (R 4.4.2)
#> S4Vectors 0.45.2 2024-11-16 [2] https://bioc.r-universe.dev (R 4.4.2)
#> sass 0.4.9 2024-03-15 [2] RSPM (R 4.4.0)
#> scales 1.3.0 2023-11-28 [2] RSPM (R 4.4.0)
#> sessioninfo 1.2.2 2021-12-06 [2] RSPM (R 4.4.0)
#> shape 1.4.6.1 2024-02-23 [2] RSPM (R 4.4.0)
#> SparseArray 1.7.2 2024-11-15 [2] https://bioc.r-universe.dev (R 4.4.2)
#> statmod 1.5.0 2023-01-06 [2] RSPM (R 4.4.0)
#> statnet.common 4.10.0 2024-10-06 [2] RSPM (R 4.4.0)
#> stringi 1.8.4 2024-05-06 [2] RSPM (R 4.4.0)
#> stringr 1.5.1 2023-11-14 [2] RSPM (R 4.4.0)
#> SummarizedExperiment 1.37.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> survival 3.7-0 2024-06-05 [2] RSPM (R 4.4.0)
#> sva 3.55.0 2024-11-18 [2] https://bioc.r-universe.dev (R 4.4.2)
#> sys 3.4.3 2024-10-04 [2] RSPM (R 4.4.0)
#> tibble 3.2.1 2023-03-20 [2] RSPM (R 4.4.0)
#> tidyr 1.3.1 2024-01-24 [2] RSPM (R 4.4.0)
#> tidyselect 1.2.1 2024-03-11 [2] RSPM (R 4.4.0)
#> txdbmaker 1.3.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> UCSC.utils 1.3.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> utf8 1.2.4 2023-10-22 [2] RSPM (R 4.4.0)
#> VariantAnnotation 1.53.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> vctrs 0.6.5 2023-12-01 [2] RSPM (R 4.4.0)
#> WGCNA 1.73 2024-09-18 [2] RSPM (R 4.4.2)
#> withr 3.0.2 2024-10-28 [2] RSPM (R 4.4.0)
#> xfun 0.49 2024-10-31 [2] RSPM (R 4.4.0)
#> XML 3.99-0.17 2024-06-25 [2] RSPM (R 4.4.0)
#> xml2 1.3.6 2023-12-04 [2] RSPM (R 4.4.0)
#> xtable 1.8-4 2019-04-21 [2] RSPM (R 4.4.0)
#> XVector 0.47.0 2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#> yaml 2.3.10 2024-07-26 [2] RSPM (R 4.4.0)
#> zlibbioc 1.52.0 2024-10-29 [2] Bioconductor 3.20 (R 4.4.2)
#>
#> [1] /tmp/RtmpvcFbrv/Rinst29ce3a66c427
#> [2] /github/workspace/pkglib
#> [3] /usr/local/lib/R/site-library
#> [4] /usr/lib/R/site-library
#> [5] /usr/lib/R/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
Tip: to create GRanges objects from
genomic coordinates in GFF/GTF files, you can use the
import()
function from the Bioconductor package rtracklayer
(Lawrence, Gentleman, and Carey 2009).↩︎
Note: By default, SNPs coordinates will
be expanded upstream and downstream according to the input window size.
However, you may have previously determined genomic intervals for each
SNP (e.g., calculated based on linkage disequilibrium) for which you
want to extract genes. To avoid expanding a sliding window in such
cases, set expand_intervals = FALSE
. This will ensure that
only SNPs are expanded, but not intervals (width >1).↩︎