library(TDbasedUFE)
library(TDbasedUFEadv)
#>
library(DOSE)
#> DOSE v4.1.0 Learn more at https://yulab-smu.top/contribution-knowledge-mining/
#>
#> Please cite:
#>
#> Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He. DOSE: an
#> R/Bioconductor package for Disease Ontology Semantic and Enrichment
#> analysis. Bioinformatics. 2015, 31(4):608-609
library(enrichplot)
#> enrichplot v1.27.1 Learn more at https://yulab-smu.top/contribution-knowledge-mining/
#>
#> Please cite:
#>
#> Guangchuang Yu, Li-Gen Wang, and Qing-Yu He. ChIPseeker: an
#> R/Bioconductor package for ChIP peak annotation, comparison and
#> visualization. Bioinformatics. 2015, 31(14):2382-2383
library(RTCGA.rnaseq)
#> Loading required package: RTCGA
#> Welcome to the RTCGA (version: 1.37.0). Read more about the project under https://rtcga.github.io/RTCGA/
library(RTCGA.clinical)
library(enrichR)
#> Welcome to enrichR
#> Checking connection ...
#> Enrichr ... Connection is Live!
#> FlyEnrichr ... Connection is Live!
#> WormEnrichr ... Connection is Live!
#> YeastEnrichr ... Connection is Live!
#> FishEnrichr ... Connection is Live!
#> OxEnrichr ... Connection is Live!
library(STRINGdb)
It might be helpful to demonstrate how to evaluate selected genes by enrichment analysis. Here, we show some of useful tools applied to the output from TDbasedUFEadv In order foe this, we reproduce one example in “How to use TDbasedUFEadv” as follows.
Multi <- list(
BLCA.rnaseq[seq_len(100), 1 + seq_len(1000)],
BRCA.rnaseq[seq_len(100), 1 + seq_len(1000)],
CESC.rnaseq[seq_len(100), 1 + seq_len(1000)],
COAD.rnaseq[seq_len(100), 1 + seq_len(1000)]
)
Z <- prepareTensorfromList(Multi, 10L)
Z <- aperm(Z, c(2, 1, 3))
Clinical <- list(BLCA.clinical, BRCA.clinical, CESC.clinical, COAD.clinical)
Multi_sample <- list(
BLCA.rnaseq[seq_len(100), 1, drop = FALSE],
BRCA.rnaseq[seq_len(100), 1, drop = FALSE],
CESC.rnaseq[seq_len(100), 1, drop = FALSE],
COAD.rnaseq[seq_len(100), 1, drop = FALSE]
)
# patient.stage_event.tnm_categories.pathologic_categories.pathologic_m
ID_column_of_Multi_sample <- c(770, 1482, 773, 791)
# patient.bcr_patient_barcode
ID_column_of_Clinical <- c(20, 20, 12, 14)
Z <- PrepareSummarizedExperimentTensor(
feature = colnames(ACC.rnaseq)[1 + seq_len(1000)],
sample = array("", 1), value = Z,
sampleData = prepareCondTCGA(
Multi_sample, Clinical,
ID_column_of_Multi_sample, ID_column_of_Clinical
)
)
HOSVD <- computeHosvd(Z)
#> | | | 0% | |======================= | 33% | |=============================================== | 67% | |======================================================================| 100%
cond <- attr(Z, "sampleData")
index <- selectFeatureProj(HOSVD, Multi, cond, de = 1e-3, input_all = 3) # Batch mode
head(tableFeatures(Z, index))
#> Feature p value adjusted p value
#> 10 ACTB|60 0.000000e+00 0.000000e+00
#> 11 ACTG1|71 0.000000e+00 0.000000e+00
#> 37 ALDOA|226 0.000000e+00 0.000000e+00
#> 19 ADAM6|8755 5.698305e-299 1.424576e-296
#> 22 AEBP1|165 1.057392e-218 2.114785e-216
#> 9 ACTA2|59 7.862975e-198 1.310496e-195
genes <- unlist(lapply(strsplit(tableFeatures(Z, index)[, 1], "|",
fixed = TRUE
), "[", 1))
entrez <- unlist(lapply(strsplit(tableFeatures(Z, index)[, 1], "|",
fixed = TRUE
), "[", 2))
Enrichr(Kuleshov et al. 2016) is one of tools that often provides us significant results toward genes selected by TDbasedUFE and TDbasedUFEadv.
setEnrichrSite("Enrichr")
#> Connection changed to https://maayanlab.cloud/Enrichr/
#> Connection is Live!
websiteLive <- TRUE
dbs <- c(
"GO_Molecular_Function_2015", "GO_Cellular_Component_2015",
"GO_Biological_Process_2015"
)
enriched <- enrichr(genes, dbs)
#> Uploading data to Enrichr... Done.
#> Querying GO_Molecular_Function_2015... Done.
#> Querying GO_Cellular_Component_2015... Done.
#> Querying GO_Biological_Process_2015... Done.
#> Parsing results... Done.
if (websiteLive) {
plotEnrich(enriched$GO_Biological_Process_2015,
showTerms = 20, numChar = 40, y = "Count",
orderBy = "P.value"
)
}
Enrichr can provide you huge number of enrichment analyses, many of which have good compatibility with the genes selected by TDbasedUFE as well as TDbasedUFEadv by the experience. Please check Enrichr’s web site to see what kinds of enrichment analyses can be done.
STRING(Szklarczyk et al. 2018) is enrichment analyses based upon protein-protein interaction, which is known to provide often significant results toward genes selected by TDbasedUFE as well as TDbasedUFEadv.
options(timeout = 200)
string_db <- STRINGdb$new(
version = "11.5",
species = 9606, score_threshold = 200,
network_type = "full", input_directory = ""
)
example1_mapped <- string_db$map(data.frame(genes = genes),
"genes",
removeUnmappedRows = TRUE
)
#> Warning: we couldn't map to STRING 1% of your identifiers
hits <- example1_mapped$STRING_id
string_db$plot_network(hits)
Although these above can provide us enough number of information to evaluate the genes selected by TDbasedUFE as well as TDbasedUFEadv, one might need all one package for which one does not how to decide which category must be evaluated in enrichment analysis.
In this case, we would recommend Metascape(Zhou et al. 2019) that unfortunately
does not have the ways approached from R. Thus, we recommend RITAN as an
alternative. It can list significant ones among multiple categories.
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] STRINGdb_2.19.0 enrichR_3.2
#> [3] RTCGA.clinical_20151101.36.0 RTCGA.rnaseq_20151101.36.0
#> [5] RTCGA_1.37.0 enrichplot_1.27.1
#> [7] DOSE_4.1.0 TDbasedUFEadv_1.7.0
#> [9] TDbasedUFE_1.7.0 BiocStyle_2.35.0
#>
#> loaded via a namespace (and not attached):
#> [1] rTensor_1.4.8 splines_4.4.2 later_1.3.2
#> [4] bitops_1.0-9 ggplotify_0.1.2 tibble_3.2.1
#> [7] R.oo_1.27.0 XML_3.99-0.17 lifecycle_1.0.4
#> [10] rstatix_0.7.2 lattice_0.22-6 backports_1.5.0
#> [13] magrittr_2.0.3 sass_0.4.9 rmarkdown_2.29
#> [16] jquerylib_0.1.4 yaml_2.3.10 plotrix_3.8-4
#> [19] httpuv_1.6.15 ggtangle_0.0.4 cowplot_1.1.3
#> [22] DBI_1.2.3 buildtools_1.0.0 RColorBrewer_1.1-3
#> [25] abind_1.4-8 MOFAdata_1.22.0 zlibbioc_1.52.0
#> [28] rvest_1.0.4 GenomicRanges_1.59.1 purrr_1.0.2
#> [31] R.utils_2.12.3 BiocGenerics_0.53.3 RCurl_1.98-1.16
#> [34] hash_2.2.6.3 yulab.utils_0.1.8 WriteXLS_6.7.0
#> [37] GenomeInfoDbData_1.2.13 IRanges_2.41.1 KMsurv_0.1-5
#> [40] S4Vectors_0.45.2 ggrepel_0.9.6 tidytree_0.4.6
#> [43] maketools_1.3.1 proto_1.0.0 codetools_0.2-20
#> [46] xml2_1.3.6 tximportData_1.34.0 tidyselect_1.2.1
#> [49] aplot_0.2.3 UCSC.utils_1.3.0 farver_2.1.2
#> [52] viridis_0.6.5 stats4_4.4.2 jsonlite_1.8.9
#> [55] Formula_1.2-5 survival_3.7-0 tools_4.4.2
#> [58] chron_2.3-61 treeio_1.31.0 Rcpp_1.0.13-1
#> [61] glue_1.8.0 gridExtra_2.3 xfun_0.49
#> [64] qvalue_2.39.0 ggthemes_5.1.0 GenomeInfoDb_1.43.1
#> [67] dplyr_1.1.4 withr_3.0.2 BiocManager_1.30.25
#> [70] fastmap_1.2.0 fansi_1.0.6 caTools_1.18.3
#> [73] digest_0.6.37 R6_2.5.1 mime_0.12
#> [76] gridGraphics_0.5-1 colorspace_2.1-1 GO.db_3.20.0
#> [79] gtools_3.9.5 RSQLite_2.3.8 R.methodsS3_1.8.2
#> [82] utf8_1.2.4 tidyr_1.3.1 generics_0.1.3
#> [85] data.table_1.16.2 httr_1.4.7 sqldf_0.4-11
#> [88] pkgconfig_2.0.3 gtable_0.3.6 blob_1.2.4
#> [91] XVector_0.47.0 sys_3.4.3 survMisc_0.5.6
#> [94] htmltools_0.5.8.1 carData_3.0-5 fgsea_1.33.0
#> [97] scales_1.3.0 Biobase_2.67.0 png_0.1-8
#> [100] ggfun_0.1.7 knitr_1.49 km.ci_0.5-6
#> [103] tzdb_0.4.0 reshape2_1.4.4 rjson_0.2.23
#> [106] nlme_3.1-166 curl_6.0.1 cachem_1.1.0
#> [109] zoo_1.8-12 stringr_1.5.1 KernSmooth_2.23-24
#> [112] parallel_4.4.2 AnnotationDbi_1.69.0 pillar_1.9.0
#> [115] grid_4.4.2 vctrs_0.6.5 gplots_3.2.0
#> [118] promises_1.3.0 ggpubr_0.6.0 car_3.1-3
#> [121] xtable_1.8-4 tximport_1.35.0 evaluate_1.0.1
#> [124] readr_2.1.5 gsubfn_0.7 cli_3.6.3
#> [127] compiler_4.4.2 rlang_1.1.4 crayon_1.5.3
#> [130] ggsignif_0.6.4 labeling_0.4.3 survminer_0.5.0
#> [133] plyr_1.8.9 fs_1.6.5 stringi_1.8.4
#> [136] viridisLite_0.4.2 BiocParallel_1.41.0 assertthat_0.2.1
#> [139] munsell_0.5.1 Biostrings_2.75.1 lazyeval_0.2.2
#> [142] GOSemSim_2.33.0 Matrix_1.7-1 hms_1.1.3
#> [145] patchwork_1.3.0 bit64_4.5.2 ggplot2_3.5.1
#> [148] KEGGREST_1.47.0 shiny_1.9.1 igraph_2.1.1
#> [151] broom_1.0.7 memoise_2.0.1 bslib_0.8.0
#> [154] ggtree_3.15.0 fastmatch_1.1-4 bit_4.5.0
#> [157] ape_5.8