Working with the Gene Ontology

Scenario

In this vignette, we demonstrate how one may use the package GO.db to dynamically display additional information about selected pathways in the interactive user interface.

Demonstration

Example data

First, we generate pathway analysis results for simulated data using fgsea.

In particular, we use the package org.Hs.eg.db to fetch real gene sets. To reduce memory footprint, we retain only the gene sets associated with 15 to 500 genes.

Then, we simulate a score for each of the gene present in any of those remaining gene sets. In practice, that score could be the log2 fold-change of the gene in a differential expression analysis (among other possibilities).

Finally, we perform an FGSEA on the simulated data.

library("org.Hs.eg.db")
library("fgsea")

# Example data ----

## Pathways
pathways <- select(org.Hs.eg.db, keys(org.Hs.eg.db, "SYMBOL"), c("GOALL"), keytype = "SYMBOL")
pathways <- subset(pathways, ONTOLOGYALL == "BP")
pathways <- unique(pathways[, c("SYMBOL", "GOALL")])
pathways <- split(pathways$SYMBOL, pathways$GOALL)
len_pathways <- lengths(pathways)
pathways <- pathways[len_pathways > 15 & len_pathways < 500]

## Features
set.seed(1)
# simulate a score for all genes found across all pathways
feature_stats <- rnorm(length(unique(unlist(pathways))))
names(feature_stats) <- unique(unlist(pathways))
# arbitrarily select a pathway to simulate enrichment
pathway_id <- "GO:0046324"
pathway_genes <- pathways[[pathway_id]]
# increase score of genes in the selected pathway to simulate enrichment
feature_stats[pathway_genes] <- feature_stats[pathway_genes] + 1

# fgsea ----

set.seed(42)
fgseaRes <- fgsea(pathways = pathways, 
                  stats    = feature_stats,
                  minSize  = 15,
                  maxSize  = 500)
head(fgseaRes[order(pval), ])
#>       pathway         pval         padj   log2err        ES      NES  size
#>        <char>        <num>        <num>     <num>     <num>    <num> <int>
#> 1: GO:0046324 5.580596e-10 2.747886e-06 0.8012156 0.6285002 2.579677    60
#> 2: GO:0046326 3.235498e-08 7.965797e-05 0.7195128 0.6752985 2.492665    37
#> 3: GO:0010827 5.741078e-08 9.423023e-05 0.7195128 0.5326197 2.321216    78
#> 4: GO:0010828 3.204527e-07 3.944773e-04 0.6749629 0.6033620 2.334222    44
#> 5: GO:0046323 5.839380e-06 5.750622e-03 0.6105269 0.4855652 2.116329    79
#> 6: GO:0008645 4.383171e-05 3.597122e-02 0.5573322 0.3990657 1.864469   121
#>     leadingEdge
#>          <list>
#> 1: TNF, KLF....
#> 2: KLF15, F....
#> 3: TNF, KLF....
#> 4: KLF15, F....
#> 5: TNF, KLF....
#> 6: TNF, KLF....

Then, we embed the fgsea results in a SummarizedExperiment object.

In this case, we create an empty ?SummarizedExperiment-class object, without any simulated count data nor metadata, as we will not be using any of those data in this example.

We then embed the pathway analysis results in the newly created ?SummarizedExperiment-class object.

But first, we reorder the results by increasing p-value. Although not essential, this implicitly defines the default ordering of the table in the live app.

library("SummarizedExperiment")
library("iSEEpathways")
se <- SummarizedExperiment()
fgseaRes <- fgseaRes[order(pval), ]
se <- embedPathwaysResults(fgseaRes, se, name = "fgsea", class = "fgsea", pathwayType = "GO")

Pathway information

In this example, we configure the app option PathwaysTable.select.details to define a function that, given the identifier of the GO term currently selected in a panel, displays information about that GO term.

Although not essential, this is a user-friendly and immediate way to ‘translate’ machine-friendly database identifiers into human-friendly descriptions.

library("iSEE")
library("GO.db")
library("shiny")
go_details <- function(x) {
    info <- select(GO.db, x, c("TERM", "ONTOLOGY", "DEFINITION"), "GOID")
    html <- list(p(strong(info$GOID), ":", info$TERM, paste0("(", info$ONTOLOGY, ")")))
    if (!is.na(info$DEFINITION)) {
        html <- append(html, list(p(info$DEFINITION)))
    }
    tagList(html)
}
se <- registerAppOptions(se, PathwaysTable.select.details = go_details)

Live app

Finally, we configure the app initial state and launch the live app.

app <- iSEE(se, initial = list(
  PathwaysTable(ResultName="fgsea", Selected = "GO:0046324", PanelWidth = 12L)
))

if (interactive()) {
  shiny::runApp(app)
}

Reproducibility

The iSEEpathways package (Rue-Albrecht and Soneson, 2024) was made possible thanks to:

  • R (R Core Team, 2024)
  • BiocStyle (Oleś, 2024)
  • knitr (Xie, 2024)
  • RefManageR (McLean, 2017)
  • rmarkdown (Allaire, Xie, Dervieux, McPherson, Luraschi, Ushey, Atkins, Wickham, Cheng, Chang, and Iannone, 2024)
  • sessioninfo (Wickham, Chang, Flight, Müller, and Hester, 2021)
  • testthat (Wickham, 2011)

This package was developed using biocthis.

Code for creating the vignette

## Create the vignette
library("rmarkdown")
system.time(render("gene-ontology.Rmd", "BiocStyle::html_document"))

## Extract the R code
library("knitr")
knit("gene-ontology.Rmd", tangle = TRUE)

Date the vignette was generated.

#> [1] "2024-11-29 07:29:40 UTC"

Wallclock time spent generating the vignette.

#> Time difference of 33.114 secs

R session information.

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.2 (2024-10-31)
#>  os       Ubuntu 24.04.1 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  C
#>  ctype    en_US.UTF-8
#>  tz       Etc/UTC
#>  date     2024-11-29
#>  pandoc   3.2.1 @ /usr/local/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package              * version  date (UTC) lib source
#>  abind                  1.4-8    2024-09-12 [2] RSPM (R 4.4.0)
#>  AnnotationDbi        * 1.69.0   2024-11-29 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  backports              1.5.0    2024-05-23 [2] RSPM (R 4.4.0)
#>  bibtex                 0.5.1    2023-01-26 [2] RSPM (R 4.4.0)
#>  Biobase              * 2.67.0   2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  BiocGenerics         * 0.53.3   2024-11-15 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  BiocManager            1.30.25  2024-08-28 [2] RSPM (R 4.4.0)
#>  BiocParallel           1.41.0   2024-11-29 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  BiocStyle            * 2.35.0   2024-11-19 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  Biostrings             2.75.1   2024-11-07 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  bit                    4.5.0    2024-09-20 [2] RSPM (R 4.4.0)
#>  bit64                  4.5.2    2024-09-22 [2] RSPM (R 4.4.0)
#>  blob                   1.2.4    2023-03-17 [2] RSPM (R 4.4.0)
#>  bslib                  0.8.0    2024-07-29 [2] RSPM (R 4.4.0)
#>  buildtools             1.0.0    2024-11-24 [3] local (/pkg)
#>  cachem                 1.1.0    2024-05-16 [2] RSPM (R 4.4.0)
#>  circlize               0.4.16   2024-02-20 [2] RSPM (R 4.4.0)
#>  cli                    3.6.3    2024-06-21 [2] RSPM (R 4.4.0)
#>  clue                   0.3-66   2024-11-13 [2] RSPM (R 4.4.0)
#>  cluster                2.1.6    2023-12-01 [2] RSPM (R 4.4.0)
#>  codetools              0.2-20   2024-03-31 [2] RSPM (R 4.4.0)
#>  colorspace             2.1-1    2024-07-26 [2] RSPM (R 4.4.0)
#>  colourpicker           1.3.0    2023-08-21 [2] RSPM (R 4.4.0)
#>  ComplexHeatmap         2.23.0   2024-11-29 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  cowplot                1.1.3    2024-01-22 [2] RSPM (R 4.4.0)
#>  crayon                 1.5.3    2024-06-20 [2] RSPM (R 4.4.0)
#>  data.table             1.16.2   2024-10-10 [2] RSPM (R 4.4.0)
#>  DBI                    1.2.3    2024-06-02 [2] RSPM (R 4.4.0)
#>  DelayedArray           0.33.2   2024-11-15 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  digest                 0.6.37   2024-08-19 [2] RSPM (R 4.4.0)
#>  doParallel             1.0.17   2022-02-07 [2] RSPM (R 4.4.0)
#>  DT                     0.33     2024-04-04 [2] RSPM (R 4.4.0)
#>  evaluate               1.0.1    2024-10-10 [2] RSPM (R 4.4.0)
#>  fansi                  1.0.6    2023-12-08 [2] RSPM (R 4.4.0)
#>  fastmap                1.2.0    2024-05-15 [2] RSPM (R 4.4.0)
#>  fastmatch              1.1-4    2023-08-18 [2] RSPM (R 4.4.0)
#>  fgsea                * 1.33.0   2024-11-19 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  fontawesome            0.5.3    2024-11-16 [2] RSPM (R 4.4.0)
#>  foreach                1.5.2    2022-02-02 [2] RSPM (R 4.4.0)
#>  generics             * 0.1.3    2022-07-05 [2] RSPM (R 4.4.0)
#>  GenomeInfoDb         * 1.43.2   2024-11-28 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  GenomeInfoDbData       1.2.13   2024-11-29 [2] Bioconductor
#>  GenomicRanges        * 1.59.1   2024-11-19 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  GetoptLong             1.0.5    2020-12-15 [2] RSPM (R 4.4.0)
#>  ggplot2                3.5.1    2024-04-23 [2] RSPM (R 4.4.0)
#>  ggrepel                0.9.6    2024-09-07 [2] RSPM (R 4.4.0)
#>  GlobalOptions          0.1.2    2020-06-10 [2] RSPM (R 4.4.0)
#>  glue                   1.8.0    2024-09-30 [2] RSPM (R 4.4.0)
#>  GO.db                * 3.20.0   2024-11-29 [2] Bioconductor
#>  gtable                 0.3.6    2024-10-25 [2] RSPM (R 4.4.0)
#>  htmltools              0.5.8.1  2024-04-04 [2] RSPM (R 4.4.0)
#>  htmlwidgets            1.6.4    2023-12-06 [2] RSPM (R 4.4.0)
#>  httpuv                 1.6.15   2024-03-26 [2] RSPM (R 4.4.0)
#>  httr                   1.4.7    2023-08-15 [2] RSPM (R 4.4.0)
#>  igraph                 2.1.1    2024-10-19 [2] RSPM (R 4.4.0)
#>  IRanges              * 2.41.1   2024-11-17 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  iSEE                 * 2.19.0   2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  iSEEpathways         * 1.5.0    2024-11-29 [1] https://bioc.r-universe.dev (R 4.4.2)
#>  iterators              1.0.14   2022-02-05 [2] RSPM (R 4.4.0)
#>  jquerylib              0.1.4    2021-04-26 [2] RSPM (R 4.4.0)
#>  jsonlite               1.8.9    2024-09-20 [2] RSPM (R 4.4.0)
#>  KEGGREST               1.47.0   2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  knitr                  1.49     2024-11-08 [2] RSPM (R 4.4.0)
#>  later                  1.4.1    2024-11-27 [2] RSPM (R 4.4.0)
#>  lattice                0.22-6   2024-03-20 [2] RSPM (R 4.4.0)
#>  lifecycle              1.0.4    2023-11-07 [2] RSPM (R 4.4.0)
#>  listviewer             4.0.0    2023-09-30 [2] RSPM (R 4.4.0)
#>  lubridate              1.9.3    2023-09-27 [2] RSPM (R 4.4.0)
#>  magrittr               2.0.3    2022-03-30 [2] RSPM (R 4.4.0)
#>  maketools              1.3.1    2024-10-04 [3] RSPM (R 4.4.0)
#>  Matrix                 1.7-1    2024-10-18 [2] RSPM (R 4.4.0)
#>  MatrixGenerics       * 1.19.0   2024-11-06 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  matrixStats          * 1.4.1    2024-09-08 [2] RSPM (R 4.4.0)
#>  memoise                2.0.1    2021-11-26 [2] RSPM (R 4.4.0)
#>  mgcv                   1.9-1    2023-12-21 [2] RSPM (R 4.4.0)
#>  mime                   0.12     2021-09-28 [2] RSPM (R 4.4.0)
#>  miniUI                 0.1.1.1  2018-05-18 [2] RSPM (R 4.4.0)
#>  munsell                0.5.1    2024-04-01 [2] RSPM (R 4.4.0)
#>  nlme                   3.1-166  2024-08-14 [2] RSPM (R 4.4.0)
#>  org.Hs.eg.db         * 3.20.0   2024-11-29 [2] Bioconductor
#>  pillar                 1.9.0    2023-03-22 [2] RSPM (R 4.4.0)
#>  pkgconfig              2.0.3    2019-09-22 [2] RSPM (R 4.4.0)
#>  plyr                   1.8.9    2023-10-02 [2] RSPM (R 4.4.0)
#>  png                    0.1-8    2022-11-29 [2] RSPM (R 4.4.0)
#>  promises               1.3.2    2024-11-28 [2] RSPM (R 4.4.0)
#>  R6                     2.5.1    2021-08-19 [2] RSPM (R 4.4.0)
#>  RColorBrewer           1.1-3    2022-04-03 [2] RSPM (R 4.4.0)
#>  Rcpp                   1.0.13-1 2024-11-02 [2] RSPM (R 4.4.0)
#>  RefManageR           * 1.4.0    2022-09-30 [2] RSPM (R 4.4.0)
#>  rintrojs               0.3.4    2024-01-11 [2] RSPM (R 4.4.0)
#>  rjson                  0.2.23   2024-09-16 [2] RSPM (R 4.4.0)
#>  rlang                  1.1.4    2024-06-04 [2] RSPM (R 4.4.0)
#>  rmarkdown              2.29     2024-11-04 [2] RSPM (R 4.4.0)
#>  RSQLite                2.3.8    2024-11-17 [2] RSPM (R 4.4.0)
#>  S4Arrays               1.7.1    2024-11-18 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  S4Vectors            * 0.45.2   2024-11-16 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  sass                   0.4.9    2024-03-15 [2] RSPM (R 4.4.0)
#>  scales                 1.3.0    2023-11-28 [2] RSPM (R 4.4.0)
#>  sessioninfo          * 1.2.2    2021-12-06 [2] RSPM (R 4.4.0)
#>  shape                  1.4.6.1  2024-02-23 [2] RSPM (R 4.4.0)
#>  shiny                * 1.9.1    2024-08-01 [2] RSPM (R 4.4.0)
#>  shinyAce               0.4.3    2024-10-19 [2] RSPM (R 4.4.0)
#>  shinydashboard         0.7.2    2021-09-30 [2] RSPM (R 4.4.0)
#>  shinyjs                2.1.0    2021-12-23 [2] RSPM (R 4.4.0)
#>  shinyWidgets           0.8.7    2024-09-23 [2] RSPM (R 4.4.0)
#>  SingleCellExperiment * 1.29.1   2024-11-09 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  SparseArray            1.7.2    2024-11-15 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  stringi                1.8.4    2024-05-06 [2] RSPM (R 4.4.0)
#>  stringr                1.5.1    2023-11-14 [2] RSPM (R 4.4.0)
#>  SummarizedExperiment * 1.37.0   2024-11-21 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  sys                    3.4.3    2024-10-04 [2] RSPM (R 4.4.0)
#>  tibble                 3.2.1    2023-03-20 [2] RSPM (R 4.4.0)
#>  timechange             0.3.0    2024-01-18 [2] RSPM (R 4.4.0)
#>  UCSC.utils             1.3.0    2024-10-31 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  utf8                   1.2.4    2023-10-22 [2] RSPM (R 4.4.0)
#>  vctrs                  0.6.5    2023-12-01 [2] RSPM (R 4.4.0)
#>  vipor                  0.4.7    2023-12-18 [2] RSPM (R 4.4.0)
#>  viridisLite            0.4.2    2023-05-02 [2] RSPM (R 4.4.0)
#>  xfun                   0.49     2024-10-31 [2] RSPM (R 4.4.0)
#>  xml2                   1.3.6    2023-12-04 [2] RSPM (R 4.4.0)
#>  xtable                 1.8-4    2019-04-21 [2] RSPM (R 4.4.0)
#>  XVector                0.47.0   2024-11-21 [2] https://bioc.r-universe.dev (R 4.4.2)
#>  yaml                   2.3.10   2024-07-26 [2] RSPM (R 4.4.0)
#>  zlibbioc               1.52.0   2024-10-29 [2] Bioconductor 3.20 (R 4.4.2)
#> 
#>  [1] /tmp/Rtmp4t7Vq1/Rinst23742b0fce94
#>  [2] /github/workspace/pkglib
#>  [3] /usr/local/lib/R/site-library
#>  [4] /usr/lib/R/site-library
#>  [5] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Bibliography

This vignette was generated using BiocStyle (Oleś, 2024) with knitr (Xie, 2024) and rmarkdown (Allaire, Xie, Dervieux et al., 2024) running behind the scenes.

Citations made with RefManageR (McLean, 2017).

[1] J. Allaire, Y. Xie, C. Dervieux, et al. rmarkdown: Dynamic Documents for R. R package version 2.29. 2024. URL: https://github.com/rstudio/rmarkdown.

[2] M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.

[3] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.35.0. 2024. URL: https://github.com/Bioconductor/BiocStyle.

[4] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2024. URL: https://www.R-project.org/.

[5] K. Rue-Albrecht and C. Soneson. iSEEpathways: iSEE extension for panels related to pathway analysis. R package version 1.5.0. 2024. URL: https://github.com/iSEE/iSEEpathways.

[6] H. Wickham. “testthat: Get Started with Testing”. In: The R Journal 3 (2011), pp. 5–10. URL: https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.

[7] H. Wickham, W. Chang, R. Flight, et al. sessioninfo: R Session Information. R package version 1.2.2, https://r-lib.github.io/sessioninfo/. 2021. URL: https://github.com/r-lib/sessioninfo#readme.

[8] Y. Xie. knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.49. 2024. URL: https://yihui.org/knitr/.