--- title: "Using annotations to facilitate interactive exploration" author: - name: Kevin Rue-Albrecht affiliation: - University of Oxford email: kevin.rue-albrecht@imm.ox.ac.uk output: BiocStyle::html_document: self_contained: yes toc: true toc_float: true toc_depth: 2 code_folding: show date: "`r doc_date()`" package: "`r pkg_ver('iSEEde')`" vignette: > %\VignetteIndexEntry{Using annotations to facilitate interactive exploration} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", crop = NULL ## Related to https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016656.html ) options(width = 100) ``` ```{r, eval=!exists("SCREENSHOT"), include=FALSE} SCREENSHOT <- function(x, ...) knitr::include_graphics(x) ``` ```{r vignetteSetup, echo=FALSE, message=FALSE, warning = FALSE} ## Track time spent on making the vignette startTime <- Sys.time() ## Bib setup library("RefManageR") ## Write bibliography information bib <- c( R = citation(), BiocStyle = citation("BiocStyle")[1], knitr = citation("knitr")[1], RefManageR = citation("RefManageR")[1], rmarkdown = citation("rmarkdown")[1], sessioninfo = citation("sessioninfo")[1], testthat = citation("testthat")[1], iSEEde = citation("iSEEde")[1] ) ``` # Example data In this example, we use the `?airway` data set. ```{r, message=FALSE, warning=FALSE} library("airway") data("airway") ``` # Annotating data This section demonstrates one of many possible workflows for adding annotations to the data set. Those annotations are meant to make it easier for users to identify genes of interest, e.g. by displaying both gene symbols and ENSEMBL gene identifiers as tooltips in the interactive browser. First, we make a copy of the Ensembl identifiers -- currently stored in the `rownames()` -- to a column in the `rowData()` component. ```{r} rowData(airway)[["ENSEMBL"]] <- rownames(airway) ``` Then, we use the `r BiocStyle::Biocpkg("org.Hs.eg.db")` package to map the Ensembl identifiers to gene symbols. We store those gene symbols as an additional column of the `rowData()` component. ```{r, message=FALSE, warning=FALSE} library("org.Hs.eg.db") rowData(airway)[["SYMBOL"]] <- mapIds( org.Hs.eg.db, rownames(airway), "SYMBOL", "ENSEMBL" ) ``` Next, we use the `uniquifyFeatureNames()` function of the `r BiocStyle::Biocpkg("scuttle")` package to replace the `rownames()` by a unique identifier that is generated as follows: - The gene symbol if it is unique. - A concatenate of the gene symbol and Ensembl gene identifier if the gene symbol is not unique. - The Ensembl identifier if the gene symbol is not available. ```{r, message=FALSE, warning=FALSE} library("scuttle") rownames(airway) <- uniquifyFeatureNames( ID = rowData(airway)[["ENSEMBL"]], names = rowData(airway)[["SYMBOL"]] ) airway ``` # Differential expression To generate some example results, we first use `edgeR::filterByExpr()` to remove genes whose counts are too low to support a rigorous differential expression analysis. Then we run a standard Limma-Voom analysis using `edgeR::voomLmFit()`, `limma::makeContrasts()`, and `limma::eBayes()`; alternatively, we could have used `limma::treat()` instead of `limma::eBayes()`. The linear model includes the `dex` and `cell` covariates, indicating the treatment conditions and cell types, respectively. Here, we are interested in differences between treatments, adjusted by cell type, and define this comparison as the `dextrt - dexuntrt` contrast. The final differential expression results are fetched using `limma::topTable()`. ```{r, message=FALSE, warning=FALSE} library("edgeR") design <- model.matrix(~ 0 + dex + cell, data = colData(airway)) keep <- filterByExpr(airway, design) fit <- voomLmFit(airway[keep, ], design, plot = FALSE) contr <- makeContrasts("dextrt - dexuntrt", levels = design) fit <- contrasts.fit(fit, contr) fit <- eBayes(fit) res_limma <- topTable(fit, sort.by = "P", n = Inf) head(res_limma) ``` Then, we embed this set of differential expression results in the `airway` object using the `embedContrastResults()` method and we use the function `contrastResults()` to display the contrast results stored in the `airway` object. ```{r} library(iSEEde) airway <- embedContrastResults(res_limma, airway, name = "dextrt - dexuntrt", class = "limma" ) contrastResults(airway) contrastResults(airway, "dextrt - dexuntrt") ``` # Live app In this example, we use `iSEE::panelDefaults()` to specify `rowData()` fields to show in the tooltip that is displayed when hovering a data point. The application is then configured to display the volcano plot and MA plot for the same contrast. Finally, the configured app is launched. ```{r, message=FALSE, warning=FALSE} library(iSEE) panelDefaults( TooltipRowData = c("SYMBOL", "ENSEMBL") ) app <- iSEE(airway, initial = list( VolcanoPlot(ContrastName = "dextrt - dexuntrt", PanelWidth = 6L), MAPlot(ContrastName = "dextrt - dexuntrt", PanelWidth = 6L) )) if (interactive()) { shiny::runApp(app) } ``` ```{r, echo=FALSE, out.width="100%"} SCREENSHOT("screenshots/using_annotations.png", delay = 20) ``` # Reproducibility The `r Biocpkg("iSEEde")` package `r Citep(bib[["iSEEde"]])` was made possible thanks to: * R `r Citep(bib[["R"]])` * `r Biocpkg("BiocStyle")` `r Citep(bib[["BiocStyle"]])` * `r CRANpkg("knitr")` `r Citep(bib[["knitr"]])` * `r CRANpkg("RefManageR")` `r Citep(bib[["RefManageR"]])` * `r CRANpkg("rmarkdown")` `r Citep(bib[["rmarkdown"]])` * `r CRANpkg("sessioninfo")` `r Citep(bib[["sessioninfo"]])` * `r CRANpkg("testthat")` `r Citep(bib[["testthat"]])` This package was developed using `r BiocStyle::Biocpkg("biocthis")`. Code for creating the vignette ```{r createVignette, eval=FALSE} ## Create the vignette library("rmarkdown") system.time(render("annotations.Rmd", "BiocStyle::html_document")) ## Extract the R code library("knitr") knit("annotations.Rmd", tangle = TRUE) ``` Date the vignette was generated. ```{r reproduce1, echo=FALSE} ## Date the vignette was generated Sys.time() ``` Wallclock time spent generating the vignette. ```{r reproduce2, echo=FALSE} ## Processing time in seconds totalTime <- diff(c(startTime, Sys.time())) round(totalTime, digits = 3) ``` `R` session information. ```{r reproduce3, echo=FALSE} ## Session info library("sessioninfo") options(width = 120) session_info() ``` # Bibliography This vignette was generated using `r Biocpkg("BiocStyle")` `r Citep(bib[["BiocStyle"]])` with `r CRANpkg("knitr")` `r Citep(bib[["knitr"]])` and `r CRANpkg("rmarkdown")` `r Citep(bib[["rmarkdown"]])` running behind the scenes. Citations made with `r CRANpkg("RefManageR")` `r Citep(bib[["RefManageR"]])`. ```{r vignetteBiblio, results = "asis", echo = FALSE, warning = FALSE, message = FALSE} ## Print bibliography PrintBibliography(bib, .opts = list(hyperlink = "to.doc", style = "html")) ``` [scheme-wikipedia]: https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Syntax [iana-uri]: https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml