--- title: "Gene Expression displaY of SummarizedExperiment in R" output: BiocStyle::html_document author: - name: David McGaughey affiliation: National Eye Institute, National Institutes of Health date: "`r Sys.Date() `" vignette: > %\VignetteIndexEntry{Gene_Expression_Plotting_GUI} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Background Several tools exist to display gene expression data in a click-and-plot system - most notably [iSEE](https://bioconductor.org/packages/release/bioc/html/iSEE.html). More packages are listed in [Related Tools]. Geyser is unique from iSEE and others in that it does not do much (aside from showing expression data). The advantage of not doing much is that the number of dials and options is substantially reduced and it is, I hope, very simple to operate. # Install Commented out for now as it is not on Bioconductor ```{r install, eval = FALSE} if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("geyser") ``` # Development Version The latest version can be installed via [github](www.github.com/davemcg/geyser) ```{r install github, eval = FALSE} if (!requireNamespace("remotes", quietly=TRUE)) install.packages("remotes") remotes::install_github("davemcg/geyser") ``` # Load Test Data ```{r setup} library(geyser) load(system.file('extdata/tiny_rse.Rdata', package = 'geyser')) ``` # Run Running geyser is as simple as giving the the [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) object to the `geyser` function. ```{r faux_run} if (interactive()){ geyser(tiny_rse) } ``` # Screenshots of some core views The Shiny-based GUI first shows you the metadata (`colData` slot) of the SummarizedExperiment (SE) object in a reactive `DT` data table. The idea is that this helps you ID which are the relevant fields to plot against (tissue and disease).
You can then click over to the Plotting section of the app and start typing in those fields (tissue and disease) in the "Sample Grouping(s)" box. The text will auto-complete as you type.
After that you can type in the genes you are interested in. Again, the genes will auto-complete as you type.
When you click the orange "Draw Box Plot" button the plot will be made
You can *custom filter* which samples are shown by clicking on the "triangle" next to "Sample Filtering" and then selecting the samples you *want* to display. Here we select only the normal samples and then use the Heatmap visualization (you can swap between Box Plot and Heatmap by clicking between the tabs).
If you want to reset the custom sample filtering, just click the "Clear Rows" button
The plots can be "outputted" by either right-clicking or by # How to use [recount3](https://rna.recount.bio) to display a pre-processed dataset We also do some light tweaking of the metadata to make human useable splits ```{r} # If needed: BiocManager::install("recount3") if (interactive()){ library(recount3) library(geyser) human_projects <- available_projects() proj_info <- subset( human_projects, project == "SRP107937" & project_type == "data_sources" ) rse_SRP107937 <- create_rse(proj_info) assay(rse_SRP107937, "counts") <- transform_counts(rse_SRP107937) # first tweak that glues the gene name onto the gene id in the row names rownames(rse_SRP107937) <- paste0(rowData(rse_SRP107937)$gene_name, ' (', row.names(rse_SRP107937), ')') # creates two new metadata fields colData(rse_SRP107937)$tissue <- colData(rse_SRP107937)$sra.sample_title %>% stringr::str_extract(.,'PRC|PR') colData(rse_SRP107937)$disease <- colData(rse_SRP107937)$sra.sample_title %>% stringr::str_extract(.,'AMD|Normal') geyser(rse_SRP107937, " geyser: SRP107937") } ``` # How to turn a count matrix into a SummarizedExperiment The key step is to get matched metadata (where each row corresponds to each column of the count matrix). ```{r count to se} library(SummarizedExperiment) counts <- matrix(runif(10 * 6, 1, 1e4), 10) row.names(counts) <- paste0('gene', seq(1,10)) colnames(counts) <- LETTERS[1:6] sample_info <- data.frame(Condition = c(rep("Unicorn", 3), rep("Horse", 3)), row.names = LETTERS[1:6]) se_object <- SummarizedExperiment(assays=list(counts = counts), colData = sample_info) if (interactive()){ geyser(se_object, "Magical Creatures") } ``` # How to turn a DESeqDataSet into a SummarizedExperiment This example is taken from the [DESeq2 guide](https://bioconductor.org/packages/release/bioc/html/DESeq2.html). A `DESeqDataSet` is highly similar to the `SummarizedExperiment` class. ```{r dds to se} library(DESeq2) library(airway) data(airway) ddsSE <- DESeqDataSet(airway, design = ~ cell + dex) if (interactive()){ geyser(ddsSE, "DESeq Airway Example") } ``` # Related Tools The theme between these tools is that they do A LOT OF STUFF. `geyser` just does one thing - shows gene expression across your samples. Which, effectively, means less energy spent trying to figure out how to get started. - [iSEE](https://bioconductor.org/packages/release/bioc/html/iSEE.html) - [pcaExplorer](https://www.bioconductor.org/packages/release/bioc/html/pcaExplorer.html) - [DEVis](https://cran.r-hub.io/web/packages/DEVis/vignettes/DEVis_vignette.html) - [omicsViewer](https://bioconductor.org/packages/devel/bioc/vignettes/omicsViewer/inst/doc/quickStart.html#1_Introduction) # Session Info ```{r} sessionInfo() ```