This is a Shiny app written in R that creates an interactive visualization using the shiny and shiny.gosling packages. The visualization displays genomic data from the Cistrome database in a track-based layout.
This package is an R shiny implementation of the Gosling.js library. Gosling js is a highly expressive library based on a Grammar for scalable and interactive Genomics Data visualization. This library is build upon the React wrapper of the Gosling.js. Which is powered by Shiny.React. Thus most of the Charts can be directly build using this package.
Let’s start with an example. For that let’s read a csv file which has some Genomic data with chromosome start stop indicators. This dataset is directly picked from gosling-lang.org here.
Multivec is a file format introduced by Higlass suitable for representing and visualizing multi-dimensional numerical data across genomic coordinates. It’s commonly used for representing data like ChIP-seq, ATAC-seq, Hi-C, and other genomic experiments where signals or measurements are collected at various genomic positions.
Multivec data is essentially a matrix where rows correspond to different genomic positions or regions, and columns correspond to different samples or experiments. Each entry in the matrix represents a value associated with a specific genomic position and sample. The genomic positions along the rows of the matrix are usually represented as chromosomal coordinates (chromosome name and base pair position). This allows the data to be aligned with the genome, enabling accurate visualization and analysis. There are different tools and file formats that support multivec data, allowing researchers to work with and visualize this type of data. The bigWig and bedGraph formats are commonly used for representing multivec data. Visualization tools and libraries like the UCSC Genome Browser, IGV (Integrative Genomics Viewer), and libraries like “shiny.gosling” can render multivec data visualizations.
Here are some resources and links where you can learn more about multivec data and how it’s used in genomics research:
UCSC Genome Browser:
The UCSC Genome Browser is a widely used tool for visualizing genomic data, including multivec data. Tutorial on visualizing multivec data in the UCSC Genome Browser
IGV (Integrative Genomics Viewer):
IGV is another popular genome visualization tool that supports multivec data. Tutorial on loading and visualizing multivec data in IGV
BedGraph and BigWig Formats:
These are common file formats used for representing multivec data. Explanation of the BedGraph format Explanation of the BigWig format
In shiny.gosling we can basically create tracks from data and then
create view from tracks. To understand how to build a plot let’s
understand 3 basic principles of gosling.js
Track
contains data, layout, height, width and all
aesthetics etc…View
Plot
.This is how a plot is created in Gosling. Let’s visit this one by one.
Let’s start by creating a track. Let’s define the first track and add
more properties to the track. With shiny.gosling
you can be
specific about the colors and ranges and channel.
Let’s build the layers for the plot. So we can build multiple tracks
to represent the genome. add_single_track
function
constructs a single track from the inputs.
The track_data function is used to define the data source for the track. It specifies the URL of the dataset, the type of data (“multivec”), and various data-related parameters such as rows, columns, values, categories, and bin size.
The visualization of the track is specified using various visual channels such as x, xe, row, color, and tooltip.
single_track <- add_single_track(
id = "track1",
data = track_data(
url = cistrome_data,
type = "multivec",
row = "sample",
column = "position",
value = "peak",
categories = c("sample 1", "sample 2", "sample 3", "sample 4"),
binSize = 4,
),
mark = "rect",
x = visual_channel_x(field = "start", type = "genomic", axis = "top"),
xe = visual_channel_x(field = "end", type = "genomic"),
row = visual_channel_row(
field = "sample",
type = "nominal",
legend = TRUE
),
color = visual_channel_color(
field = "peak",
type = "quantitative",
legend = TRUE
),
tooltip = visual_channel_tooltips(
visual_channel_tooltip(field = "start", type = "genomic", alt = "Start Position"),
visual_channel_tooltip(field = "end", type = "genomic", alt = "End Position"),
visual_channel_tooltip(
field = "peak",
type = "quantitative",
alt = "Value",
format = "0.2"
)
),
width = 600,
height = 130
)
The compose_view function is used to compose the single track visualization into a view. The composed view is placed within a circular layout. The x-domain (genomic interval) is set to chromosome 1, interval [1, 3000500].
single_composed_views <- arrange_views(
title = "Single Track",
subtitle = "This is the simplest single track visualization with a linear layout",
layout = "circular",
views = single_composed_track,
xDomain = list(
chromosome = "chr1",
interval = c(1, 3000500)
)
)
We can then even add more tracks to it. So let’s create a few more tracks just to make a better and more beautiful graph.
The use_gosling function is used to incorporate the shiny.gosling package for rendering the visualization.
Hovering on the plot will show the start position, the end position and the value.
Scrolling on the plot, will zoom in or zoom out the view.
Users can click on the PDF button to download a pdf of the current state of the plot.
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] sessioninfo_1.2.2 ggbio_1.55.0
#> [3] ggplot2_3.5.1 StructuralVariantAnnotation_1.23.0
#> [5] VariantAnnotation_1.53.0 Rsamtools_2.23.1
#> [7] Biostrings_2.75.3 XVector_0.47.0
#> [9] SummarizedExperiment_1.37.0 Biobase_2.67.0
#> [11] MatrixGenerics_1.19.0 matrixStats_1.4.1
#> [13] rtracklayer_1.67.0 GenomicRanges_1.59.1
#> [15] GenomeInfoDb_1.43.2 IRanges_2.41.2
#> [17] S4Vectors_0.45.2 BiocGenerics_0.53.3
#> [19] generics_0.1.3 shiny_1.10.0
#> [21] shiny.gosling_1.3.0 rmarkdown_2.29
#>
#> loaded via a namespace (and not attached):
#> [1] RColorBrewer_1.1-3 sys_3.4.3 rstudioapi_0.17.1
#> [4] jsonlite_1.8.9 magrittr_2.0.3 GenomicFeatures_1.59.1
#> [7] fs_1.6.5 BiocIO_1.17.1 zlibbioc_1.52.0
#> [10] vctrs_0.6.5 memoise_2.0.1 RCurl_1.98-1.16
#> [13] base64enc_0.1-3 progress_1.2.3 htmltools_0.5.8.1
#> [16] S4Arrays_1.7.1 curl_6.0.1 SparseArray_1.7.2
#> [19] Formula_1.2-5 sass_0.4.9 bslib_0.8.0
#> [22] fontawesome_0.5.3 htmlwidgets_1.6.4 httr2_1.0.7
#> [25] plyr_1.8.9 cachem_1.1.0 buildtools_1.0.0
#> [28] GenomicAlignments_1.43.0 shiny.react_0.4.0 mime_0.12
#> [31] lifecycle_1.0.4 pkgconfig_2.0.3 Matrix_1.7-1
#> [34] R6_2.5.1 fastmap_1.2.0 GenomeInfoDbData_1.2.13
#> [37] digest_0.6.37 colorspace_2.1-1 GGally_2.2.1
#> [40] AnnotationDbi_1.69.0 OrganismDbi_1.49.0 Hmisc_5.2-1
#> [43] RSQLite_2.3.9 filelock_1.0.3 httr_1.4.7
#> [46] abind_1.4-8 compiler_4.4.2 bit64_4.5.2
#> [49] withr_3.0.2 htmlTable_2.4.3 backports_1.5.0
#> [52] BiocParallel_1.41.0 DBI_1.2.3 ggstats_0.7.0
#> [55] biomaRt_2.63.0 rappdirs_0.3.3 DelayedArray_0.33.3
#> [58] rjson_0.2.23 tools_4.4.2 foreign_0.8-87
#> [61] httpuv_1.6.15 nnet_7.3-19 glue_1.8.0
#> [64] restfulr_0.0.15 promises_1.3.2 grid_4.4.2
#> [67] checkmate_2.3.2 cluster_2.1.8 reshape2_1.4.4
#> [70] gtable_0.3.6 BSgenome_1.75.0 tidyr_1.3.1
#> [73] ensembldb_2.31.0 hms_1.1.3 data.table_1.16.4
#> [76] xml2_1.3.6 pillar_1.10.0 stringr_1.5.1
#> [79] later_1.4.1 dplyr_1.1.4 BiocFileCache_2.15.0
#> [82] lattice_0.22-6 bit_4.5.0.1 biovizBase_1.55.0
#> [85] RBGL_1.83.0 tidyselect_1.2.1 maketools_1.3.1
#> [88] knitr_1.49 gridExtra_2.3 ProtGenerics_1.39.1
#> [91] xfun_0.49 stringi_1.8.4 UCSC.utils_1.3.0
#> [94] lazyeval_0.2.2 yaml_2.3.10 evaluate_1.0.1
#> [97] codetools_0.2-20 tibble_3.2.1 graph_1.85.0
#> [100] BiocManager_1.30.25 cli_3.6.3 rpart_4.1.23
#> [103] xtable_1.8-4 munsell_0.5.1 jquerylib_0.1.4
#> [106] dichromat_2.0-0.1 Rcpp_1.0.13-1 dbplyr_2.5.0
#> [109] png_0.1-8 XML_3.99-0.17 parallel_4.4.2
#> [112] assertthat_0.2.1 blob_1.2.4 prettyunits_1.2.0
#> [115] AnnotationFilter_1.31.0 bitops_1.0-9 pwalign_1.3.1
#> [118] txdbmaker_1.3.1 scales_1.3.0 purrr_1.0.2
#> [121] crayon_1.5.3 rlang_1.1.4 KEGGREST_1.47.0