iSEEtree: interactive exploration of microbiome data

Introduction

Motivation

iSEEtree is a Bioconductor package for the interactive visualisation of microbiome data stored in a TreeSummarizedExperiment (TreeSE) container. On the one side, it leverages and extends the graphics of the iSEE package, which is designed for the generic SummarizedExperiment class. On the other side, it employs the statistical and visual tools for microbiome data science provided by the mia family of packages. Thus, iSEE and mia represent the two building blocks of iSEEtree. Detailed introductory material on these two frameworks is available in the iSEE-verse website and the OMA Bioconductor book, respectively.

iSEEtree is meant for new and experienced users alike, who desire to create and interact with several graphics for microbiome data, without the need for an in-depth knowledge of the underlying mia functionality. Current microbiome-specific panels include phylogenetic trees, ordination plots and compositional plots, which can be further explored below in this article. Other more generic panels are also reused from the iSEE package and can be experimented in this article.

Panels

iSEEtree derives its microbiome-related visualisation methods from the miaViz package, which is code-based and requires basic knowledge of R programming and microbiome data structures. The following panels represent an easy-to-use interactive version of the miaViz plotting functions:

  • AbundanceDensityPlot: a density plot of the top features, where every line is a feature and the x axis shows its abundance for different samples. Its interpretation is explained in the OMA chapter on Exploration.
  • AbundancePlot: a compositional barplot of the samples, where every bar is a sample composed by different features in different colours. Its interpretation is explained in the OMA chapter on Community Composition.
  • ColumnTreePlot: the hierarchical organisation of the samples, where every tip is a sample and the closer the more related they are.
  • LoadingPlot: a heatmap or barplot of the loadings or contributions of each feature to the reduced dimensions of PCA, PCoA or another ordination method.
  • RDAPlot: an supervised ordination plot of the samples, where every dot is a sample on a reduced dimensional space and every arrow reflects the contribution of a sample variable. Its interpretation is explained in the OMA chapter on Community Similarity.
  • RowTreePlot: a phylogenetic tree of the features, where every tip is a feature and two neighbouring tips are more closely related than two far apart from each other.

By default, the iSEEtree layout also includes the following panels inherited by iSEE:

The ColumnDataPlot could also prove useful for the visualisation of column variables such as alpha diversity indices. Its interpretation is explained in the OMA chapter on Community Diversity.

Tutorial

Installation

R is an open-source statistical environment which can be easily modified to enhance its functionality via packages. iSEEtree is an R package available on Bioconductor. R can be installed on any operating system from CRAN after which you can install iSEEtree by using the following commands in your R session:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("iSEEtree")

Example

The panels described above can be generated for a model TreeSE object in the following example:

library(iSEEtree)
library(mia)
library(scater)

# Import TreeSE
data("Tengeler2020", package = "mia")
tse <- Tengeler2020

# Add relabundance assay
tse <- transformAssay(tse, method = "relabundance")

# Add reduced dimensions
tse <- runMDS(tse, assay.type = "relabundance")

# Launch iSEE
if (interactive()) {
  iSEE(tse)
}

Resources

Citation

We hope that iSEEtree will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!

## Citation info
citation("iSEEtree")
#> Warning in citation("iSEEtree"): could not determine year for 'iSEEtree' from
#> package DESCRIPTION file
#> To cite package 'iSEEtree' in publications use:
#> 
#>   Benedetti G, Lahti L (????). _iSEEtree: Interactive visualisation for
#>   microbiome data_. R package version 1.1.0,
#>   <https://github.com/microbiome/iSEEtree>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {iSEEtree: Interactive visualisation for microbiome data},
#>     author = {Giulio Benedetti and Leo Lahti},
#>     note = {R package version 1.1.0},
#>     url = {https://github.com/microbiome/iSEEtree},
#>   }

Background Knowledge

iSEEtree is based on many other packages and in particular on those that have implemented the infrastructure needed for dealing with omics data, microbiome data and interactive visualisation. That is, packages like SummarizedExperiment, TreeSummarizedExperiment, mia, iSEE and shiny.

If you are asking yourself the question “Where do I start using Bioconductor?” you might be interested in this blog post.

Help

As package developers, we try to explain clearly how to use our packages and in which order to use the functions. But R and Bioconductor have a steep learning curve so it is critical to learn where to ask for help. The blog post quoted above mentions some but we would like to highlight the Bioconductor support site as the main resource for getting help: remember to use the iSEEtree tag and check the older posts. Other alternatives are available such as creating GitHub issues and tweeting. However, please note that if you want to receive help you should adhere to the posting guidelines. It is particularly critical that you provide a small reproducible example and your session information so package developers can track down the source of the error.

Reproducibility

iSEEtree was made possible thanks to:

  • R (R Core Team, 2024)
  • BiocStyle (Oleś, 2024)
  • knitr (Xie, 2024)
  • RefManageR (McLean, 2017)
  • rmarkdown (Allaire, Xie, Dervieux, McPherson, Luraschi, Ushey, Atkins, Wickham, Cheng, Chang, and Iannone, 2024)
  • testthat (Wickham, 2011)

This package was developed using usethis.

R session information:

#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] scater_1.34.0                   ggplot2_3.5.1                   scuttle_1.16.0                 
#>  [4] mia_1.15.0                      TreeSummarizedExperiment_2.14.0 Biostrings_2.75.0              
#>  [7] XVector_0.46.0                  MultiAssayExperiment_1.33.0     iSEEtree_1.1.0                 
#> [10] iSEE_2.19.0                     SingleCellExperiment_1.28.0     SummarizedExperiment_1.36.0    
#> [13] Biobase_2.67.0                  GenomicRanges_1.59.0            GenomeInfoDb_1.43.0            
#> [16] IRanges_2.41.0                  S4Vectors_0.44.0                BiocGenerics_0.53.1            
#> [19] generics_0.1.3                  MatrixGenerics_1.19.0           matrixStats_1.4.1              
#> [22] RefManageR_1.4.0                BiocStyle_2.35.0               
#> 
#> loaded via a namespace (and not attached):
#>   [1] splines_4.4.1               later_1.3.2                 ggplotify_0.1.2             tibble_3.2.1               
#>   [5] polyclip_1.10-7             rpart_4.1.23                DirichletMultinomial_1.49.0 lifecycle_1.0.4            
#>   [9] doParallel_1.0.17           miaViz_1.15.0               lattice_0.22-6              MASS_7.3-61                
#>  [13] SnowballC_0.7.1             backports_1.5.0             magrittr_2.0.3              Hmisc_5.2-0                
#>  [17] sass_0.4.9                  rmarkdown_2.28              jquerylib_0.1.4             yaml_2.3.10                
#>  [21] httpuv_1.6.15               DBI_1.2.3                   buildtools_1.0.0            minqa_1.2.8                
#>  [25] RColorBrewer_1.1-3          lubridate_1.9.3             abind_1.4-8                 zlibbioc_1.52.0            
#>  [29] purrr_1.0.2                 ggraph_2.2.1                yulab.utils_0.1.7           nnet_7.3-19                
#>  [33] tweenr_2.0.3                sandwich_3.1-1              circlize_0.4.16             GenomeInfoDbData_1.2.13    
#>  [37] ggrepel_0.9.6               tokenizers_0.3.0            irlba_2.3.5.1               tidytree_0.4.6             
#>  [41] maketools_1.3.1             vegan_2.6-8                 rbiom_1.0.3                 permute_0.9-7              
#>  [45] DelayedMatrixStats_1.29.0   codetools_0.2-20            DelayedArray_0.33.1         ggforce_0.4.2              
#>  [49] DT_0.33                     xml2_1.3.6                  tidyselect_1.2.1            shape_1.4.6.1              
#>  [53] aplot_0.2.3                 farver_2.1.2                UCSC.utils_1.2.0            lme4_1.1-35.5              
#>  [57] ScaledMatrix_1.14.0         viridis_0.6.5               shinyWidgets_0.8.7          base64enc_0.1-3            
#>  [61] jsonlite_1.8.9              GetoptLong_1.0.5            BiocNeighbors_2.1.0         tidygraph_1.3.1            
#>  [65] decontam_1.27.0             Formula_1.2-5               iterators_1.0.14            foreach_1.5.2              
#>  [69] ggnewscale_0.5.0            tools_4.4.1                 treeio_1.30.0               Rcpp_1.0.13                
#>  [73] glue_1.8.0                  gridExtra_2.3               SparseArray_1.6.0           xfun_0.48                  
#>  [77] mgcv_1.9-1                  dplyr_1.1.4                 withr_3.0.2                 shinydashboard_0.7.2       
#>  [81] BiocManager_1.30.25         fastmap_1.2.0               boot_1.3-31                 bluster_1.17.0             
#>  [85] fansi_1.0.6                 shinyjs_2.1.0               digest_0.6.37               rsvd_1.0.5                 
#>  [89] gridGraphics_0.5-1          timechange_0.3.0            R6_2.5.1                    mime_0.12                  
#>  [93] colorspace_2.1-1            listviewer_4.0.0            lpSolve_5.6.21              utf8_1.2.4                 
#>  [97] tidyr_1.3.1                 data.table_1.16.2           DECIPHER_3.3.0              graphlayouts_1.2.0         
#> [101] httr_1.4.7                  htmlwidgets_1.6.4           S4Arrays_1.6.0              pkgconfig_2.0.3            
#> [105] gtable_0.3.6                ComplexHeatmap_2.23.0       sys_3.4.3                   janeaustenr_1.0.0          
#> [109] htmltools_0.5.8.1           rintrojs_0.3.4              clue_0.3-65                 scales_1.3.0               
#> [113] png_0.1-8                   ggfun_0.1.7                 knitr_1.48                  rstudioapi_0.17.1          
#> [117] reshape2_1.4.4              rjson_0.2.23                checkmate_2.3.2             nlme_3.1-166               
#> [121] nloptr_2.1.1                shinyAce_0.4.3              zoo_1.8-12                  cachem_1.1.0               
#> [125] GlobalOptions_0.1.2         stringr_1.5.1               parallel_4.4.1              miniUI_0.1.1.1             
#> [129] vipor_0.4.7                 foreign_0.8-87              pillar_1.9.0                grid_4.4.1                 
#> [133] vctrs_0.6.5                 slam_0.1-54                 promises_1.3.0              BiocSingular_1.23.0        
#> [137] beachmat_2.23.0             xtable_1.8-4                cluster_2.1.6               beeswarm_0.4.0             
#> [141] htmlTable_2.4.3             evaluate_1.0.1              mvtnorm_1.3-1               cli_3.6.3                  
#> [145] compiler_4.4.1              rlang_1.1.4                 crayon_1.5.3                tidytext_0.4.2             
#> [149] mediation_4.5.0             plyr_1.8.9                  fs_1.6.5                    ggbeeswarm_0.7.2           
#> [153] stringi_1.8.4               viridisLite_0.4.2           BiocParallel_1.41.0         munsell_0.5.1              
#> [157] lazyeval_0.2.2              colourpicker_1.3.0          Matrix_1.7-1                patchwork_1.3.0            
#> [161] sparseMatrixStats_1.18.0    shiny_1.9.1                 highr_0.11                  memoise_2.0.1              
#> [165] igraph_2.1.1                RcppParallel_5.1.9          bslib_0.8.0                 ggtree_3.15.0              
#> [169] bibtex_0.5.1                ape_5.8

References

This vignette was generated using BiocStyle (Oleś, 2024) with knitr (Xie, 2024) and rmarkdown (Allaire, Xie, Dervieux et al., 2024) running behind the scenes. Citations were generated with RefManageR (McLean, 2017).

[1] J. Allaire, Y. Xie, C. Dervieux, et al. rmarkdown: Dynamic Documents for R. R package version 2.28. 2024. URL: https://github.com/rstudio/rmarkdown.

[2] M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.

[3] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.35.0. 2024. URL: https://github.com/Bioconductor/BiocStyle.

[4] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2024. URL: https://www.R-project.org/.

[5] H. Wickham. “testthat: Get Started with Testing”. In: The R Journal 3 (2011), pp. 5–10. URL: https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.

[6] Y. Xie. knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.48. 2024. URL: https://yihui.org/knitr/.