library(ggtree)
library(ggtreeDendro)
library(aplot)
scale_color_subtree <- ggtreeDendro::scale_color_subtree
Clustering is very importance method to classify items into different categories and to infer functions since similar objects tend to behavior similarly. There are more than 200 packages in Bioconductor implement clustering algorithms or employ clustering methods for omic-data analysis.
Albeit the methods are important for data analysis, the visualization
is quite limited. Most the the packages only have the ability to
visualize the hierarchical tree structure using
stats:::plot.hclust()
. This package is design to visualize
hierarchical tree structure with associated data (e.g., clinical
information collected with the samples) using the powerful in-house
developed ggtree
package.
This package implements a set of autoplot()
methods to
display tree structure. We will implement more autoplot()
methods to support more objects. The output of these
autoplot()
methods is a ggtree
object, which
can be further annotated by adding layers using ggplot2 syntax.
Integrating associated data to annotate the tree is also supported by ggtreeExtra
package.
Here are some demonstrations of using autoplot()
methods
to visualize common hierarchical clustering tree objects.
hclust
and dendrogram
objectsThese two classes are defined in the stats package.
linkage
objectThe class linkage
is defined in the mdendro
package.
agnes
, diana
and twins
objectsThese classes are defined in the cluster package.
pvclust
objectThe pvclust
class is defined in the pvclust
package.
library(pvclust)
data(Boston, package = "MASS")
set.seed(123)
result <- pvclust(Boston, method.dist="cor", method.hclust="average", nboot=1000, parallel=TRUE)
## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
The pvclust
object contains two types of p-values: AU
(Approximately Unbiased) p-value and BP (Boostrap Probability) value.
These values will be automatically labelled on the tree.
Here is the output of sessionInfo() on the system on which this document was compiled:
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] pvclust_2.2-0 cluster_2.1.6 mdendro_2.2.1 aplot_0.2.3
## [5] ggtreeDendro_1.9.0 ggtree_3.13.2 yulab.utils_0.1.7 prettydoc_0.4.1
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.9 utf8_1.2.4 generics_0.1.3 tidyr_1.3.1
## [5] ggplotify_0.1.2 lattice_0.22-6 digest_0.6.37 magrittr_2.0.3
## [9] evaluate_1.0.1 grid_4.4.1 fastmap_1.2.0 jsonlite_1.8.9
## [13] ape_5.8 purrr_1.0.2 fansi_1.0.6 scales_1.3.0
## [17] lazyeval_0.2.2 jquerylib_0.1.4 cli_3.6.3 rlang_1.1.4
## [21] munsell_0.5.1 tidytree_0.4.6 withr_3.0.2 cachem_1.1.0
## [25] yaml_2.3.10 tools_4.4.1 parallel_4.4.1 dplyr_1.1.4
## [29] colorspace_2.1-1 ggplot2_3.5.1 buildtools_1.0.0 vctrs_0.6.5
## [33] R6_2.5.1 gridGraphics_0.5-1 lifecycle_1.0.4 fs_1.6.4
## [37] ggfun_0.1.7 treeio_1.29.2 pkgconfig_2.0.3 pillar_1.9.0
## [41] bslib_0.8.0 gtable_0.3.6 glue_1.8.0 Rcpp_1.0.13
## [45] highr_0.11 xfun_0.48 tibble_3.2.1 tidyselect_1.2.1
## [49] sys_3.4.3 knitr_1.48 farver_2.1.2 htmltools_0.5.8.1
## [53] nlme_3.1-166 patchwork_1.3.0 labeling_0.4.3 rmarkdown_2.28
## [57] maketools_1.3.1 compiler_4.4.1