The correlation structure between
samples in complex study designs can be decomposed into the contribution
of multiple dimensions of variation. variancePartition
provides a statistical and visualization framework to interpret sources
of variation. Here I describe a visualization of the correlation
structure between samples for a single gene.
In the example dataset described in the main vignette, samples are
correlated because they can come from the same individual or the same
tissue. The function plotCorrStructure()
shows the
correlation structure caused by each variable as well and the joint
correlation structure. Figure 1 shows the correlation between samples
from the same individual where (a) shows the samples sorted based on
clustering of the correlation matrix and (b) shows the original order.
Figure 1 c) and d) shows the same type of plot except demonstrating the
effect of tissue. The total correlation structure from summing
individual and tissue correlation matricies is shown in Figure 2. The
code to generate these plots is shown below.
# Fit linear mixed model and examine correlation stucture
# for one gene
data(varPartData)
form <- ~ Age + (1 | Individual) + (1 | Tissue)
fitList <- fitVarPartModel(geneExpr[1:2, ], form, info)
# focus on one gene
fit <- fitList[[1]]
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
## [4] LC_COLLATE=C LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] variancePartition_1.37.1 BiocParallel_1.41.0 limma_3.63.2
## [4] ggplot2_3.5.1 knitr_1.49 rmarkdown_2.29
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 xfun_0.49 bslib_0.8.0 caTools_1.18.3
## [5] Biobase_2.67.0 lattice_0.22-6 numDeriv_2016.8-1.1 vctrs_0.6.5
## [9] tools_4.4.2 Rdpack_2.6.2 bitops_1.0-9 generics_0.1.3
## [13] pbkrtest_0.5.3 parallel_4.4.2 tibble_3.2.1 fansi_1.0.6
## [17] pkgconfig_2.0.3 Matrix_1.7-1 KernSmooth_2.23-24 lifecycle_1.0.4
## [21] stringr_1.5.1 compiler_4.4.2 gplots_3.2.0 statmod_1.5.0
## [25] munsell_0.5.1 RhpcBLASctl_0.23-42 codetools_0.2-20 lmerTest_3.1-3
## [29] htmltools_0.5.8.1 sys_3.4.3 buildtools_1.0.0 sass_0.4.9
## [33] yaml_2.3.10 tidyr_1.3.1 pillar_1.9.0 nloptr_2.1.1
## [37] jquerylib_0.1.4 MASS_7.3-61 aod_1.3.3 cachem_1.1.0
## [41] iterators_1.0.14 boot_1.3-31 nlme_3.1-166 gtools_3.9.5
## [45] tidyselect_1.2.1 digest_0.6.37 stringi_1.8.4 mvtnorm_1.3-2
## [49] fANCOVA_0.6-1 reshape2_1.4.4 purrr_1.0.2 dplyr_1.1.4
## [53] maketools_1.3.1 splines_4.4.2 fastmap_1.2.0 grid_4.4.2
## [57] colorspace_2.1-1 cli_3.6.3 magrittr_2.0.3 utf8_1.2.4
## [61] broom_1.0.7 corpcor_1.6.10 withr_3.0.2 backports_1.5.0
## [65] scales_1.3.0 remaCor_0.0.18 matrixStats_1.4.1 lme4_1.1-35.5
## [69] evaluate_1.0.1 EnvStats_3.0.0 rbibutils_2.3 rlang_1.1.4
## [73] Rcpp_1.0.13-1 glue_1.8.0 BiocGenerics_0.53.3 minqa_1.2.8
## [77] jsonlite_1.8.9 plyr_1.8.9 R6_2.5.1