Introduction to microSTASIS

Introduction

µSTASIS, or microSTASIS, was firstly develop for assessing the stability of microbiota over time. The initial idea behind it was to warrant the proper processing of high-dimensional, sparse, undetermined, right skewed, overdispersed and compositional data. Specifically, it aims to fill the gap of temporal stability metrics in the compositional analysis of microbiome data. However, it can also work if the data are not transformed for belonging to the Aitchison simplex.

On one side, the assumptions in which µSTASIS is based are two: the individual-specific composition and the within-individual variability over time. On the other side, the output, which we called mS score, is easy to interpret and provides a contextualized and intuitive metric to estimate temporal microbiota stability. Also, the package incorporates leave-p-out (or k) cross-validation routines, that compute the mean absolute error, and multiple functions to visualize the result.

Therefore, sort of the questions that could be answered are related with robust time-resolved definitions of stability to assess microbiota behaviour against perturbations or to search for specific compositions that present a dynamic equilibrium around a central attractor state rather than overcoming a threshold of no return.

Algorithm

Firstly, two samples from the same individual has to be paired. For that, one can merge the sequential paired times (t1 with t2, t2 with t3…) or use a non-sequential order (t1 with t3). Therefore, from a single large data matrix, we would generate more that are smaller, that is, with fewer observations (samples).

Once we have the paired samples, it is time for the main algorithm: iterative clustering. Concretely, Hartigan-Wong k-means algorithm is used as many times as possible for stressing out paired samples from the same individuals to test if they remain together for multiple numbers of clusters over a whole data set of individuals. Also, this is corrected in those cases where the distance between samples from the same individuals is lower than the distances to their respective centroids.

Why Bioconductor?

Although the package was initially released at CRAN, we feel that releasing it at Bioconductor could add better experience to the user. Some functions were internally modified leveraging on other Bioconductor packages as well as we added interoperability of our workflow with TreeSummarizedExperiment.

  • microSTASIS (Sanchez-Sanchez, Santonja, and Benitez-Paez, 2022)

Basics

Install microSTASIS

microSTASIS is a R package available via the Bioconductor repository (formerly at CRAN) which can be installed by using the following commands in your R session:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
      install.packages("BiocManager")
  }
BiocManager::install("microSTASIS")
## Check that you have a valid Bioconductor installation
BiocManager::valid()

Citing microSTASIS

We hope that microSTASIS will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!

## Citation info
citation("microSTASIS")
#> To cite microSTASIS in publications, please use:
#> 
#>   Sanchez-Sanchez P, Santonja FJ, Benitez-Paez A (2022). "Assessment of
#>   human microbiota stability across longitudinal samples using
#>   iteratively growing-partitioned clustering." _Briefings in
#>   Bioinformatics_, *23*(2). ISSN -2577, doi:10.1093/bib/bbac055
#>   <https://doi.org/10.1093/bib/bbac055>, bbac055,
#>   <https://doi.org/10.1093/bib/bbac055>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Article{,
#>     title = {Assessment of human microbiota stability across longitudinal 
#>           samples using iteratively growing-partitioned clustering},
#>     author = {Pedro Sanchez-Sanchez and Francisco J. Santonja and Alfonso Benitez-Paez},
#>     journal = {Briefings in Bioinformatics},
#>     volume = {23},
#>     number = {2},
#>     year = {2022},
#>     month = {02},
#>     issn = {-2577},
#>     doi = {10.1093/bib/bbac055},
#>     url = {https://doi.org/10.1093/bib/bbac055},
#>     note = {bbac055},
#>   }

Quick start to using microSTASIS

## Load the package
library("microSTASIS")

The input data can be a matrix where the rownames include ID, common pattern and sampling time and columns are the microbial features i.e. amplicon sequence variants (ASVs) or operational taxonomic units (OTUs). Alternatively, it can be a TreeSummarizedExperiment object containing the matrix in an assay slot or within an altExps (alternative experiment) slot. Then, the ID and sampling time info should be in the colData slot.

## Show the input data
data(clr)
clr[1:8, 1:6]
#>                 1        2          3          4          6          7
#> 003_0_1  2.891583 2.996944 -1.1114255 -1.1114255 -1.8606574  1.3875060
#> 003_0_25 7.430753 7.365052  4.4745478  4.7213331  0.9239730 -0.8677865
#> 003_0_26 6.359208 6.314002  1.3350641  1.7537744  0.1956298 -0.2743738
#> 003_0_27 7.364809 7.372548  4.5086757  4.6234512 -2.7098118 -2.7098118
#> 004_0_1  4.033265 4.028176 -2.3723362 -2.3723362  1.8855559 -2.3723362
#> 004_0_25 4.666157 4.539863 -1.2180974 -1.2180974 -2.0048403  0.5886198
#> 004_0_26 5.005840 5.082213 -0.2471546 -0.2471546  7.2118445 -1.5034582
#> 004_0_27 4.492273 4.449713  3.5114435 -0.9009753  8.4833123 -0.9009753

The first step is to subset an initial matrix of microbiome data with multiple sampling points. ThepairedTimes() function already do it for every possible paired times in a sequential = TRUE way or for specific times points, for example: sequential = FALSE, specifiedTimePoints = c("1", "3"). The output is a list with length equal to the number of paired times.

## Subseting the initial data to a list of multiple paired times
times <- pairedTimes(data = clr, sequential = TRUE, common = "_0_")

Alternatively, the input can also be a TreeSummarizedExperiment, like below.

## Loading the TSE package
suppressPackageStartupMessages(library(TreeSummarizedExperiment))
## Creating a TreeSummarizedExperiment object
ID_common_time <- rownames(clr)
samples <- 1:dim(clr)[1]
metadata <- data.frame(samples, stringr::str_split(ID_common_time, "_", 
                                                   simplify = TRUE))
colnames(metadata)[2:4] <- c("ID", "common", "time_point")
both_tse <- TreeSummarizedExperiment(assays = list(counts = t(clr)), 
                                     colData = metadata)
altExp(both_tse) <- SummarizedExperiment(assays = SimpleList(data = t(clr)), 
                                         colData = metadata)
assayNames(altExp(both_tse)) <- "data"
## Subseting the data as two different TSE objects to two list of multiple paired times
TSE_times <- pairedTimes(data = both_tse, sequential = TRUE, assay = "counts",
                         alternativeExp = NULL,
                         ID = "ID", timePoints = "time_point")
TSE_altExp_times <- pairedTimes(data = both_tse, sequential = TRUE, 
                                assay = "data", alternativeExp = "unnamed1",
                                 ID = "ID", timePoints = "time_point")

It is pretty important not to forget sequential = FALSE when the user wants to get specifiedTimePoints.

Main algorithm

Then, it is time for iterativeClustering(), which can be done in parallel and runs for every item in the list.

## Main algorithm applied to the matrix-derived object
mS <- iterativeClustering(pairedTimes = times, common = "_")

Note that the downstream functions work equally once we have the pairedTimes()output. Also, BPPARAM can be specified as following BiocParallel guides.

## Main algorithm applied to the TSE-derived object
mS_TSE <- iterativeClustering(pairedTimes = TSE_times, common = "_")

Visualization

There are two main functions for visualizing the results: a scatter plot with boxplots on the side and a heatmap.

## Prepare the result for the later visualization functions
results <- mSpreviz(results = mS, times = times)
## Scatter plot of the stability scores per patient and time
plotmSscatter(results = results, order = "median", times = c("t1_t25", "t25_t26"), 
          gridLines = TRUE, sideScale = 0.2)
#> Registered S3 method overwritten by 'ggside':
#>   method from   
#>   +.gg   ggplot2
#> Warning: Removed 16 rows containing non-finite outside the scale range
#> (`stat_xsideboxplot()`).
#> Warning: Removed 16 rows containing missing values or values outside the scale range
#> (`geom_point()`).

## Heatmap of the stability scores per patient and time
plotmSheatmap(results = results, order = "mean", times = c("t1_t25", "t25_t26"), 
          label = TRUE)
#> Warning: Removed 16 rows containing missing values or values outside the scale range
#> (`geom_text()`).

Both functions can be sorted by the stability score by the argument order.

The interpretation of both visualizations is similar. One the one side we can see how the stability scores differ a little bit between the sampling points, although the boxplots show that the values range from 0 to 1. Equally, the heatmap allows to see what are the most stable patients over sampling points.

Cross-validation

The stability results can be validated by cross validation (CV) in two ways: leave-one-out or leave-k-out. k is the number of individuals removed in each time that iterativeClustering() is run internally; it is equal to 1 for the LOOCV.

## Cross validation in a parallelized way with different k values
cv_klist_k2 <- BiocParallel::bpmapply(iterativeClusteringCV, name = names(times), 
                                      k = rep(2L, 3), 
                                      MoreArgs = list(pairedTimes = times, 
                                                      results = mS, 
                                                      common = "_0_"))

Then, the result can be displayed in two ways: as the mean absolute error or as a visual representation of how the mS score change.

## Calculate the mean absolute error after computing the cross validation
MAE_t1_t25 <- mSerrorCV(pairedTime = times$t1_t25, CVklist = cv_klist_k2[[1]],  
                        k = 2L)

Remember that for both mSerrorCV() and plotmSlinesCV(), k argument has to be the same previously used for extracting the input for CVklist. For that, we recommend to save it in another local variable and then use it for the arguments of the different functions.

## Prepare the result for the later visualization functions
MAE <- mSpreviz(results = list(MAE_t1_t25), times = list(t1_t25 = times$t1_t25))
## Heatmap of the mean absolute error values per patient and time
plotmSheatmap(results = MAE, times = c("t1_t25", "t25_t26"), label = TRUE,
          high = 'red2',  low = 'forestgreen', midpoint = 5)

The visualizations allows to look at the reliability of the stability metric for our data set and to detect some cases where the score should not be taken doubtlessly.

## Line plot of the stability scores per patient at a given paired times
## Both the proper score after iterative clustering and the cross validation ones
plotmSlinesCV(pairedTime = times$t1_t25, CVklist = cv_klist_k2[[1]], k = 2L)
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(X[[i]], ...): argument is not numeric or logical:
#> returning NA

The lines visualizations should be seen as a way to track how the stability score for an individual in a given paired time changes in dependence to the data set i.e. when removing other individuals. Variability is expected, but heavy drops or increases not.

Metadata integration

The analysis can also be integrated with metadata of the individuals in the following way.

## Create a metadata data frame
metadata <- data.frame(Sample = rownames(clr), age = c(rep("youth", 65), 
                                                       rep("old", 131-65)))
## Modify the metadata to the proper format that match with results
group <- mSmetadataGroups(metadata = metadata, samples = metadata$Sample, 
                          common = "_0_", individuals = results$individual, 
                          variable = "age")
## Boxplot with individual stability scores split by groups
plotmSdynamics(results, groups = group, points = TRUE, linetype = 0)
#> Warning: Removed 32 rows containing non-finite outside the scale range
#> (`stat_summary()`).
#> Warning: Removed 32 rows containing non-finite outside the scale range
#> (`stat_boxplot()`).
#> Warning: Removed 32 rows containing missing values or values outside the scale range
#> (`geom_point()`).

This plot allow to track general trends of stability in different groups across time.

Reproducibility

The microSTASIS package (Sanchez-Sanchez, Santonja, and Benitez-Paez, 2022) was made possible thanks to:

  • R (R Core Team, 2024)
  • BiocStyle (Oleś, 2024)
  • knitr (Xie, 2024)
  • RefManageR (McLean, 2017)
  • rmarkdown (Allaire, Xie, Dervieux, McPherson, Luraschi, Ushey, Atkins, Wickham, Cheng, Chang, and Iannone, 2024)
  • sessioninfo (Wickham, Chang, Flight, Müller, and Hester, 2021)
  • testthat (Wickham, 2011)

This package was developed using biocthis.

Code for creating the vignette.

## Create the vignette
library("rmarkdown")
system.time(render("microSTASIS.Rmd", "BiocStyle::html_document"))
## Extract the R code
library("knitr")
knit("microSTASIS.Rmd", tangle = TRUE)

Date the vignette was generated.

#> [1] "2024-10-30 08:51:28 UTC"

Wallclock time spent generating the vignette.

#> Time difference of 1.211 mins

R session information

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.1 (2024-06-14)
#>  os       Ubuntu 24.04.1 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  C
#>  ctype    en_US.UTF-8
#>  tz       Etc/UTC
#>  date     2024-10-30
#>  pandoc   3.2.1 @ /usr/local/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package                  * version date (UTC) lib source
#>  abind                      1.4-8   2024-09-12 [2] RSPM (R 4.4.0)
#>  ape                        5.8     2024-04-11 [2] RSPM (R 4.4.0)
#>  backports                  1.5.0   2024-05-23 [2] RSPM (R 4.4.0)
#>  bibtex                     0.5.1   2023-01-26 [2] RSPM (R 4.4.0)
#>  Biobase                  * 2.67.0  2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  BiocGenerics             * 0.53.0  2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  BiocManager                1.30.25 2024-08-28 [2] RSPM (R 4.4.0)
#>  BiocParallel               1.41.0  2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  BiocStyle                * 2.35.0  2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  Biostrings               * 2.75.0  2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  bslib                      0.8.0   2024-07-29 [2] RSPM (R 4.4.0)
#>  buildtools                 1.0.0   2024-10-28 [3] local (/pkg)
#>  cachem                     1.1.0   2024-05-16 [2] RSPM (R 4.4.0)
#>  cli                        3.6.3   2024-06-21 [2] RSPM (R 4.4.0)
#>  codetools                  0.2-20  2024-03-31 [2] RSPM (R 4.4.0)
#>  colorspace                 2.1-1   2024-07-26 [2] RSPM (R 4.4.0)
#>  crayon                     1.5.3   2024-06-20 [2] RSPM (R 4.4.0)
#>  DelayedArray               0.33.1  2024-10-30 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  digest                     0.6.37  2024-08-19 [2] RSPM (R 4.4.0)
#>  dplyr                      1.1.4   2023-11-17 [2] RSPM (R 4.4.0)
#>  evaluate                   1.0.1   2024-10-10 [2] RSPM (R 4.4.0)
#>  fansi                      1.0.6   2023-12-08 [2] RSPM (R 4.4.0)
#>  farver                     2.1.2   2024-05-13 [2] RSPM (R 4.4.0)
#>  fastmap                    1.2.0   2024-05-15 [2] RSPM (R 4.4.0)
#>  fs                         1.6.4   2024-04-25 [2] RSPM (R 4.4.0)
#>  generics                   0.1.3   2022-07-05 [2] RSPM (R 4.4.0)
#>  GenomeInfoDb             * 1.41.2  2024-10-02 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  GenomeInfoDbData           1.2.13  2024-10-30 [2] Bioconductor
#>  GenomicRanges            * 1.57.2  2024-10-29 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  ggplot2                    3.5.1   2024-04-23 [2] RSPM (R 4.4.0)
#>  ggside                     0.3.1   2024-03-01 [2] RSPM (R 4.4.0)
#>  glue                       1.8.0   2024-09-30 [2] RSPM (R 4.4.0)
#>  gtable                     0.3.6   2024-10-25 [2] RSPM (R 4.4.0)
#>  highr                      0.11    2024-05-26 [2] RSPM (R 4.4.0)
#>  htmltools                  0.5.8.1 2024-04-04 [2] RSPM (R 4.4.0)
#>  httr                       1.4.7   2023-08-15 [2] RSPM (R 4.4.0)
#>  IRanges                  * 2.39.2  2024-10-25 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  jquerylib                  0.1.4   2021-04-26 [2] RSPM (R 4.4.0)
#>  jsonlite                   1.8.9   2024-09-20 [2] RSPM (R 4.4.0)
#>  knitr                      1.48    2024-07-07 [2] RSPM (R 4.4.0)
#>  labeling                   0.4.3   2023-08-29 [2] RSPM (R 4.4.0)
#>  lattice                    0.22-6  2024-03-20 [2] RSPM (R 4.4.0)
#>  lazyeval                   0.2.2   2019-03-15 [2] RSPM (R 4.4.0)
#>  lifecycle                  1.0.4   2023-11-07 [2] RSPM (R 4.4.0)
#>  lubridate                  1.9.3   2023-09-27 [2] RSPM (R 4.4.0)
#>  magrittr                   2.0.3   2022-03-30 [2] RSPM (R 4.4.0)
#>  maketools                  1.3.1   2024-10-28 [3] Github (jeroen/maketools@d46f92c)
#>  Matrix                     1.7-1   2024-10-18 [2] RSPM (R 4.4.0)
#>  MatrixGenerics           * 1.17.1  2024-10-23 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  matrixStats              * 1.4.1   2024-09-08 [2] RSPM (R 4.4.0)
#>  microSTASIS              * 1.7.0   2024-10-30 [1] https://bioc.r-universe.dev (R 4.4.1)
#>  munsell                    0.5.1   2024-04-01 [2] RSPM (R 4.4.0)
#>  nlme                       3.1-166 2024-08-14 [2] RSPM (R 4.4.0)
#>  pillar                     1.9.0   2023-03-22 [2] RSPM (R 4.4.0)
#>  pkgconfig                  2.0.3   2019-09-22 [2] RSPM (R 4.4.0)
#>  plyr                       1.8.9   2023-10-02 [2] RSPM (R 4.4.0)
#>  purrr                      1.0.2   2023-08-10 [2] RSPM (R 4.4.0)
#>  R6                         2.5.1   2021-08-19 [2] RSPM (R 4.4.0)
#>  Rcpp                       1.0.13  2024-07-17 [2] RSPM (R 4.4.0)
#>  RefManageR               * 1.4.0   2022-09-30 [2] RSPM (R 4.4.0)
#>  rlang                      1.1.4   2024-06-04 [2] RSPM (R 4.4.0)
#>  rmarkdown                  2.28    2024-08-17 [2] RSPM (R 4.4.0)
#>  S4Arrays                   1.5.11  2024-10-29 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  S4Vectors                * 0.43.2  2024-10-17 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  sass                       0.4.9   2024-03-15 [2] RSPM (R 4.4.0)
#>  scales                     1.3.0   2023-11-28 [2] RSPM (R 4.4.0)
#>  sessioninfo              * 1.2.2   2021-12-06 [2] RSPM (R 4.4.0)
#>  SingleCellExperiment     * 1.27.2  2024-10-26 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  SparseArray                1.5.45  2024-10-29 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  stringi                    1.8.4   2024-05-06 [2] RSPM (R 4.4.0)
#>  stringr                    1.5.1   2023-11-14 [2] RSPM (R 4.4.0)
#>  SummarizedExperiment     * 1.35.5  2024-10-29 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  sys                        3.4.3   2024-10-04 [2] RSPM (R 4.4.0)
#>  tibble                     3.2.1   2023-03-20 [2] RSPM (R 4.4.0)
#>  tidyr                      1.3.1   2024-01-24 [2] RSPM (R 4.4.0)
#>  tidyselect                 1.2.1   2024-03-11 [2] RSPM (R 4.4.0)
#>  tidytree                   0.4.6   2023-12-12 [2] RSPM (R 4.4.0)
#>  timechange                 0.3.0   2024-01-18 [2] RSPM (R 4.4.0)
#>  treeio                     1.29.2  2024-10-28 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  TreeSummarizedExperiment * 2.13.0  2024-10-28 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  UCSC.utils                 1.1.0   2024-10-29 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  utf8                       1.2.4   2023-10-22 [2] RSPM (R 4.4.0)
#>  vctrs                      0.6.5   2023-12-01 [2] RSPM (R 4.4.0)
#>  withr                      3.0.2   2024-10-28 [2] RSPM (R 4.4.0)
#>  xfun                       0.48    2024-10-03 [2] RSPM (R 4.4.0)
#>  xml2                       1.3.6   2023-12-04 [2] RSPM (R 4.4.0)
#>  XVector                  * 0.45.0  2024-10-02 [2] https://bioc.r-universe.dev (R 4.4.1)
#>  yaml                       2.3.10  2024-07-26 [2] RSPM (R 4.4.0)
#>  yulab.utils                0.1.7   2024-08-26 [2] RSPM (R 4.4.0)
#>  zlibbioc                   1.51.2  2024-10-21 [2] Bioconductor 3.20 (R 4.4.1)
#> 
#>  [1] /tmp/Rtmpxm2X9R/Rinst1771614f6d30
#>  [2] /github/workspace/pkglib
#>  [3] /usr/local/lib/R/site-library
#>  [4] /usr/lib/R/site-library
#>  [5] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Bibliography

This vignette was generated using BiocStyle (Oleś, 2024) with knitr (Xie, 2024) and rmarkdown (Allaire, Xie, Dervieux et al., 2024) running behind the scenes.

Citations made with RefManageR (McLean, 2017).

[1] J. Allaire, Y. Xie, C. Dervieux, et al. rmarkdown: Dynamic Documents for R. R package version 2.28. 2024. URL: https://github.com/rstudio/rmarkdown.

[2] M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.

[3] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.35.0. 2024. URL: https://github.com/Bioconductor/BiocStyle.

[4] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2024. URL: https://www.R-project.org/.

[5] P. Sanchez-Sanchez, F. J. Santonja, and A. Benitez-Paez. “Assessment of human microbiota stability across longitudinal samples using iteratively growing-partitioned clustering”. In: Briefings in Bioinformatics 23.2 (Feb. 2022). bbac055. ISSN: -2577. DOI: 10.1093/bib/bbac055. URL: https://doi.org/10.1093/bib/bbac055.

[6] H. Wickham. “testthat: Get Started with Testing”. In: The R Journal 3 (2011), pp. 5–10. URL: https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.

[7] H. Wickham, W. Chang, R. Flight, et al. sessioninfo: R Session Information. R package version 1.2.2, https://r-lib.github.io/sessioninfo/. 2021. URL: https://github.com/r-lib/sessioninfo#readme.

[8] Y. Xie. knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.48. 2024. URL: https://yihui.org/knitr/.