| Title: | Microbiome Time Series Analysis |
|---|---|
| Description: | The `miaTime` package provides tools for microbiome time series analysis based on (Tree)SummarizedExperiment infrastructure. |
| Authors: | Leo Lahti [aut] (ORCID: <https://orcid.org/0000-0001-5537-637X>), Tuomas Borman [aut, cre] (ORCID: <https://orcid.org/0000-0002-8563-8884>), Yagmur Simsek [aut], Sudarshan Shetty [ctb], Chouaib Benchraka [ctb], Muluh Muluh [ctb], Ali Hajj [ctb] |
| Maintainer: | Tuomas Borman <[email protected]> |
| License: | Artistic-2.0 | file LICENSE |
| Version: | 1.3.0 |
| Built: | 2026-06-01 09:38:59 UTC |
| Source: | https://github.com/bioc/miaTime |
miaTime Package.miaTime implements time series methods in mia ecosystem.
Maintainer: Tuomas Borman [email protected] (ORCID)
Authors:
Leo Lahti [email protected] (ORCID)
Yagmur Simsek [email protected]
Other contributors:
Sudarshan Shetty [email protected] [contributor]
Chouaib Benchraka [contributor]
Muluh Muluh [contributor]
Ali Hajj [contributor]
TreeSummarizedExperiment class
Calculates short term changes in abundance of taxa using temporal abundance data.
addShortTermChange(x, ...) getShortTermChange(x, ...) ## S4 method for signature 'SummarizedExperiment' addShortTermChange(x, name = "short_term_change", ...) ## S4 method for signature 'SummarizedExperiment' getShortTermChange(x, time.col, assay.type = "counts", group = NULL, ...)addShortTermChange(x, ...) getShortTermChange(x, ...) ## S4 method for signature 'SummarizedExperiment' addShortTermChange(x, name = "short_term_change", ...) ## S4 method for signature 'SummarizedExperiment' getShortTermChange(x, time.col, assay.type = "counts", group = NULL, ...)
x |
A
|
... |
additional arguments.
|
name |
|
time.col |
|
assay.type |
|
group |
|
These functions can be utilized to calculate growth metrics for short term change. In specific, the functions calculate the metrics with the following equations:
getShortTermChange returns DataFrame object containing
the short term change in abundance over time for a microbe.
addShortTermChange, on the other hand, returns a
SummarizedExperiment
object with these results in its metadata.
Ji, B.W., et al. (2020) Macroecological dynamics of gut microbiota. Nat Microbiol 5, 768–775 . doi: https://doi.org/10.1038/s41564-020-0685-1
library(miaTime) # Load time series data data(minimalgut) tse <- minimalgut # Get relative abundances tse <- transformAssay(tse, method = "relabundance") # Calculate short term changes df <- getShortTermChange( tse, assay.type = "relabundance", time.col = "Time.hr", group = "StudyIdentifier") # Calculate the logarithm of the ratio described in Ji, B.W., et al. (2020) tse <- transformAssay( tse, assay.type = "relabundance", method = "log10", pseudocount = TRUE) df <- getShortTermChange( tse, assay.type = "log10", time.col = "Time.hr", group = "StudyIdentifier") # Select certain bacteria that have highest growth rate select <- df[["growth_rate"]] |> abs() |> order(decreasing = FALSE) select <- df[select, "FeatureID"] |> unique() |> head(3) df <- df[ which(df[["FeatureID"]] %in% select), ] # Plot results library(ggplot2) p <- ggplot(df, aes(x = Time.hr, y = growth_rate, colour = FeatureID)) + geom_smooth(level = 0.5) + facet_grid(. ~ StudyIdentifier, scales = "free") + scale_y_log10() plibrary(miaTime) # Load time series data data(minimalgut) tse <- minimalgut # Get relative abundances tse <- transformAssay(tse, method = "relabundance") # Calculate short term changes df <- getShortTermChange( tse, assay.type = "relabundance", time.col = "Time.hr", group = "StudyIdentifier") # Calculate the logarithm of the ratio described in Ji, B.W., et al. (2020) tse <- transformAssay( tse, assay.type = "relabundance", method = "log10", pseudocount = TRUE) df <- getShortTermChange( tse, assay.type = "log10", time.col = "Time.hr", group = "StudyIdentifier") # Select certain bacteria that have highest growth rate select <- df[["growth_rate"]] |> abs() |> order(decreasing = FALSE) select <- df[select, "FeatureID"] |> unique() |> head(3) df <- df[ which(df[["FeatureID"]] %in% select), ] # Plot results library(ggplot2) p <- ggplot(df, aes(x = Time.hr, y = growth_rate, colour = FeatureID)) + geom_smooth(level = 0.5) + facet_grid(. ~ StudyIdentifier, scales = "free") + scale_y_log10() p
Simulated dataset based on a Crohn's disease microbiome study. The dataset is right-censored and includes time-to-event data, representing either the occurrence of disease or censoring. It contains 150 individuals and 48 taxa. The dataset originates from the coda4microbiome package under the name "data_survival".
data(crohn_survival)data(crohn_survival)
The dataset is in the
TreeSummarizedExperiment
format.
Sample metadata includes the following information:
diagnosis: Indicates whether an individual has Crohn’s disease (CD) or is a control.
event: Binary variable indicating disease status (1 = Crohn’s Disease, 0 = Control).
event_time: The time of event occurrence or censoring (unitless time).
Taxa data do not include additional taxonomy information.
Loads the dataset in R.
Calle ML et al. (2023) coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies. BMC Bioinformatics, 24(82). https://doi.org/10.1186/s12859-023-05205-3
Gevers D et al. (2018) The Treatment-Naive Microbiome in New-Onset Crohn’s Disease. Cell Host & Microbe, 15(4). #' https://doi.org/10.1016/j.chom.2014.02.005
Calculates sample dissimilarity between the given baseline and other time points, optionally within a group (subject, reaction chamber, or similar). The corresponding time difference is returned as well.
getBaselineDivergence(x, ...) addBaselineDivergence(x, ...) ## S4 method for signature 'SummarizedExperiment' getBaselineDivergence( x, time.col, assay.type = "counts", reference = NULL, group = NULL, method = "bray", ... ) ## S4 method for signature 'SummarizedExperiment' addBaselineDivergence( x, name = c("divergence", "time_diff", "ref_samples"), ... )getBaselineDivergence(x, ...) addBaselineDivergence(x, ...) ## S4 method for signature 'SummarizedExperiment' getBaselineDivergence( x, time.col, assay.type = "counts", reference = NULL, group = NULL, method = "bray", ... ) ## S4 method for signature 'SummarizedExperiment' addBaselineDivergence( x, name = c("divergence", "time_diff", "ref_samples"), ... )
x |
A
|
... |
Optional arguments passed into
|
time.col |
|
assay.type |
|
reference |
|
group |
|
method |
|
name |
|
The group argument allows calculating divergence per group. If given, the divergence is calculated per group. e.g. subject, chamber, group etc. Otherwise, this is done across all samples at once.
The baseline sample(s) always need to belong to the data object i.e. they can be merged into it before applying this function. The reason is that they need to have comparable sample data, at least some time point information for calculating time differences w.r.t. baseline sample.
The baseline time point is by default defined as the smallest time point (per group). Alternatively, the user can provide the baseline vector, or a list of baseline vectors per group (named list per group).
getBaselineDivergence returns DataFrame object
containing the sample dissimilarity and corresponding time difference between
samples. addBaselineDivergence, on the other hand, returns a
SummarizedExperiment
object with these results in its colData.
library(miaTime) data(hitchip1006) tse <- transformAssay(hitchip1006, method = "relabundance") # By default, reference samples are the samples from the first timepoint tse <- addBaselineDivergence( tse, group = "subject", time.col = "time", assay.type = "relabundance", method = "bray") # Add reference samples to colData, if you want to specify reference # samples manually colData(tse)[["reference"]] <- "Sample-875" tse <- addBaselineDivergence( tse, reference = "reference", group = "subject", time.col = "time", name = c("divergence_from_baseline", "time_from_baseline", "reference_samples"), assay.type = "relabundance", method = "bray")library(miaTime) data(hitchip1006) tse <- transformAssay(hitchip1006, method = "relabundance") # By default, reference samples are the samples from the first timepoint tse <- addBaselineDivergence( tse, group = "subject", time.col = "time", assay.type = "relabundance", method = "bray") # Add reference samples to colData, if you want to specify reference # samples manually colData(tse)[["reference"]] <- "Sample-875" tse <- addBaselineDivergence( tse, reference = "reference", group = "subject", time.col = "time", name = c("divergence_from_baseline", "time_from_baseline", "reference_samples"), assay.type = "relabundance", method = "bray")
This function calculates coefficient of bimodality for each taxa.
getBimodality(x, ...) addBimodality(x, ...) ## S4 method for signature 'SummarizedExperiment' addBimodality(x, name = "bimodality", ...) ## S4 method for signature 'SummarizedExperiment' getBimodality(x, assay.type = "counts", ...)getBimodality(x, ...) addBimodality(x, ...) ## S4 method for signature 'SummarizedExperiment' addBimodality(x, name = "bimodality", ...) ## S4 method for signature 'SummarizedExperiment' getBimodality(x, assay.type = "counts", ...)
x |
A
|
... |
additional arguments.
|
name |
|
assay.type |
|
This function calculates coefficient of bimodality for each taxa. If the dataset includes grouping, for instance, individual systems or patients, the coefficient is calculated for each group separately.
The coefficient of bimodality measures whether a taxon has bimodal abundance. For instance, certain taxon can be high-abundant in some timepoints while in others it might be rare. The coefficient can help to determine these taxa.
The coefficient of bimodality (b) is defined as follows:
where skewness is calculated as follows
and kurtosis as follows
The coefficient ranges from 0-1, where 1 means high bimodality. The coefficient was introduced in the paper Shade A et al. 2014, where they used bimodality and abundance to determine conditionally rare taxa (CRT).
DataFrame or x with results added to its rowData.
Shade A, et al. (2014) Conditionally Rare Taxa Disproportionately Contribute to Temporal Changes in Microbial Diversity. doi: 10.1128/mbio.01371-14
mia::getConditionallyLowAbundant()
library(miaTime) data(SilvermanAGutData) tse <- SilvermanAGutData # In this example, we are only interested on vessel 1. tse <- tse[, tse[["Vessel"]] == 1] tse <- transformAssay(tse, method = "relabundance") # Calculate bimodality b <- getBimodality(tse) # Determine taxa with high bimodality bimodal_taxa <- names(b)[ which(b[[1]] > 0.95) ] # Determine taxa with abundance > 0.5% abundant <- getAbundant( tse, assay.type = "relabundance", abundant.th = 0.5/100) # The detected CRT crt <- intersect(bimodal_taxa, abundant) head(crt)library(miaTime) data(SilvermanAGutData) tse <- SilvermanAGutData # In this example, we are only interested on vessel 1. tse <- tse[, tse[["Vessel"]] == 1] tse <- transformAssay(tse, method = "relabundance") # Calculate bimodality b <- getBimodality(tse) # Determine taxa with high bimodality bimodal_taxa <- names(b)[ which(b[[1]] > 0.95) ] # Determine taxa with abundance > 0.5% abundant <- getAbundant( tse, assay.type = "relabundance", abundant.th = 0.5/100) # The detected CRT crt <- intersect(bimodal_taxa, abundant) head(crt)
Quantify intermediate stability with respect to a given reference point.
getStability(x, ...) addStability(x, ...) ## S4 method for signature 'SummarizedExperiment' addStability(x, time.col, name = "stability", ...) ## S4 method for signature 'SummarizedExperiment' getStability( x, time.col, assay.type = "counts", reference = NULL, group = NULL, ... )getStability(x, ...) addStability(x, ...) ## S4 method for signature 'SummarizedExperiment' addStability(x, time.col, name = "stability", ...) ## S4 method for signature 'SummarizedExperiment' getStability( x, time.col, assay.type = "counts", reference = NULL, group = NULL, ... )
x |
A
|
... |
additional arguments.
|
time.col |
|
name |
|
assay.type |
|
reference |
|
group |
|
These methods estimate intermediate stability described in Lahti et al. 2014. The method is heuristic and makes many simplifying assumptions. However, the stability estimation can be useful for exploration, and can provide a baseline for more advanced measures.
The stability is calculated by first defining reference point, ,
for each feature. User can define the reference points with reference.
If reference points are not defined, they are calculated by taking median
where denotes a single feature and abundance.
Then difference between consecutive time points, ,
and difference between the previous time point and reference,
, are calculated:
where denotes time point.
Optionally, time difference is calculated
The stability coefficient is calculated with correlation
or with linear model as follows
getStability returns DataFrame while addStability
returns results added to its rowData(x).
Lahti L, et al. (2014) Tipping elements in the human intestinal ecosystem. Nat Commun. doi: 10.1038/ncomms5344
library(miaTime) # Load time series data data(minimalgut) tse <- minimalgut # Apply clr transformation tse <- transformAssay(tse, method = "rclr") # Calculate stability single system tse_sub <- tse[, tse[["StudyIdentifier"]] == "Bioreactor A"] tse_sub <- addStability(tse_sub, assay.type = "rclr", time.col = "Time.hr") rowData(tse_sub) # Add custom reference values rowData(tse)[["ref_col"]] <- 1 # Calculate stability for each system simultaneously by taking time # difference into account tse <- addStability( tse, assay.type = "rclr", time.col = "Time.hr", group = "StudyIdentifier", ref_col = "ref_col", mode = "lm") rowData(tse)library(miaTime) # Load time series data data(minimalgut) tse <- minimalgut # Apply clr transformation tse <- transformAssay(tse, method = "rclr") # Calculate stability single system tse_sub <- tse[, tse[["StudyIdentifier"]] == "Bioreactor A"] tse_sub <- addStability(tse_sub, assay.type = "rclr", time.col = "Time.hr") rowData(tse_sub) # Add custom reference values rowData(tse)[["ref_col"]] <- 1 # Calculate stability for each system simultaneously by taking time # difference into account tse <- addStability( tse, assay.type = "rclr", time.col = "Time.hr", group = "StudyIdentifier", ref_col = "ref_col", mode = "lm") rowData(tse)
Calculates sample dissimilarity between consecutive time points along with time difference.
getStepwiseDivergence(x, ...) addStepwiseDivergence(x, ...) ## S4 method for signature 'ANY' getStepwiseDivergence( x, time.col, assay.type = "counts", time.interval = 1L, group = NULL, method = "bray", ... ) ## S4 method for signature 'SummarizedExperiment' addStepwiseDivergence( x, name = c("divergence", "time_diff", "ref_samples"), ... )getStepwiseDivergence(x, ...) addStepwiseDivergence(x, ...) ## S4 method for signature 'ANY' getStepwiseDivergence( x, time.col, assay.type = "counts", time.interval = 1L, group = NULL, method = "bray", ... ) ## S4 method for signature 'SummarizedExperiment' addStepwiseDivergence( x, name = c("divergence", "time_diff", "ref_samples"), ... )
x |
A
|
... |
Optional arguments passed into
|
time.col |
|
assay.type |
|
time.interval |
|
group |
|
method |
|
name |
|
These functions calculate time-wise divergence, meaning each sample is
compared to the previous i-th sample, where i is the specified time
interval (see time.interval). By default, the function calculates
divergence by comparing all samples with each other. However, it is often
more meaningful to calculate divergence within a specific patient or group
(see the group parameter).
getStepwiseDivergence returns DataFrame object
containing the sample dissimilarity and corresponding time difference between
samples. addStepwiseDivergence, on the other hand, returns a
SummarizedExperiment
object with these results in its colData.
library(miaTime) data(hitchip1006) tse <- transformAssay(hitchip1006, method = "relabundance") # Calculate divergence tse <- addStepwiseDivergence( tse, group = "subject", time.interval = 1, time.col = "time", assay.type = "relabundance" )library(miaTime) data(hitchip1006) tse <- transformAssay(hitchip1006, method = "relabundance") # Calculate divergence tse <- addStepwiseDivergence( tse, group = "subject", time.interval = 1, time.col = "time", assay.type = "relabundance" )
This data set contains genus-level microbiota profiling with HITChip for 1006 western adults with no reported health complications, reported in Lahti et al. (2014).
data(hitchip1006)data(hitchip1006)
The data set in
TreeSummarizedExperiment
format.
The data is also available for download from the Data Dryad
http://doi.org/10.5061/dryad.pk75d and was initially released
as a phyloseq object under the name atlas1006 in microbiome R package.
The miaTime package provides this as
TreeSummarizedExperiment
object.
Some of the subjects also have short time series.
Loads the data set in R.
Leo Lahti
Lahti et al. Tipping elements of the human intestinal ecosystem. Nature Communications 5:4344, 2014. https://doi.org/10.1038/ncomms5344.
The Kumaraswamy2024 includes microbiota and metabolite profiling data from 78 Indian individuals (40 males, 38 females).
The Indian subjects were grouped into four diet groups (~20 subjects per group), and fecal samples were collected across three seasonal time points.
The microbiota profiling was performed using HITChip microarray analysis (in duplicate), qPCR (in triplicate with eight-point standard curves), and LC-HRMS and HPLC metabolite profiling with internal standards.
Column metadata includes diet group assignment, sampling season, sex, BMI, age, and questionnaire-based lifestyle metadata.
Quality control metrics include Pearson correlation (>0.98) for HITChip, qPCR assay efficiency (>0.99), and technical replicates for HPLC and qPCR.
Data sources:
Microbiota HITChip microarray data
qPCR absolute abundance data
Chemical profiling data (HPLC, LC-HRMS)
Sample metadata (diet, lifestyle)
Processed and raw data are available via:
Zenodo (DOI: https://doi.org/10.5281/zenodo.14424024)
NCBI-SRA (fermented foods 16S rRNA sequencing, accession: PRJNA1191989)
data(Kumaraswamy2024)data(Kumaraswamy2024)
The data set in
TreeSummarizedExperiment
format.
Loads the data set in R.
Jeyaram, K., Lahti, L., Tims, S. et al
Jeyaram, K., Lahti, L., Tims, S. et al. Fermented foods affect the seasonal stability of gut bacteria in an Indian rural population. Nat Commun 16, 771 https://doi.org/10.1038/s41467-025-56014-6
This time-series data set contains absolute temporal abundances for 16 human gut species that were assembled in an in vitro gut system. These were subjected to a variety of disturbances over a period of 20 days. The sample data includes measurements for Acetate, Butyrate, Propionate, and optical density.
data(minimalgut)data(minimalgut)
The data set in
TreeSummarizedExperiment
format.
The data is available also on
https://github.com/microsud/syncomR
The data contains 183 samples from 3 in vitro gut system, 61 per bioreactor
collected three times daily.
The current implementation in miaTime provides this as
TreeSummarizedExperiment
object.
Loads the data set in R.
Sudarshan A. Shetty
Shetty, S.A., Kostopoulos, I., Geerlings, S.Y. et al. Dynamic metabolic interactions and trophic roles of human gut microbes identified using a minimal microbiome exhibiting ecological properties. ISME, 16, 2144–2159 (2022). https://doi.org/10.1038/s41396-022-01255-2.
The SilvermanAGutData dataset contains 16S rRNA gene based high-throughput profiling of 4 in vitro artificial gut models. The sampling was done hourly and daily to capture sub-daily dynamics of microbial community originating from human feces. The data consists of 413 taxa from 639 samples. The data set can be used to investigate longitudinal dynamics of microbial community in a controlled environment.
Column metadata includes the days of sampling, vessel identifier, sampling frequency pre-post challenge with Bacteroides ovatus.
The row metadata of the microbiome data contains taxonomic information on the Kingdom, Phylum, Class, Order, Family and Genus and Species level.
The row tree consists of a phylogenetic tree build using sequence information of 413 taxa.
As reference sequences the ASV are provided.
All data are downloaded from ExperimentHub and cached for local re-use
data(SilvermanAGutData)data(SilvermanAGutData)
The data set in
TreeSummarizedExperiment
format.
Loads the data set in R.
Sudarshan A. Shetty and Felix G.M. Ernst
Silveman J.D et al. (2018): Dynamic linear models guide design and analysis of microbiota studies within artificial human guts. Microbiome 6:202 https://doi.org/10.1186/s40168-018-0584-3
This time-series data set contains absolute temporal variation for most major gut genera, as well as diversity indicators for both relative and quantitative abundance profiles of healthy women living in an industrialized country.
data(temporalMicrobiome20)data(temporalMicrobiome20)
The data set in
TreeSummarizedExperiment
format.
The data is available also on
https://www.nature.com/articles/s41467-021-27098-7#Sec43
The data contains 713 fecal samples from 20 Belgian women collected
over six weeks.
The current implementation in miaTime provides this as
TreeSummarizedExperiment
object.
Loads the data set in R.
Doris Vandeputte
Vandeputte, D., De Commer, L., Tito, R.Y. et al. Temporal variability in quantitative human gut microbiome profiles and implications for clinical research. Nat Commun 12, 6740 (2021). https://doi.org/10.1038/s41467-021-27098-7