Package 'mist'

Title: Differential Methylation Analysis for scDNAm Data
Description: mist (Methylation Inference for Single-cell along Trajectory) is a hierarchical Bayesian framework for modeling DNA methylation trajectories and performing differential methylation (DM) analysis in single-cell DNA methylation (scDNAm) data. It estimates developmental-stage-specific variations, identifies genomic features with drastic changes along pseudotime, and, for two phenotypic groups, detects features with distinct temporal methylation patterns. mist uses Gibbs sampling to estimate parameters for temporal changes and stage-specific variations.
Authors: Daoyu Duan [aut, cre]
Maintainer: Daoyu Duan <[email protected]>
License: MIT + file LICENSE
Version: 0.99.18
Built: 2025-02-22 03:26:37 UTC
Source: https://github.com/bioc/mist

Help Index


Differential methylation evaluation over time for scDNA-seq data under the 1-group scenario.

Description

This function performs DM analysis to identify genomic features showing drastic changes along pseudotime, by modeling the methylation changes along pseudotime and estimating the developmental-stage-specific biological variations.

Usage

dmSingle(Dat_sce, BPPARAM = MulticoreParam())

Arguments

Dat_sce

The updated sce object with A numeric matrix of estimated parameters for all genomic features in the rowData, including:

  • β0\beta_0 to β4\beta_4: Estimated coefficients for the polynomial of degree 4.

  • σ12\sigma^2_1 to σ42\sigma^2_4: Estimated variances for each stage along the pseudotime.

BPPARAM

A BiocParallelParam object specifying the parallel backend for computations, as used in bplapply(). Defaults to MulticoreParam() for parallel processing.

Value

The updated sce object with A named numeric vector where each value corresponds to a genomic feature (e.g., a gene) in the rowData. The values represent the minimum area between the fitted curve of scDNA methylation levels along pseudotime and a constant horizontal line for each feature. This metric measures the magnitude of methylation level changes along pseudotime. The values are sorted in descending order, with larger values indicating more drastic changes.

Examples

library(mist)
data <- readRDS(system.file("extdata", "group1_sampleData_sce.rds", package = "mist"))
Dat_sce_new <- estiParam(
    Dat_sce = data,
    Dat_name = "Methy_level_group1",
    ptime_name = "pseudotime"
)
dm_sce <- dmSingle(Dat_sce_new)

Differential methylation evaluation over time for scDNA-seq data under the 2-group scenario.

Description

This function performs differential methylation (DM) analysis to identify genomic features showing significant changes between two groups along pseudotime. The function models methylation changes and compares the fitted curves for each group, calculating the integral of the differences between the curves.

Usage

dmTwoGroups(Dat_sce_g1, Dat_sce_g2, BPPARAM = MulticoreParam())

Arguments

Dat_sce_g1

A SingleCellExperiment object for group 1, containing:

  • mist_pars: A numeric matrix in rowData with estimated parameters for all genomic features (generated by estiParamTwo).

Dat_sce_g2

A SingleCellExperiment object for group 2, containing:

  • mist_pars: A numeric matrix in rowData with estimated parameters for all genomic features (generated by estiParamTwo).

BPPARAM

A BiocParallelParam object specifying the parallel backend for computations, as used in bplapply(). Defaults to MulticoreParam() for parallel processing.

Value

A named numeric vector where each value corresponds to a genomic feature (e.g., a gene). The values represent the integral of the differences between the fitted curves of scDNA methylation levels for the two groups. The vector is sorted in descending order, with larger values indicating more drastic differences between the groups.

Examples

library(SingleCellExperiment)
data_g1 <- readRDS(system.file("extdata", "group1_sampleData_sce.rds", package = "mist"))
data_g2 <- readRDS(system.file("extdata", "group2_sampleData_sce.rds", package = "mist"))
Dat_sce_g1 <- estiParam(
    Dat_sce = data_g1,
    Dat_name = "Methy_level_group1",
    ptime_name = "pseudotime"
)

Dat_sce_g2 <- estiParam(
    Dat_sce = data_g2,
    Dat_name = "Methy_level_group2",
    ptime_name = "pseudotime"
) 
# Run differential methylation analysis
dm_results <- dmTwoGroups(
    Dat_sce_g1 = Dat_sce_g1,
    Dat_sce_g2 = Dat_sce_g2
)

Parameter Estimation With mist

Description

This function performs the Gibbs sampling procedure based on hierarchical Bayesian modeling to produce the parameters required for differential methylation analysis.

Usage

estiParam(
  Dat_sce,
  Dat_name,
  ptime_name,
  BPPARAM = MulticoreParam(),
  verbose = TRUE
)

Arguments

Dat_sce

A SingleCellExperiment object containing the single-cell DNA methylation level. Methylation levels should be stored as an assay, with genomic feature (gene) names in rownames and cells in colnames.

Dat_name

A character string specifying the name of the assay to extract the methylation level data.

ptime_name

A character string specifying the name of the column in colData containing the pseudotime vector.

BPPARAM

A BiocParallelParam object specifying the parallel backend for computations, as used in bplapply(). Defaults to MulticoreParam() for parallel processing.

verbose

A logical value indicating whether to print progress messages to the console. Defaults to TRUE. Set to FALSE to suppress messages.

Value

The updated sce object with A numeric matrix of estimated parameters for all genomic features in the rowData, including:

  • β0\beta_0 to β4\beta_4: Estimated coefficients for the polynomial of degree 4.

  • σ12\sigma^2_1 to σ42\sigma^2_4: Estimated variances for each stage along the pseudotime.

Examples

library(SingleCellExperiment)
data <- readRDS(system.file("extdata", "group1_sampleData_sce.rds", package = "mist"))
Dat_sce_new <- estiParam(
    Dat_sce = data,
    Dat_name = "Methy_level_group1",
    ptime_name = "pseudotime"
)

Plot Methylation Levels vs. Pseudotime for a Specific Gene

Description

Generates a scatter plot of methylation levels vs. pseudotime for a specific gene, overlaid with the fitted curve based on the estimated coefficients.

Usage

plotGene(Dat_sce, Dat_name, ptime_name, gene_name)

Arguments

Dat_sce

The updated sce object with A numeric matrix of estimated parameters for all genomic features in the rowData, including:

  • β0\beta_0 to β4\beta_4: Estimated coefficients for the polynomial of degree 4.

  • σ12\sigma^2_1 to σ42\sigma^2_4: Estimated variances for each stage along the pseudotime.

Dat_name

A character string specifying the name of the assay to extract the methylation data.

ptime_name

A character string specifying the name of the colData to extract the pseudotime vector.

gene_name

A character string specifying the gene name to plot.

Value

A ggplot2 scatter plot with an overlayed fitted curve.

Examples

library(SingleCellExperiment)
library(ggplot2)
data <- readRDS(system.file("extdata", "group1_sampleData_sce.rds", package = "mist"))
Dat_sce_new <- estiParam(
    Dat_sce = data,
    Dat_name = "Methy_level_group1",
    ptime_name = "pseudotime"
)
plotGene(Dat_sce = Dat_sce_new,
Dat_name = "Methy_level_group1",
ptime_name = "pseudotime",
gene_name = "ENSMUSG00000000037")

Small Sample Single-Cell Data for Testing and Illustration

Description

A small single-cell dataset for testing and illustration purposes, derived from the GEO dataset GSE121708. The data contains DNA methylation information from mouse gastrulation experiments at the single-cell level.

Format

A SingleCellExperiment object containing:

  • Metadata about cells and genes.

  • DNA methylation levels across a subset of genes and cells.

Details

  • Data Processing Steps:

    1. Annotation: Data was annotated using the mm10 reference genome.

    2. Data Cleaning: Genes expressed in less than 10% of cells were removed, resulting in a scDNAm dataset containing 986 cells and 18,220 genes.

    3. Pseudotime Inference: Pseudotime was inferred using the default settings in Monocle3.

    4. Subsampling:

      • For illustration purposes (e.g., vignette generation), a random sample of 5 genes was selected to create group1_sampleData_sce.rds.

Source

The data originates from the GEO dataset GSE121708, as described in:

  1. Argelaguet R, Clark SJ, Mohammed H, Stapel LC, et al. Multi-omics profiling of mouse gastrulation at single-cell resolution. Nature 2019 Dec;576(7787):487-491. PMID: 31827285.

  2. Kapourani CA, Argelaguet R, Sanguinetti G, Vallejos CA. scMET: Bayesian modeling of DNA methylation heterogeneity at single-cell resolution. Genome Biol 2021 Apr 20;22(1):114. PMID: 33879195.

Examples

file_path <- system.file("extdata", "group1_sampleData_sce.rds", package = "mist")
Dat_sce <- readRDS(file_path)

Small Sample Single-Cell Data for Testing and Illustration

Description

A small single-cell dataset for testing and illustration purposes, derived from the GEO dataset GSE121708. The data contains DNA methylation information from mouse gastrulation experiments at the single-cell level.

Format

A SingleCellExperiment object containing:

  • Metadata about cells and genes.

  • DNA methylation levels across a subset of genes and cells.

Details

  • Data Processing Steps:

    1. Annotation: Data was annotated using the mm10 reference genome.

    2. Data Cleaning: Genes expressed in less than 10% of cells were removed, resulting in a scDNAm dataset containing 986 cells and 18,220 genes.

    3. Pseudotime Inference: Pseudotime was inferred using the default settings in Monocle3.

    4. Subsampling:

      • For illustration purposes (e.g., vignette generation), a random sample of 5 genes was selected to create group2_sampleData_sce.rds.

Source

The data originates from the GEO dataset GSE121708, as described in:

  1. Argelaguet R, Clark SJ, Mohammed H, Stapel LC, et al. Multi-omics profiling of mouse gastrulation at single-cell resolution. Nature 2019 Dec;576(7787):487-491. PMID: 31827285.

  2. Kapourani CA, Argelaguet R, Sanguinetti G, Vallejos CA. scMET: Bayesian modeling of DNA methylation heterogeneity at single-cell resolution. Genome Biol 2021 Apr 20;22(1):114. PMID: 33879195.

Examples

file_path <- system.file("extdata", "group2_sampleData_sce.rds", package = "mist")
Dat_sce <- readRDS(file_path)