mist
(Methylation Inference for Single-cell along
Trajectory) is an R package for differential methylation (DM) analysis
of single-cell DNA methylation (scDNAm) data. The package employs a
Bayesian approach to model methylation changes along pseudotime and
estimates developmental-stage-specific biological variations. It
supports both single-group and two-group analyses, enabling users to
identify genomic features exhibiting temporal changes in methylation
levels or different methylation patterns between groups.
This vignette demonstrates how to use mist
for: 1.
Single-group analysis. 2. Two-group analysis.
To install the latest version of mist
, run the following
commands:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
# Install mist from GitHub
BiocManager::install("https://github.com/dxd429/mist")
From Bioconductor:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("mist")
To view the package vignette in HTML format, run the following lines in R:
In this section, we will estimate parameters and perform differential methylation analysis using single-group data.
Here we load the example data from GSE121708.
estiParam
# Estimate parameters for single-group
Dat_sce <- estiParam(
Dat_sce = Dat_sce,
Dat_name = "Methy_level_group1",
ptime_name = "pseudotime"
)
# Check the output
head(rowData(Dat_sce)$mist_pars)
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4
## ENSMUSG00000000001 1.258305 -0.65671361 0.69294130 0.25567294 -0.03428851
## ENSMUSG00000000003 1.637308 1.52021623 2.99741289 -1.80182953 -3.09225058
## ENSMUSG00000000028 1.297487 -0.02606789 0.12666814 0.02381224 -0.02203108
## ENSMUSG00000000037 1.022418 -4.17092310 11.77587596 -5.42119974 -2.14640761
## ENSMUSG00000000049 1.032193 -0.02407027 0.08464866 0.04582820 0.04732236
## Sigma2_1 Sigma2_2 Sigma2_3 Sigma2_4
## ENSMUSG00000000001 5.584926 13.186113 3.420395 1.812427
## ENSMUSG00000000003 26.371100 5.821442 5.249663 8.953477
## ENSMUSG00000000028 7.756109 6.952036 3.331387 2.139187
## ENSMUSG00000000037 8.442394 14.361304 7.080631 2.661218
## ENSMUSG00000000049 5.798817 8.313150 3.476161 1.348409
dmSingle
# Perform differential methylation analysis for the single-group
Dat_sce <- dmSingle(Dat_sce)
# View the top genomic features with drastic methylation changes
head(rowData(Dat_sce)$mist_int)
## ENSMUSG00000000037 ENSMUSG00000000003 ENSMUSG00000000001 ENSMUSG00000000049
## 0.058965814 0.032287501 0.014025146 0.006945089
## ENSMUSG00000000028
## 0.004486784
plotGene
# Produce scatterplot with fitted curve of a specific gene
library(ggplot2)
plotGene(Dat_sce = Dat_sce,
Dat_name = "Methy_level_group1",
ptime_name = "pseudotime",
gene_name = "ENSMUSG00000000037")
In this section, we will estimate parameters and perform DM analysis using data from two phenotypic groups.
estiParam
# Estimate parameters for both groups
Dat_sce_g1 <- estiParam(
Dat_sce = Dat_sce_g1,
Dat_name = "Methy_level_group1",
ptime_name = "pseudotime"
)
Dat_sce_g2 <- estiParam(
Dat_sce = Dat_sce_g2,
Dat_name = "Methy_level_group2",
ptime_name = "pseudotime"
)
# Check the output
head(rowData(Dat_sce_g1)$mist_pars, n = 3)
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4
## ENSMUSG00000000001 1.253914 -0.41866795 0.4247328 0.17881565 0.032925690
## ENSMUSG00000000003 1.686343 1.65069566 2.8225110 -1.69819740 -3.205147042
## ENSMUSG00000000028 1.281487 -0.06818676 0.1733813 0.04922493 0.002760749
## Sigma2_1 Sigma2_2 Sigma2_3 Sigma2_4
## ENSMUSG00000000001 5.433284 13.723355 3.847083 1.665877
## ENSMUSG00000000003 26.654086 2.879913 5.332224 9.376792
## ENSMUSG00000000028 7.181474 8.596087 2.801461 2.423331
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4
## ENSMUSG00000000001 1.8850212 -1.061052 6.410020 -5.077464 -0.4616122
## ENSMUSG00000000003 -0.8114396 -1.848182 6.256698 -4.759478 0.3981777
## ENSMUSG00000000028 2.3387961 -5.734657 26.782505 -35.486932 14.5193955
## Sigma2_1 Sigma2_2 Sigma2_3 Sigma2_4
## ENSMUSG00000000001 5.425377 6.590355 3.158040 1.203816
## ENSMUSG00000000003 7.343238 11.066584 5.082078 2.939701
## ENSMUSG00000000028 10.579859 5.633411 3.745928 3.053246
dmTwoGroups
# Perform DM analysis to compare the two groups
dm_results <- dmTwoGroups(
Dat_sce_g1 = Dat_sce_g1,
Dat_sce_g2 = Dat_sce_g2
)
# View the top genomic features with different temporal patterns between groups
head(dm_results)
## ENSMUSG00000000037 ENSMUSG00000000003 ENSMUSG00000000001 ENSMUSG00000000028
## 0.049850022 0.030855435 0.022579178 0.016533701
## ENSMUSG00000000049
## 0.009363765
mist
provides a comprehensive suite of tools for
analyzing scDNAm data along pseudotime, whether you are working with a
single group or comparing two phenotypic groups. With the combination of
Bayesian modeling and differential methylation analysis,
mist
is a powerful tool for identifying significant genomic
features in scDNAm data.
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ggplot2_3.5.1 SingleCellExperiment_1.29.1
## [3] SummarizedExperiment_1.37.0 Biobase_2.67.0
## [5] GenomicRanges_1.59.1 GenomeInfoDb_1.43.4
## [7] IRanges_2.41.3 S4Vectors_0.45.4
## [9] BiocGenerics_0.53.6 generics_0.1.3
## [11] MatrixGenerics_1.19.1 matrixStats_1.5.0
## [13] mist_0.99.18 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 farver_2.1.2 dplyr_1.1.4
## [4] Biostrings_2.75.4 bitops_1.0-9 fastmap_1.2.0
## [7] RCurl_1.98-1.16 GenomicAlignments_1.43.0 XML_3.99-0.18
## [10] digest_0.6.37 lifecycle_1.0.4 survival_3.8-3
## [13] magrittr_2.0.3 compiler_4.4.2 rlang_1.1.5
## [16] sass_0.4.9 tools_4.4.2 yaml_2.3.10
## [19] rtracklayer_1.67.1 knitr_1.49 labeling_0.4.3
## [22] S4Arrays_1.7.3 curl_6.2.1 DelayedArray_0.33.6
## [25] abind_1.4-8 BiocParallel_1.41.2 withr_3.0.2
## [28] sys_3.4.3 grid_4.4.2 colorspace_2.1-1
## [31] scales_1.3.0 MASS_7.3-64 mcmc_0.9-8
## [34] cli_3.6.4 mvtnorm_1.3-3 rmarkdown_2.29
## [37] crayon_1.5.3 httr_1.4.7 rjson_0.2.23
## [40] cachem_1.1.0 splines_4.4.2 parallel_4.4.2
## [43] BiocManager_1.30.25 XVector_0.47.2 restfulr_0.0.15
## [46] vctrs_0.6.5 Matrix_1.7-2 jsonlite_1.9.0
## [49] SparseM_1.84-2 carData_3.0-5 car_3.1-3
## [52] MCMCpack_1.7-1 Formula_1.2-5 maketools_1.3.2
## [55] jquerylib_0.1.4 glue_1.8.0 codetools_0.2-20
## [58] gtable_0.3.6 BiocIO_1.17.1 UCSC.utils_1.3.1
## [61] munsell_0.5.1 tibble_3.2.1 pillar_1.10.1
## [64] htmltools_0.5.8.1 quantreg_6.00 GenomeInfoDbData_1.2.13
## [67] R6_2.6.1 evaluate_1.0.3 lattice_0.22-6
## [70] Rsamtools_2.23.1 bslib_0.9.0 MatrixModels_0.5-3
## [73] coda_0.19-4.1 SparseArray_1.7.6 xfun_0.51
## [76] buildtools_1.0.0 pkgconfig_2.0.3
estiParam
dmSingle
plotGene
estiParam
dmTwoGroups