Title: | Muscle Epigenetic Age Test |
---|---|
Description: | This package estimates epigenetic age in skeletal muscle, using DNA methylation data generated with the Illumina Infinium technology (HM27, HM450 and HMEPIC). |
Authors: | Sarah Voisin [aut, cre] , Steve Horvath [ctb] |
Maintainer: | Sarah Voisin <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.19.0 |
Built: | 2024-10-30 08:24:19 UTC |
Source: | https://github.com/bioc/MEAT |
BMIQcalibration
uses an adapted version of the BMIQ algorithm to
calibrate the beta-matrix stored in the input SummarizedExperiment object
SE
to the gold standard dataset used in the muscle clock (GSE50498).
BMIQcalibration(SE, version = "MEAT2.0")
BMIQcalibration(SE, version = "MEAT2.0")
SE |
A |
version |
A character specifying which version of the epigenetic clock
you would like to use. Dy default, |
BMIQcalibration
was created by Steve Horvath,
largely based on the BMIQ
function from
Teschendorff (2013) to adjust for the type-2 bias in Illumina HM450
and HMEPIC arrays. BMIQ stands for beta mixture quantile normalization.
Horvath fixed minor errors in the v_1.2 version of the BMIQ algorithm
and changed the optimization algorithm to make the code more robust.
He used method = "Nelder-Mead" in optim
since
the other optimization method sometimes gets stuck. Toward this end,
the function blc
was replaced by blc2
.
SE
needs to be a SummarizedExperiment object containing a matrix of
beta-values that has been cleaned using clean_beta
.
Each sample in SE
is iteratively calibrated to the
gold standard values, so the time it takes to run
BMIQcalibration
is directly proportional to the number
of samples in SE
. This step is essential to estimate
epigenetic age with accuracy.
A calibrated version of the input SE
calibrated to the gold
standard dataset GSE50498.
clean_beta
to get the DNA methylation matrix ready
for calibration,
BMIQ
for the original BMIQ algorithm and
https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r115
for the original paper describing Horvath's adapted BMIQ algorithm, and
SummarizedExperiment-class
for more
details on how to create and manipulate SummarizedExperiment objects.
# Load matrix of beta-values of two individuals from dataset GSE121961 data("GSE121961", envir = environment()) # Load phenotypes of the two individuals from dataset GSE121961 data("GSE121961_pheno", envir = environment()) # Create a SummarizedExperiment object to coordinate phenotypes and # methylation into one object. library(SummarizedExperiment) GSE121961_SE <- SummarizedExperiment(assays=list(beta=GSE121961), colData=GSE121961_pheno) # Run clean_beta() to clean the beta-matrix GSE121961_SE_clean <- clean_beta(SE = GSE121961_SE, version = "MEAT2.0") # Run BMIQcalibration() to calibrate the clean beta-matrix GSE121961_SE_calibrated <- BMIQcalibration(SE = GSE121961_SE_clean, version = "MEAT2.0")
# Load matrix of beta-values of two individuals from dataset GSE121961 data("GSE121961", envir = environment()) # Load phenotypes of the two individuals from dataset GSE121961 data("GSE121961_pheno", envir = environment()) # Create a SummarizedExperiment object to coordinate phenotypes and # methylation into one object. library(SummarizedExperiment) GSE121961_SE <- SummarizedExperiment(assays=list(beta=GSE121961), colData=GSE121961_pheno) # Run clean_beta() to clean the beta-matrix GSE121961_SE_clean <- clean_beta(SE = GSE121961_SE, version = "MEAT2.0") # Run BMIQcalibration() to calibrate the clean beta-matrix GSE121961_SE_calibrated <- BMIQcalibration(SE = GSE121961_SE_clean, version = "MEAT2.0")
clean_beta
reduces the beta-matrix stored in the input
SummarizedExperiment object SE
to the right CpGs, imputes missing
values if any, and replaces 0 and 1 with min and max values.
clean_beta(SE = NULL, version = "MEAT2.0")
clean_beta(SE = NULL, version = "MEAT2.0")
SE |
A |
version |
A character specifying which version of the epigenetic clock
you would like to use. Dy default, |
clean_beta
will transform the the beta-matrix stored in SE
by:
reducing it to the CpGs used to calibrate DNA methylation profiles
to the gold standard. By default, clean_beta
will reduce your beta-matrix
to the 18,747 CpGs used in the updated version of MEAT (MEAT 2.0).
If you would like to use the original version of MEAT, clean_beta
will reduce your data to the 19,401 CpGs that are in common between the 12
datasets from the original publication.
checking whether it contains missing values, and impute them with
impute.knn
,
check whether it contains 0 and 1 values, and if any, change them to the minimum non-0 and maximum non-1 values in the beta-matrix.
A clean version of the input SE
reduced to the right CpGs,
with missing values imputed, and without 0 or 1 values.
impute.knn
for imputation of missing values,
and SummarizedExperiment-class
for more
details on how to create and manipulate SummarizedExperiment objects.
# Load matrix of beta-values of two individuals from dataset GSE121961 data("GSE121961", envir = environment()) # Load phenotypes of the two individuals from dataset GSE121961 data("GSE121961_pheno", envir = environment()) # Create a SummarizedExperiment object to coordinate phenotypes and # methylation into one object. library(SummarizedExperiment) GSE121961_SE <- SummarizedExperiment(assays=list(beta=GSE121961), colData=GSE121961_pheno) # Run clean_beta() to clean the beta-matrix GSE121961_SE_clean <- clean_beta(SE = GSE121961_SE, version = "MEAT2.0")
# Load matrix of beta-values of two individuals from dataset GSE121961 data("GSE121961", envir = environment()) # Load phenotypes of the two individuals from dataset GSE121961 data("GSE121961_pheno", envir = environment()) # Create a SummarizedExperiment object to coordinate phenotypes and # methylation into one object. library(SummarizedExperiment) GSE121961_SE <- SummarizedExperiment(assays=list(beta=GSE121961), colData=GSE121961_pheno) # Run clean_beta() to clean the beta-matrix GSE121961_SE_clean <- clean_beta(SE = GSE121961_SE, version = "MEAT2.0")
Detailed information on the 200 CpGs automatically selected by the elastic net model.
CpGs_in_MEAT
CpGs_in_MEAT
A data frame with 201 rows and 6 variables:
CpG name
Weight given by the elastic net model to the CpG
Chromosome where the CpG is located
Position in bp where the CpG is located (human genome build version hg38)
Gene annotated to the CpG. Each CpG was annotated to one or more genes using the annotation file from Zhou et al. to which we added annotation to long-range interaction promoters using chromatin states in male skeletal muscle from the Roadmap Epigenomics Project and GeneHancer information from the Genome Browser (hg38).
Chromatin state in male skeletal muscle from the the Roadmap Epigenomics Project)
https://onlinelibrary.wiley.com/doi/full/10.1002/jcsm.12556
Detailed information on the 156 CpGs automatically selected by the elastic net model.
CpGs_in_MEAT2.0
CpGs_in_MEAT2.0
A data frame with 157 rows and 6 variables:
CpG name
Weight given by the elastic net model to the CpG
Chromosome where the CpG is located
Position in bp where the CpG is located (human genome build version hg38)
Gene annotated to the CpG. Each CpG was annotated to one or more genes using the annotation file from Zhou et al. to which we added annotation to long-range interaction promoters using chromatin states in male skeletal muscle from the Roadmap Epigenomics Project and GeneHancer information from the Genome Browser (hg38).
Chromatin state in male skeletal muscle from the the Roadmap Epigenomics Project)
Chromatin state in female skeletal muscle from the the Roadmap Epigenomics Project)
https://onlinelibrary.wiley.com/doi/full/10.1002/jcsm.12741
An object with S3 class "glmnet","elnet" generated by training 682 skeletal muscle DNA methylation profiles on a tranformed version of age. This elastic net model can take in any skeletal muscle DNA methylation profile that has been cleaned and calibrated to the GSE50498 gold standard dataset, to estimate epigenetic age in the sample.
elasticnet_model_MEAT
elasticnet_model_MEAT
An elastic net model
An object with S3 class "glmnet","elnet" generated by training 1,053 skeletal muscle DNA methylation profiles on a tranformed version of age. This elastic net model can take in any skeletal muscle DNA methylation profile that has been cleaned and calibrated to the GSE50498 gold standard dataset, to estimate epigenetic age in the sample.
elasticnet_model_MEAT2.0
elasticnet_model_MEAT2.0
An elastic net model
epiage_estimation
takes as input a
SummarizedExperiment-class
object whose
assays contain a beta-matrix called "beta". This beta-matrix should contain
DNA methylation profiles in skeletal muscle that have been cleaned with
clean_beta
and calibrated with BMIQcalibration
.
epiage_estimation
will use the muscle clock to estimate epigenetic age
in each sample.
epiage_estimation(SE = NULL, version = "MEAT2.0", age_col_name = NULL)
epiage_estimation(SE = NULL, version = "MEAT2.0", age_col_name = NULL)
SE |
A |
version |
A character specifying which version of the epigenetic clock
you would like to use. Dy default, |
age_col_name |
The name of the column in colData from |
epiage_estimation
estimates epigenetic age for each sample in the
input SE
based on DNA methylation profiles. SE
needs to be a
SummarizedExperiment-class
object containing
a matrix of beta-values called "beta" in assays. Beta must have been
calibrated to the gold standard GSE50498 using BMIQcalibration
to obtain good estimates of epigenetic age.
A SummarizedExperiment-class
object
identical to the input SE
, with components added to colData. If no
phenotypes were provided in the colData of the input SE
,
epiage_estimation
will put in colData a tibble containing a single
column called "DNAmage", corresponding to epigenetic age (in years) for each
sample. If phenotypes were provided in the colData of the input SE
,
epiage_estimation
will add to the existing colData three columns:
DNAmage
epigenetic age (in years)
AAdiff
the difference between predicted and actual age
(in years).
AAresid
the residuals of a linear model
(using lm
) of DNAmage against actual age.
AAresid
is only returned if the number of samples is > 2, as
AAresid
cannot be calculated with < 2 samples.
BMIQ
for the original BMIQ algorithm,
https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r115
for the adapted version of the BMIQ algorithm, and
https://onlinelibrary.wiley.com/doi/full/10.1002/jcsm.12556
for the elastic net model of the muscle clock.
# Load matrix of beta-values of two individuals from dataset GSE121961 data("GSE121961", envir = environment()) # Load phenotypes of the two individuals from dataset GSE121961 data("GSE121961_pheno", envir = environment()) # Create a SummarizedExperiment object to coordinate phenotypes and # methylation into one object. library(SummarizedExperiment) GSE121961_SE <- SummarizedExperiment(assays=list(beta=GSE121961), colData=GSE121961_pheno) # Run clean_beta() to clean the beta-matrix GSE121961_SE_clean <- clean_beta(SE = GSE121961_SE, version = "MEAT2.0") # Run BMIQcalibration() to calibrate the clean beta-matrix GSE121961_SE_calibrated <- BMIQcalibration(SE = GSE121961_SE_clean, version = "MEAT2.0") # Run epiage_estimation() to obtain DNAmage + optionally AAdiff and AAresid GSE121961_SE_epiage <- epiage_estimation(SE = GSE121961_SE_calibrated, version = "MEAT2.0", age_col_name = "Age") colData(GSE121961_SE_epiage)
# Load matrix of beta-values of two individuals from dataset GSE121961 data("GSE121961", envir = environment()) # Load phenotypes of the two individuals from dataset GSE121961 data("GSE121961_pheno", envir = environment()) # Create a SummarizedExperiment object to coordinate phenotypes and # methylation into one object. library(SummarizedExperiment) GSE121961_SE <- SummarizedExperiment(assays=list(beta=GSE121961), colData=GSE121961_pheno) # Run clean_beta() to clean the beta-matrix GSE121961_SE_clean <- clean_beta(SE = GSE121961_SE, version = "MEAT2.0") # Run BMIQcalibration() to calibrate the clean beta-matrix GSE121961_SE_calibrated <- BMIQcalibration(SE = GSE121961_SE_clean, version = "MEAT2.0") # Run epiage_estimation() to obtain DNAmage + optionally AAdiff and AAresid GSE121961_SE_epiage <- epiage_estimation(SE = GSE121961_SE_calibrated, version = "MEAT2.0", age_col_name = "Age") colData(GSE121961_SE_epiage)
Gold standard dataset GSE50498 containing the mean methylation across 24 young and 24 old individuals at the 19,401 CpGs used to calibrate DNA methylation profiles.
gold.mean.MEAT
gold.mean.MEAT
A data frame with 19,401 rows and 2 variables:
CpG name
mean methylation across all samples at the corresponding CpG (between 0 and 1)
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE50nnn/GSE50498/matrix/
Gold standard dataset GSE50498 containing the mean methylation across 24 young and 24 old individuals at the 18,747 CpGs used to calibrate DNA methylation profiles.
gold.mean.MEAT2.0
gold.mean.MEAT2.0
A data frame with 18,747 rows and 2 variables:
CpG name
mean methylation across all samples at the corresponding CpG (between 0 and 1)
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE50nnn/GSE50498/matrix/
GSE121961 dataset containing 2 DNA methylation profiles generated with the HMEPIC technology, and used here as a test dataset.
GSE121961
GSE121961
A data frame with 866,091 CpGs (rows) and 2 individuals (columns)
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE121nnn/GSE121961/matrix/
GSE121961_pheno contains information on sex, age (missing for the controls), and group (Control, or SELENON/RYR mutant) for the 2 samples in the GSE121961 DNA methylation dataset.
GSE121961_pheno
GSE121961_pheno
A data frame with 2 samples (rows) and 4 phenotypes (columns).
https://onlinelibrary.wiley.com/doi/abs/10.1002/humu.23745