Title: | Placental DNA methylation analysis tools |
---|---|
Description: | This package contains R functions to predict biological variables to from placnetal DNA methylation data generated from infinium arrays. This includes inferring ethnicity/ancestry, gestational age, and cell composition from placental DNA methylation array (450k/850k) data. |
Authors: | Victor Yuan [aut, cre], Wendy P. Robinson [aut, ctb], Icíar Fernández-Boyano [aut, ctb] |
Maintainer: | Victor Yuan <[email protected]> |
License: | GPL-2 |
Version: | 1.15.5 |
Built: | 2025-03-04 06:22:45 UTC |
Source: | https://github.com/bioc/planet |
Coefficients from the three placental gestational age clocks from Lee Y et al. 2019.
Reference: Lee Y, Choufani S, Weksberg R, et al. Placental epigenetic clocks: estimating gestational age using placental DNA methylation levels. Aging (Albany NY). 2019;11(12):4238–4253. doi:10.18632/aging.102049. PMID: 31235674
data(ageCpGs)
data(ageCpGs)
A tibble with coefficients for the RPC, CPC, and refined RPC.
1860 CpGs used to predict ethnicity.
See Yuan et al. 2019 for details.
data(ethnicityCpGs)
data(ethnicityCpGs)
A character vector of length 1860
https://pubmed.ncbi.nlm.nih.gov/31399127/
6 DNA methylation profiles from preeclampsia and healthy control placentas. This data was downloaded from:
"Genome wide DNA methylation profiling of normal and preeclampsia placental samples. Illumina Infinium HumanMethylation450 BeadChip (450K array) was used to obtain DNA methylation profiles in placental samples. Samples included 16 samples from healthy uncomplicated pregnancies and 8 samples from pregnancies affected by preeclampsia." - from Yeung et al.
The DNA methylation data for 24 placental samples were downloaded from
GSE75196.
After normalizing using minfi::preprocessNoob
and wateRmelon::BMIQ
,
the data were filtered to 6/24 samples and 10,000 random CpGs + those CpGs
used in the gestational age clock and ethnicity classifier.
Reference: Yeung KR, Chiu CL, Pidsley R, Makris A et al. DNA methylation profiles in preeclampsia and healthy control placentas. Am J Physiol Heart Circ Physiol 2016 May 15;310(10):H1295-303. PMID:26968548
data(plBetas)
data(plBetas)
A matrix
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75196
First trimester coefficients for placental cellular deconvolution from Yuan V et al. 2020.
Reference: to be edited PMID: to be edited
data(plCellCpGsFirst)
data(plCellCpGsFirst)
A matrix with coefficients for Trophoblasts, Stromal, Endothelial, Hofbauer cells, nRBCs, and Syncytiotrophoblasts.
Third trimester coefficients for placental cellular deconvolution from Yuan V et al. 2020.
Reference: to be edited PMID: to be edited
data(plCellCpGsThird)
data(plCellCpGsThird)
A matrix with coefficients for Trophoblasts, Stromal, Endothelial, Hofbauer cells, nRBCs, and Syncytiotrophoblasts.
A nice color palette for placental cell types.
Used in Yuan V et al. 2020.
Contains colors for:
Syncytiotrophoblast
Trophoblast
Stromal
Hofbauer
Endothelial
nRBCs
data(plColors)
data(plColors)
An object of class character
of length 6.
pl_betas
Sex, disease, and gestational age information associated with
pl_betas
.
Downloaded from the GEO accession:
Reference: Yeung KR, Chiu CL, Pidsley R, Makris A et al. DNA methylation profiles in preeclampsia and healthy control placentas. Am J Physiol Heart Circ Physiol 2016 May 15;310(10):H1295-303. PMID: 26968548
data(plPhenoData)
data(plPhenoData)
A tibble
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75196
copied from mixOmicsTeam/mixOmics/refs/heads/master/R/predict.R on 2025 Jan 30. Some components omitted that are not used in planet.
## S3 method for class 'mixo_pls' predict( object, newdata, study.test, dist = c("all", "max.dist", "centroids.dist", "mahalanobis.dist"), multilevel = NULL, ... ) ## S3 method for class 'mixo_spls' predict( object, newdata, study.test, dist = c("all", "max.dist", "centroids.dist", "mahalanobis.dist"), multilevel = NULL, ... )
## S3 method for class 'mixo_pls' predict( object, newdata, study.test, dist = c("all", "max.dist", "centroids.dist", "mahalanobis.dist"), multilevel = NULL, ... ) ## S3 method for class 'mixo_spls' predict( object, newdata, study.test, dist = c("all", "max.dist", "centroids.dist", "mahalanobis.dist"), multilevel = NULL, ... )
object |
object of class inheriting from
|
newdata |
data matrix in which to look for for explanatory variables to be used for prediction. Please note that this method does not perform multilevel decomposition or log ratio transformations, which need to be processed beforehand. |
study.test |
For MINT objects, grouping factor indicating which samples
of |
dist |
distance to be applied for discriminant methods to predict the
class of new data, should be a subset of |
multilevel |
Design matrix for multilevel analysis (for repeated
measurements). A numeric matrix or data frame. For a one level factor
decomposition, the input is a vector indicating the repeated measures on
each individual, i.e. the individuals ID. For a two level decomposition with
splsda models, the two factors are included in Y. Finally for a two level
decomposition with spls models, 2nd AND 3rd columns in design indicate those
factors (see example in |
... |
not used currently. |
predict
produces a list with the following components:
predict |
predicted response values. The dimensions correspond to the observations, the response variables and the model dimension, respectively. For a supervised model, it corresponds to the predicted dummy variables. |
variates |
matrix of predicted variates. |
B.hat |
matrix of regression coefficients (without the intercept). |
AveragedPredict |
if more than one block, returns the average predicted
values over the blocks (using the |
WeightedPredict |
if more than one block, returns the weighted average
of the predicted values over the blocks (using the |
class |
predicted class of |
MajorityVote |
if more than one block, returns the majority class over the blocks. NA for a sample means that there is no consensus on the predicted class for this particular sample over the blocks. |
WeightedVote |
if more than one block, returns the weighted majority class over the blocks. NA for a sample means that there is no consensus on the predicted class for this particular sample over the blocks. |
weights |
Returns the weights of each block used for the weighted predictions, for each nrepeat and each fold |
centroids |
matrix of coordinates for centroids. |
dist |
type of distance requested. |
vote |
majority vote result for multi block analysis (see details above). |
Florian Rohart, Sébastien Déjean, Ignacio González, Kim-Anh Lê Cao, Al J Abadi
Rohart F, Gautier B, Singh A, Lê Cao K-A. mixOmics: an R package for 'omics feature selection and multiple data integration. PLoS Comput Biol 13(11): e1005752
Tenenhaus, M. (1998). La regression PLS: theorie et pratique. Paris: Editions Technic.
http://www.mixOmics.org for more details.
# example code
# example code
predictAge
Multiplies the coefficients from one of three
epigenetic gestational age clocks, by the corresponding CpGs in a supplied
betas data.frame
.
predictAge(betas, type = "RPC")
predictAge(betas, type = "RPC")
betas |
An n by m dataframe of methylation values on the beta scale (0, 1), where the CpGs are arranged in rows, and samples in columns. Should contain all CpGs used in each clock |
type |
One of the following: "RPC" (Robust), "CPC", (Control) or "RRPC" (Refined Robust). |
Predicts gestational age using one of 3 placental gestational age clocks: RPC, CPC, or refined RPC. Requires placental DNA methylation measured on the Infinium 27K/450k/EPIC methylation array. Ensure as many predictive CpGs are present in your data, otherwise accuracy may be impacted.
It's recommended that you have all predictive CpGs, otherwise accuracy may vary.
A vector of length m
, containing inferred gestational age.
# Load placenta DNAm data library(dplyr) data(plBetas) data(plPhenoData) plPhenoData %>% mutate(inferred_ga = predictAge(plBetas, type = "RPC"))
# Load placenta DNAm data library(dplyr) data(plBetas) data(plPhenoData) plPhenoData %>% mutate(inferred_ga = predictAge(plBetas, type = "RPC"))
Uses 1860 CpGs to predict self-reported ethnicity on placental microarray data.
predictEthnicity(betas, threshold = 0.75, force = FALSE)
predictEthnicity(betas, threshold = 0.75, force = FALSE)
betas |
n x m dataframe of methylation values on the beta scale (0, 1), where the variables are arranged in rows, and samples in columns. Should contain all 1860 predictors and be normalized with NOOB and BMIQ. |
threshold |
A probability threshold ranging from (0, 1) to call samples 'ambiguous'. Defaults to 0.75. |
force |
run even if missing predictors. Default is |
Predicts self-reported ethnicity from 3 classes: Africans, Asians, and Caucasians, using placental DNA methylation data measured on the Infinium 450k/EPIC methylation array. Will return membership probabilities that often reflect genetic ancestry composition.
The input data should contain all 1860 predictors (cpgs) of the final GLMNET model.
It's recommended to use the same normalization methods used on the training data: NOOB and BMIQ.
a tibble
## To predict ethnicity on 450k/850k samples # Load placenta DNAm data data(plBetas) predictEthnicity(plBetas)
## To predict ethnicity on 450k/850k samples # Load placenta DNAm data data(plBetas) predictEthnicity(plBetas)
Uses 45 CpGs to predict early preeclampsia (PE delivered before or at 34 weeks of gestation) on placental DNA methylation microarray data.
predictPreeclampsia(betas, ...)
predictPreeclampsia(betas, ...)
betas |
matrix or array of methylation values on the beta scale (0, 1), where the variables are arranged in rows, and samples in columns. |
... |
feeds into outersect function |
Assigns the class labels "early-PE" or "normotensive" to each sample and returns a class probability.
produces a list with components detailed in the mixOmics::predict
R documentation
prior to prediction. This was the normalization method used on the training data.
# To predict early preeclampsia on 450k/850k samples # Load data library(ExperimentHub) eh <- ExperimentHub() query(eh, "eoPredData") # test object x_test <- eh[['EH8403']] x_test %>% predictPreeclampsia()
# To predict early preeclampsia on 450k/850k samples # Load data library(ExperimentHub) eh <- ExperimentHub() query(eh, "eoPredData") # test object x_test <- eh[['EH8403']] x_test %>% predictPreeclampsia()