Title: | Methylated CpGs Set Enrichment Analysis |
---|---|
Description: | Identification of diferentially methylated regions (DMRs) in predefined regions (promoters, CpG islands...) from the human genome using Illumina's 450K or EPIC microarray data. Provides methods to rank CpG probes based on linear models and includes plotting functions. |
Authors: | Jordi Martorell-Marugán and Pedro Carmona-Sáez |
Maintainer: | Jordi Martorell-Marugán <[email protected]> |
License: | GPL-2 |
Version: | 1.27.2 |
Built: | 2025-01-03 06:54:05 UTC |
Source: | https://github.com/bioc/mCSEA |
Identification of diferentially methylated regions (DMRs) in predefined regions (promoters, CpG islands...) from the human genome using Illumina's 450K or EPIC microarray data. Provides methods to rank CpG probes based on linear models and includes plotting functions.
Jordi Martorell Marugán
Maintainer: Jordi Martorell Marugán<[email protected]>
## Not run: library(mCSEA) data(mcseadata) myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") myResults <- mCSEATest(myRank, regionsTypes = "promoters", platform = "EPIC") ## End(Not run) data(precomputedmCSEA) head(myResults$promoters)
## Not run: library(mCSEA) data(mcseadata) myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") myResults <- mCSEATest(myRank, regionsTypes = "promoters", platform = "EPIC") ## End(Not run) data(precomputedmCSEA) head(myResults$promoters)
exprTest is a subset of 100 genes' microarray expression data for 20 bone marrow samples: 10 from Acute Lymphoblastic Leukemia patients and 10 from healthy patients. It is useful to test mCSEAIntegrate function.
data(exprTest)
data(exprTest)
matrix
Obtained from the leukemiasEset data package
Uses mCSEA methylation analysis results and expression values to search for significant correlations between DMRs methylation and close genes expression.
mCSEAIntegrate(mCSEAResults, exprData, regionType = c("promoters", "genes", "CGI", "custom"), geneIDs = "SYMBOL", dmrName = NULL, pcutoff = 0.05, minCor = 0.5, minP = 0.05, makePlot = TRUE, folder = ".", nproc = 1)
mCSEAIntegrate(mCSEAResults, exprData, regionType = c("promoters", "genes", "CGI", "custom"), geneIDs = "SYMBOL", dmrName = NULL, pcutoff = 0.05, minCor = 0.5, minP = 0.05, makePlot = TRUE, folder = ".", nproc = 1)
mCSEAResults |
The object generated by mCSEATest function |
exprData |
A matrix or data frame with genes in rows and samples in columns. A SummarizedExperiment object can be used too |
regionType |
The region types to be represented. Must be one or more of "promoters", "genes", "CGI" and "custom" |
geneIDs |
Gene identifiers used in exprData. One of "SYMBOL", "ENSEMBL", "ENTREZID", "GENEID", "REFSEQ" or "UNIGENE" |
dmrName |
The DMR of interest to correlate with expression (e.g. gene name, CGI name...). If NULL (default), all DMRs with P-Value < pcutoff are selected |
pcutoff |
P-Value threshold to select DMRs if dmrName = NULL |
minCor |
Correlation threshold to output the results |
minP |
Correlation P-Value threshold to output the results |
makePlot |
If TRUE, generate corelation and save them in the folder specified by folder parameter |
folder |
Directory to save the correlation plots if makePlot = TRUE |
nproc |
Number of processors to be used |
A data.frame with the integration results.
Jordi Martorell Marugán, [email protected]
data(precomputedmCSEA) data(exprTest) resultsInt <- mCSEAIntegrate(myResults, exprTest, "promoters", "ENSEMBL", "GATA2", makePlot = FALSE) resultsInt
data(precomputedmCSEA) data(exprTest) resultsInt <- mCSEAIntegrate(myResults, exprTest, "promoters", "ENSEMBL", "GATA2", makePlot = FALSE) resultsInt
Generate a graphic with the genomic context of the selected DMR, showing methylation status at each CpG site of different samples groups
mCSEAPlot(mCSEAResults, regionType, dmrName, extend = 1000, chromosome = TRUE, leadingEdge = TRUE, CGI = FALSE, genes = TRUE, transcriptAnnotation = "transcript", makePDF = TRUE, col = c("blue", "magenta", "green", "red", "black"))
mCSEAPlot(mCSEAResults, regionType, dmrName, extend = 1000, chromosome = TRUE, leadingEdge = TRUE, CGI = FALSE, genes = TRUE, transcriptAnnotation = "transcript", makePDF = TRUE, col = c("blue", "magenta", "green", "red", "black"))
mCSEAResults |
The object generated by mCSEATest function |
regionType |
The region type to be represented. Must be one of "promoters", "genes", "CGI" or "custom" |
dmrName |
The DMR of interest to be represented (e.g. gene name, CGI name...) |
extend |
The number of base pairs to extend the plot around the DMR of interest (default = 1000 bp) |
chromosome |
If TRUE, represent the chromosome and genome axis |
leadingEdge |
If TRUE, represent the bars indicating if each CpG belongs to the mCSEA leading edge (green) or not (red) |
CGI |
If TRUE, represent the annotated CpG islands |
genes |
If TRUE, represent the annotated genes |
transcriptAnnotation |
Labels showed at the genes track. Must be one of "transcript" (default), "symbol", "gene", "exon" or "feature" |
makePDF |
If TRUE, save the plot in pdf format in the working directory. Otherwise, draw the plot in the active graphics window |
col |
Vector with colors to plot methylation in different groups |
'NULL'
Jordi Martorell Marugán, [email protected]
rankProbes
, mCSEATest
,
mCSEAPlotGSEA
library(mCSEAdata) data(mcseadata) ## Not run: myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") set.seed(123) myResults <- mCSEATest(myRank, betaTest, phenoTest, regionsTypes = "promoters", platform = "EPIC") ## End(Not run) data(precomputedmCSEA) mCSEAPlot(myResults, "promoters", "CLIC6", transcriptAnnotation = "symbol", makePDF = FALSE)
library(mCSEAdata) data(mcseadata) ## Not run: myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") set.seed(123) myResults <- mCSEATest(myRank, betaTest, phenoTest, regionsTypes = "promoters", platform = "EPIC") ## End(Not run) data(precomputedmCSEA) mCSEAPlot(myResults, "promoters", "CLIC6", transcriptAnnotation = "symbol", makePDF = FALSE)
Generate an enrichment plot
mCSEAPlotGSEA(rank, mCSEAResults, regionType, dmrName)
mCSEAPlotGSEA(rank, mCSEAResults, regionType, dmrName)
rank |
A named numeric vector with the ranking statistic of each CpG site |
mCSEAResults |
The object generated by mCSEATest function |
regionType |
The region type to be represented. Must be one of "promoters", "genes", "CGI" or "custom" |
dmrName |
The DMR of interest to be represented (e.g. gene name, CGI name...) |
'NULL'
Jordi Martorell Marugán, [email protected]
Subramanian, A. et al (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles . PNAS 102, 15545-15550.
rankProbes
, mCSEATest
,
mCSEAPlot
## Not run: library(mCSEAdata) data(mcseadata) myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") set.seed(123) myResults <- mCSEATest(myRank, regionsTypes = "promoters", platform = "EPIC") ## End(Not run) data(precomputedmCSEA) mCSEAPlotGSEA(myRank, myResults, "promoters", "CLIC6")
## Not run: library(mCSEAdata) data(mcseadata) myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") set.seed(123) myResults <- mCSEATest(myRank, regionsTypes = "promoters", platform = "EPIC") ## End(Not run) data(precomputedmCSEA) mCSEAPlotGSEA(myRank, myResults, "promoters", "CLIC6")
Perform a methylated CpG sites enrichment analysis in predefined genomic regions
mCSEATest(rank, methData, pheno = NULL, column = 1, regionsTypes = c("promoters", "genes", "CGI"), customAnnotation = NULL, minCpGs = 5, nproc = 1, nperm = NULL, platform = "450k")
mCSEATest(rank, methData, pheno = NULL, column = 1, regionsTypes = c("promoters", "genes", "CGI"), customAnnotation = NULL, minCpGs = 5, nproc = 1, nperm = NULL, platform = "450k")
rank |
A named numeric vector with the ranking statistic of each CpG site |
methData |
A data frame or a matrix containing Illumina's CpG probes in rows and samples in columns. A SummarizedExperiment object can be used too |
pheno |
A data frame or a matrix containing samples in rows and covariates in columns. If NULL (default), pheno is extracted from the SummarizedExperiment object |
column |
The column name or number from pheno used to split the samples into groups (first column is used by default) |
regionsTypes |
A character or character vector indicating the predefined regions to be analyzed. NULL to skip this step and use customAnnotation. |
customAnnotation |
An optional list with the CpGs associated to each feature (default = NULL) |
minCpGs |
Minimum number of CpGs associated to a region. Regions below this threshold are not tested |
nproc |
Number of processors to use in GSEA step (default = 1) |
nperm |
(deprecated) Number of permutations to do in GSEA step in the previous implementation. Now, this parameter is ignored |
platform |
Platform used to get the methylation data (450k or EPIC) |
A list with the results of each of the analyzed regions. For each region type, a data frame with the results and a list with the probes associated to each region are generated. In addition, this list also contains the input methData, pheno and platform objects
Jordi Martorell Marugán, [email protected]
Subramanian, A. et al (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles . PNAS 102, 15545-15550.
rankProbes
, mCSEAPlot
,
mCSEAPlotGSEA
## Not run: library(mCSEAdata) data(mcseadata) myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") set.seed(123) myResults <- mCSEATest(myRank, betaTest, phenoTest, regionsTypes = "promoters", platform = "EPIC") ## End(Not run) data(precomputedmCSEA) head(myResults[["promoters"]]) head(myResults[["promoters_association"]])
## Not run: library(mCSEAdata) data(mcseadata) myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") set.seed(123) myResults <- mCSEATest(myRank, betaTest, phenoTest, regionsTypes = "promoters", platform = "EPIC") ## End(Not run) data(precomputedmCSEA) head(myResults[["promoters"]]) head(myResults[["promoters_association"]])
myRank is a result of rankProbes function. myResults is a result of mCSEATest function.
data(precomputedmCSEA)
data(precomputedmCSEA)
vector (myRank
) and list data.frame (myResults
)
Both objects were obtained with the example commands in the mCSEA vignette.
Apply a linear model to Illumina's 450k or EPIC methylation data to get the t-value of each CpG probe
rankProbes(methData, pheno = NULL, paired = FALSE, explanatory = 1, covariates = c(), pairColumn = c(), caseGroup = 1, refGroup = 2, continuous = NULL, typeInput = "beta", typeAnalysis = "M")
rankProbes(methData, pheno = NULL, paired = FALSE, explanatory = 1, covariates = c(), pairColumn = c(), caseGroup = 1, refGroup = 2, continuous = NULL, typeInput = "beta", typeAnalysis = "M")
methData |
A data frame or a matrix containing Illumina's CpG probes in rows and samples in columns. A SummarizedExperiment object can be used too |
pheno |
A data frame or a matrix containing samples in rows and covariates in columns. If NULL (default), pheno is extracted from the SummarizedExperiment object |
paired |
Perform a paired t-test (default = FALSE) |
explanatory |
The column name or position from pheno used to perform the comparison between groups (default = first column) |
covariates |
A list or character vector with column names from pheno used as data covariates in the linear model |
pairColumn |
Only for paired analysis. The column name or position from pheno used to connect the paired samples (default = NULL) |
caseGroup |
The group name or position from explanatory variable used as cases to perform the comparison (default = first group) |
refGroup |
The group name or position from explanatory variable used as reference to perform the comparison (default = second group) |
continuous |
A list or character vector with columns names from pheno which should be treated as continuous variables (default = none) |
typeInput |
Type of input methylation data. "beta" for Beta-values and "M" for M-values |
typeAnalysis |
"M" to use M-values to rank the CpG probes (default). "beta" to use Beta-values instead |
A named vector containing the t-values from the linear model for each CpG probe
Jordi Martorell Marugán, [email protected]
Smyth, G. K. (2005). Limma: linear models for microarray data. Bioinformatics and Computational Biology Solutions using R and Bioconductor, 397-420.
data(mcseadata) myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") head(myRank)
data(mcseadata) myRank <- rankProbes(betaTest, phenoTest, refGroup = "Control") head(myRank)