Title: | Differentially Methylated Regions caller |
---|---|
Description: | Uses Bisulfite sequencing data in two conditions and identifies differentially methylated regions between the conditions in CG and non-CG context. The input is the CX report files produced by Bismark and the output is a list of DMRs stored as GRanges objects. |
Authors: | Nicolae Radu Zabet <[email protected]>, Jonathan Michael Foonlan Tsang <[email protected]>, Alessandro Pio Greco <[email protected]> and Ryan Merritt <[email protected]> |
Maintainer: | Nicolae Radu Zabet <[email protected]> |
License: | GPL-3 |
Version: | 1.39.0 |
Built: | 2024-10-30 05:23:04 UTC |
Source: | https://github.com/bioc/DMRcaller |
This function extracts from the methylation data the total number of reads, the number of methylated reads and the number of cytosines in the specific context from a region (e.g. DMRs)
analyseReadsInsideRegionsForCondition(regions, methylationData, context, label = "", cores = 1)
analyseReadsInsideRegionsForCondition(regions, methylationData, context, label = "", cores = 1)
regions |
a |
methylationData |
the methylation data in one condition
(see |
context |
the context in which to extract the reads ( |
label |
a string to be added to the columns to identify the condition |
cores |
the number of cores used to compute the DMRs. |
a GRanges
object with additional four metadata columns
the number of methylated reads
the total number of reads
the proportion methylated reads
the number of cytosines in the regions
Nicolae Radu Zabet
filterDMRs
, computeDMRs
,
DMRsNoiseFilterCG
, and mergeDMRsIteratively
# load the methylation data data(methylationDataList) #load the DMRs in CG context. These DMRs were computed with minGap = 200. data(DMRsNoiseFilterCG) #retrive the number of reads in CHH context in WT DMRsNoiseFilterCGreadsCHH <- analyseReadsInsideRegionsForCondition( DMRsNoiseFilterCG[1:10], methylationDataList[["WT"]], context = "CHH", label = "WT")
# load the methylation data data(methylationDataList) #load the DMRs in CG context. These DMRs were computed with minGap = 200. data(DMRsNoiseFilterCG) #retrive the number of reads in CHH context in WT DMRsNoiseFilterCGreadsCHH <- analyseReadsInsideRegionsForCondition( DMRsNoiseFilterCG[1:10], methylationDataList[["WT"]], context = "CHH", label = "WT")
This function computes the differentially methylated regions between two conditions.
computeDMRs(methylationData1, methylationData2, regions = NULL, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", lambda = 0.5, binSize = 100, test = "fisher", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1)
computeDMRs(methylationData1, methylationData2, regions = NULL, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", lambda = 0.5, binSize = 100, test = "fisher", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1)
methylationData1 |
the methylation data in condition 1
(see |
methylationData2 |
the methylation data in condition 2
(see |
regions |
a |
context |
the context in which the DMRs are computed ( |
method |
the method used to compute the DMRs ( |
windowSize |
the size of the triangle base measured in nucleotides.
This parameter is required only if the selected method is
|
kernelFunction |
a |
lambda |
numeric value required for the Gaussian filter
( |
binSize |
the size of the tiling bins in nucleotides. This parameter is
required only if the selected method is |
test |
the statistical test used to call DMRs ( |
pValueThreshold |
DMRs with p-values (when performing the statistical
test; see |
minCytosinesCount |
DMRs with less cytosines in the specified context
than |
minProportionDifference |
DMRs where the difference in methylation
proportion between the two conditions is lower than
|
minGap |
DMRs separated by a gap of at least |
minSize |
DMRs with a size smaller than |
minReadsPerCytosine |
DMRs with the average number of reads lower than
|
cores |
the number of cores used to compute the DMRs. |
the DMRs stored as a GRanges
object with the following
metadata columns:
a number indicating whether the region lost (-1) or gain (+1) methylation in condition 2 compared to condition 1.
the context in which the DMRs was computed ("CG"
,
"CHG"
or "CHH"
).
the number of methylated reads in condition 1.
the total number of reads in condition 1.
the proportion methylated reads in condition 1.
the number of methylated reads in condition 2.
the total number reads in condition 2.
the proportion methylated reads in condition 2.
the number of cytosines in the DMR.
a string indicating whether the region lost ("loss"
)
or gained ("gain"
) methylation in condition 2 compared to condition 1.
the p-value (adjusted to control the false discovery rate with the Benjamini and Hochberg's method) of the statistical test when the DMR was called.
Nicolae Radu Zabet and Jonathan Michael Foonlan Tsang
filterDMRs
, mergeDMRsIteratively
,
analyseReadsInsideRegionsForCondition
and
DMRsNoiseFilterCG
# load the methylation data data(methylationDataList) # the regions where to compute the DMRs regions <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) # compute the DMRs in CG context with noise_filter method DMRsNoiseFilterCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) ## Not run: # compute the DMRs in CG context with neighbourhood method DMRsNeighbourhoodCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "neighbourhood", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) # compute the DMRs in CG context with bins method DMRsBinsCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "bins", binSize = 100, test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) ## End(Not run)
# load the methylation data data(methylationDataList) # the regions where to compute the DMRs regions <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) # compute the DMRs in CG context with noise_filter method DMRsNoiseFilterCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) ## Not run: # compute the DMRs in CG context with neighbourhood method DMRsNeighbourhoodCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "neighbourhood", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) # compute the DMRs in CG context with bins method DMRsBinsCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "bins", binSize = 100, test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) ## End(Not run)
This function computes the differentially methylated regions between replicates with two conditions.
computeDMRsReplicates(methylationData, condition = NULL, regions = NULL, context = "CG", method = "neighbourhood", binSize = 100, test = "betareg", pseudocountM = 1, pseudocountN = 2, pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1)
computeDMRsReplicates(methylationData, condition = NULL, regions = NULL, context = "CG", method = "neighbourhood", binSize = 100, test = "betareg", pseudocountM = 1, pseudocountN = 2, pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1)
methylationData |
the methylation data containing all the conditions for all the replicates. |
condition |
a vector of strings indicating the conditions for each
sample in |
regions |
a |
context |
the context in which the DMRs are computed ( |
method |
the method used to compute the DMRs |
binSize |
the size of the tiling bins in nucleotides. This parameter is
required only if the selected method is |
test |
the statistical test used to call DMRs ( |
pseudocountM |
numerical value to be added to the methylated reads before modelling beta regression. |
pseudocountN |
numerical value to be added to the total reads before modelling beta regression. |
pValueThreshold |
DMRs with p-values (when performing the statistical
test; see |
minCytosinesCount |
DMRs with less cytosines in the specified context
than |
minProportionDifference |
DMRs where the difference in methylation
proportion between the two conditions is lower than
|
minGap |
DMRs separated by a gap of at least |
minSize |
DMRs with a size smaller than |
minReadsPerCytosine |
DMRs with the average number of reads lower than
|
cores |
the number of cores used to compute the DMRs. |
the DMRs stored as a GRanges
object with the following
metadata columns:
a number indicating whether the region lost (-1) or gain (+1) methylation in condition 2 compared to condition 1.
the context in which the DMRs was computed ("CG"
,
"CHG"
or "CHH"
).
the number of methylated reads in condition 1.
the total number of reads in condition 1.
the proportion methylated reads in condition 1.
the number of methylated reads in condition 2.
the total number reads in condition 2.
the proportion methylated reads in condition 2.
the number of cytosines in the DMR.
a string indicating whether the region lost ("loss"
)
or gained ("gain"
) methylation in condition 2 compared to condition 1.
the p-value (adjusted to control the false discovery rate with the Benjamini and Hochberg's method) of the statistical test when the DMR was called.
Alessandro Pio Greco and Nicolae Radu Zabet
## Not run: # starting with data joined using joinReplicates data("syntheticDataReplicates") # compute the DMRs in CG context with neighbourhood method # creating condition vector condition <- c("a", "a", "b", "b") # computing DMRs using the neighbourhood method DMRsReplicatesNeighbourhood <- computeDMRsReplicates(methylationData = methylationData, condition = condition, regions = NULL, context = "CHH", method = "neighbourhood", test = "betareg", pseudocountM = 1, pseudocountN = 2, pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) ## End(Not run)
## Not run: # starting with data joined using joinReplicates data("syntheticDataReplicates") # compute the DMRs in CG context with neighbourhood method # creating condition vector condition <- c("a", "a", "b", "b") # computing DMRs using the neighbourhood method DMRsReplicatesNeighbourhood <- computeDMRsReplicates(methylationData = methylationData, condition = condition, regions = NULL, context = "CHH", method = "neighbourhood", test = "betareg", pseudocountM = 1, pseudocountN = 2, pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) ## End(Not run)
This function computes the coverage for bisulfite sequencing data. It
returns a vector
with the proportion (or raw count) of cytosines that
have the number of reads higher or equal than a vector
of specified
thresholds.
computeMethylationDataCoverage(methylationData, regions = NULL, context = "CG", breaks = NULL, proportion = TRUE)
computeMethylationDataCoverage(methylationData, regions = NULL, context = "CG", breaks = NULL, proportion = TRUE)
methylationData |
the methylation data stored as a |
regions |
a |
context |
the context in which the DMRs are computed ( |
breaks |
a |
proportion |
a |
a vector
with the proportion (or raw count) of cytosines that
have the number of reads higher or equal than the threshold values specified
in the breaks
vector
.
Nicolae Radu Zabet and Jonathan Michael Foonlan Tsang
plotMethylationDataCoverage
,
methylationDataList
# load the methylation data data(methylationDataList) # compute coverage in CG context breaks <- c(1,5,10,15) coverage_CG_wt <- computeMethylationDataCoverage(methylationDataList[["WT"]], context="CG", breaks=breaks)
# load the methylation data data(methylationDataList) # compute coverage in CG context breaks <- c(1,5,10,15) coverage_CG_wt <- computeMethylationDataCoverage(methylationDataList[["WT"]], context="CG", breaks=breaks)
This function computes the correlation of the methylation levels as a
function of the distances between the Cytosines. The function returns a
vector
with the correlation of methylation levels at distance equal to
a vector
of specified thresholds.
computeMethylationDataSpatialCorrelation(methylationData, regions = NULL, context = "CG", distances = NULL)
computeMethylationDataSpatialCorrelation(methylationData, regions = NULL, context = "CG", distances = NULL)
methylationData |
the methylation data stored as a |
regions |
a |
context |
the context in which the correlation is computed ( |
distances |
a |
a vector
with the correlation of the methylation levels for
Cytosines located at distances specified in the distances
vector
.
Nicolae Radu Zabet
plotMethylationDataSpatialCorrelation
,
methylationDataList
# load the methylation data data(methylationDataList) # compute spatial correlation in CG context distances <- c(1,5,10,15) correlation_CG_wt <- computeMethylationDataSpatialCorrelation(methylationDataList[["WT"]], context="CG", distances=distances)
# load the methylation data data(methylationDataList) # compute spatial correlation in CG context distances <- c(1,5,10,15) correlation_CG_wt <- computeMethylationDataSpatialCorrelation(methylationDataList[["WT"]], context="CG", distances=distances)
This function computes the low resolution profiles for the bisulfite sequencing data.
computeMethylationProfile(methylationData, region, windowSize = floor(width(region)/500), context = "CG")
computeMethylationProfile(methylationData, region, windowSize = floor(width(region)/500), context = "CG")
methylationData |
the methylation data stored as a |
region |
a |
windowSize |
a |
context |
the context in which the DMRs are computed ( |
a GRanges
object with equal sized tiles of the
region
. The object consists of the following metadata
the number of methylated reads.
the total number of reads.
the proportion of methylated reads.
the context ("CG"
, "CHG"
or "CHH"
).
Nicolae Radu Zabet and Jonathan Michael Foonlan Tsang
plotMethylationProfileFromData
,
plotMethylationProfile
, methylationDataList
# load the methylation data data(methylationDataList) # the region where to compute the profile region <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E6)) # compute low resolution profile in 20 Kb windows lowResProfileWTCHH <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 20000, context = "CHH") ## Not run: # compute low resolution profile in 10 Kb windows lowResProfileWTCG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CG") lowResProfileMet13CG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CG") ## End(Not run)
# load the methylation data data(methylationDataList) # the region where to compute the profile region <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E6)) # compute low resolution profile in 20 Kb windows lowResProfileWTCHH <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 20000, context = "CHH") ## Not run: # compute low resolution profile in 10 Kb windows lowResProfileWTCG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CG") lowResProfileMet13CG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CG") ## End(Not run)
This function computes the distribution of a subset of regions
(GRanges
object) over a large region (GRanges
object)
computeOverlapProfile(subRegions, largeRegion, windowSize = floor(width(largeRegion)/500), binary = TRUE, cores = 1)
computeOverlapProfile(subRegions, largeRegion, windowSize = floor(width(largeRegion)/500), binary = TRUE, cores = 1)
subRegions |
a |
largeRegion |
a |
windowSize |
The |
binary |
a value indicating whether to count 1 for each overlap or to compute the width of the overlap |
cores |
the number of cores used to compute the DMRs. |
a GRanges
object with equal sized tiles of the regions.
The object has one metadata file score
which represents: the number of
subRegions overlapping with the tile, in the case of binary = TRUE
,
and the width of the subRegions overlapping with the tile , in the case of
binary = FALSE
.
Nicolae Radu Zabet
plotOverlapProfile
, filterDMRs
,
computeDMRs
and mergeDMRsIteratively
# load the methylation data data(methylationDataList) # load the DMRs in CG context data(DMRsNoiseFilterCG) # the coordinates of the area to be plotted largeRegion <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) # compute overlaps distribution hotspots <- computeOverlapProfile(DMRsNoiseFilterCG, largeRegion, windowSize = 10000, binary = FALSE)
# load the methylation data data(methylationDataList) # load the DMRs in CG context data(DMRsNoiseFilterCG) # the coordinates of the area to be plotted largeRegion <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) # compute overlaps distribution hotspots <- computeOverlapProfile(DMRsNoiseFilterCG, largeRegion, windowSize = 10000, binary = FALSE)
Uses bisulfite sequencing data in two conditions and identifies differentially methylated regions between the conditions in CG and non-CG context. The input is the CX report files produced by Bismark and the output is a list of DMRs stored as GRanges objects.
The most important functions in the DMRcaller are:
readBismark
reads the Bismark CX report files in a
GRanges
object.
readBismarkPool
Reads multiple CX report files and pools them together.
saveBismark
saves the methylation data stored in a
GRanges
object into a Bismark CX report file.
poolMethylationDatasets
pools together multiple methylation datasets.
poolTwoMethylationDatasets
pools together two methylation datasets.
computeMethylationDataCoverage
Computes the coverage for the bisulfite sequencing data.
plotMethylationDataCoverage
Plots the coverage for the bisulfite sequencing data.
computeMethylationDataSpatialCorrelation
Computes the correlation between methylation levels as a function of the distances between the Cytosines.
plotMethylationDataSpatialCorrelation
Plots the correlation of methylation levels for Cytosines located at a certain distance apart.
computeMethylationProfile
Computes the low resolution profiles for the bisulfite sequencing data at certain locations.
plotMethylationProfile
Plots the low resolution profiles for the bisulfite sequencing data at certain locations.
plotMethylationProfileFromData
Plots the low resolution profiles for the loaded bisulfite sequencing data.
computeDMRs
Computes the differentially methylated regions between two conditions.
filterDMRs
Filters a list of (potential) differentially methylated regions.
mergeDMRsIteratively
Merge DMRs iteratively.
analyseReadsInsideRegionsForCondition
Analyse reads inside regions for condition.
plotLocalMethylationProfile
Plots the methylation profile at one locus for the bisulfite sequencing data.
computeOverlapProfile
Computes the distribution of a set of subregions on a large region.
plotOverlapProfile
Plots the distribution of a set of subregions on a large region.
getWholeChromosomes
Computes the GRanges objects with each chromosome as an element from the methylationData.
joinReplicates
Merges two GRanges objects with single reads columns. It is necessary to start the analysis of DMRs with biological replicates.
computeDMRsReplicates
Computes the differentially methylated regions between two conditions with multiple biological replicates.
Nicolae Radu Zabet [email protected], Jonathan Michael Foonlan Tsang [email protected] Alessandro Pio Greco [email protected]
Maintainer: Nicolae Radu Zabet [email protected]
See vignette("rd", package = "DMRcaller")
for an overview
of the package.
## Not run: # load the methylation data data(methylationDataList) #plot the low resolution profile at 5 Kb resolution par(mar=c(4, 4, 3, 1)+0.1) plotMethylationProfileFromData(methylationDataList[["WT"]], methylationDataList[["met1-3"]], conditionsNames=c("WT", "met1-3"), windowSize = 5000, autoscale = TRUE, context = c("CG", "CHG", "CHH"), labels = LETTERS) # compute low resolution profile in 10 Kb windows in CG context lowResProfileWTCG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CG") lowResProfileMet13CG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CG") lowResProfileCG <- GRangesList("WT" = lowResProfileWTCG, "met1-3" = lowResProfileMet13CG) # compute low resolution profile in 10 Kb windows in CHG context lowResProfileWTCHG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CHG") lowResProfileMet13CHG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CHG") lowResProfileCHG <- GRangesList("WT" = lowResProfileWTCHG, "met1-3" = lowResProfileMet13CHG) # plot the low resolution profile par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(2,1)) plotMethylationProfile(lowResProfileCG, autoscale = FALSE, labels = LETTERS[1], title="CG methylation on Chromosome 3", col=c("#D55E00","#E69F00"), pch = c(1,0), lty = c(4,1)) plotMethylationProfile(lowResProfileCHG, autoscale = FALSE, labels = LETTERS[2], title="CHG methylation on Chromosome 3", col=c("#0072B2", "#56B4E9"), pch = c(16,2), lty = c(3,2)) # plot the coverage in all three contexts plotMethylationDataCoverage(methylationDataList[["WT"]], methylationDataList[["met1-3"]], breaks = 1:15, regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG", "CHG", "CHH"), proportion = TRUE, labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) # plot the correlation of methylation levels as a function of distance plotMethylationDataSpatialCorrelation(methylationDataList[["WT"]], distances = c(1,5,10,15), regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG"), labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) # the regions where to compute the DMRs regions <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E6)) # compute the DMRs in CG context with noise_filter method DMRsNoiseFilterCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) # compute the DMRs in CG context with neighbourhood method DMRsNeighbourhoodCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "neighbourhood", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) # compute the DMRs in CG context with bins method DMRsBinsCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "bins", binSize = 100, test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) # load the gene annotation data data(GEs) #select the genes genes <- GEs[which(GEs$type == "gene")] # the regions where to compute the DMRs genes <- genes[overlapsAny(genes, regions)] # filter genes that are differntially methylated in the two conditions DMRsGenesCG <- filterDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], potentialDMRs = genes, context = "CG", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minReadsPerCytosine = 3, cores = 1) #merge the DMRs DMRsNoiseFilterCGLarger <- mergeDMRsIteratively(DMRsNoiseFilterCG, minGap = 500, respectSigns = TRUE, methylationDataList[["WT"]], methylationDataList[["met1-3"]], context = "CG", minProportionDifference=0.4, minReadsPerCytosine = 1, pValueThreshold=0.01, test="score",alternative = "two.sided") #select the genes genes <- GEs[which(GEs$type == "gene")] # the coordinates of the area to be plotted chr3Reg <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(510000,530000)) # load the DMRs in CG context data(DMRsNoiseFilterCG) DMRsCGlist <- list("noise filter"=DMRsNoiseFilterCG, "neighbourhood"=DMRsNeighbourhoodCG, "bins"=DMRsBinsCG, "genes"=DMRsGenesCG) # plot the CG methylation par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(1,1)) plotLocalMethylationProfile(methylationDataList[["WT"]], methylationDataList[["met1-3"]], chr3Reg, DMRsCGlist, c("WT", "met1-3"), GEs, windowSize=100, main="CG methylation") hotspotsHypo <- computeOverlapProfile( DMRsNoiseFilterCG[(DMRsNoiseFilterCG$regionType == "loss")], region, windowSize=2000, binary=TRUE, cores=1) hotspotsHyper <- computeOverlapProfile( DMRsNoiseFilterCG[(DMRsNoiseFilterCG$regionType == "gain")], region, windowSize=2000, binary=TRUE, cores=1) plotOverlapProfile(GRangesList("Chr3"=hotspotsHypo), GRangesList("Chr3"=hotspotsHyper), names=c("loss", "gain"), title="CG methylation") # loading synthetic data data("syntheticDataReplicates") # creating condition vector condition <- c("a", "a", "b", "b") # computing DMRs using the neighbourhood method DMRsReplicatesNeighbourhood <- computeDMRsReplicates(methylationData = methylationData, condition = condition, regions = NULL, context = "CHH", method = "neighbourhood", test = "betareg", pseudocountM = 1, pseudocountN = 2, pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) ## End(Not run)
## Not run: # load the methylation data data(methylationDataList) #plot the low resolution profile at 5 Kb resolution par(mar=c(4, 4, 3, 1)+0.1) plotMethylationProfileFromData(methylationDataList[["WT"]], methylationDataList[["met1-3"]], conditionsNames=c("WT", "met1-3"), windowSize = 5000, autoscale = TRUE, context = c("CG", "CHG", "CHH"), labels = LETTERS) # compute low resolution profile in 10 Kb windows in CG context lowResProfileWTCG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CG") lowResProfileMet13CG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CG") lowResProfileCG <- GRangesList("WT" = lowResProfileWTCG, "met1-3" = lowResProfileMet13CG) # compute low resolution profile in 10 Kb windows in CHG context lowResProfileWTCHG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CHG") lowResProfileMet13CHG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CHG") lowResProfileCHG <- GRangesList("WT" = lowResProfileWTCHG, "met1-3" = lowResProfileMet13CHG) # plot the low resolution profile par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(2,1)) plotMethylationProfile(lowResProfileCG, autoscale = FALSE, labels = LETTERS[1], title="CG methylation on Chromosome 3", col=c("#D55E00","#E69F00"), pch = c(1,0), lty = c(4,1)) plotMethylationProfile(lowResProfileCHG, autoscale = FALSE, labels = LETTERS[2], title="CHG methylation on Chromosome 3", col=c("#0072B2", "#56B4E9"), pch = c(16,2), lty = c(3,2)) # plot the coverage in all three contexts plotMethylationDataCoverage(methylationDataList[["WT"]], methylationDataList[["met1-3"]], breaks = 1:15, regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG", "CHG", "CHH"), proportion = TRUE, labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) # plot the correlation of methylation levels as a function of distance plotMethylationDataSpatialCorrelation(methylationDataList[["WT"]], distances = c(1,5,10,15), regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG"), labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) # the regions where to compute the DMRs regions <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E6)) # compute the DMRs in CG context with noise_filter method DMRsNoiseFilterCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) # compute the DMRs in CG context with neighbourhood method DMRsNeighbourhoodCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "neighbourhood", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) # compute the DMRs in CG context with bins method DMRsBinsCG <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "bins", binSize = 100, test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) # load the gene annotation data data(GEs) #select the genes genes <- GEs[which(GEs$type == "gene")] # the regions where to compute the DMRs genes <- genes[overlapsAny(genes, regions)] # filter genes that are differntially methylated in the two conditions DMRsGenesCG <- filterDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], potentialDMRs = genes, context = "CG", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minReadsPerCytosine = 3, cores = 1) #merge the DMRs DMRsNoiseFilterCGLarger <- mergeDMRsIteratively(DMRsNoiseFilterCG, minGap = 500, respectSigns = TRUE, methylationDataList[["WT"]], methylationDataList[["met1-3"]], context = "CG", minProportionDifference=0.4, minReadsPerCytosine = 1, pValueThreshold=0.01, test="score",alternative = "two.sided") #select the genes genes <- GEs[which(GEs$type == "gene")] # the coordinates of the area to be plotted chr3Reg <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(510000,530000)) # load the DMRs in CG context data(DMRsNoiseFilterCG) DMRsCGlist <- list("noise filter"=DMRsNoiseFilterCG, "neighbourhood"=DMRsNeighbourhoodCG, "bins"=DMRsBinsCG, "genes"=DMRsGenesCG) # plot the CG methylation par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(1,1)) plotLocalMethylationProfile(methylationDataList[["WT"]], methylationDataList[["met1-3"]], chr3Reg, DMRsCGlist, c("WT", "met1-3"), GEs, windowSize=100, main="CG methylation") hotspotsHypo <- computeOverlapProfile( DMRsNoiseFilterCG[(DMRsNoiseFilterCG$regionType == "loss")], region, windowSize=2000, binary=TRUE, cores=1) hotspotsHyper <- computeOverlapProfile( DMRsNoiseFilterCG[(DMRsNoiseFilterCG$regionType == "gain")], region, windowSize=2000, binary=TRUE, cores=1) plotOverlapProfile(GRangesList("Chr3"=hotspotsHypo), GRangesList("Chr3"=hotspotsHyper), names=c("loss", "gain"), title="CG methylation") # loading synthetic data data("syntheticDataReplicates") # creating condition vector condition <- c("a", "a", "b", "b") # computing DMRs using the neighbourhood method DMRsReplicatesNeighbourhood <- computeDMRsReplicates(methylationData = methylationData, condition = condition, regions = NULL, context = "CHH", method = "neighbourhood", test = "betareg", pseudocountM = 1, pseudocountN = 2, pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minGap = 200, minSize = 50, minReadsPerCytosine = 4, cores = 1) ## End(Not run)
A GRangesList
object containing the DMRs between Wild Type (WT) and
met1-3 mutant (met1-3) in Arabidopsis thaliana
(see methylationDataList
). The DMRs were computed on the first
1 Mbp from Chromosome 3 with noise filter method using a triangular kernel
and a windowSize of 100 bp
The GRanges
element contain 11 metadata columns;
see computeDMRs
filterDMRs
, computeDMRs
,
analyseReadsInsideRegionsForCondition
and mergeDMRsIteratively
This function extracts GC sites in the genome
extractGC(methylationData, genome, contexts = c("ALL", "CG", "CHG", "CHH"))
extractGC(methylationData, genome, contexts = c("ALL", "CG", "CHG", "CHH"))
methylationData |
the methylation data stored as a |
genome |
a BSgenome with the DNA sequence of the organism |
contexts |
the context in which the DMRs are computed ( |
the a subset of methylationData
consisting of all GC sites.
Ryan Merritt
## Not run: # load the genome sequence if(!require("BSgenome.Athaliana.TAIR.TAIR9", character.only = TRUE)){ if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("BSgenome.Athaliana.TAIR.TAIR9") } library(BSgenome.Athaliana.TAIR.TAIR9) # load the methylation data data(methylationDataList) methylationDataWTGpCpG <- extractGC(methylationDataList[["WT"]], BSgenome.Athaliana.TAIR.TAIR9, "CG") ## End(Not run)
## Not run: # load the genome sequence if(!require("BSgenome.Athaliana.TAIR.TAIR9", character.only = TRUE)){ if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("BSgenome.Athaliana.TAIR.TAIR9") } library(BSgenome.Athaliana.TAIR.TAIR9) # load the methylation data data(methylationDataList) methylationDataWTGpCpG <- extractGC(methylationDataList[["WT"]], BSgenome.Athaliana.TAIR.TAIR9, "CG") ## End(Not run)
This function verifies whether a set of pottential DMRs (e.g. genes, transposons, CpG islands) are differentially methylated or not.
filterDMRs(methylationData1, methylationData2, potentialDMRs, context = "CG", test = "fisher", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minReadsPerCytosine = 3, cores = 1)
filterDMRs(methylationData1, methylationData2, potentialDMRs, context = "CG", test = "fisher", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minReadsPerCytosine = 3, cores = 1)
methylationData1 |
the methylation data in condition 1
(see |
methylationData2 |
the methylation data in condition 2
(see |
potentialDMRs |
a |
context |
the context in which the DMRs are computed ( |
test |
the statistical test used to call DMRs ( |
pValueThreshold |
DMRs with p-values (when performing the statistical
test; see |
minCytosinesCount |
DMRs with less cytosines in the specified context
than |
minProportionDifference |
DMRs where the difference in methylation
proportion between the two conditions is lower than
|
minReadsPerCytosine |
DMRs with the average number of reads lower than
|
cores |
the number of cores used to compute the DMRs. |
a GRanges
object with 11 metadata columns that contain
the DMRs; see computeDMRs
.
Nicolae Radu Zabet
DMRsNoiseFilterCG
, computeDMRs
,
analyseReadsInsideRegionsForCondition
and mergeDMRsIteratively
# load the methylation data data(methylationDataList) # load the gene annotation data data(GEs) #select the genes genes <- GEs[which(GEs$type == "gene")] # the regions where to compute the DMRs regions <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) genes <- genes[overlapsAny(genes, regions)] # filter genes that are differntially methylated in the two conditions DMRsGenesCG <- filterDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], potentialDMRs = genes, context = "CG", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minReadsPerCytosine = 3, cores = 1)
# load the methylation data data(methylationDataList) # load the gene annotation data data(GEs) #select the genes genes <- GEs[which(GEs$type == "gene")] # the regions where to compute the DMRs regions <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) genes <- genes[overlapsAny(genes, regions)] # filter genes that are differntially methylated in the two conditions DMRsGenesCG <- filterDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], potentialDMRs = genes, context = "CG", test = "score", pValueThreshold = 0.01, minCytosinesCount = 4, minProportionDifference = 0.4, minReadsPerCytosine = 3, cores = 1)
A GRanges
object containing the annotation of the Arabidopsis thaliana
A GRanges
object
The object was created by calling import.gff3
function
from rtracklayer
package for
ftp://ftp.arabidopsis.org/Maps/gbrowse_data/TAIR10/TAIR10_GFF3_genes_transposons.gff
Returns a GRanges
object spanning from the first cytocine until
the last one on each chromosome
getWholeChromosomes(methylationData)
getWholeChromosomes(methylationData)
methylationData |
the methylation data stored as a |
a GRanges
object will all chromosomes.
Nicolae Radu Zabet
# load the methylation data data(methylationDataList) # get all chromosomes chromosomes <- getWholeChromosomes(methylationDataList[["WT"]])
# load the methylation data data(methylationDataList) # get all chromosomes chromosomes <- getWholeChromosomes(methylationDataList[["WT"]])
This function joins together data that come from biological replicates to perform analysis
joinReplicates(methylationData1, methylationData2, usecomplete = FALSE)
joinReplicates(methylationData1, methylationData2, usecomplete = FALSE)
methylationData1 |
the methylation data stored as a |
methylationData2 |
the methylation data stored as a |
usecomplete |
Boolean, determine wheter, when the two dataset differ for number of cytosines, if the smaller dataset should be added with zero reads to match the bigger dataset. |
returns a GRanges
object containing multiple metadata
columns with the reads from each object passed as parameter
Alessandro Pio Greco and Nicolae Radu Zabet
## Not run: # load the methylation data data(methylationDataList) # Joins the wildtype and the mutant in a single object joined_data <- joinReplicates(methylationDataList[["WT"]], methylationDataList[["met1-3"]], FALSE) ## End(Not run)
## Not run: # load the methylation data data(methylationDataList) # Joins the wildtype and the mutant in a single object joined_data <- joinReplicates(methylationDataList[["WT"]], methylationDataList[["met1-3"]], FALSE) ## End(Not run)
This function takes a list of DMRs and attempts to merge DMRs while keeping the new DMRs statistically significant.
mergeDMRsIteratively(DMRs, minGap, respectSigns = TRUE, methylationData1, methylationData2, context = "CG", minProportionDifference = 0.4, minReadsPerCytosine = 4, pValueThreshold = 0.01, test = "fisher", alternative = "two.sided", cores = 1)
mergeDMRsIteratively(DMRs, minGap, respectSigns = TRUE, methylationData1, methylationData2, context = "CG", minProportionDifference = 0.4, minReadsPerCytosine = 4, pValueThreshold = 0.01, test = "fisher", alternative = "two.sided", cores = 1)
DMRs |
the list of DMRs as a |
minGap |
DMRs separated by a gap of at least |
respectSigns |
logical value indicating whether to respect the sign when joining DMRs. |
methylationData1 |
the methylation data in condition 1
(see |
methylationData2 |
the methylation data in condition 2
(see |
context |
the context in which the DMRs are computed ( |
minProportionDifference |
two adjacent DMRs are merged only if the
difference in methylation proportion of the new DMR is higher than
|
minReadsPerCytosine |
two adjacent DMRs are merged only if the number of
reads per cytosine of the new DMR is higher than |
pValueThreshold |
two adjacent DMRs are merged only if the p-value of
the new DMR (see |
test |
the statistical test used to call DMRs ( |
alternative |
indicates the alternative hypothesis and must be one of
|
cores |
the number of cores used to compute the DMRs. |
the reduced list of DMRs as a GRanges
object;
e.g. see computeDMRs
Nicolae Radu Zabet
filterDMRs
, computeDMRs
,
analyseReadsInsideRegionsForCondition
and
DMRsNoiseFilterCG
# load the methylation data data(methylationDataList) #load the DMRs in CG context they were computed with minGap = 200 data(DMRsNoiseFilterCG) #merge the DMRs DMRsNoiseFilterCGLarger <- mergeDMRsIteratively(DMRsNoiseFilterCG[1:100], minGap = 500, respectSigns = TRUE, methylationDataList[["WT"]], methylationDataList[["met1-3"]], context = "CG", minProportionDifference=0.4, minReadsPerCytosine = 1, pValueThreshold=0.01, test="score",alternative = "two.sided") ## Not run: #set genomic coordinates where to compute DMRs regions <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) # compute DMRs and remove gaps smaller than 200 bp DMRsNoiseFilterCG200 <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", test = "score", pValueThreshold = 0.01, minCytosinesCount = 1, minProportionDifference = 0.4, minGap = 200, minSize = 0, minReadsPerCytosine = 1, cores = 1) DMRsNoiseFilterCG0 <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", test = "score", pValueThreshold = 0.01, minCytosinesCount = 1, minProportionDifference = 0.4, minGap = 0, minSize = 0, minReadsPerCytosine = 1, cores = 1) DMRsNoiseFilterCG0Merged200 <- mergeDMRsIteratively(DMRsNoiseFilterCG0, minGap = 200, respectSigns = TRUE, methylationDataList[["WT"]], methylationDataList[["met1-3"]], context = "CG", minProportionDifference=0.4, minReadsPerCytosine = 1, pValueThreshold=0.01, test="score",alternative = "two.sided") #check that all newley computed DMRs are identical print(all(DMRsNoiseFilterCG200 == DMRsNoiseFilterCG0Merged200)) ## End(Not run)
# load the methylation data data(methylationDataList) #load the DMRs in CG context they were computed with minGap = 200 data(DMRsNoiseFilterCG) #merge the DMRs DMRsNoiseFilterCGLarger <- mergeDMRsIteratively(DMRsNoiseFilterCG[1:100], minGap = 500, respectSigns = TRUE, methylationDataList[["WT"]], methylationDataList[["met1-3"]], context = "CG", minProportionDifference=0.4, minReadsPerCytosine = 1, pValueThreshold=0.01, test="score",alternative = "two.sided") ## Not run: #set genomic coordinates where to compute DMRs regions <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) # compute DMRs and remove gaps smaller than 200 bp DMRsNoiseFilterCG200 <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", test = "score", pValueThreshold = 0.01, minCytosinesCount = 1, minProportionDifference = 0.4, minGap = 200, minSize = 0, minReadsPerCytosine = 1, cores = 1) DMRsNoiseFilterCG0 <- computeDMRs(methylationDataList[["WT"]], methylationDataList[["met1-3"]], regions = regions, context = "CG", method = "noise_filter", windowSize = 100, kernelFunction = "triangular", test = "score", pValueThreshold = 0.01, minCytosinesCount = 1, minProportionDifference = 0.4, minGap = 0, minSize = 0, minReadsPerCytosine = 1, cores = 1) DMRsNoiseFilterCG0Merged200 <- mergeDMRsIteratively(DMRsNoiseFilterCG0, minGap = 200, respectSigns = TRUE, methylationDataList[["WT"]], methylationDataList[["met1-3"]], context = "CG", minProportionDifference=0.4, minReadsPerCytosine = 1, pValueThreshold=0.01, test="score",alternative = "two.sided") #check that all newley computed DMRs are identical print(all(DMRsNoiseFilterCG200 == DMRsNoiseFilterCG0Merged200)) ## End(Not run)
A GRangesList
object containing the methylation data at each cytosine
location in the genome in Wild Type (WT) and met1-3 mutant (met1-3) in
Arabidopsis thaliana. The data only contains the first 1 Mbp from Chromosome 3.
The GRanges
elements contain four metadata columns
the context in which the DMRs are computed ("CG"
,
"CHG"
or "CHH"
).
the number of methylated reads.
the total number of reads.
the specific context of the cytosine (H is replaced by the actual nucleotide).
Each element was created by by calling readBismark
function on the CX report files generated by Bismark
http://www.bioinformatics.babraham.ac.uk/projects/bismark/
for http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM980986 dataset
in the case of Wild Type (WT) and
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM981032
dataset in the case of met1-3 mutant (met1-3).
This function plots the methylation profile at one locus for the bisulfite
sequencing data.The points on the graph represent methylation proportion of
individual cytosines, their colour which sample they belong to and the
intesity of the the colour how many reads that particular cytosine had. This
means that darker colors indicate stronger evidence that the corresponding
cytosine has the corresponding methylation proportion, while lighter colors
indicate a weaker evidence. The solid lines represent the smoothed profiles
and the intensity of the line the coverage at the corresponding position
(darker colors indicate more reads while lighter ones less reads). The boxes
on top represent the DMRs, where a filled box will represent a DMR which
gained methylation while a box with a pattern represent a DMR that lost
methylation. The DMRs need to have a metadafield "regionType"
which
can be either "gain"
(where there is more methylation in condition 2
compared to condition 1) or "loss"
(where there is less methylation in
condition 2 compared to condition 1). In case this metadafield is missing all
DMRs are drawn using a filled box. Finally, we also allow annotation of the
DNA sequence. We represent by a black boxes all the exons, which are joined
by a horizontal black line, thus, marking the full body of the gene. With
grey boxes we mark the transposable elements. Both for genes and transposable
elements we plot them over a mid line if they are on the positive strand and
under the mid line if they are on the negative strand.
plotLocalMethylationProfile(methylationData1, methylationData2, region, DMRs = NULL, conditionsNames = NULL, gff = NULL, windowSize = 150, context = "CG", labels = NULL, col = NULL, main = "", plotMeanLines = TRUE, plotPoints = TRUE)
plotLocalMethylationProfile(methylationData1, methylationData2, region, DMRs = NULL, conditionsNames = NULL, gff = NULL, windowSize = 150, context = "CG", labels = NULL, col = NULL, main = "", plotMeanLines = TRUE, plotPoints = TRUE)
methylationData1 |
the methylation data in condition 1
(see |
methylationData2 |
the methylation data in condition 2
(see |
region |
a |
DMRs |
a |
conditionsNames |
the names of the two conditions. This will be used to plot the legend. |
gff |
a |
windowSize |
the size of the triangle base used to smooth the average methylation profile. |
context |
the context in which the DMRs are computed ( |
labels |
a |
col |
a |
main |
a |
plotMeanLines |
a |
plotPoints |
a |
Invisibly returns NULL
Nicolae Radu Zabet
# load the methylation data data(methylationDataList) # load the gene annotation data data(GEs) #select the genes genes <- GEs[which(GEs$type == "gene")] # the coordinates of the area to be plotted chr3Reg <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(510000,530000)) # load the DMRs in CG context data(DMRsNoiseFilterCG) DMRsCGlist <- list("noise filter"=DMRsNoiseFilterCG) # plot the CG methylation par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(1,1)) plotLocalMethylationProfile(methylationDataList[["WT"]], methylationDataList[["met1-3"]], chr3Reg, DMRsCGlist, c("WT", "met1-3"), GEs, windowSize=100, main="CG methylation")
# load the methylation data data(methylationDataList) # load the gene annotation data data(GEs) #select the genes genes <- GEs[which(GEs$type == "gene")] # the coordinates of the area to be plotted chr3Reg <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(510000,530000)) # load the DMRs in CG context data(DMRsNoiseFilterCG) DMRsCGlist <- list("noise filter"=DMRsNoiseFilterCG) # plot the CG methylation par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(1,1)) plotLocalMethylationProfile(methylationDataList[["WT"]], methylationDataList[["met1-3"]], chr3Reg, DMRsCGlist, c("WT", "met1-3"), GEs, windowSize=100, main="CG methylation")
This function plots the coverage for the bisulfite sequencing data.
plotMethylationDataCoverage(methylationData1, methylationData2 = NULL, breaks, regions = NULL, conditionsNames = NULL, context = "CG", proportion = TRUE, labels = NULL, col = NULL, pch = c(1, 0, 16, 2, 15, 17), lty = c(4, 1, 3, 2, 6, 5), contextPerRow = FALSE)
plotMethylationDataCoverage(methylationData1, methylationData2 = NULL, breaks, regions = NULL, conditionsNames = NULL, context = "CG", proportion = TRUE, labels = NULL, col = NULL, pch = c(1, 0, 16, 2, 15, 17), lty = c(4, 1, 3, 2, 6, 5), contextPerRow = FALSE)
methylationData1 |
the methylation data in condition 1
(see |
methylationData2 |
the methylation data in condition 2
(see |
breaks |
a |
regions |
a |
conditionsNames |
a vector of character with the names of the conditions
for |
context |
the context in which the DMRs are computed ( |
proportion |
a |
labels |
a |
col |
a |
pch |
the R symbols used to plot the data. It needs to contain a minimum
of 2 symbols per condition. If not or if |
lty |
the line types used to plot the data. It needs to contain a
minimum of 2 line types per condition. If not or if |
contextPerRow |
a |
This function plots the proportion of cytosines in a specific context that have at least a certain number of reads (x-axis)
Invisibly returns NULL
Nicolae Radu Zabet
computeMethylationDataCoverage
,
methylationDataList
# load the methylation data data(methylationDataList) # plot the coverage in CG context par(mar=c(4, 4, 3, 1)+0.1) plotMethylationDataCoverage(methylationDataList[["WT"]], methylationDataList[["met1-3"]], breaks = c(1,5,10,15), regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG"), proportion = TRUE, labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) ## Not run: # plot the coverage in all three contexts plotMethylationDataCoverage(methylationDataList[["WT"]], methylationDataList[["met1-3"]], breaks = 1:15, regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG", "CHG", "CHH"), proportion = TRUE, labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) ## End(Not run)
# load the methylation data data(methylationDataList) # plot the coverage in CG context par(mar=c(4, 4, 3, 1)+0.1) plotMethylationDataCoverage(methylationDataList[["WT"]], methylationDataList[["met1-3"]], breaks = c(1,5,10,15), regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG"), proportion = TRUE, labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) ## Not run: # plot the coverage in all three contexts plotMethylationDataCoverage(methylationDataList[["WT"]], methylationDataList[["met1-3"]], breaks = 1:15, regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG", "CHG", "CHH"), proportion = TRUE, labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) ## End(Not run)
This function plots the correlation of methylation levels for Cytosines located at a certain distance apart.
plotMethylationDataSpatialCorrelation(methylationData1, methylationData2 = NULL, distances, regions = NULL, conditionsNames = NULL, context = "CG", labels = NULL, col = NULL, pch = c(1, 0, 16, 2, 15, 17), lty = c(4, 1, 3, 2, 6, 5), contextPerRow = FALSE, log = "")
plotMethylationDataSpatialCorrelation(methylationData1, methylationData2 = NULL, distances, regions = NULL, conditionsNames = NULL, context = "CG", labels = NULL, col = NULL, pch = c(1, 0, 16, 2, 15, 17), lty = c(4, 1, 3, 2, 6, 5), contextPerRow = FALSE, log = "")
methylationData1 |
the methylation data in condition 1
(see |
methylationData2 |
the methylation data in condition 2
(see |
distances |
a |
regions |
a |
conditionsNames |
a vector of character with the names of the conditions
for |
context |
the context in which the DMRs are computed ( |
labels |
a |
col |
a |
pch |
the R symbols used to plot the data. It needs to contain a minimum
of 2 symbols per condition. If not or if |
lty |
the line types used to plot the data. It needs to contain a
minimum of 2 line types per condition. If not or if |
contextPerRow |
a |
log |
a |
This function plots the proportion of cytosines in a specific context that have at least a certain number of reads (x-axis)
Invisibly returns NULL
Nicolae Radu Zabet
computeMethylationDataSpatialCorrelation
,
methylationDataList
## Not run: # load the methylation data data(methylationDataList) # plot the spatial correlation in CG context par(mar=c(4, 4, 3, 1)+0.1) plotMethylationDataSpatialCorrelation(methylationDataList[["WT"]], distances = c(1,5,10,15), regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG"), labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) # plot the spatial correlation in all three contexts plotMethylationDataSpatialCorrelation(methylationDataList[["WT"]], methylationDataList[["met1-3"]], distances = c(1,5,10,15,20,50,100,150,200,500,1000), regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG", "CHG", "CHH"), labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE, log="x") ## End(Not run)
## Not run: # load the methylation data data(methylationDataList) # plot the spatial correlation in CG context par(mar=c(4, 4, 3, 1)+0.1) plotMethylationDataSpatialCorrelation(methylationDataList[["WT"]], distances = c(1,5,10,15), regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG"), labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE) # plot the spatial correlation in all three contexts plotMethylationDataSpatialCorrelation(methylationDataList[["WT"]], methylationDataList[["met1-3"]], distances = c(1,5,10,15,20,50,100,150,200,500,1000), regions = NULL, conditionsNames = c("WT","met1-3"), context = c("CG", "CHG", "CHH"), labels = LETTERS, col = NULL, pch = c(1,0,16,2,15,17), lty = c(4,1,3,2,6,5), contextPerRow = FALSE, log="x") ## End(Not run)
This function plots the low resolution profiles for the bisulfite sequencing data.
plotMethylationProfile(methylationProfiles, autoscale = FALSE, labels = NULL, title = "", col = NULL, pch = c(1, 0, 16, 2, 15, 17), lty = c(4, 1, 3, 2, 6, 5))
plotMethylationProfile(methylationProfiles, autoscale = FALSE, labels = NULL, title = "", col = NULL, pch = c(1, 0, 16, 2, 15, 17), lty = c(4, 1, 3, 2, 6, 5))
methylationProfiles |
a |
autoscale |
a |
labels |
a |
title |
the plot title. |
col |
a |
pch |
the R symbols used to plot the data. |
lty |
the line types used to plot the data. |
Invisibly returns NULL
Nicolae Radu Zabet
plotMethylationProfileFromData
,
computeMethylationProfile
and methylationDataList
# load the methylation data data(methylationDataList) # the region where to compute the profile region <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E6)) # compute low resolution profile in 20 Kb windows lowResProfileWTCG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 20000, context = "CG") lowResProfilsCG <- GRangesList("WT" = lowResProfileWTCG) #plot the low resolution profile par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(1,1)) plotMethylationProfile(lowResProfilsCG, autoscale = FALSE, title="CG methylation on Chromosome 3", col=c("#D55E00","#E69F00"), pch = c(1,0), lty = c(4,1)) ## Not run: # compute low resolution profile in 10 Kb windows in CG context lowResProfileWTCG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CG") lowResProfileMet13CG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CG") lowResProfileCG <- GRangesList("WT" = lowResProfileWTCG, "met1-3" = lowResProfileMet13CG) # compute low resolution profile in 10 Kb windows in CHG context lowResProfileWTCHG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CHG") lowResProfileMet13CHG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CHG") lowResProfileCHG <- GRangesList("WT" = lowResProfileWTCHG, "met1-3" = lowResProfileMet13CHG) # plot the low resolution profile par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(2,1)) plotMethylationProfile(lowResProfileCG, autoscale = FALSE, labels = LETTERS[1], title="CG methylation on Chromosome 3", col=c("#D55E00","#E69F00"), pch = c(1,0), lty = c(4,1)) plotMethylationProfile(lowResProfileCHG, autoscale = FALSE, labels = LETTERS[2], title="CHG methylation on Chromosome 3", col=c("#0072B2", "#56B4E9"), pch = c(16,2), lty = c(3,2)) ## End(Not run)
# load the methylation data data(methylationDataList) # the region where to compute the profile region <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E6)) # compute low resolution profile in 20 Kb windows lowResProfileWTCG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 20000, context = "CG") lowResProfilsCG <- GRangesList("WT" = lowResProfileWTCG) #plot the low resolution profile par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(1,1)) plotMethylationProfile(lowResProfilsCG, autoscale = FALSE, title="CG methylation on Chromosome 3", col=c("#D55E00","#E69F00"), pch = c(1,0), lty = c(4,1)) ## Not run: # compute low resolution profile in 10 Kb windows in CG context lowResProfileWTCG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CG") lowResProfileMet13CG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CG") lowResProfileCG <- GRangesList("WT" = lowResProfileWTCG, "met1-3" = lowResProfileMet13CG) # compute low resolution profile in 10 Kb windows in CHG context lowResProfileWTCHG <- computeMethylationProfile(methylationDataList[["WT"]], region, windowSize = 10000, context = "CHG") lowResProfileMet13CHG <- computeMethylationProfile( methylationDataList[["met1-3"]], region, windowSize = 10000, context = "CHG") lowResProfileCHG <- GRangesList("WT" = lowResProfileWTCHG, "met1-3" = lowResProfileMet13CHG) # plot the low resolution profile par(mar=c(4, 4, 3, 1)+0.1) par(mfrow=c(2,1)) plotMethylationProfile(lowResProfileCG, autoscale = FALSE, labels = LETTERS[1], title="CG methylation on Chromosome 3", col=c("#D55E00","#E69F00"), pch = c(1,0), lty = c(4,1)) plotMethylationProfile(lowResProfileCHG, autoscale = FALSE, labels = LETTERS[2], title="CHG methylation on Chromosome 3", col=c("#0072B2", "#56B4E9"), pch = c(16,2), lty = c(3,2)) ## End(Not run)
This function plots the low resolution profiles for all bisulfite sequencing data.
plotMethylationProfileFromData(methylationData1, methylationData2 = NULL, regions = NULL, conditionsNames = NULL, context = "CG", windowSize = NULL, autoscale = FALSE, labels = NULL, col = NULL, pch = c(1, 0, 16, 2, 15, 17), lty = c(4, 1, 3, 2, 6, 5), contextPerRow = TRUE)
plotMethylationProfileFromData(methylationData1, methylationData2 = NULL, regions = NULL, conditionsNames = NULL, context = "CG", windowSize = NULL, autoscale = FALSE, labels = NULL, col = NULL, pch = c(1, 0, 16, 2, 15, 17), lty = c(4, 1, 3, 2, 6, 5), contextPerRow = TRUE)
methylationData1 |
the methylation data in condition 1
(see |
methylationData2 |
the methylation data in condition 2
(see |
regions |
a |
conditionsNames |
the names of the two conditions. This will be used to plot the legend. |
context |
a |
windowSize |
a |
autoscale |
a |
labels |
a |
col |
a |
pch |
the R symbols used to plot the data It needs to contain a minimum
of 2 symbols per condition. If not or if |
lty |
the line types used to plot the data. It needs to contain a
minimum of 2 line types per condition. If not or if |
contextPerRow |
a |
Invisibly returns NULL
Nicolae Radu Zabet
plotMethylationProfile
,
computeMethylationProfile
and methylationDataList
# load the methylation data data(methylationDataList) #plot the low resolution profile at 10 Kb resolution par(mar=c(4, 4, 3, 1)+0.1) plotMethylationProfileFromData(methylationDataList[["WT"]], methylationDataList[["met1-3"]], conditionsNames=c("WT", "met1-3"), windowSize = 20000, autoscale = TRUE, context = c("CHG")) ## Not run: #plot the low resolution profile at 5 Kb resolution par(mar=c(4, 4, 3, 1)+0.1) plotMethylationProfileFromData(methylationDataList[["WT"]], methylationDataList[["met1-3"]], conditionsNames=c("WT", "met1-3"), windowSize = 5000, autoscale = TRUE, context = c("CG", "CHG", "CHH"), labels = LETTERS) ## End(Not run)
# load the methylation data data(methylationDataList) #plot the low resolution profile at 10 Kb resolution par(mar=c(4, 4, 3, 1)+0.1) plotMethylationProfileFromData(methylationDataList[["WT"]], methylationDataList[["met1-3"]], conditionsNames=c("WT", "met1-3"), windowSize = 20000, autoscale = TRUE, context = c("CHG")) ## Not run: #plot the low resolution profile at 5 Kb resolution par(mar=c(4, 4, 3, 1)+0.1) plotMethylationProfileFromData(methylationDataList[["WT"]], methylationDataList[["met1-3"]], conditionsNames=c("WT", "met1-3"), windowSize = 5000, autoscale = TRUE, context = c("CG", "CHG", "CHH"), labels = LETTERS) ## End(Not run)
This function plots the distribution of a set of subregions on a large region.
plotOverlapProfile(overlapsProfiles1, overlapsProfiles2 = NULL, names = NULL, labels = NULL, col = NULL, title = "", logscale = FALSE, maxValue = NULL)
plotOverlapProfile(overlapsProfiles1, overlapsProfiles2 = NULL, names = NULL, labels = NULL, col = NULL, title = "", logscale = FALSE, maxValue = NULL)
overlapsProfiles1 |
a |
overlapsProfiles2 |
a |
names |
a |
labels |
a |
col |
a |
title |
the title of the plot. |
logscale |
a |
maxValue |
a maximum value in a region. Used for the colour scheme. |
Invisibly returns NULL
.
Nicolae Radu Zabet
computeOverlapProfile
, filterDMRs
,
computeDMRs
and mergeDMRsIteratively
# load the methylation data data(methylationDataList) # load the DMRs in CG context data(DMRsNoiseFilterCG) # the coordinates of the area to be plotted largeRegion <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) # compute overlaps distribution hotspotsHypo <- computeOverlapProfile(DMRsNoiseFilterCG, largeRegion, windowSize = 10000, binary = FALSE) plotOverlapProfile(GRangesList("Chr3"=hotspotsHypo), names = c("hypomethylated"), title = "CG methylation") ## Not run: largeRegion <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E6)) hotspotsHypo <- computeOverlapProfile( DMRsNoiseFilterCG[(DMRsNoiseFilterCG$regionType == "loss")], largeRegion, windowSize=2000, binary=TRUE, cores=1) hotspotsHyper <- computeOverlapProfile( DMRsNoiseFilterCG[(DMRsNoiseFilterCG$regionType == "gain")], largeRegion, windowSize=2000, binary=TRUE, cores=1) plotOverlapProfile(GRangesList("Chr3"=hotspotsHypo), GRangesList("Chr3"=hotspotsHyper), names=c("loss", "gain"), title="CG methylation") ## End(Not run)
# load the methylation data data(methylationDataList) # load the DMRs in CG context data(DMRsNoiseFilterCG) # the coordinates of the area to be plotted largeRegion <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E5)) # compute overlaps distribution hotspotsHypo <- computeOverlapProfile(DMRsNoiseFilterCG, largeRegion, windowSize = 10000, binary = FALSE) plotOverlapProfile(GRangesList("Chr3"=hotspotsHypo), names = c("hypomethylated"), title = "CG methylation") ## Not run: largeRegion <- GRanges(seqnames = Rle("Chr3"), ranges = IRanges(1,1E6)) hotspotsHypo <- computeOverlapProfile( DMRsNoiseFilterCG[(DMRsNoiseFilterCG$regionType == "loss")], largeRegion, windowSize=2000, binary=TRUE, cores=1) hotspotsHyper <- computeOverlapProfile( DMRsNoiseFilterCG[(DMRsNoiseFilterCG$regionType == "gain")], largeRegion, windowSize=2000, binary=TRUE, cores=1) plotOverlapProfile(GRangesList("Chr3"=hotspotsHypo), GRangesList("Chr3"=hotspotsHyper), names=c("loss", "gain"), title="CG methylation") ## End(Not run)
This function pools together multiple methylation datasets.
poolMethylationDatasets(methylationDataList)
poolMethylationDatasets(methylationDataList)
methylationDataList |
a |
the methylation data stored as a GRanges
object with four metadata columns (see methylationDataList
).
Nicolae Radu Zabet
# load methylation data object data(methylationDataList) # pools the two datasets together pooledMethylationData <- poolMethylationDatasets(methylationDataList)
# load methylation data object data(methylationDataList) # pools the two datasets together pooledMethylationData <- poolMethylationDatasets(methylationDataList)
This function pools together two methylation datasets.
poolTwoMethylationDatasets(methylationData1, methylationData2)
poolTwoMethylationDatasets(methylationData1, methylationData2)
methylationData1 |
a |
methylationData2 |
a |
the methylation data stored as a GRanges
object with four metadata columns (see methylationDataList
).
Nicolae Radu Zabet
# load methylation data object data(methylationDataList) # save the two datasets together pooledMethylationData <- poolTwoMethylationDatasets(methylationDataList[[1]], methylationDataList[[2]])
# load methylation data object data(methylationDataList) # save the two datasets together pooledMethylationData <- poolTwoMethylationDatasets(methylationDataList[[1]], methylationDataList[[2]])
This function takes as input a CX report file produced by Bismark
and returns a GRanges
object with four metadata columns
The file represents the bisulfite sequencing methylation data.
readBismark(file)
readBismark(file)
file |
The filename (including path) of the methylation (CX report generated by Bismark) to be read. |
the methylation data stored as a GRanges
object with four metadata columns (see methylationDataList
).
Nicolae Radu Zabet and Jonathan Michael Foonlan Tsang
# load methylation data object data(methylationDataList) # save the one datasets into a file saveBismark(methylationDataList[["WT"]], "chr3test_a_thaliana_wt.CX_report") # load the data methylationDataWT <- readBismark("chr3test_a_thaliana_wt.CX_report") #check that the loading worked all(methylationDataWT == methylationDataList[["WT"]])
# load methylation data object data(methylationDataList) # save the one datasets into a file saveBismark(methylationDataList[["WT"]], "chr3test_a_thaliana_wt.CX_report") # load the data methylationDataWT <- readBismark("chr3test_a_thaliana_wt.CX_report") #check that the loading worked all(methylationDataWT == methylationDataList[["WT"]])
This function takes as input a vector of CX report file produced by Bismark
and returns a GRanges
object with four metadata columns
(see methylationDataList
). The file represents the pooled
bisulfite sequencing data.
readBismarkPool(files)
readBismarkPool(files)
files |
The filenames (including path) of the methylation (CX report generated with Bismark) to be read |
the methylation data stored as a GRanges
object with four metadata columns (see methylationDataList
).
Nicolae Radu Zabet and Jonathan Michael Foonlan Tsang
# load methylation data object data(methylationDataList) # save the two datasets saveBismark(methylationDataList[["WT"]], "chr3test_a_thaliana_wt.CX_report") saveBismark(methylationDataList[["met1-3"]], "chr3test_a_thaliana_met13.CX_report") # reload the two datasets and pool them filenames <- c("chr3test_a_thaliana_wt.CX_report", "chr3test_a_thaliana_met13.CX_report") methylationDataPool <- readBismarkPool(filenames)
# load methylation data object data(methylationDataList) # save the two datasets saveBismark(methylationDataList[["WT"]], "chr3test_a_thaliana_wt.CX_report") saveBismark(methylationDataList[["met1-3"]], "chr3test_a_thaliana_met13.CX_report") # reload the two datasets and pool them filenames <- c("chr3test_a_thaliana_wt.CX_report", "chr3test_a_thaliana_met13.CX_report") methylationDataPool <- readBismarkPool(filenames)
This function takes as input a GRanges
object generated with
readBismark
and saves the output to a file using
Bismark CX report format.
saveBismark(methylationData, filename)
saveBismark(methylationData, filename)
methylationData |
the methylation data stored as a |
filename |
the filename where the data will be saved. |
Invisibly returns NULL
Nicolae Radu Zabet
# load methylation data object data(methylationDataList) # save one dataset to a file saveBismark(methylationDataList[["WT"]], "chr3test_a_thaliana_wt.CX_report")
# load methylation data object data(methylationDataList) # save one dataset to a file saveBismark(methylationDataList[["WT"]], "chr3test_a_thaliana_wt.CX_report")
A GRanges
object containing simulated date for methylation in four
samples. The conditions assciated witch each sample are a, a, b and b.
A GRanges
object containing multiple metadata
columns with the reads from each object passed as parameter
The object was created by calling joinReplicates
function.