Title: | Differential NOMe-seq analysis |
---|---|
Description: | dinoR tests for significant differences in NOMe-seq footprints between two conditions, using genomic regions of interest (ROI) centered around a landmark, for example a transcription factor (TF) motif. This package takes NOMe-seq data (GCH methylation/protection) in the form of a Ranged Summarized Experiment as input. dinoR can be used to group sequencing fragments into 3 or 5 categories representing characteristic footprints (TF bound, nculeosome bound, open chromatin), plot the percentage of fragments in each category in a heatmap, or averaged across different ROI groups, for example, containing a common TF motif. It is designed to compare footprints between two sample groups, using edgeR's quasi-likelihood methods on the total fragment counts per ROI, sample, and footprint category. |
Authors: | Michaela Schwaiger [aut, cre] |
Maintainer: | Michaela Schwaiger <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.3.0 |
Built: | 2024-11-08 06:00:47 UTC |
Source: | https://github.com/bioc/dinoR |
Compare each footprint pattern in WT and KO samples (percentages and diNOMeTest results).
compareFootprints( footprint_percentages, res, WTsamples = c("WT_1", "WT_2"), KOsamples = c("KO_1", "KO_2"), plotcols, facetROIgroup = FALSE, plot = TRUE )
compareFootprints( footprint_percentages, res, WTsamples = c("WT_1", "WT_2"), KOsamples = c("KO_1", "KO_2"), plotcols, facetROIgroup = FALSE, plot = TRUE )
footprint_percentages |
A tibble where each column corresponds to a sample-footprint percentage and each row to a ROI, with the rows clustered by similarity. |
res |
A tibble with the results of differential fragment count testing for each ROI-footprint combination. |
WTsamples |
The control sample names. |
KOsamples |
The treatment sample names. |
plotcols |
A character vector of colors to be used for distinguishing the ROI groups (has to be the same length as there are ROI groups). |
facetROIgroup |
If TRUE, split the plots for each pattern by ROI group. |
plot |
If TRUE, will output a plot. |
Plots the percentages of reads in each ROI in WT versus KO samples (mean of two replicates) in each footprint pattern. The color indicates the ROI group and the shape the results of the diNOMeTest.
A scatter plot for each footprint pattern comparing WT and KO percentages and significance test results.
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) NomeData <- footprintQuant(NomeData) res <- diNOMeTest(NomeData, WTsamples = c("WT_1", "WT_2"), KOsamples = c("KO_1", "KO_2") ) footprint_percentages <- footprintPerc(NomeData) compareFootprints(footprint_percentages, res, plotcols = "black", plot = TRUE)
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) NomeData <- footprintQuant(NomeData) res <- diNOMeTest(NomeData, WTsamples = c("WT_1", "WT_2"), KOsamples = c("KO_1", "KO_2") ) footprint_percentages <- footprintPerc(NomeData) compareFootprints(footprint_percentages, res, plotcols = "black", plot = TRUE)
Creates an RSE object with mock NOMe-seq data.
createExampleData( samples = c("WT_1", "WT_2", "KO_1", "KO_2"), group = c("WT", "WT", "KO", "KO"), nROI = 4, randomMeth = TRUE )
createExampleData( samples = c("WT_1", "WT_2", "KO_1", "KO_2"), group = c("WT", "WT", "KO", "KO"), nROI = 4, randomMeth = TRUE )
samples |
The sample names. |
group |
The sample group names. |
nROI |
The number of ROIs that should be constructed. |
randomMeth |
Logical indicating whether the methylation/protection values should be randomly generated. |
Creates an RSE object with mock NOMe-seq data.
RSE object with mock data.
createExampleData()
createExampleData()
Tests for differences in fragment counts for each NOMe footprint pattern compared to total fragment counts in two conditions.
diNOMeTest( footprint_counts, WTsamples = c("WT_1", "WT_2"), KOsamples = c("KO_1", "KO_2"), minreads = 1, meanreads = 1, prior.count = 3, FDR = 0.05, FC = 2, combineNucCounts = FALSE )
diNOMeTest( footprint_counts, WTsamples = c("WT_1", "WT_2"), KOsamples = c("KO_1", "KO_2"), minreads = 1, meanreads = 1, prior.count = 3, FDR = 0.05, FC = 2, combineNucCounts = FALSE )
footprint_counts |
A Summarized Experiment containing the
sample names (colData), ROI names (rowData),
and number of fragments in each NOMe footprint pattern
category (assays). For example
the output of the ( |
WTsamples |
The control sample names as they appear in
( |
KOsamples |
The treatment sample names as they appear
in ( |
minreads |
The minimum number of fragments to which a footprint could be assigned a ROI must have in all samples. All other ROIs are filtered out before the differential NOMe analysis. |
meanreads |
The minimum number of fragments to which a footprint could be assigned a ROI must on average across all samples. All other ROIs are filtered out before the differential NOMe analysis. |
prior.count |
The pseudocount used for ( |
FDR |
The FDR cutoff for a ROI - footprint combination to be called regulated in the output. |
FC |
The fold change cutoff for a ROI - footprint combination to be called regulated in the output. |
combineNucCounts |
If TRUE, the upNuc, downNuc, and Nuc fragment counts will be combined into the Nuc category. |
Uses edgeR's quasi-likelihood methods to conveniently test for differential proportions of each one of 5 (or 3, if nucleosome footprints are combined) distinct footprints between at least two control and at least two treatment samples.
A tibble with the results of differential fragment count testing for each ROI-footprint combination.
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) footprint_counts <- footprintQuant(NomeData) diNOMeTest(footprint_counts, WTsamples = c("WT_1", "WT_2"), KOsamples = c("KO_1", "KO_2") )
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) footprint_counts <- footprintQuant(NomeData) diNOMeTest(footprint_counts, WTsamples = c("WT_1", "WT_2"), KOsamples = c("KO_1", "KO_2") )
Assign a footprint type to each fragment based on GCH protection values in pre-defined windows.
footprintCalc( NomeData, window_1 = c(-50, -25), window_2 = c(-8, 8), window_3 = c(25, 50) )
footprintCalc( NomeData, window_1 = c(-50, -25), window_2 = c(-8, 8), window_3 = c(25, 50) )
NomeData |
A Ranged Summarized Experiment (RSE) with an
entry for each ROI. The ( |
window_1 |
Integer vector with two elements representing start and end positions of the first window relative to the ROI center. |
window_2 |
Integer vector with two elements representing start and end positions of the second window relative to the ROI center. |
window_3 |
Integer vector with two elements representing start and end positions of the third window relative to the ROI center. |
Selects 3 windows (default is -50:-25, -8:8, 25:50) around the center of the provided region of interest (ROI) and calculates the average GCH methylation protection for a given fragment across all GCHs in each window. If it is above 0.5 the window is deemed protected, below 0.5, unprotected. Depending on the protection pattern in all windows, a read is put into one of 5 footprint categories: tf bound (0 - 1 - 0), open chromatin (0 - 0 - 0), downstream positioned nucleosome (1 - 1 - 0), other nucleosome (1 - 1 - 1, 1 - 0 - 0, 0 - 0 - 1, 1 - 0 - 1), and upstream positioned nucleosome (0 - 1 - 1).
The Ranged Summarized Experiment with an assay "footprints" added, which contains a footprint type assigned to each fragment.
NomeData <- createExampleData() footprintCalc(NomeData)
NomeData <- createExampleData() footprintCalc(NomeData)
Calculates the percentage of all fragments in a ROI-sample combination corresponding to each footprint pattern.
footprintPerc( footprint_counts, minreads = 1, meanreads = 1, ROIgroup = "motif", combineNucCounts = FALSE )
footprintPerc( footprint_counts, minreads = 1, meanreads = 1, ROIgroup = "motif", combineNucCounts = FALSE )
footprint_counts |
A Summarized Experiment containing the
sample names (colData), ROI names (rowData), and number of fragments
in each NOMe footprint pattern category as assays. For example
the output of the ( |
minreads |
The minimum number of fragments to which a footprint could be assigned a ROI must have in all samples. All other ROIs are removed. |
meanreads |
The minimum number of fragments to which a footprint could be assigned a ROI must have on average across all samples. All other ROIs are removed. |
ROIgroup |
Column name of a metadata column in the
( |
combineNucCounts |
If TRUE, the upNuc, downNuc, and Nuc fragment counts will be combined into the Nuc category. |
Calculates the percentage of all fragments in a ROI-sample combination in each footprint pattern. Then turns the table into wide format, where each column corresponds to a sample-footprint percentage and each row to a ROI and clusters the rows by similarity.
A tibble where each column corresponds to a sample-footprint percentage and each row to a ROI.
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) footprint_counts <- footprintQuant(NomeData) footprintPerc(footprint_counts)
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) footprint_counts <- footprintQuant(NomeData) footprintPerc(footprint_counts)
Quantify the footprint types.
footprintQuant(NomeData)
footprintQuant(NomeData)
NomeData |
A Ranged Summarized Experiment (RSE) with an
entry for each ROI. The ( |
Count the number of fragments corresponding to a footprint type for each sample-ROI combination.
The Ranged Summarized Experiment with an assay added for each footprint type, containing the number of fragments that contain that footprint. An assay with the total number of pattern-able fragments ("all") is also added. tf = transcription factor footprint open = open chromatin footprint upNuc = upstream nucleosome footprint downNuc = downstream nucleosome footprint Nuc = other nucleosome footprints
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) footprintQuant(NomeData)
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) footprintQuant(NomeData)
Draws heatmaps of the percentages of all fragments in a ROI-sample combination in each footprint pattern.
fpPercHeatmap( footprint_percentages, breaks = rep(list(c(0, 50, 100)), 5), plotcols = c("#236467", "#AA9B39", "#822B56", "#822B26", "#822B99") )
fpPercHeatmap( footprint_percentages, breaks = rep(list(c(0, 50, 100)), 5), plotcols = c("#236467", "#AA9B39", "#822B56", "#822B26", "#822B99") )
footprint_percentages |
A tibble where each column corresponds to a sample-footprint percentage and each row to a ROI, with the rows clustered by similarity. |
breaks |
A list of vectors indicating numeric breaks used
in ( |
plotcols |
A character vector of 5 colors to be used for the heatmaps of the 5 footprint patterns ("tf", "open", "upNuc", "Nuc", "downNuc"), or 3 colors if the nucleosome patterns have been combined. |
Draws heatmaps of the percentages of all fragments in a ROI-sample combination in each footprint pattern supplied (for example: "tf", "open", "upNuc", "Nuc", "downNuc"). The rows of the heatmaps are split by ROI group.
Heatmaps of the percentages of all fragments in a ROI-sample combination in each footprint pattern.
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) NomeData <- footprintQuant(NomeData) footprint_percentages <- footprintPerc(NomeData) fpPercHeatmap(footprint_percentages)
NomeData <- createExampleData() NomeData <- footprintCalc(NomeData) NomeData <- footprintQuant(NomeData) footprint_percentages <- footprintPerc(NomeData) fpPercHeatmap(footprint_percentages)
Plot the summarized GCH methylation protections across selected ROIs.
metaPlots(NomeData, nr = 2, nROI = 2, ROIgroup = "motif", span = 0.05)
metaPlots(NomeData, nr = 2, nROI = 2, ROIgroup = "motif", span = 0.05)
NomeData |
A Ranged Summarized Experiment (RSE) with an entry
for each ROI. The ( |
nr |
Integer used as a cutoff to filter sample ROI
combinations that have less than
( |
nROI |
The number of ROIs that need to have a GpC methylation measurement at a given position for this position to be included in the plot. |
ROIgroup |
Column name of a metadata column in the rowData of the RSE, describing a group each ROI belongs to, for example, different transcription factor motifs at the center of the ROI. |
span |
The ( |
Summarizes the GCH methylation protections across selected ROIs.
A tibble with the methylation protection profiles summarized across all ROIs in a certain group.
NomeData <- createExampleData() metaPlots(NomeData)
NomeData <- createExampleData() metaPlots(NomeData)
WT and AdnpKO mouse ES cells were subjected to guided NOMe-seq. 1500 regions were targeted using guideRNAs and Cas9, and bisulfite sequenced in 300bp paired-end mode. Reads were mapped to the genome using biscuit. Then we used UMI-tools to remove duplicated UMIs. The GCH protection was determined using the fetch-NOMe package, using the 1500 ROIs as input regions. The ROIs were all 600bp long, and centered around a transcription factor motif. The resulting tibble was converted into a RangedSummarizedExperiment using the NOMeConverteR package. To reduce file size, data were filtered to only those ROIs containing 20-180 fragments.
data(NomeData)
data(NomeData)
NomeData
A RangedSummarizedExperiment with 219 ROIs and 4 samples:
sample names
ROI names with the format: TFmotif_chromosome_start_end and motif type
nFragsAnalyzed: number of fragments that were analyzed for GCH methylation protection, reads: a Gpos element for every sample-ROI combination
GPos elements with 'protection' metadata column containing a sparse logical matrix indicating if a GCH was protected from methylation and a 'methylation' column containing a sparse logical matrix indicting if a GCH was methylated. The GPos has data for every position from 300bp upstream to 300bp downstream around a CTCF motif center.
A RangedSummarizedExperiment with 219 ROIs and 4 samples
generated by Lucas Kaaij