Title: | Bayesian hierarchical model for genome-wide nucleosome positioning with high-throughput short-read data (MNase-Seq) |
---|---|
Description: | This package does nucleosome positioning using informative Multinomial-Dirichlet prior in a t-mixture with reversible jump estimation of nucleosome positions for genome-wide profiling. |
Authors: | Pascal Belleau [aut], Rawane Samb [aut], Astrid DeschĂȘnes [cre, aut], Khader Khadraoui [aut], Lajmi Lakhal-Chaieb [aut], Arnaud Droit [aut] |
Maintainer: | Astrid DeschĂȘnes <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.31.0 |
Built: | 2024-10-31 04:29:34 UTC |
Source: | https://github.com/bioc/RJMCMCNucleosomes |
This package does nucleosome positioning using informative Multinomial-Dirichlet prior in a t-mixture with reversible jump estimation of nucleosome positions for genome-wide profiling.
Pascal Belleau, Rawane Samb, Astrid DeschĂȘnes, Khader Khadraoui, Lajmi Lakhal and Arnaud Droit
Maintainer: Astrid Deschenes <[email protected]>
rjmcmc
for profiling of nucleosome positions for a
segment
rjmcmcCHR
for profiling of nucleosome positions
for a large region. The function will take care of spliting and
merging.
segmentation
for spliting a GRanges
containing reads in a list of smaller segments for
the rjmcmc
function.
postTreatment
for merging closely positioned
nucleosomes
mergeRDSFiles
for merging nucleosome information
from selected RDS files.
plotNucleosomes
for generating a graph containing
the nucleosome positions and the read coverage.
Merge nucleosome information, from all RDS files present
in a same directory, into one object
of class
"rjmcmcNucleosomesMerge".
mergeAllRDSFilesFromDirectory(directory)
mergeAllRDSFilesFromDirectory(directory)
directory |
a |
a list
of class
"rjmcmcNucleosomesMerge" containing:
k a integer
, the number of nucleosomes.
mu
a GRanges
containing the positions of the
nucleosomes.
Pascal Belleau, Astrid Deschenes
## Use a directory present in the RJMCMC package directoryWithRDSFiles <- system.file("extdata", package = "RJMCMCNucleosomes") ## Merge nucleosomes info from RDS files present in directory ## It is assumed that all files present in the directory are nucleosomes ## result for the same chromosome result <- mergeAllRDSFilesFromDirectory(directoryWithRDSFiles) ## Print the number and the position of the nucleosomes result$k result$mu ## Class of the output object class(result)
## Use a directory present in the RJMCMC package directoryWithRDSFiles <- system.file("extdata", package = "RJMCMCNucleosomes") ## Merge nucleosomes info from RDS files present in directory ## It is assumed that all files present in the directory are nucleosomes ## result for the same chromosome result <- mergeAllRDSFilesFromDirectory(directoryWithRDSFiles) ## Print the number and the position of the nucleosomes result$k result$mu ## Class of the output object class(result)
Merge nucleosome information present in RDS files into one
object of class
"rjmcmcNucleosomesMerge".
mergeRDSFiles(RDSFiles)
mergeRDSFiles(RDSFiles)
RDSFiles |
a |
a list
of class
"rjmcmcNucleosomesMerge" containing:
k a integer
, the number of nucleosomes.
mu
a GRanges
containing the positions of the
nucleosomes.
Pascal Belleau, Astrid Deschenes
## Use RDS files present in the RJMCMC package RDSFiles <- dir(system.file("extdata", package = "RJMCMCNucleosomes"), full.names = TRUE, pattern = "*RDS") ## Merge nucleosomes info from RDS files present in directory result <- mergeRDSFiles(RDSFiles) ## Print the number and the position of the nucleosomes result$k result$mu ## Class of the output object class(result)
## Use RDS files present in the RJMCMC package RDSFiles <- dir(system.file("extdata", package = "RJMCMCNucleosomes"), full.names = TRUE, pattern = "*RDS") ## Merge nucleosomes info from RDS files present in directory result <- mergeRDSFiles(RDSFiles) ## Print the number and the position of the nucleosomes result$k result$mu ## Class of the output object class(result)
Generate a graph for
a GRanges
or a GRangesList
of nucleosome positions. In
presence of only one prediction (with multiples nucleosome positions), a
GRanges
is used. In presence of more thant one predictions (as
example, before and after post-treatment or results from
different software), a GRangesList
with
one entry per prediction is used. All predictions must have been obtained
using the same reads.
plotNucleosomes(nucleosomePositions, reads, seqName = NULL, xlab = "position", ylab = "coverage", names = NULL)
plotNucleosomes(nucleosomePositions, reads, seqName = NULL, xlab = "position", ylab = "coverage", names = NULL)
nucleosomePositions |
a |
reads |
a |
seqName |
a |
xlab |
a |
ylab |
a |
names |
a |
a graph containing the nucleosome positions and the read coverage
Astrid Deschenes
## Load reads dataset data(reads_demo_01) ## Run RJMCMC method result <- rjmcmc(reads = reads_demo_01, seqName = "chr_SYNTHETIC", nbrIterations = 4000, lambda = 2, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5, vSeed = 10213) ## Create graph using the synthetic map plotNucleosomes(nucleosomePositions = result$mu, seqName = "chr_SYNTHETIC", reads = reads_demo_01)
## Load reads dataset data(reads_demo_01) ## Run RJMCMC method result <- rjmcmc(reads = reads_demo_01, seqName = "chr_SYNTHETIC", nbrIterations = 4000, lambda = 2, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5, vSeed = 10213) ## Create graph using the synthetic map plotNucleosomes(nucleosomePositions = result$mu, seqName = "chr_SYNTHETIC", reads = reads_demo_01)
rjmcmc
function.A helper function which merges closely positioned nucleosomes to rectify the over splitting and provide a more conservative approach. Beware that each chromosome must be treated separatly.
postTreatment(reads, seqName = NULL, resultRJMCMC, extendingSize = 74L, chrLength)
postTreatment(reads, seqName = NULL, resultRJMCMC, extendingSize = 74L, chrLength)
reads |
a |
seqName |
a |
resultRJMCMC |
an object of |
extendingSize |
a positive |
chrLength |
a positive |
a GRanges
, the updated nucleosome positions.
When no nucleosome is present, NULL
is returned.
Pascal Belleau, Astrid Deschenes
## Loading dataset data(reads_demo_02) ## Nucleosome positioning, running both merge and split functions result <- rjmcmc(reads = reads_demo_02, seqName = "chr_SYNTHETIC", nbrIterations = 1000, lambda = 2, kMax = 30, minInterval = 146, maxInterval = 490, minReads = 3, vSeed = 11) ## Before post-treatment result ##Post-treatment function which merged closely positioned nucleosomes postResult <- postTreatment(reads = reads_demo_02, seqName = "chr_SYNTHETIC", result, 100, 73500) ## After post-treatment postResult
## Loading dataset data(reads_demo_02) ## Nucleosome positioning, running both merge and split functions result <- rjmcmc(reads = reads_demo_02, seqName = "chr_SYNTHETIC", nbrIterations = 1000, lambda = 2, kMax = 30, minInterval = 146, maxInterval = 490, minReads = 3, vSeed = 11) ## Before post-treatment result ##Post-treatment function which merged closely positioned nucleosomes postResult <- postTreatment(reads = reads_demo_02, seqName = "chr_SYNTHETIC", result, 100, 73500) ## After post-treatment postResult
Generated a formated output of a list marked as
an rjmcmcNucleosomes
class
## S3 method for class 'rjmcmcNucleosomes' print(x, ...)
## S3 method for class 'rjmcmcNucleosomes' print(x, ...)
x |
the output object from |
... |
arguments passed to or from other methods |
An object of class rjmcmcNucleosomes
Astrid Deschenes
## Loading dataset data(RJMCMC_result) print(RJMCMC_result)
## Loading dataset data(RJMCMC_result) print(RJMCMC_result)
Generated a formated output of a list marked as
an rjmcmcNucleosomesBeforeAndAfterPostTreatment
class
## S3 method for class 'rjmcmcNucleosomesBeforeAndAfterPostTreatment' print(x, ...)
## S3 method for class 'rjmcmcNucleosomesBeforeAndAfterPostTreatment' print(x, ...)
x |
the output object from |
... |
arguments passed to or from other methods |
an object of class
rjmcmcNucleosomesBeforeAndAfterPostTreatment
Astrid Deschenes
## Load synthetic dataset of reads data(syntheticNucleosomeReads) ## Use dataset of reads to create GRanges object sampleGRanges <- GRanges(syntheticNucleosomeReads$dataIP) ## Run nucleosome detection on the entire sample ## Not run: result <- rjmcmcCHR(reads = sampleGRanges, zeta = 147, delta=50, maxLength=1200, nbrIterations = 1000, lambda = 3, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5, vSeed = 10113, nbCores = 2, saveAsRDS = FALSE) ## End(Not run) ## Print result ## Not run: print(result)
## Load synthetic dataset of reads data(syntheticNucleosomeReads) ## Use dataset of reads to create GRanges object sampleGRanges <- GRanges(syntheticNucleosomeReads$dataIP) ## Run nucleosome detection on the entire sample ## Not run: result <- rjmcmcCHR(reads = sampleGRanges, zeta = 147, delta=50, maxLength=1200, nbrIterations = 1000, lambda = 3, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5, vSeed = 10113, nbCores = 2, saveAsRDS = FALSE) ## End(Not run) ## Print result ## Not run: print(result)
Generated a formated output of a list marked as
an rjmcmcNucleosomesMerge
class
## S3 method for class 'rjmcmcNucleosomesMerge' print(x, ...)
## S3 method for class 'rjmcmcNucleosomesMerge' print(x, ...)
x |
the output object from |
... |
arguments passed to or from other methods |
an object of class mergeAllRDSFilesFromDirectory
Astrid Deschenes
## Use a directory present in the RJMCMC package directoryWithRDSFiles <- system.file("extdata", package = "RJMCMCNucleosomes") ## Merge nucleosomes info from RDS files present in directory ## It is assumed that all files present in the directory are nucleosomes ## result for the same chromosome result <- mergeAllRDSFilesFromDirectory(directoryWithRDSFiles) ## Show resulting nucleosomes print(result) ## or simply result
## Use a directory present in the RJMCMC package directoryWithRDSFiles <- system.file("extdata", package = "RJMCMCNucleosomes") ## Merge nucleosomes info from RDS files present in directory ## It is assumed that all files present in the directory are nucleosomes ## result for the same chromosome result <- mergeAllRDSFilesFromDirectory(directoryWithRDSFiles) ## Show resulting nucleosomes print(result) ## or simply result
GRanges
format
(for demo purpose).A group of forward and reverse reads, in a GRanges
, that can be
used to test the rjmcmc
function.
data(reads_demo_01)
data(reads_demo_01)
A GRanges
containing forward and reverse reads.
A GRanges
containing forward and reverse reads.
rjmcmc
for profiling of nucleosome positions
## Loading dataset data(reads_demo_01) ## Nucleosome positioning rjmcmc(reads = reads_demo_01, nbrIterations = 100, lambda = 3, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5)
## Loading dataset data(reads_demo_01) ## Nucleosome positioning rjmcmc(reads = reads_demo_01, nbrIterations = 100, lambda = 3, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5)
GRanges
format
(for demo purpose).A group of forward and reverse reads that can be used to test the
rjmcmc
function.
data(reads_demo_02)
data(reads_demo_02)
A GRanges
containing forward and reverse reads.
A GRanges
containing forward and reverse reads.
rjmcmc
for profiling of nucleosome positions
rjmcmcCHR
for profiling of nucleosome positions
for a large region. The function will take care of spliting and
merging.
segmentation
for spliting a GRanges
containing reads in a list of smaller segments for
the rjmcmc
function.
postTreatment
for merging closely positioned
nucleosomes
mergeRDSFiles
for merging nucleosome information
from selected RDS files.
plotNucleosomes
for generating a graph containing
the nucleosome positions and the read coverage.
## Loading dataset data(reads_demo_02) ## Nucleosome positioning ## Since there is only one chromosome present in reads_demo_02, the name ## of the chromosome does not need to be specified rjmcmc(reads = reads_demo_02, nbrIterations = 150, lambda = 3, kMax = 30, minInterval = 144, maxInterval = 290, minReads = 6)
## Loading dataset data(reads_demo_02) ## Nucleosome positioning ## Since there is only one chromosome present in reads_demo_02, the name ## of the chromosome does not need to be specified rjmcmc(reads = reads_demo_02, nbrIterations = 150, lambda = 3, kMax = 30, minInterval = 144, maxInterval = 290, minReads = 6)
Use of a fully Bayesian hierarchical model for chromosome-wide profiling of nucleosome positions based on high-throughput short-read data (MNase-Seq data). Beware that for a genome-wide profiling, each chromosome must be treated separatly. This function is optimized to run on segments that are smaller sections of the chromosome.
rjmcmc(reads, seqName = NULL, nbrIterations, kMax, lambda = 3, minInterval, maxInterval, minReads = 5, adaptIterationsToReads = TRUE, vSeed = -1, saveAsRDS = FALSE)
rjmcmc(reads, seqName = NULL, nbrIterations, kMax, lambda = 3, minInterval, maxInterval, minReads = 5, adaptIterationsToReads = TRUE, vSeed = -1, saveAsRDS = FALSE)
reads |
a |
seqName |
a |
nbrIterations |
a positive |
kMax |
a positive |
lambda |
a positive |
minInterval |
a |
maxInterval |
a |
minReads |
a positive |
adaptIterationsToReads |
a |
vSeed |
a |
saveAsRDS |
a |
a list
of class
"rjmcmcNucleosomes" containing:
call
the matched call.
k
a integer
, the final estimation of the number
of nucleosomes. 0
when no nucleosome is detected.
mu
a GRanges
containing the positions of the
nucleosomes and '*' as strand. The seqnames
of the GRanges
correspond to the seqName
input value. NA
when no nucleosome
is detected.
k_max
a integer
, the maximum number of nucleosomes
obtained during the iteration process. NA
when no nucleosome is
detected.
Rawane Samb, Pascal Belleau, Astrid Deschenes
## Loading dataset data(reads_demo_01) ## Nucleosome positioning, running both merge and split functions result <- rjmcmc(reads = reads_demo_01, seqName = "chr_SYNTHETIC", nbrIterations = 1000, lambda = 2, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5, vSeed = 10113, saveAsRDS = FALSE) ## Print the final estimation of the number of nucleosomes result$k ## Print the position of nucleosomes result$mu ## Print the maximum number of nucleosomes obtained during the iteration ## process result$k_max
## Loading dataset data(reads_demo_01) ## Nucleosome positioning, running both merge and split functions result <- rjmcmc(reads = reads_demo_01, seqName = "chr_SYNTHETIC", nbrIterations = 1000, lambda = 2, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5, vSeed = 10113, saveAsRDS = FALSE) ## Print the final estimation of the number of nucleosomes result$k ## Print the position of nucleosomes result$mu ## Print the maximum number of nucleosomes obtained during the iteration ## process result$k_max
A list
of class
"rjmcmcNucleosomes" which contains the information about the
detected nucleosomes.
data(RJMCMC_result)
data(RJMCMC_result)
A list
of class
"rjmcmcNucleosomes" containing:
call
the matched call.
k
a integer
, the final estimation of the number
of nucleosomes. 0
when no nucleosome is detected.
mu
a vector
of numeric
of length
k
, the positions of the nucleosomes. NA
when no nucleosome is
detected.
k_max
a integer
, the maximum number of nucleosomes
obtained during the iteration process. NA
when no nucleosome is
detected.
A list
of class
"rjmcmcNucleosomes" containing:
call
the matched call.
k
a integer
, the final estimation of the number
of nucleosomes. 0
when no nucleosome is detected.
mu
a vector
of numeric
of length
k
, the positions of the nucleosomes. NA
when no nucleosome is
detected.
k_max
a integer
, the maximum number of nucleosomes
obtained during the iteration process. NA
when no nucleosome is
detected.
rjmcmc
for profiling of nucleosome positions
rjmcmcCHR
for profiling of nucleosome positions
for a large region. The function will take care of spliting and
merging.
segmentation
for spliting a GRanges
containing reads in a list of smaller segments for
the rjmcmc
function.
postTreatment
for merging closely positioned
nucleosomes
mergeRDSFiles
for merging nucleosome information
from selected RDS files.
plotNucleosomes
for generating a graph containing
the nucleosome positions and the read coverage.
## Loading dataset data(RJMCMC_result) data(reads_demo_02) ## Results before post-treatment RJMCMC_result$mu ## Post-treatment function which merged closely positioned nucleosomes postResult <- postTreatment(reads = reads_demo_02, extendingSize = 60, chrLength = 100000, resultRJMCMC = RJMCMC_result) ## Results after post-treatment postResult
## Loading dataset data(RJMCMC_result) data(reads_demo_02) ## Results before post-treatment RJMCMC_result$mu ## Post-treatment function which merged closely positioned nucleosomes postResult <- postTreatment(reads = reads_demo_02, extendingSize = 60, chrLength = 100000, resultRJMCMC = RJMCMC_result) ## Results after post-treatment postResult
Use of a fully Bayesian hierarchical model for chromosome-wide profiling of nucleosome positions based on high-throughput short-read data (MNase-Seq data). Beware that for a genome-wide profiling, each chromosome must be treated separatly. This function is optimized to run on an entire chromosome.
The function will process by splittingg the GRanges
of reads
(as example, the reads from a chromosome) in a list
of smaller
GRanges
segments that can be run by the
rjmcmc
function. All those steps are done automatically.
rjmcmcCHR(reads, seqName = NULL, zeta = 147, delta, maxLength, nbrIterations, kMax, lambda = 3, minInterval, maxInterval, minReads = 5, adaptIterationsToReads = TRUE, vSeed = -1, nbCores = 1, dirOut = "out", saveAsRDS = FALSE, saveSEG = TRUE)
rjmcmcCHR(reads, seqName = NULL, zeta = 147, delta, maxLength, nbrIterations, kMax, lambda = 3, minInterval, maxInterval, minReads = 5, adaptIterationsToReads = TRUE, vSeed = -1, nbCores = 1, dirOut = "out", saveAsRDS = FALSE, saveSEG = TRUE)
reads |
a |
seqName |
a |
zeta |
a positive |
delta |
a positive |
maxLength |
a positive |
nbrIterations |
a positive |
kMax |
a positive |
lambda |
a positive |
minInterval |
a |
maxInterval |
a |
minReads |
a positive |
adaptIterationsToReads |
a |
vSeed |
a |
nbCores |
a positive |
dirOut |
a |
saveAsRDS |
a |
saveSEG |
a |
a list
of class
"rjmcmcNucleosomesBeforeAndAfterPostTreatment" containing:
k a integer
, the number of nucleosomes.
mu a GRanges
containing the positions of the nucleosomes.
kPost a integer
, the number of nucleosomes after
post-treatment and '*' as strand. The seqnames
of the GRanges
correspond to the seqName
input value. NA
when no nucleosome
is detected.
muPost a GRanges
containing the positions of the
nucleosomes after post-treament and '*' as strand. The seqnames
of the GRanges
correspond to the seqName
input value.
NA
when no nucleosome is detected.
Pascal Belleau, Astrid Deschenes
## Load synthetic dataset of reads data(syntheticNucleosomeReads) ## Use dataset of reads to create GRanges object sampleGRanges <- GRanges(syntheticNucleosomeReads$dataIP) ## Run nucleosome detection on the entire sample ## Not run: result <- rjmcmcCHR(reads = sampleGRanges, zeta = 147, delta=50, maxLength=1200, nbrIterations = 1000, lambda = 3, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5, vSeed = 10113, nbCores = 2, saveAsRDS = FALSE) ## End(Not run)
## Load synthetic dataset of reads data(syntheticNucleosomeReads) ## Use dataset of reads to create GRanges object sampleGRanges <- GRanges(syntheticNucleosomeReads$dataIP) ## Run nucleosome detection on the entire sample ## Not run: result <- rjmcmcCHR(reads = sampleGRanges, zeta = 147, delta=50, maxLength=1200, nbrIterations = 1000, lambda = 3, kMax = 30, minInterval = 146, maxInterval = 292, minReads = 5, vSeed = 10113, nbCores = 2, saveAsRDS = FALSE) ## End(Not run)
GRanges
containing reads in a list of smaller
segments for the rjmcmc
function.Split a GRanges
of reads (as example, the reads from
a chromosome) in a list
of smaller GRanges
sot that the
rjmcmc
function can be run on each segments.
segmentation(reads, zeta = 147, delta, maxLength)
segmentation(reads, zeta = 147, delta, maxLength)
reads |
a |
zeta |
a positive |
delta |
a positive |
maxLength |
a positive |
a GRangesList
containing all the segments.
Pascal Belleau, Astrid Deschenes
## Load synthetic dataset of reads data(syntheticNucleosomeReads) ## Use dataset of reads to create GRanges object sampleGRanges <- GRanges(seqnames = syntheticNucleosomeReads$dataIP$chr, ranges = IRanges(start = syntheticNucleosomeReads$dataIP$start, end = syntheticNucleosomeReads$dataIP$end), strand = syntheticNucleosomeReads$dataIP$strand) # Segmentation of the reads segmentation(reads = sampleGRanges, zeta = 147, delta = 50, maxLength = 1000)
## Load synthetic dataset of reads data(syntheticNucleosomeReads) ## Use dataset of reads to create GRanges object sampleGRanges <- GRanges(seqnames = syntheticNucleosomeReads$dataIP$chr, ranges = IRanges(start = syntheticNucleosomeReads$dataIP$start, end = syntheticNucleosomeReads$dataIP$end), strand = syntheticNucleosomeReads$dataIP$strand) # Segmentation of the reads segmentation(reads = sampleGRanges, zeta = 147, delta = 50, maxLength = 1000)
nucleoSim
package
(for demo purpose).A list
of class
"syntheticNucReads" which contains the information about synthetic reads
related to nucleosomes. The datset has been created using a total of 300
well-positioned nucleosomes, 30 fuzzy nucleosomes with variance of reads
following a Normal distribution.
data(syntheticNucleosomeReads)
data(syntheticNucleosomeReads)
A list
containing:
call
the called that generated the dataset.
dataIP
a data.frame
with the chromosome name, the
starting and ending positions and the direction of all forward and
reverse reads for all well-positioned and fuzzy nucleosomes. Paired-end
reads are identified with an unique id.
wp
a data.frame
with the positions of all the
well-positioned nucleosomes, as well as the number of paired-reads
associated to each one.
fuz
a data.frame
with the positions of all the
fuzzy nucleosomes, as well as the number of paired-reads associated
to each one.
paired
a data.frame
with the starting and ending
positions of the reads used to generate the paired-end reads.
Paired-end reads are identified with an unique id.
A list
containing:
call
the called that generated the dataset.
dataIP
a data.frame
with the chromosome name, the
starting and ending positions and the direction of all forward and
reverse reads for all well-positioned and fuzzy nucleosomes. Paired-end
reads are identified with an unique id.
wp
a data.frame
with the positions of all the
well-positioned nucleosomes, as well as the number of paired-reads
associated to each one.
fuz
a data.frame
with the positions of all the
fuzzy nucleosomes, as well as the number of paired-reads associated
to each one.
paired
a data.frame
with the starting and ending
positions of the reads used to generate the paired-end reads.
Paired-end reads are identified with an unique id.