Title: | Detection of de novo deletion in targeted sequencing trios |
---|---|
Description: | A package for the detection of de novo copy number deletions in targeted sequencing of trios with high sensitivity and positive predictive value. |
Authors: | Jack M.. Fu [aut, cre] |
Maintainer: | Jack M.. Fu <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.27.0 |
Built: | 2024-11-30 02:48:11 UTC |
Source: | https://github.com/bioc/MDTS |
The MDTS package for Detection of Denovo Deletions from Targeted Sequencing Data Using Minimum-Distance
This function will randomly select a sample of bam files to calculate dynamic MDTS bins for subsequent read-depth analysis.
calcBins(metaData, n, readLength, medianCoverage, minimumCoverage, genome, mappabilityFile, seed = 1337)
calcBins(metaData, n, readLength, medianCoverage, minimumCoverage, genome, mappabilityFile, seed = 1337)
metaData |
A table in the format of the output of getMetaData(). |
n |
The number of subsamples to use. |
readLength |
The read length of the experiment. |
medianCoverage |
The median number of reads across sub-samples to reach before creating a new bin. |
minimumCoverage |
The miminum number of coverage across all sub-samples required to create the proto-region. |
genome |
The BSGenome object that assists in calculations of the GC content of the bins. |
mappabilityFile |
A path to the bigwig file of 100mer mappability of the corresponding genome. |
seed |
Sets the seed so results are reproducible. Defaults to 1337. |
Returns a GRanges
object depicting the dynamic bins that MDTS
calculates.
load(system.file("extdata", 'bins.RData', package = "MDTS")) bins
load(system.file("extdata", 'bins.RData', package = "MDTS")) bins
This function will return a matrix of read counts where ecah column is a sample, and each row is a bin.
calcCounts(metaData, bins, rl, mc.cores = 1)
calcCounts(metaData, bins, rl, mc.cores = 1)
metaData |
A table in the format of the output of getMetaData(). |
bins |
The set of bins determined by calcBins(). |
rl |
The read length of the experiment. |
mc.cores |
The number of cores to use for multi-threaded analysis. Defaults to 1. |
A data.frame
that contains the counts for each sample in the
metaData
input that fall into each segment of bins
.
## Not run: pD <- getMetaData( 'https://raw.githubusercontent.com/JMF47/MDTSData/master/data/pD.ped') genome = BSgenome.Hsapiens.UCSC.hg19 map_file <- "https://raw.githubusercontent.com/JMF47/MDTSData/master/data/chr1.map.bw" bins = calcBins(pD, n=5, rl=100, med=150, min=5, genome, map_file) ## End(Not run) load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) counts
## Not run: pD <- getMetaData( 'https://raw.githubusercontent.com/JMF47/MDTSData/master/data/pD.ped') genome = BSgenome.Hsapiens.UCSC.hg19 map_file <- "https://raw.githubusercontent.com/JMF47/MDTSData/master/data/chr1.map.bw" bins = calcBins(pD, n=5, rl=100, med=150, min=5, genome, map_file) ## End(Not run) load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) counts
This function will return a matrix of minimum distances where ecah column is a family, and each row is a bin.
calcMD(mCounts, metaData)
calcMD(mCounts, metaData)
mCounts |
A matrix of normalized coverage output by normalizedCounts(). |
metaData |
A table in the format of the output of metaData(). |
A data.frame
of minimum distances. Each column is a trio,
while each row is an entry in bins
load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts <- normalizeCounts(counts, bins) md <- calcMD(mCounts, pD)
load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts <- normalizeCounts(counts, bins) md <- calcMD(mCounts, pD)
This function will return a single GRanges object containing all denovo deletions that passed filtering from a Circular Binary Segmentation object with supplementary information.
denovoDeletions(cbs, mCounts, bins)
denovoDeletions(cbs, mCounts, bins)
cbs |
The output from segmentMD(). |
mCounts |
The normalized counts matrix output by normalizeCounts(). |
bins |
The set of bins determined by calcBins(). |
A GRanges
object that reports all detected denovo deletions
passing requite filters.
load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts = normalizeCounts(counts, bins) md = calcMD(mCounts, pD) cbs = segmentMD(md, bins) denovo = denovoDeletions(cbs, mCounts, bins)
load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts = normalizeCounts(counts, bins) md = calcMD(mCounts, pD) cbs = segmentMD(md, bins) denovo = denovoDeletions(cbs, mCounts, bins)
This function allows constructor of phenotype information necessary for downstream analysis. See format of required fields. Function will also rearrange the rows such that trios are grouped together - with proband first, mother second, and father third.
getMetaData(path, id = "subj_id", familyId = "family_id", fatherId = "father_id", motherId = "mother_id", bamPath = "bam_path")
getMetaData(path, id = "subj_id", familyId = "family_id", fatherId = "father_id", motherId = "mother_id", bamPath = "bam_path")
path |
The path pointing to the file that contains information on each subject in the dataset. |
id |
The column name that identifies each sample. Defaults to 'subj_id'. |
familyId |
The column name that identifies which family the sample belongs to. Defaults to 'family_id'. |
fatherId |
The column name that identifies the id of the father. Defaults to 'father_id'. |
motherId |
The column name that identifies the id of the mother. Defaults to 'mother_id'. |
bamPath |
The column name that identifies where to find the bam file for each subject. Defaults to 'bam_path'. |
Returns a data.frame
of required sample information for
running MDTS.
meta <- getMetaData( 'https://raw.githubusercontent.com/JMF47/MDTSData/master/data/pD.ped')
meta <- getMetaData( 'https://raw.githubusercontent.com/JMF47/MDTSData/master/data/pD.ped')
This function will return a matrix of normalized M scores where ecah column is a sample, and each row is a bin.
normalizeCounts(counts, bins, GC = TRUE, map = TRUE, mc.cores = 1)
normalizeCounts(counts, bins, GC = TRUE, map = TRUE, mc.cores = 1)
counts |
A matrix of raw coverage output by calcCounts(). |
bins |
The set of bins determined by calcBins(). |
GC |
Whether to loess adjust for GC. Defaults to TRUE. |
map |
Whether to loess adjust for mappability. Defaults to TRUE. Defaults to 1. |
mc.cores |
The number of cores to use for multi-threaded analysis. |
A data.frame
of normalized counts. Each column is a sample,
and each row is a entry of bins
.
load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts <- normalizeCounts(counts, bins)
load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts <- normalizeCounts(counts, bins)
This function will return a GRanges object containing the copy number segments of all families in the input minimum distance matrix. It calls segment() from DNAcopy (alpha=0.001, undo.splits="sdundo", undo.SD=4).
segmentMD(md, bins, alpha = 0.001, undo.splits = "sdundo", undo.SD = 4, mc.cores = 1)
segmentMD(md, bins, alpha = 0.001, undo.splits = "sdundo", undo.SD = 4, mc.cores = 1)
md |
The minimum distance matrix produced by calcMD. |
bins |
The set of bins determined by calcBins. |
alpha |
Controls the alpha option in calling DNAcopy::segment() |
undo.splits |
Controls the undo.splits option in DNAcopy::segment() |
undo.SD |
Controls the undo.SD option in calling DNAcopy::segment() |
mc.cores |
The number of cores to use for multi-threaded analysis. Defaults to 1. |
A data.frame
containing the segmented regions based to be
parsed by denovoDeletions()
minimum distance.
load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts <- normalizeCounts(counts, bins) md <- calcMD(mCounts, pD) cbs <- segmentMD(md, bins)
load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts <- normalizeCounts(counts, bins) md <- calcMD(mCounts, pD) cbs <- segmentMD(md, bins)
This function plots the raw read information from the location of interest for a trio.
visualizeDeletion(deletion, bins, metaData, mCounts, md, save = FALSE)
visualizeDeletion(deletion, bins, metaData, mCounts, md, save = FALSE)
deletion |
A GRanges object in the format of the output of denovoDeletions(). |
bins |
The set of bins determined by calcBins(). |
metaData |
A table in the format of the output of getMetaData(). |
mCounts |
A matrix of normalized coverage output by normalizedCounts(). |
md |
The minimum distance matrix output by calcMD() |
save |
If TRUE will save plot to current working directory instead of rendering. |
The file name if the plot was saved.
## Not run: load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts <- normalizeCounts(counts, bins) md <- calcMD(mCounts, pD) cbs <- segmentMD(md, bins) denovo <- denovoDeletions(cbs, mCounts, bins) visualizeDeletion(denovo[1], bins, pD, mCounts, md) ## End(Not run)
## Not run: load(system.file("extdata", 'bins.RData', package = "MDTS")) load(system.file("extdata", 'counts.RData', package = "MDTS")) load(system.file("extdata", 'pD.RData', package = "MDTS")) mCounts <- normalizeCounts(counts, bins) md <- calcMD(mCounts, pD) cbs <- segmentMD(md, bins) denovo <- denovoDeletions(cbs, mCounts, bins) visualizeDeletion(denovo[1], bins, pD, mCounts, md) ## End(Not run)