Title: | cn.mops - Mixture of Poissons for CNV detection in NGS data |
---|---|
Description: | cn.mops (Copy Number estimation by a Mixture Of PoissonS) is a data processing pipeline for copy number variations and aberrations (CNVs and CNAs) from next generation sequencing (NGS) data. The package supplies functions to convert BAM files into read count matrices or genomic ranges objects, which are the input objects for cn.mops. cn.mops models the depths of coverage across samples at each genomic position. Therefore, it does not suffer from read count biases along chromosomes. Using a Bayesian approach, cn.mops decomposes read variations across samples into integer copy numbers and noise by its mixture components and Poisson distributions, respectively. cn.mops guarantees a low FDR because wrong detections are indicated by high noise and filtered out. cn.mops is very fast and written in C++. |
Authors: | Guenter Klambauer [aut], Gundula Povysil [cre] |
Maintainer: | Gundula Povysil <[email protected]> |
License: | LGPL (>= 2.0) |
Version: | 1.51.0 |
Built: | 2024-06-30 02:56:11 UTC |
Source: | https://github.com/bioc/cn.mops |
This generic function calculates the fractional copy
numbers of a CNV detection method stored in an instance
of CNVDetectionResult-class
. Must be a
result of "referencecn.mops".
object |
An instance of "CNVDetectionResult" |
segStat |
Which statistic per segment should be used. Can be either "mean" or "median". (Default="mean"). |
calcFractionalCopyNumbers
returns an instance of
"CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) r <- referencecn.mops(X[,1:2],apply(X,1,median)) calcFractionalCopyNumbers(r)
data(cn.mops) r <- referencecn.mops(X[,1:2],apply(X,1,median)) calcFractionalCopyNumbers(r)
This generic function calculates the fractional copy numbers of
a CNV detection method stored in an instance of
CNVDetectionResult-class
. Must be a result of
"referencecn.mops".
## S4 method for signature 'CNVDetectionResult' calcFractionalCopyNumbers(object, segStat = "mean")
## S4 method for signature 'CNVDetectionResult' calcFractionalCopyNumbers(object, segStat = "mean")
object |
An instance of "CNVDetectionResult" |
segStat |
Which statistic per segment should be used. Can be either "mean" or "median". (Default="mean"). |
calcFractionalCopyNumbers
returns an
instance of "CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) r <- referencecn.mops(X[,1:2],apply(X,1,median)) calcFractionalCopyNumbers(r)
data(cn.mops) r <- referencecn.mops(X[,1:2],apply(X,1,median)) calcFractionalCopyNumbers(r)
This generic function calculates the integer copy numbers
of a CNV detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult" |
calcIntegerCopyNumbers
returns an instance of
"CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) calcIntegerCopyNumbers(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) calcIntegerCopyNumbers(r)
This generic function calculates the integer copy numbers of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' calcIntegerCopyNumbers(object)
## S4 method for signature 'CNVDetectionResult' calcIntegerCopyNumbers(object)
object |
An instance of "CNVDetectionResult" |
calcIntegerCopyNumbers
returns an
instance of "CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) calcIntegerCopyNumbers(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) calcIntegerCopyNumbers(r)
This function performs the cn.mops algorithm for copy number detection in NGS data.
cn.mops(input, I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4), classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7", "CN8"), priorImpact = 1, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.5, lowerThreshold = -0.9, minWidth = 3, segAlgorithm = "fast", minReadCount = 5, useMedian = FALSE, returnPosterior = FALSE, ...)
cn.mops(input, I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4), classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7", "CN8"), priorImpact = 1, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.5, lowerThreshold = -0.9, minWidth = 3, segAlgorithm = "fast", minReadCount = 5, useMedian = FALSE, returnPosterior = FALSE, ...)
input |
Either an instance of "GRanges" or a raw data matrix, where columns are interpreted as samples and rows as genomic regions. An entry is the read count of a sample in the genomic region. |
I |
Vector positive real values that contain the expected fold change of the copy number classes. Length of this vector must be equal to the length of the "classes" parameter vector. For human copy number polymorphisms we suggest to use the default I = c(0.025,0.5,1,1.5,2,2.5,3,3.5,4). |
classes |
Vector of characters of the same length as the parameter vector "I". One vector element must be named "CN2". The names reflect the labels of the copy number classes. Default = c("CN0","CN1","CN2","CN3","CN4","CN5","CN6","CN7","CN8"). |
priorImpact |
Positive real value that reflects how strong the prior assumption affects the result. The higher the value the more samples will be assumed to have copy number 2. Default = 1. |
cyc |
Positive integer that sets the number of cycles for the algorithm. Usually after less than 15 cycles convergence is reached. Default = 20. |
parallel |
How many cores are used for the computation. If set to zero than no parallelization is applied. Default = 0. |
norm |
The normalization strategy to be used. If set to 0 the read counts are not normalized and cn.mops does not model different coverages. If set to 1 the read counts are normalized. If set to 2 the read counts are not normalized and cn.mops models different coverages. (Default=1). |
normType |
Mode of the normalization technique. Possible values are "mean","min","median","quant", "poisson" and "mode". Read counts will be scaled sample-wise. Default = "poisson". |
sizeFactor |
By this parameter one can decide to how the size factors are calculated. Possible choices are the the mean, median or mode coverage ("mean", "median", "mode") or any quantile ("quant"). |
normQu |
Real value between 0 and 1. If the "normType" parameter is set to "quant" then this parameter sets the quantile that is used for the normalization. Default = 0.25. |
quSizeFactor |
Quantile of the sizeFactor if sizeFactor is set to "quant". 0.75 corresponds to "upper quartile normalization". Real value between 0 and 1. Default = 0.75. |
upperThreshold |
Positive real value that sets the cut-off for copy number gains. All CNV calling values above this value will be called as "gain". The value should be set close to the log2 of the expected foldchange for copy number 3 or 4. Default = 0.5. |
lowerThreshold |
Negative real value that sets the cut-off for copy number losses. All CNV calling values below this value will be called as "loss". The value should be set close to the log2 of the expected foldchange for copy number 1 or 0. Default = -0.9. |
minWidth |
Positive integer that is exactly the parameter "min.width" of the "segment" function of "DNAcopy". minWidth is the minimum number of segments a CNV should span. Default = 3. |
segAlgorithm |
Which segmentation algorithm should be used. If set to "DNAcopy" circular binary segmentation is performed. Any other value will initiate the use of our fast segmentation algorithm. Default = "fast". |
minReadCount |
If all samples are below this value the algorithm will return the prior knowledge. This prevents that the algorithm from being applied to segments with very low coverage. Default=5. |
useMedian |
Whether "median" instead of "mean" of a segment should be used for the CNV call. Default=FALSE. |
returnPosterior |
Flag that decides whether the posterior probabilities should be returned. The posterior probabilities have a dimension of samples times copy number states times genomic regions and therefore consume a lot of memory. Default=FALSE. |
... |
Additional parameters will be passed to the "DNAcopy" or the standard segmentation algorithm. |
An instance of "CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) cn.mops(XRanges) cn.mops(XRanges,parallel=2)
data(cn.mops) cn.mops(XRanges) cn.mops(XRanges,parallel=2)
S4 class for storing results of a CNV detection method.
The following slots are defined for CNVDetectionResult objects:
The segments in which the reads are counted. GRanges object.
The normalized data.
The data to which the segmentation algorithm is applied. These can be z-Scores, ratios, log-ratios or I/NI calls.
The CNV call that the method provides for a specific sample
The CNV call that the method provides a specific segment.
The posterior probabilities for different copy numbers.
The detected CNVs.
The detected CNV regions.
The segmentation of the reference sequence (sample-wise).
The most probable integer copy number.
The sample names.
The parameters with which the method was run.
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(x = "CNVDetectionResult", y = "missing")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
signature(object = "CNVDetectionResult")
: ...
Guenter Klambauer [email protected]
showClass("CNVDetectionResult")
showClass("CNVDetectionResult")
CNVDetectionResult-class
.This generic function returns CNV regions of a CNV
detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult" |
cnvr
returns a eturns a "GRanges" object
containing the CNV regions.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) cnvr(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) cnvr(r)
CNVDetectionResult-class
.This generic function returns CNV regions of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' cnvr(object)
## S4 method for signature 'CNVDetectionResult' cnvr(object)
object |
An instance of "CNVDetectionResult" |
cnvr
returns a eturns a "GRanges" object containing
the CNV regions.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) cnvr(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) cnvr(r)
This data set gives the starts, ends, and the integer copy number of the
simulated CNVs in the data set XRanges
object.
CNVRanges
CNVRanges
A GRanges object with 20 rows and 40 value columns across 1 space.
http://www.bioinf.jku.at/cnmops/cnmops.html.
Guenter Klambauer, Karin Schwarzbauer, Andreas Mayr, Djork-Arne Clevert, Andreas Mitterecker, Ulrich Bodenhofer, Sepp Hochreiter. cn.MOPS: mixture of Poissons for discovering copy number variations in next generation sequencing data with a low false discovery rate. Nucleic Acids Research 2012 40(9); doi:10.1093/nar/gks003.
CNVDetectionResult-class
.This generic function returns CNVs of a CNV detection
method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult" |
cnvs
returns a eturns a "GRanges" object
containing the CNVs.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) cnvs(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) cnvs(r)
CNVDetectionResult-class
.This generic function returns CNVs of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' cnvs(object)
## S4 method for signature 'CNVDetectionResult' cnvs(object)
object |
An instance of "CNVDetectionResult" |
cnvs
returns a eturns a "GRanges" object containing
the CNVs.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) cnvs(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) cnvs(r)
Performs the cn.mops algorithm for copy number detection in NGS data with parameters adjusted to exome sequencing data.
exomecn.mops(input, I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4), classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7", "CN8"), priorImpact = 10, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.5, lowerThreshold = -0.8, minWidth = 5, segAlgorithm = "fast", minReadCount = 1, useMedian = FALSE, returnPosterior = FALSE, ...)
exomecn.mops(input, I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4), classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7", "CN8"), priorImpact = 10, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.5, lowerThreshold = -0.8, minWidth = 5, segAlgorithm = "fast", minReadCount = 1, useMedian = FALSE, returnPosterior = FALSE, ...)
input |
Either an instance of "GRanges" or a raw data matrix, where columns are interpreted as samples and rows as genomic regions. An entry is the read count of a sample in the genomic region. |
I |
Vector positive real values that contain the expected fold change of the copy number classes. Length of this vector must be equal to the length of the "classes" parameter vector. For human copy number polymorphisms we suggest to use the default I = c(0.025,0.5,1,1.5,2,2.5,3,3.5,4). |
classes |
Vector of characters of the same length as the parameter vector "I". One vector element must be named "CN2". The names reflect the labels of the copy number classes. Default = c("CN0","CN1","CN2","CN3","CN4","CN5","CN6","CN7","CN8"). |
priorImpact |
Positive real value that reflects how strong the prior assumption affects the result. The higher the value the more samples will be assumed to have copy number 2. Default = 10. |
cyc |
Positive integer that sets the number of cycles for the algorithm. Usually after less than 15 cycles convergence is reached. Default = 20. |
parallel |
How many cores are used for the computation. If set to zero than no parallelization is applied. Default = 0. |
norm |
The normalization strategy to be used. If set to 0 the read counts are not normalized and cn.mops does not model different coverages. If set to 1 the read counts are normalized. If set to 2 the read counts are not normalized and cn.mops models different coverages. (Default=1). |
normType |
Mode of the normalization technique. Possible values are "mean","min","median","quant", "poisson" and "mode". Read counts will be scaled sample-wise. Default = "poisson". |
sizeFactor |
By this parameter one can decide to how the size factors are calculated. Possible choices are the the mean, median or mode coverage ("mean", "median", "mode") or any quantile ("quant"). |
normQu |
Real value between 0 and 1. If the "normType" parameter is set to "quant" then this parameter sets the quantile that is used for the normalization. Default = 0.25. |
quSizeFactor |
Quantile of the sizeFactor if sizeFactor is set to "quant". 0.75 corresponds to "upper quartile normalization". Real value between 0 and 1. Default = 0.75. |
upperThreshold |
Positive real value that sets the cut-off for copy number gains. All CNV calling values above this value will be called as "gain". The value should be set close to the log2 of the expected foldchange for copy number 3 or 4. Default = 0.55. |
lowerThreshold |
Negative real value that sets the cut-off for copy number losses. All CNV calling values below this value will be called as "loss". The value should be set close to the log2 of the expected foldchange for copy number 1 or 0. Default = -0.8. |
minWidth |
Positive integer that is exactly the parameter "min.width" of the "segment" function of "DNAcopy". minWidth is the minimum number of segments a CNV should span. Default = 5. |
segAlgorithm |
Which segmentation algorithm should be used. If set to "DNAcopy" circular binary segmentation is performed. Any other value will initiate the use of our fast segmentation algorithm. Default = "fast". |
minReadCount |
If all samples are below this value the algorithm will return the prior knowledge. This prevents that the algorithm from being applied to segments with very low coverage. Default=1. |
useMedian |
Whether "median" instead of "mean" of a segment should be used for the CNV call. Default=FALSE. |
returnPosterior |
Flag that decides whether the posterior probabilities should be returned. The posterior probabilities have a dimension of samples times copy number states times genomic regions and therefore consume a lot of memory. Default=FALSE. |
... |
Additional parameters will be passed to the "DNAcopy" or the standard segmentation algorithm. |
An instance of "CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) exomecn.mops(exomeCounts)
data(cn.mops) exomecn.mops(exomeCounts)
This data set gives the read counts on chromosome 22 (hg19) of 22 samples in 3785 exons. The rows correspond to targeted regions or exons and columns to samples. An entry is the number of reads that map to the specific segment, i.e. targeted region or exon, of the sample. The GRanges object contains the information of the genomic location. The read counts were generated from freely available exome sequencing data of the 1000Genomes Project.
exomeCounts
exomeCounts
A GRanges object of 3785 rows and 22 columns.
http://www.bioinf.jku.at/software/cnmops/cnmops.html.
Guenter Klambauer, Karin Schwarzbauer, Andreas Mayr, Djork-Arne Clevert, Andreas Mitterecker, Ulrich Bodenhofer, Sepp Hochreiter. cn.MOPS: mixture of Poissons for discovering copy number variations in next generation sequencing data with a low false discovery rate. Nucleic Acids Research 2012 40(9); doi:10.1093/nar/gks003.
The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 2010 467(1061-1073); doi:10.1038/nature09534.
Generates the read counts from BAM Files. These counts are necessary for CNV detection methods based on depth of coverage information.
This function can also be run in a parallel version.
getReadCountsFromBAM(BAMFiles, sampleNames, refSeqNames, WL = 25000, parallel = 0, ...)
getReadCountsFromBAM(BAMFiles, sampleNames, refSeqNames, WL = 25000, parallel = 0, ...)
BAMFiles |
BAMFiles |
sampleNames |
The corresponding sample names to the BAM Files. |
refSeqNames |
Names of the reference sequence that should be analyzed. The name must appear in the header of the BAM file. If it is not given the function will select the first reference sequence that appears in the header of the BAM files. Can be set to analyze multipe chromosomes at once, e.g. refSeqNames=c("chr1","chr2") |
WL |
Windowlength. Length of the initial segmentation of the genome in basepairs. Should be chosen such that on the average 100 reads are contained in each segment. |
parallel |
The number of parallel processes to be used for this function. Default=0. |
... |
Additional parameters passed to the function "countBamInGRanges" of the Bioconductor package "exomeCopy". Quality filters for read counts can be adjusted there. |
An instance of "GRanges", that contains the breakpoints of the initial segments and the raw read counts that were extracted from the BAM files. This object can be used as input for cn.mops and other CNV detection methods.
Guenter Klambauer [email protected]
BAMFiles <- list.files(system.file("extdata", package="cn.mops"),pattern=".bam$",full.names=TRUE) bamDataRanges <- getReadCountsFromBAM(BAMFiles, sampleNames=paste("Sample",1:3),WL=5000) X <- getReadCountsFromBAM(BAMFiles, sampleNames=paste("Sample",1:3),WL=5000,parallel=2)
BAMFiles <- list.files(system.file("extdata", package="cn.mops"),pattern=".bam$",full.names=TRUE) bamDataRanges <- getReadCountsFromBAM(BAMFiles, sampleNames=paste("Sample",1:3),WL=5000) X <- getReadCountsFromBAM(BAMFiles, sampleNames=paste("Sample",1:3),WL=5000,parallel=2)
Generates the read counts from BAM Files for predefined segments. This is the appropiate choice for exome sequencing data, where the bait regions, target regions or exons are the predefined segments. These counts are necessary for CNV detection methods based on depth of coverage information.
This function can also be run in a parallel version.
getSegmentReadCountsFromBAM(BAMFiles, GR, sampleNames, parallel = 0, ...)
getSegmentReadCountsFromBAM(BAMFiles, GR, sampleNames, parallel = 0, ...)
BAMFiles |
BAMFiles |
GR |
A genomic ranges object that contains the genomic coordinates of the segments. |
sampleNames |
The corresponding sample names to the BAM Files. |
parallel |
The number of parallel processes to be used for this function. Default=0. |
... |
Additional parameters passed to the function "countBamInGRanges" of the Bioconductor package "exomeCopy". Quality filters for read counts can be adjusted there. Please see "??countBamInGRanges" for more information. |
An instance of "GRanges", that contains the breakpoints of the initial segments and the raw read counts that were extracted from the BAM files. This object can be used as input for cn.mops and other CNV detection methods.
Guenter Klambauer [email protected]
BAMFiles <- list.files(system.file("extdata", package="cn.mops"),pattern=".bam$", full.names=TRUE) gr <- GRanges(c("20","20"),IRanges(c(60000,70000),c(70000,80000))) bamDataRanges <- getSegmentReadCountsFromBAM(BAMFiles,GR=gr) bamDataRanges <- getSegmentReadCountsFromBAM(BAMFiles,GR=gr,parallel=2)
BAMFiles <- list.files(system.file("extdata", package="cn.mops"),pattern=".bam$", full.names=TRUE) gr <- GRanges(c("20","20"),IRanges(c(60000,70000),c(70000,80000))) bamDataRanges <- getSegmentReadCountsFromBAM(BAMFiles,GR=gr) bamDataRanges <- getSegmentReadCountsFromBAM(BAMFiles,GR=gr,parallel=2)
CNVDetectionResult-class
.This generic function returns the genomic ranges of a CNV
detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult". |
normalizedData
returns a "GRanges" object
containing the normalized data.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) gr(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) gr(r)
CNVDetectionResult-class
.This generic function returns the genomic ranges
of a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' gr(object)
## S4 method for signature 'CNVDetectionResult' gr(object)
object |
An instance of "CNVDetectionResult". |
normalizedData
returns a "GRanges" object containing
the normalized data.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) gr(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) gr(r)
Performs the cn.mops algorithm for copy number detection in NGS data adjusted to haploid genomes. It is assumed that the normal state is copy number 1. This is an experimental method at the moment.
haplocn.mops(input, I = c(0.025, 1, 2, 3, 4, 5, 6, 7, 8), classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7", "CN8"), priorImpact = 1, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.6, lowerThreshold = -0.9, minWidth = 3, segAlgorithm = "fast", minReadCount = 1, returnPosterior = FALSE, ...)
haplocn.mops(input, I = c(0.025, 1, 2, 3, 4, 5, 6, 7, 8), classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7", "CN8"), priorImpact = 1, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.6, lowerThreshold = -0.9, minWidth = 3, segAlgorithm = "fast", minReadCount = 1, returnPosterior = FALSE, ...)
input |
Either an instance of "GRanges" or a raw data matrix, where columns are interpreted as samples and rows as genomic regions. An entry is the read count of a sample in the genomic region. |
I |
Vector positive real values that contain the expected fold change of the copy number classes. Length of this vector must be equal to the length of the "classes" parameter vector. For copy number polymorphisms in haploid organisms we suggest to use the default I = c(0.025,1,2,3,4,5,6,7,8). |
classes |
Vector of characters of the same length as the parameter vector "I". One vector element must be named "CN1". The names reflect the labels of the copy number classes. Default = c("CN0","CN1","CN2","CN3","CN4","CN5","CN6","CN7","CN8"). |
priorImpact |
Positive real value that reflects how strong the prior assumption affects the result. The higher the value the more samples will be assumed to have copy number 1. Default = 1. |
cyc |
Positive integer that sets the number of cycles for the algorithm. Usually after less than 15 cycles convergence is reached. Default = 20. |
parallel |
How many cores are used for the computation. If set to zero than no parallelization is applied. Default = 0. |
norm |
The normalization strategy to be used. If set to 0 the read counts are not normalized and cn.mops does not model different coverages. If set to 1 the read counts are normalized. If set to 2 the read counts are not normalized and cn.mops models different coverages. (Default=1). |
normType |
Mode of the normalization technique. Possible values are "mean","min","median","quant", "poisson" and "mode". Read counts will be scaled sample-wise. Default = "poisson". |
sizeFactor |
By this parameter one can decide to how the size factors are calculated. Possible choices are the the mean, median or mode coverage ("mean", "median", "mode") or any quantile ("quant"). |
normQu |
Real value between 0 and 1. If the "normType" parameter is set to "quant" then this parameter sets the quantile that is used for the normalization. Default = 0.25. |
quSizeFactor |
Quantile of the sizeFactor if sizeFactor is set to "quant". 0.75 corresponds to "upper quartile normalization". Real value between 0 and 1. Default = 0.75. |
upperThreshold |
Positive real value that sets the cut-off for copy number gains. All CNV calling values above this value will be called as "gain". The value should be set close to the log2 of the expected foldchange for copy number 3 or 4. Default = 0.5. |
lowerThreshold |
Negative real value that sets the cut-off for copy number losses. All CNV calling values below this value will be called as "loss". The value should be set close to the log2 of the expected foldchange for copy number 1 or 0. Default = -0.9. |
minWidth |
Positive integer that is exactly the parameter "min.width" of the "segment" function of "DNAcopy". minWidth is the minimum number of segments a CNV should span. Default = 4. |
segAlgorithm |
Which segmentation algorithm should be used. If set to "DNAcopy" circular binary segmentation is performed. Any other value will initiate the use of our fast segmentation algorithm. Default = "fast". |
minReadCount |
If all samples are below this value the algorithm will return the prior knowledge. This prevents that the algorithm from being applied to segments with very low coverage. |
returnPosterior |
Flag that decides whether the posterior probabilities should be returned. The posterior probabilities have a dimension of samples times copy number states times genomic regions and therefore consume a lot of memory. Default=FALSE. |
... |
Additional parameters will be passed to the "DNAcopy" or the standard segmentation algorithm. |
An instance of "CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) haplocn.mops(XRanges[1:200, ]) haplocn.mops(XRanges[1:200, ],parallel=2)
data(cn.mops) haplocn.mops(XRanges[1:200, ]) haplocn.mops(XRanges[1:200, ],parallel=2)
CNVDetectionResult-class
.This generic function returns the individual calls of a
CNV detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult" |
individualCalls
returns a "GRanges" object
containing the individual calls.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) individualCall(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) individualCall(r)
CNVDetectionResult-class
.This generic function returns the individual calls of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' individualCall(object)
## S4 method for signature 'CNVDetectionResult' individualCall(object)
object |
An instance of "CNVDetectionResult" |
individualCalls
returns a "GRanges" object containing
the individual calls.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) individualCall(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) individualCall(r)
CNVDetectionResult-class
.
The I/NI call is a measure for a genomic
segment across all samples, whether this segment is a CNV region
(informative) or a normal genomic region (non-informative).This generic function returns the
informative/non-informative call of a CNV detection
method stored in an instance of
CNVDetectionResult-class
. The I/NI call is
a measure for a genomic segment across all samples,
whether this segment is a CNV region (informative) or a
normal genomic region (non-informative).
object |
An instance of "CNVDetectionResult" |
iniCall
returns a "GRanges" object containing the
individual calls.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) iniCall(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) iniCall(r)
CNVDetectionResult-class
.
The I/NI call is a measure for a genomic
segment across all samples, whether this segment is a CNV region
(informative) or a normal genomic region (non-informative).This generic function returns the informative/non-informative call of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
The I/NI call is a measure for a genomic
segment across all samples, whether this segment is a CNV region
(informative) or a normal genomic region (non-informative).
## S4 method for signature 'CNVDetectionResult' iniCall(object)
## S4 method for signature 'CNVDetectionResult' iniCall(object)
object |
An instance of "CNVDetectionResult" |
iniCall
returns a "GRanges" object containing
the individual calls.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) iniCall(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) iniCall(r)
CNVDetectionResult-class
.This generic function returns the integer copy numbers of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult" |
integerCopyNumber
returns a eturns a "GRanges"
object containing the integer copy numbers.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) integerCopyNumber(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) integerCopyNumber(r)
CNVDetectionResult-class
.This generic function returns the integer copy numbers of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' integerCopyNumber(object)
## S4 method for signature 'CNVDetectionResult' integerCopyNumber(object)
object |
An instance of "CNVDetectionResult" |
integerCopyNumber
returns a eturns a "GRanges" object containing
the integer copy numbers.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) integerCopyNumber(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) integerCopyNumber(r)
CNVDetectionResult-class
. For other CNV detection methods
this can be (log-) ratios or z-scores.This generic function returns the local assessments, i.e.
signed individual informative/non-informative calls, of a
CNV detection method stored in an instance of
CNVDetectionResult-class
. For other CNV
detection methods this can be (log-) ratios or z-scores.
object |
An instance of "CNVDetectionResult" |
localAssessments
returns a "GRanges" object
containing the local assessments.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) localAssessments(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) localAssessments(r)
CNVDetectionResult-class
. For other CNV detection methods
this can be (log-) ratios or z-scores.This generic function returns the local assessments, i.e.
signed individual informative/non-informative calls,
of a CNV detection method stored in an instance of
CNVDetectionResult-class
. For other CNV detection methods
this can be (log-) ratios or z-scores.
## S4 method for signature 'CNVDetectionResult' localAssessments(object)
## S4 method for signature 'CNVDetectionResult' localAssessments(object)
object |
An instance of "CNVDetectionResult" |
localAssessments
returns a "GRanges" object containing
the local assessments.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) localAssessments(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) localAssessments(r)
This generic function calculates robust CNV regions by
segmenting the I/NI call per genomic region
of an object CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' makeRobustCNVR(object, robust = 0.5, minWidth = 4, ...)
## S4 method for signature 'CNVDetectionResult' makeRobustCNVR(object, robust = 0.5, minWidth = 4, ...)
object |
An instance of "CNVDetectionResult" |
robust |
Robustness parameter. The higher the value, the more samples are required to have a CNV that confirms the CNV region. Setting this parameter to 0 restores the original CNV regions. (Default=0.5) |
minWidth |
The minimum length measured in genomic regions a CNV region has to span in order to be called. A parameter of the segmentation algorithm. (Default=4). |
... |
Additional parameters passed to the segmentation algorithm. |
This generic function calculates robust CNV regions by
segmenting the I/NI call per genomic region
of an object CNVDetectionResult-class
.
cn.mops usually reports a CNV region if at least one individual has a CNV in this region. For some applications it is useful to find more common CNV regions, i.e., regions in which more than one sample has a CNV. The I/NI call measures both signal strength and how many sample show an abnormal copy number, therefore segmentation of the I/NI call can provide robust CNV regions.
makeRobustCNVR
returns a "CNVDetectionResult"
object containing new values in the slot "cnvr".
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) rr <- calcIntegerCopyNumbers(makeRobustCNVR(r,robust=0.1,minWidth=3))
data(cn.mops) r <- cn.mops(X[1:100,1:5]) rr <- calcIntegerCopyNumbers(makeRobustCNVR(r,robust=0.1,minWidth=3))
Normalize quantitative NGS data in order to make counts comparable over samples, i.e., correcting for different library sizes or coverages. Scales each samples' reads such that the coverage is even for all samples after normalization.
normalizeChromosomes(X, chr, normType = "poisson", sizeFactor = "mean", qu = 0.25, quSizeFactor = 0.75, ploidy)
normalizeChromosomes(X, chr, normType = "poisson", sizeFactor = "mean", qu = 0.25, quSizeFactor = 0.75, ploidy)
X |
Matrix of positive real values, where columns are interpreted as samples and rows as genomic regions. An entry is the read count of a sample in the genomic region. Alternatively this can be a GRanges object containing the read counts as values. |
chr |
Character vector that has as many elements as "X" has rows. The vector assigns each genomic segment to a reference sequence (chromosome). |
normType |
Type of the normalization technique. Each samples' read counts are scaled such that the total number of reads are comparable across samples. If this parameter is set to the value "mode", the read counts are scaled such that each samples' most frequent value (the "mode") is equal after normalization. Accordingly for the other options are "mean","median","poisson", "quant", and "mode". Default = "poisson". |
sizeFactor |
By this parameter one can decide to how the size factors are calculated. Possible choices are the the mean, median or mode coverage ("mean", "median", "mode") or any quantile ("quant"). |
qu |
Quantile of the normType if normType is set to "quant" .Real value between 0 and 1. Default = 0.25. |
quSizeFactor |
Quantile of the sizeFactor if sizeFactor is set to "quant". 0.75 corresponds to "upper quartile normalization". Real value between 0 and 1. Default = 0.75. |
ploidy |
An integer value for each sample or each column in the read count matrix. At least two samples must have a ploidy of 2. Default = "missing". |
A data matrix of normalized read counts with the same dimensions as the input matrix X.
Guenter Klambauer [email protected]
data(cn.mops) X.norm <- normalizeChromosomes(X)
data(cn.mops) X.norm <- normalizeChromosomes(X)
CNVDetectionResult-class
.This generic function returns the normalized data of a
CNV detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult". |
normalizedData
returns a "GRanges" object
containing the normalized data.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) normalizedData(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) normalizedData(r)
CNVDetectionResult-class
.This generic function returns the normalized data
of a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' normalizedData(object)
## S4 method for signature 'CNVDetectionResult' normalizedData(object)
object |
An instance of "CNVDetectionResult". |
normalizedData
returns a "GRanges" object containing
the normalized data.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) normalizedData(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) normalizedData(r)
Normalize quantitative NGS data in order to make counts comparable over samples. Scales each samples' reads such that the coverage is even for all samples after normalization.
normalizeGenome(X, normType = "poisson", sizeFactor = "mean", qu = 0.25, quSizeFactor = 0.75, ploidy)
normalizeGenome(X, normType = "poisson", sizeFactor = "mean", qu = 0.25, quSizeFactor = 0.75, ploidy)
X |
Matrix of positive real values, where columns are interpreted as samples and rows as genomic regions. An entry is the read count of a sample in the genomic region. Alternatively this can be a GRanges object containing the read counts as values. |
normType |
Type of the normalization technique. Each samples' read counts are scaled such that the total number of reads are comparable across samples. If this parameter is set to the value "mode", the read counts are scaled such that each samples' most frequent value (the "mode") is equal after normalization. Accordingly for the other options are "mean","median","poisson", "quant", and "mode". Default = "poisson". |
sizeFactor |
By this parameter one can decide to how the size factors are calculated. Possible choices are the the mean, median or mode coverage ("mean", "median", "mode") or any quantile ("quant"). |
qu |
Quantile of the normType if normType is set to "quant" .Real value between 0 and 1. Default = 0.25. |
quSizeFactor |
Quantile of the sizeFactor if sizeFactor is set to "quant". 0.75 corresponds to "upper quartile normalization". Real value between 0 and 1. Default = 0.75. |
ploidy |
An integer value for each sample or each column in the read count matrix. At least two samples must have a ploidy of 2. Default = "missing". |
A data matrix of normalized read counts with the same dimensions as the input matrix X.
Guenter Klambauer [email protected]
data(cn.mops) X.norm <- normalizeGenome(X)
data(cn.mops) X.norm <- normalizeGenome(X)
CNVDetectionResult-class
.This generic function returns the parameters of a CNV
detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult" |
params
returns a eturns a "GRanges" object
containing the parameters.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) params(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) params(r)
CNVDetectionResult-class
.This generic function returns the parameters of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' params(object)
## S4 method for signature 'CNVDetectionResult' params(object)
object |
An instance of "CNVDetectionResult" |
params
returns a eturns a "GRanges" object containing
the parameters.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) params(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) params(r)
Plots read counts, call values and CNV calls in an identified CNV region.
## S4 method for signature 'CNVDetectionResult,missing' plot(x, which,margin=c(10,10),toFile=FALSE)
## S4 method for signature 'CNVDetectionResult,missing' plot(x, which,margin=c(10,10),toFile=FALSE)
x |
An instance of "CNVDetectionResult" |
which |
The index of the CNV region to be plotted. |
margin |
Vector of two positive integers that states how many segments left and right of the CNV region should be included in the plot. Default = c(10,10). |
toFile |
Logical value whether the output should be plotted to a file. Default = FALSE. |
Generates a CNV calling plot.
Guenter Klambauer [email protected]
CNVDetectionResult-class
.
The posterior probabilities are represented
as a three dimensional array, where the three dimensions are segment,
copy number and individual.This generic function returns the posterior probabilities
of a CNV detection method stored in an instance of
CNVDetectionResult-class
. The posterior
probabilities are represented as a three dimensional
array, where the three dimensions are segment, copy
number and individual.
object |
An instance of "CNVDetectionResult" |
posteriorProbs
returns a three dimensional array.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) posteriorProbs(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) posteriorProbs(r)
CNVDetectionResult-class
.
The posterior probabilities are represented
as a three dimensional array, where the three dimensions are segment,
copy number and individual.This generic function returns the posterior probabilities of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
The posterior probabilities are represented
as a three dimensional array, where the three dimensions are segment,
copy number and individual.
## S4 method for signature 'CNVDetectionResult' posteriorProbs(object)
## S4 method for signature 'CNVDetectionResult' posteriorProbs(object)
object |
An instance of "CNVDetectionResult" |
posteriorProbs
returns a three dimensional array.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) posteriorProbs(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) posteriorProbs(r)
This function performs the an alternative version of the cn.mops algorithm adapted to a setting of control versus cases
referencecn.mops(cases, controls, I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 8, 16, 32, 64), classes = paste("CN", c(0:8, 16, 32, 64, 128), sep = ""), priorImpact = 1, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.5, lowerThreshold = -0.9, minWidth = 4, segAlgorithm = "DNAcopy", minReadCount = 1, verbose = 1, returnPosterior = FALSE, ...)
referencecn.mops(cases, controls, I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 8, 16, 32, 64), classes = paste("CN", c(0:8, 16, 32, 64, 128), sep = ""), priorImpact = 1, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.5, lowerThreshold = -0.9, minWidth = 4, segAlgorithm = "DNAcopy", minReadCount = 1, verbose = 1, returnPosterior = FALSE, ...)
cases |
Either an instance of "GRanges" or a raw data matrix, where columns are interpreted as samples and rows as genomic regions. An entry is the read count of a sample in the genomic region. |
controls |
Either an instance of "GRanges" or a raw data matrix, where columns are interpreted as samples and rows as genomic regions. An entry is the read count of a sample in the genomic region. |
I |
Vector positive real values that contain the expected fold change of the copy number classes. Length of this vector must be equal to the length of the "classes" parameter vector. For human copy number polymorphisms we suggest to use the default I = c(0.025,0.5,1,1.5,2,2.5,3,3.5,4,8,16,32,64). |
classes |
Vector of characters of the same length as the parameter vector "I". One vector element must be named "CN2". The names reflect the labels of the copy number classes. Default = paste("CN",c(0:8,16,32,64,128),sep=""). |
priorImpact |
Positive real value that reflects how strong the prior assumption affects the result. The higher the value the more samples will be assumed to have copy number 2. Default = 1. |
cyc |
Positive integer that sets the number of cycles for the algorithm. Usually after less than 15 cycles convergence is reached. Default = 20. |
parallel |
How many cores are used for the computation. If set to zero than no parallelization is applied. Default = 0. |
norm |
The normalization strategy to be used. If set to 0 the read counts are not normalized and cn.mops does not model different coverages. If set to 1 the read counts are normalized. If set to 2 the read counts are not normalized and cn.mops models different coverages. (Default=1). |
normType |
Mode of the normalization technique. Possible values are "mean","min","median","quant", "poisson" and "mode". Read counts will be scaled sample-wise. Default = "poisson". |
sizeFactor |
By this parameter one can decide to how the size factors are calculated. Possible choices are the the mean, median or mode coverage ("mean", "median", "mode") or any quantile ("quant"). |
normQu |
Real value between 0 and 1. If the "normType" parameter is set to "quant" then this parameter sets the quantile that is used for the normalization. Default = 0.25. |
quSizeFactor |
Quantile of the sizeFactor if sizeFactor is set to "quant". 0.75 corresponds to "upper quartile normalization". Real value between 0 and 1. Default = 0.75. |
upperThreshold |
Positive real value that sets the cut-off for copy number gains. All CNV calling values above this value will be called as "gain". The value should be set close to the log2 of the expected foldchange for copy number 3 or 4. Default = 0.5. |
lowerThreshold |
Negative real value that sets the cut-off for copy number losses. All CNV calling values below this value will be called as "loss". The value should be set close to the log2 of the expected foldchange for copy number 1 or 0. Default = -0.9. |
minWidth |
Positive integer that is exactly the parameter "min.width" of the "segment" function of "DNAcopy". minWidth is the minimum number of segments a CNV should span. Default = 3. |
segAlgorithm |
Which segmentation algorithm should be used. If set to "DNAcopy" circular binary segmentation is performed. Any other value will initiate the use of our fast segmentation algorithm. Default = "DNAcopy". |
minReadCount |
If all samples are below this value the algorithm will return the prior knowledge. This prevents that the algorithm from being applied to segments with very low coverage. Default=1. |
verbose |
Flag that decides whether referencecn.mops gives status if (verbose>0) messages. Default=1. |
returnPosterior |
Flag that decides whether the posterior probabilities should be returned. The posterior probabilities have a dimension of samples times copy number states times genomic regions and therefore consume a lot of memory. Default=FALSE. |
... |
Additional parameters will be passed to the "DNAcopy" or the standard segmentation algorithm. |
An instance of "CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) referencecn.mops(X[1:200, ],apply(X[1:200, ],1, median)) referencecn.mops(X[1:200, ],apply(X[1:200, ],1, median),parallel=2)
data(cn.mops) referencecn.mops(X[1:200, ],apply(X[1:200, ],1, median)) referencecn.mops(X[1:200, ],apply(X[1:200, ],1, median),parallel=2)
CNVDetectionResult-class
.This generic function returns the sample names of a CNV
detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult" |
sampleNames
returns a eturns a "GRanges" object
containing the parameters.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) sampleNames(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) sampleNames(r)
CNVDetectionResult-class
.This generic function returns the sample names of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' sampleNames(object)
## S4 method for signature 'CNVDetectionResult' sampleNames(object)
object |
An instance of "CNVDetectionResult" |
sampleNames
returns a eturns a "GRanges" object containing
the parameters.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) sampleNames(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) sampleNames(r)
Performs a fast segmentation algorithm based on the cyber t test and the t statistics. This is a special version for log-ratios or I/NI calls that are assumed to be centered around 0. For segmentation of data with different characteristics you can a) substract the mean/median/mode from your data or b) use the more general version of this algorithm in the R Bioconductor package "fastseg".
segment(x, alpha = 0.05, segMedianT = NULL, minSeg = 3, eps = 0, delta = 20, maxInt = 40, cyberWeight = 50)
segment(x, alpha = 0.05, segMedianT = NULL, minSeg = 3, eps = 0, delta = 20, maxInt = 40, cyberWeight = 50)
x |
Values to be segmented. |
alpha |
Real value between 0 and 1 is interpreted as the percentage of total points that are considered as initial breakpoints. An integer greater than 1 is interpreted as number of initial breakpoints. Default = 0.05. |
segMedianT |
Vector of length 2. Thresholds on the segment's median. Segments' medians above the first element are considered as gains and below the second value as losses. If set to NULL the segmentation algorithm tries to determine the thresholds itself. If set to 0 the gain and loss segments are not merged. (Default = NULL). |
minSeg |
Minimum length of segments. Default = 3. |
eps |
Real value greater or equal zero. A breakpoint is only possible between to consecutive values of x that have a distance of at least "eps". Default = 0. |
delta |
Positive integer. A parameter to make the segmentation more efficient. If the statistics of a breakpoint lowers while extending the window, the algorithm extends the windows by "delta" more points until it stops. Default = 20. |
maxInt |
The maximum length of a segment left of the breakpoint and right of the breakpoint that is considered. Default = 40. |
cyberWeight |
The "nu" parameter of the cyber t-test. Default = 50. |
A data frame containing the segments.
Guenter Klambauer [email protected]
x <- rnorm(n=500,sd=0.5) x[150:200] <- rnorm(n=51,mean=3,sd=0.5) segment(x)
x <- rnorm(n=500,sd=0.5) x[150:200] <- rnorm(n=51,mean=3,sd=0.5) segment(x)
CNVDetectionResult-class
.This generic function returns segmentation of a CNV
detection method stored in an instance of
CNVDetectionResult-class
.
object |
An instance of "CNVDetectionResult" |
segmentation
returns a eturns a "GRanges" object
containing the segmentation.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) segmentation(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) segmentation(r)
CNVDetectionResult-class
.This generic function returns segmentation of
a CNV detection method stored in an instance of
CNVDetectionResult-class
.
## S4 method for signature 'CNVDetectionResult' segmentation(object)
## S4 method for signature 'CNVDetectionResult' segmentation(object)
object |
An instance of "CNVDetectionResult" |
segmentation
returns a eturns a "GRanges" object containing
the segmentation.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:100,1:5]) segmentation(r)
data(cn.mops) r <- cn.mops(X[1:100,1:5]) segmentation(r)
Plots the log normalized read counts and the detected segments as a segmentation plot.
r |
An instance of "CNVDetectionResult" |
mainCN |
The name of the main copy number. That is "CN2" for diploid individuals. For haplocn.mops this should be set to "CN1". |
sampleIdx |
The index of the samples to be plotted. (Default = missing) |
seqnames |
The names of the reference sequence (chromosomes) to be plotted. (Default = missing) |
segStat |
Whether the segment line should display the mean or the median of a segments calls. (Default = "mean"). |
plot.type |
the type of plot. (Default = "s"). |
altcol |
logical flag to indicate if chromosomes should be plotted in alternating colors in the whole genome plot. (Default = TRUE). |
sbyc.layout |
|
cbys.layout |
|
cbys.nchrom |
the number of chromosomes per page in the layout. (Default = 1). |
include.means |
logical flag to indicate whether segment means are to be drawn. (Default = TRUE). |
zeroline |
logical flag to indicate whether a horizontal line at y=0 is to be drawn. (Default = TRUE). |
pt.pch |
the plotting character used for plotting the log-ratio values. (Default = ".") |
pt.cex |
the size of plotting character used for the log-ratio values (Default = 3). |
pt.cols |
the color list for the points. The colors alternate between chromosomes. (Default = c("green","black").) |
segcol |
the color of the lines indicating the segment means. (Default = "red"). |
zlcol |
the color of the zeroline. (Default = "grey"). |
ylim |
this argument is present to override the default limits which is the range of symmetrized log-ratios. (Default = NULL). |
lwd |
line weight of lines for segment mean and zeroline. (Default = 3). |
... |
other arguments which will be passed to
|
Generates a segmentation plot.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:200, ]) segplot(r,sampleIdx=1)
data(cn.mops) r <- cn.mops(X[1:200, ]) segplot(r,sampleIdx=1)
Plots the log normalized read counts and the detected segments as a segmentation plot.
## S4 method for signature 'CNVDetectionResult' segplot(r, mainCN = "CN2", sampleIdx, seqnames, segStat = "mean", plot.type = "s", altcol = TRUE, sbyc.layout, cbys.nchrom = 1, cbys.layout, include.means = TRUE, zeroline = TRUE, pt.pch = ".", pt.cex = 3, pt.cols = c("green", "black"), segcol = "red", zlcol = "grey", ylim, lwd = 3, ...)
## S4 method for signature 'CNVDetectionResult' segplot(r, mainCN = "CN2", sampleIdx, seqnames, segStat = "mean", plot.type = "s", altcol = TRUE, sbyc.layout, cbys.nchrom = 1, cbys.layout, include.means = TRUE, zeroline = TRUE, pt.pch = ".", pt.cex = 3, pt.cols = c("green", "black"), segcol = "red", zlcol = "grey", ylim, lwd = 3, ...)
r |
An instance of "CNVDetectionResult" |
mainCN |
The name of the main copy number. That is "CN2" for diploid individuals. For haplocn.mops this should be set to "CN1". |
sampleIdx |
The index of the samples to be plotted. (Default = missing) |
seqnames |
The names of the reference sequence (chromosomes) to be plotted. (Default = missing) |
segStat |
Whether the segment line should display the mean or the median of a segments calls. (Default = "mean"). |
plot.type |
the type of plot. (Default = "s"). |
altcol |
logical flag to indicate if chromosomes should be plotted in alternating colors in the whole genome plot. (Default = TRUE). |
sbyc.layout |
|
cbys.nchrom |
the number of chromosomes per page in the layout. (Default = 1). |
cbys.layout |
|
include.means |
logical flag to indicate whether segment means are to be drawn. (Default = TRUE). |
zeroline |
logical flag to indicate whether a horizontal line at y=0 is to be drawn. (Default = TRUE). |
pt.pch |
the plotting character used for plotting the log-ratio values. (Default = ".") |
pt.cex |
the size of plotting character used for the log-ratio values (Default = 3). |
pt.cols |
the color list for the points. The colors alternate between chromosomes. (Default = c("green","black").) |
segcol |
the color of the lines indicating the segment means. (Default = "red"). |
zlcol |
the color of the zeroline. (Default = "grey"). |
ylim |
this argument is present to override the default limits which is the range of symmetrized log-ratios. (Default = NULL). |
lwd |
line weight of lines for segment mean and zeroline. (Default = 3). |
... |
other arguments which will be passed to |
Generates a segmentation plot.
Guenter Klambauer [email protected]
data(cn.mops) r <- cn.mops(X[1:200, ]) segplot(r,sampleIdx=1)
data(cn.mops) r <- cn.mops(X[1:200, ]) segplot(r,sampleIdx=1)
Displays method for S4 class CNVDetectionResult
## S4 method for signature 'CNVDetectionResult' show(object)
## S4 method for signature 'CNVDetectionResult' show(object)
object |
An instance of a "CNVDetectionResult". |
Displays the result object of a CNV detection method.
Guenter Klambauer [email protected]
This function performs the an alternative version of the cn.mops algorithm adapted to a setting of a single sample.
singlecn.mops(x, I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4), classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7", "CN8"), priorImpact = 1, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.5, lowerThreshold = -0.9, minWidth = 3, segAlgorithm = "fast", minReadCount = 1, returnPosterior = FALSE, ...)
singlecn.mops(x, I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4), classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7", "CN8"), priorImpact = 1, cyc = 20, parallel = 0, norm = 1, normType = "poisson", sizeFactor = "mean", normQu = 0.25, quSizeFactor = 0.75, upperThreshold = 0.5, lowerThreshold = -0.9, minWidth = 3, segAlgorithm = "fast", minReadCount = 1, returnPosterior = FALSE, ...)
x |
Either an instance of "GRanges" or a raw data matrix with one column or a vector of read counts. An entry is the read count of the sample in the genomic region. |
I |
Vector positive real values that contain the expected fold change of the copy number classes. Length of this vector must be equal to the length of the "classes" parameter vector. For human copy number polymorphisms we suggest to use the default I = c(0.025,0.5,1,1.5,2,2.5,3,3.5,4). |
classes |
Vector of characters of the same length as the parameter vector "I". One vector element must be named "CN2". The names reflect the labels of the copy number classes. Default = c("CN0","CN1","CN2","CN3","CN4","CN5","CN6","CN7","CN8"). |
priorImpact |
Positive real value that reflects how strong the prior assumption affects the result. The higher the value the more samples will be assumed to have copy number 2. Default = 1. |
cyc |
Positive integer that sets the number of cycles for the algorithm. Usually after less than 15 cycles convergence is reached. Default = 20. |
parallel |
How many cores are used for the computation. If set to zero than no parallelization is applied. Default = 0. |
norm |
The normalization strategy to be used. If set to 0 the read counts are not normalized and cn.mops does not model different coverages. If set to 1 the read counts are normalized. If set to 2 the read counts are not normalized and cn.mops models different coverages. (Default=1). |
normType |
Mode of the normalization technique. Possible values are "mean","min","median","quant", "poisson" and "mode". Read counts will be scaled sample-wise. Default = "poisson". |
sizeFactor |
By this parameter one can decide to how the size factors are calculated. Possible choices are the the mean, median or mode coverage ("mean", "median", "mode") or any quantile ("quant"). |
normQu |
Real value between 0 and 1. If the "normType" parameter is set to "quant" then this parameter sets the quantile that is used for the normalization. Default = 0.25. |
quSizeFactor |
Quantile of the sizeFactor if sizeFactor is set to "quant". 0.75 corresponds to "upper quartile normalization". Real value between 0 and 1. Default = 0.75. |
upperThreshold |
Positive real value that sets the cut-off for copy number gains. All CNV calling values above this value will be called as "gain". The value should be set close to the log2 of the expected foldchange for copy number 3 or 4. Default = 0.5. |
lowerThreshold |
Negative real value that sets the cut-off for copy number losses. All CNV calling values below this value will be called as "loss". The value should be set close to the log2 of the expected foldchange for copy number 1 or 0. Default = -0.9. |
minWidth |
Positive integer that is exactly the parameter "min.width" of the "segment" function of "DNAcopy". minWidth is the minimum number of segments a CNV should span. Default = 3. |
segAlgorithm |
Which segmentation algorithm should be used. If set to "DNAcopy" circular binary segmentation is performed. Any other value will initiate the use of our fast segmentation algorithm. Default = "fast". |
minReadCount |
If all samples are below this value the algorithm will return the prior knowledge. This prevents that the algorithm from being applied to segments with very low coverage. Default=1. |
returnPosterior |
Flag that decides whether the posterior probabilities should be returned. The posterior probabilities have a dimension of samples times copy number states times genomic regions and therefore consume a lot of memory. Default=FALSE. |
... |
Additional parameters will be passed to the "DNAcopy" or the standard segmentation algorithm. |
An instance of "CNVDetectionResult".
Guenter Klambauer [email protected]
data(cn.mops) singlecn.mops(XRanges[,1])
data(cn.mops) singlecn.mops(XRanges[,1])
This data set gives the read counts of 40 samples in 5000 genomic
locations. The rows correspond to genomic segments of 25kbp length and the
columns to samples. An entry is the number of reads that map to the specific
segment of the sample. The rownames contain the information of the genomic
location - they are in the format refseqname_startposition_endposition.
The simulated data contains CNVs given in the CNVRanges
object. It was generated using
distributions of read counts as they appear in real sequencing experiments.
CNVs were implanted under the assumption that the expected read count is linear
dependent on the copy number (e.g. in a certain genomic we expect
reads for copy number 2, then we expect
reads for copy number 4).
X
X
A data matrix of 5000 rows and 40 columns.
http://www.bioinf.jku.at/software/cnmops/cnmops.html.
Guenter Klambauer, Karin Schwarzbauer, Andreas Mayr, Djork-Arne Clevert, Andreas Mitterecker, Ulrich Bodenhofer, Sepp Hochreiter. cn.MOPS: mixture of Poissons for discovering copy number variations in next generation sequencing data with a low false discovery rate. Nucleic Acids Research 2012 40(9); doi:10.1093/nar/gks003.
This data set gives the read counts of 40 samples in 5000 genomic
locations. The rows correspond to genomic segments of 25kbp length and the
columns to samples. An entry is the number of reads that map to the specific
segment of the sample. The "GRanges" object contains the name of the
reference sequence, start and end position of the genomic segments.
The simulated data contains CNVs given in the CNVRanges
object. It was generated using
distributions of read counts as they appear in real sequencing experiments.
CNVs were implanted under the assumption that the expected read count is linear
dependent on the copy number (e.g. in a certain genomic we expect
reads for copy number 2, then we expect
reads for copy number 4).
XRanges
XRanges
A GRanges object with 5000 rows and 40 value columns across 1 space.
http://www.bioinf.jku.at/software/cnmops/cnmops.html.
Guenter Klambauer, Karin Schwarzbauer, Andreas Mayr, Djork-Arne Clevert, Andreas Mitterecker, Ulrich Bodenhofer, Sepp Hochreiter. cn.MOPS: mixture of Poissons for discovering copy number variations in next generation sequencing data with a low false discovery rate. Nucleic Acids Research 2012 40(9); doi:10.1093/nar/gks003.