Package 'CODEX' reference manual

Title:	A Normalization and Copy Number Variation Detection Method for Whole Exome Sequencing
Description:	A normalization and copy number variation calling procedure for whole exome DNA sequencing data. CODEX relies on the availability of multiple samples processed using the same sequencing pipeline for normalization, and does not require matched controls. The normalization model in CODEX includes terms that specifically remove biases due to GC content, exon length and targeting and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based recursive segmentation procedure that explicitly models the count-based exome sequencing data.
Authors:	Yuchao Jiang, Nancy R. Zhang
Maintainer:	Yuchao Jiang <[email protected]>
License:	GPL-2
Version:	1.39.0
Built:	2025-03-29 05:35:38 UTC
Source:	https://github.com/bioc/CODEX

A Normalization and Copy Number Variation Detection Method for Whole Exome Sequencing

Description

CODEX is a normalization and copy number variation calling procedure for whole exome DNA sequencing data. CODEX relies on the availability of multiple samples processed using the same sequencing pipeline for normalization, and does not require matched controls. The normalization model in CODEX includes terms that specifically remove biases due to GC content, exon length and targeting and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based recursive segmentation procedure that explicitly models the count-based exome sequencing data.

Details

Package:	CODEX
Type:	Package
Version:	0.99.0
Date:	2015-01-13
License:	GPL-2

CODEX takes as input the bam files/directories for whole exome sequencing datasets and bed files for exonic positions, returns raw and normalized coverage for each exon, and calls copy number variations with genotyping results.

Author(s)

Yuchao Jiang <[email protected]>, Nancy R. Zhang

Demo data pre-stored for bambedObj.

Description

Pre-stored bambedObj data for demonstration purposes.

Usage

data(bambedObjDemo)data(bambedObjDemo)

Details

Pre-computed using whole exome sequencing data of 46 HapMap samples.

Value

bambedObj demo data (list) pre-computed.

`AIC`	vector of AIC for each K returned from `normalize`
`BIC`	vector of BIC for each K returned from `normalize`
`RSS`	vector of RSS for each K returned from `normalize`
`K`	vector of K returned from `normalize`
`filename`	Filename of the output plot of AIC and RSS

`bamdir`	Column vector. Each line specifies directory of a bam file. Should be in same order as sample names in sampname.
`bedFile`	Path to bed file specifying exonic targets. Is of type character.
`sampname`	Column vector. Each line specifies name of a sample corresponding to the bam file. Should be in same order as bam directories in bamdir.
`projectname`	String specifying the name of the project. Data will be saved using this as prefix.
`chr`	Chromosome.

`bamdir`	Bam directories
`sampname`	Sample names
`ref`	IRanges object specifying exonic positions
`projectname`	String specifying the name of the project.
`chr`	Chromosome

`bambedObj`	Object returned from `getbambed`
`mapqthres`	Mapping quality threshold hold of reads.

`Y`	Read depth matrix
`readlength`	Vector of read length for each sample

`chr`	Chromosome returned from `getbambed`
`ref`	IRanges object returned from `getbambed`

`Y_qc`	Read depth matrix after quality control procedure returned from `qc`
`gc_qc`	Vector of GC content for each exon after quality control procedure returned from `qc`
`K`	Number of latent Poisson factors. Can be an integer if optimal solution has been chosen or a vector of integers so that AIC, BIC, and RSS are computed for choice of optimal k.

`Yhat`	Normalized read depth matrix
`AIC`	AIC for model selection
`BIC`	BIC for model selection
`RSS`	RSS for model selection
`K`	Number of latent Poisson factors

`Y`	Original read depth matrix returned from `getcoverage`
`sampname`	Vector of sample names returned from `getbambed`
`chr`	Chromosome.
`ref`	IRanges object specifying exonic positions returned from `getbambed`
`mapp`	Vector of mappability for each exon returned from `getmapp`
`gc`	Vector of GC content for each exon returned from `getgc`
`cov_thresh`	Vector specifying the upper and lower bound of exonic median coverage threshold for QC. 20-4000 recommended.
`length_thresh`	Vector specifying the upper and lower bound of exonic length threshold for QC. 20-2000 recommended.
`mapp_thresh`	Scalar variable specifying exonic mappability threshold for QC. 0.9 recommended.
`gc_thresh`	Vector specifying the upper and lower bound of exonic GC content threshold for QC. 20-80 recommended.

`Y_qc`	Updated `Y` after QC
`sampname_qc`	Updated `sampname` after QC
`gc_qc`	Updated `gc` after QC
`mapp_qc`	Updated `mapp` after QC
`ref_qc`	Updated `ref` after QC
`qcmat`	Matrix specifying results of exon-wise QC procedures

`Y_qc`	Raw read depth matrix after quality control procedure returned from `qc`
`Yhat`	Normalized read depth matrix returned from `normalize`
`optK`	Optimal value `K` returned from `choiceofK`
`K`	Number of latent Poisson factors. Can be an integer if optimal solution has been chosen or a vector of integers so that AIC, BIC, and RSS are computed for choice of optimal k.
`sampname_qc`	Vector of sample names after quality control procedure returned from `qc`
`ref_qc`	IRanges object of genomic positions of each exon after quality control procedure returned from `qc`
`chr`	Chromosome number returned from `getbambed`
`lmax`	Maximum CNV length in number of exons returned.
`mode`	Can be either "integer" or "fraction", which respectively correspond to format of the returned copy numbers.

Package 'CODEX'

Help Index

A Normalization and Copy Number Variation Detection Method for Whole Exome Sequencing

Description

Details

Author(s)

Demo data pre-stored for bambedObj.

Description

Usage

Details

Value

Author(s)

Examples

Determine the number of latent factors K.

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Demo data pre-stored for coverageObj.

Description

Usage

Details

Value

Author(s)

Examples

Demo data pre-stored for GC content.

Description

Usage

Details

Value

Author(s)

Examples

Get bam file directories, sample names, and exonic positions

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Get depth of coverage from whole exome sequencing

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Get GC content for each exonic target

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Get mappability for each exonic target

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Position reference for pre-computed mappability results.

Description

Usage

Details

Value

Author(s)

See Also

Examples

Pre-computed mappabilities