Package 'iCNV' reference manual

Title:	Integrated Copy Number Variation detection
Description:	Integrative copy number variation (CNV) detection from multiple platform and experimental design.
Authors:	Zilu Zhou, Nancy Zhang
Maintainer:	Zilu Zhou <[email protected]>
License:	GPL-2
Version:	1.27.0
Built:	2025-02-22 05:59:15 UTC
Source:	https://github.com/bioc/iCNV

Get BAM baf information from vcf

Description

If your vcf follow the format in the example, you could use this function to extract NGS baf from vcf files. Remember to load library before hands. Save 6 lists, each list has N entry. N = # of individuals (or vcf file) ngs_baf.nm: name of the bamfiles; ngs_baf.chr: the chromosome; ngs_baf.pos: the position of the variants; ngs_baf: the BAF of the variants; ngs_baf.id: the ID of the variants; filenm:the file name

Usage

bambaf_from_vcf(dir = ".", vcf_list, chr = NULL, projname = "")
bambaf_from_vcf(dir = ".", vcf_list, chr = NULL, projname = "")

Arguments

`dir`	The directory to all the vcf stored; default is right in this folder. Type character. Defualt '.'
`vcf_list`	All the vcf names stored in vcf.list; could use command:"ls *.vcf > vcf.list" to generate. Type character.
`chr`	Specify the chromosome you want to generate. Must be of int from 1-22. If not specify, this function will generate all chromosomes. Defualt NULL
`projname`	Name of the project. Type character. Default ”

Value

void

Examples

dir <- system.file("extdata", package="iCNV")
bambaf_from_vcf(dir,'bam_vcf.list',projname='icnv.demo.')
bambaf_from_vcf(dir,'bam_vcf.list',chr=22,projname='icnv.demo.')
dir <- system.file("extdata", package="iCNV")
bambaf_from_vcf(dir,'bam_vcf.list',projname='icnv.demo.')
bambaf_from_vcf(dir,'bam_vcf.list',chr=22,projname='icnv.demo.')

Generate BED file for WGS dataset.

Description

Default position generated from USCS genome browser

Usage

bed_generator(chr = numeric(), hg = numeric(), start = NULL, end = NULL,
  by = 1000)
bed_generator(chr = numeric(), hg = numeric(), start = NULL, end = NULL,
  by = 1000)

Arguments

`chr`	Specify the chromosome you want to generate. Must be of int from 1-22. Type integer.
`hg`	Specify the coordinate you want to generate from. Start and end position of hg19 and hg38 have been pre-implemented. Type integer.
`start`	The start position of your BED file. Default NULL
`end`	The end position of your BED file. Default NULL
`by`	The chunk of your DNA for each bin. Type integer. Default 1000.

Value

void

Examples

bed_generator(chr=22,hg=38)
bed_generator(22,38,5001,10000,by=500)
bed_generator(chr=22,hg=38)
bed_generator(22,38,5001,10000,by=500)

chromosome of the example

Description

data chromosome

Usage

    chr
chr

Format

integer

Value

Name of the file

Description

Example NGS VCF files for the 1000 Genome Project, value stored at filenm

Usage

    filenm
filenm

Format

vector with NGS vcf file names

Value

File names for the NGS vcf

Get array information from given format

Description

If your array input file follow the format in the example, you could use this function to extract array LRR and baf. Remember to load library before hands. Save 4*[# of chr] lists, each list has N entry. N = # of individuals snp_lrr: SNP LRR intensity; snp_lrr.pos: the position of the SNPs snp_baf: the BAF of the SNPs; snp_baf.pos: the position of the SNPs

Usage

get_array_input(dir = character(), pattern = character(), chr = NULL,
  projname = "")
get_array_input(dir = character(), pattern = character(), chr = NULL,
  projname = "")

Arguments

`dir`	A string. The directory path to the folder where store signal intensity file according to chr. Type character
`pattern`	A string. The pattern of all the intensity file. Type character
`chr`	Specify the chromosome you want to generate. Must be of int from 1-22. If not specify, this function will generate files for all chromosomes. Default NULL
`projname`	Name of the project. Type character

Value

void

Examples

dir <- system.file("extdata", package="iCNV")
pattern <- paste0('*.csv.arrayicnv$')
get_array_input(dir,pattern,chr=22,projname='icnv.demo.')
dir <- system.file("extdata", package="iCNV")
pattern <- paste0('*.csv.arrayicnv$')
get_array_input(dir,pattern,chr=22,projname='icnv.demo.')

CNV detection

Description

Copy number variation detection tool for germline data. Able to combine intensity and BAF from SNP array and NGS data.

Usage

iCNV_detection(ngs_plr = NULL, snp_lrr = NULL, ngs_baf = NULL,
  snp_baf = NULL, ngs_plr.pos = NULL, snp_lrr.pos = NULL,
  ngs_baf.pos = NULL, snp_baf.pos = NULL, maxIt = 50, visual = 0,
  projname = "iCNV.", CN = 0, mu = c(-3, 0, 2), cap = FALSE)
iCNV_detection(ngs_plr = NULL, snp_lrr = NULL, ngs_baf = NULL,
  snp_baf = NULL, ngs_plr.pos = NULL, snp_lrr.pos = NULL,
  ngs_baf.pos = NULL, snp_baf.pos = NULL, maxIt = 50, visual = 0,
  projname = "iCNV.", CN = 0, mu = c(-3, 0, 2), cap = FALSE)

Arguments

`ngs_plr`	A list of NGS intensity data. Each entry is an individual. If no NGS data, no need to specify.
`snp_lrr`	A list of SNP array intensity data. Each entry is an individual. If no SNP array data, no need to specify.
`ngs_baf`	A list of NGS BAF data. Each entry is an individual. If no NGS data, no need to specify.
`snp_baf`	A list of SNP array BAF data. Each entry is an individual. If no SNP array data, no need to specify.
`ngs_plr.pos`	A list of NGS intensity postion data. Each entry is an individual with dimension= (#of bins or exons, 2(start and end position)). If no NGS data, no need to specify.
`snp_lrr.pos`	A list of SNP array intensity postion data. Each entry is an individual with length=#of SNPs. If no SNP array data, no need to specify.
`ngs_baf.pos`	A list of NGS BAF postion data. Each entry is an individual with length=#of BAFs. If no NGS data, no need to specify.
`snp_baf.pos`	A list of SNP array BAF postion data. Each entry is an individual with length=#of BAFs. If no SNP array data, no need to specify.
`maxIt`	An integer number indicate the maximum number of EM iteration if not converged during parameter inference. Type integer. Default 50.
`visual`	An indicator variable with value 0,1,2. 0 indicates no visualization, 1 indicates basic visualization, 2 indicates complete visualization (Note visual 2 only work for single platform and integer CN inferenced). Type integer. Default 0
`projname`	A string as the name of this project. Type character. Default 'iCNV.'
`CN`	An indicator variable with value 0,1 for whether wants to infer exact copy number. 0 no exact CN, 1 exact CN. Type integer. Default 0.
`mu`	A length tree vectur specify means of intensity in mixture normal distribution (Deletion, Diploid, Duplification). Default c(-3,0,2)
`cap`	A boolean decides whether we cap insane intensity value due to double deletion or mutiple amplification. Type logical. Default False

Value

(1) CNV inference, contains CNV inference, Start and end position for each inference, Conditional probability for each inference, mu for mixture normal, sigma for mixture normal, probability of CNVs, Z score for each inference.

(2) exact copy number for each CNV inference, if CN=1.

Examples

# icnv call without genotype (just infer deletion, duplication)
projname <- 'icnv.demo.'
icnv_res0 <- iCNV_detection(ngs_plr,snp_lrr,
                         ngs_baf,snp_baf,
                         ngs_plr.pos,snp_lrr.pos,
                         ngs_baf.pos,snp_baf.pos,
                         projname=projname,CN=0,mu=c(-3,0,2),cap=TRUE,visual = 1)
# icnv call with genotype inference and complete plot
projname <- 'icnv.demo.geno.'
icnv_res1 <- iCNV_detection(ngs_plr,snp_lrr,
                         ngs_baf,snp_baf,
                         ngs_plr.pos,snp_lrr.pos,
                         ngs_baf.pos,snp_baf.pos,
                         projname=projname,CN=1,mu=c(-3,0,2),cap=TRUE,visual = 2)
# icnv call without genotype (just infer deletion, duplication)
projname <- 'icnv.demo.'
icnv_res0 <- iCNV_detection(ngs_plr,snp_lrr,
                         ngs_baf,snp_baf,
                         ngs_plr.pos,snp_lrr.pos,
                         ngs_baf.pos,snp_baf.pos,
                         projname=projname,CN=0,mu=c(-3,0,2),cap=TRUE,visual = 1)
# icnv call with genotype inference and complete plot
projname <- 'icnv.demo.geno.'
icnv_res1 <- iCNV_detection(ngs_plr,snp_lrr,
                         ngs_baf,snp_baf,
                         ngs_plr.pos,snp_lrr.pos,
                         ngs_baf.pos,snp_baf.pos,
                         projname=projname,CN=1,mu=c(-3,0,2),cap=TRUE,visual = 2)

Convert icnv.output to input for Genome Browser.

Description

We could add the output to custom tracks on Genome Browser. Remeber to choose human assembly matches your input data. We color coded the CNVs to make it as consistant as IGV. To show color, click 'User Track after submission', and edit config to 'visibility=2 itemRgb="On"'. Color see Github page for more example.

Usage

icnv_output_to_gb(chr = numeric(), icnv.output)
icnv_output_to_gb(chr = numeric(), icnv.output)

Arguments

`chr`	CNV chromosome. Type integer.
`icnv.output`	output from output_list_function

Value

matrix for Genome browser

Examples

icnv.output <- output_list(icnv_res=icnv_res0,sampleid=sampname_qc, CN=0, min_size=10000)
gb_input <- icnv_output_to_gb(chr=22,icnv.output)
write.table(gb_input,file='icnv_res_gb_chr22.tab',quote=FALSE,col.names=FALSE,row.names=FALSE)
icnv.output <- output_list(icnv_res=icnv_res0,sampleid=sampname_qc, CN=0, min_size=10000)
gb_input <- icnv_output_to_gb(chr=22,icnv.output)
write.table(gb_input,file='icnv_res_gb_chr22.tab',quote=FALSE,col.names=FALSE,row.names=FALSE)

Example iCNV calling results.

Description

iCNV calling result of all the samples

Usage

    icnv_res
icnv_res

Format

A list containing the calling result of CNVs:

1st item: HMM call result without Copy number
2nd item: exact copy number

Value

iCNV calling result

BAF list from NGS

Description

10 samples BAF value extracted from VCF files, location stored at ngs_baf.pos

Usage

    ngs_baf
ngs_baf

Format

A list of ten, which each entry is the BAF value for a individual

Value

BAF value

BAF chromosome from NGS

Description

46 samples BAF chromosome. Pre-computed using whole exome sequencing data of 46 HapMap samples.

Usage

    ngs_baf.chr
ngs_baf.chr

Format

A list of 46, which each entry is the BAF chromosome for a individual position

Value

BAF chromosome

BAF variants id from NGS

Description

46 samples BAF ids. Pre-computed using whole exome sequencing data of 46 HapMap samples.

Usage

    ngs_baf.id
ngs_baf.id

Format

A list of 46, which each entry is the BAF variants id a individual position

Value

BAF variants id

BAF variants sample name from NGS

Description

46 samples BAF names.

Usage

    ngs_baf.nm
ngs_baf.nm

Format

A list of 46, which each entry is the sample name

Value

BAF variants sample names

BAF position list from NGS

Description

10 samples BAF position extracted from VCF files, value stored at ngs_baf

Usage

    ngs_baf.pos
ngs_baf.pos

Format

A list of ten, which each entry is the BAF positions for a individual

Value

BAF position

Normalized Poisson likelihood ratio list from NGS

Description

10 samples PLR value from BAM calculated by CODEX, exon position stored at ngs_plr.pos

Usage

    ngs_plr
ngs_plr

Format

A list of ten, which each entry is the PLR value for a individual, calculated from CODEX

Value

PLR value

Exon location list from NGS

Description

10 samples exon position extracted from BED files, value stored at ngs_plr

Usage

    ngs_plr.pos
ngs_plr.pos

Format

A list of ten, which each entry is the Exon positions for a individual

Value

Exon position

Demo data pre-stored for normObj.

Description

Pre-stored normObj data for demonstration purposes.

Usage

normObjnormObj

Details

Pre-computed using whole exome sequencing data of 46 HapMap samples.

Value

normObj demo data (list) pre-computed.

Author(s)

Zilu Zhou [email protected]

Examples

Yhat <- normObjDemo$Yhat
AIC <- normObjDemo$AIC
BIC <- normObjDemo$BIC
RSS <- normObjDemo$RSS
K <- normObjDemo$K
Yhat <- normObjDemo$Yhat
AIC <- normObjDemo$AIC
BIC <- normObjDemo$BIC
RSS <- normObjDemo$RSS
K <- normObjDemo$K

Generate ouput list.

Description

Generate human readable output from result calculated by iCNV_detection function

Usage

output_list(icnv_res, sampleid = NULL, CN = 0, min_size = 0)
output_list(icnv_res, sampleid = NULL, CN = 0, min_size = 0)

Arguments

`icnv_res`	CNV inference result. Output from iCNV_detection()
`sampleid`	the name of the sample, same order as the input
`CN`	An indicator variable with value 0,1 for whether exact copy number inferred in iCNV_detection. 0 no exact CN, 1 exact CN. Type integer. Default 0.
`min_size`	A integer which indicate the minimum length of the CNV you are interested in. This could remove super short CNVs due to noise. Type integer. Default 0. Recommend 1000.

Value

output CNV list of each individual

Examples

icnv.output <- output_list(icnv_res=icnv_res0,sampleid=sampname_qc, CN=0)
icnv.output <- output_list(icnv_res=icnv_res0,sampleid=sampname_qc, CN=0)

plot out the NGS plr or array lrr.

Description

For quality checking purpose during intermediate steps

Usage

plot_intensity(intensity, chr = numeric())
plot_intensity(intensity, chr = numeric())

Arguments

`intensity`	Specify the ngs_plr object generated by CODEX or SNP array.
`chr`	Specify the chromosome you want to generate. Must be of int from 1-22. Type integer

Value

void

Examples

chr <- 22
plot_intensity(ngs_plr,chr)
plot_intensity(snp_lrr,chr)
chr <- 22
plot_intensity(ngs_plr,chr)
plot_intensity(snp_lrr,chr)

Plot CNV inference score.

Description

Plot out CNV inference score. Each row is a sample, each column is a SNP or, exon (WES) or bin (WGS). Red color indicate score favor duplication whereas blue favor deletion.

Usage

plotHMMscore(icnv_res, h = NULL, t = NULL, title = "score plot",
  output = NULL, col = "")
plotHMMscore(icnv_res, h = NULL, t = NULL, title = "score plot",
  output = NULL, col = "")

Arguments

`icnv_res`	CNV inference result. Result from iCNV_detection() (i.e. iCNV_detection(...))
`h`	start position of this plot. Default Start of the whole chromosome
`t`	end position of this plot. Default End of the whole chromosome
`title`	of this plot. Character value. Type character Default "score plot"
`output`	generated from output_list_function. If it isn't null, only CNVs in output file will be highlighted. Default NULL
`col`	Specify if would like to plot in DGV color scheme ('DGV',red for deletion, blue for duplication and grey for diploid) or default color scheme (blue for deletion, red for duplicatin and and green for diploid) Type character. Default ”

Value

void

Examples

plotHMMscore(icnv_res0,h=21000000, t=22000000, title='my favorite subject')
plotHMMscore(icnv_res0,h=21000000, t=22000000, title='my favorite subject',col='DGV')
plotHMMscore(icnv_res0,h=21000000, t=22000000, title='my favorite subject')
plotHMMscore(icnv_res0,h=21000000, t=22000000, title='my favorite subject',col='DGV')

Individual sample plot

Description

Plot relationship between platforms and features for each individual. Only work for muli-platform inference.

Usage

plotindi(ngs_plr, snp_lrr, ngs_baf, snp_baf, ngs_plr.pos, snp_lrr.pos,
  ngs_baf.pos, snp_baf.pos, icnvres, I = numeric(), h = NULL, t = NULL)
plotindi(ngs_plr, snp_lrr, ngs_baf, snp_baf, ngs_plr.pos, snp_lrr.pos,
  ngs_baf.pos, snp_baf.pos, icnvres, I = numeric(), h = NULL, t = NULL)

Arguments

`ngs_plr`	A list of NGS intensity data. Each entry is an individual. If no NGS data, no need to specify.
`snp_lrr`	A list of SNP array intensity data. Each entry is an individual. If no SNP array data, no need to specify.
`ngs_baf`	A list of NGS BAF data. Each entry is an individual. If no NGS data, no need to specify.
`snp_baf`	A list of SNP array BAF data. Each entry is an individual. If no SNP array data, no need to specify.
`ngs_plr.pos`	A list of NGS intensity postion data. Each entry is an individual with dimension= (#of bins or exons, 2(start and end position)). If no NGS data, no need to specify.
`snp_lrr.pos`	A list of SNP array intensity postion data. Each entry is an individual with length=#of SNPs. If no SNP array data, no need to specify.
`ngs_baf.pos`	A list of NGS BAF postion data. Each entry is an individual with length=#of BAFs. If no NGS data, no need to specify.
`snp_baf.pos`	A list of SNP array BAF postion data. Each entry is an individual with length=#of BAFs. If no SNP array data, no need to specify.
`icnvres`	CNV inference result. The output from iCNV_detection()
`I`	Indicating the position of the individual to plot. Type integer.
`h`	start position of this plot. Default Start of the whole chromosome
`t`	end position of this plot. Default End of the whole chromosome

Value

void

Examples

plotindi(ngs_plr,snp_lrr,ngs_baf,snp_baf,
 ngs_plr.pos,snp_lrr.pos,ngs_baf.pos,snp_baf.pos,
 icnv_res0,I=1)
plotindi(ngs_plr,snp_lrr,ngs_baf,snp_baf,
 ngs_plr.pos,snp_lrr.pos,ngs_baf.pos,snp_baf.pos,
 icnv_res0,I=1)

name of project

Description

name of project

Usage

    projname
projname

Format

string

Value

name of project

Demo data pre-stored for qcObj.

Description

Pre-stored qcObj data for demonstration purposes.

Usage

qcObjqcObj

Details

Pre-computed using whole exome sequencing data of 46 HapMap samples.

Value

qcObj demo data (list) pre-computed.

Author(s)

Zilu Zhou [email protected]

Examples

Y_qc <- qcObj$Y_qc
sampname_qc <- qcObj$sampname_qc
gc_qc <- qcObj$gc_qc
mapp_qc <- qcObj$mapp_qc
ref_qc <- qcObj$ref_qc
Y_qc <- qcObj$Y_qc
sampname_qc <- qcObj$sampname_qc
gc_qc <- qcObj$gc_qc
mapp_qc <- qcObj$mapp_qc
ref_qc <- qcObj$ref_qc

CODEX sample name

Description

46 samples BAM names.

Usage

    sampname
sampname

Format

A vector of 46, which each entry is the sample name

Value

CODEX sample names

QCed sample name

Description

QCed sample name

Usage

    sampname_qc
sampname_qc

Format

string

Value

name of samples after QC

BAF list from Array

Description

10 samples BAF value extracted from standard format files, location stored at snp_baf.pos

Usage

    snp_baf
snp_baf

Format

A list of ten, which each entry is the BAF value for a individual

Value

BAF value

BAF position list from Array

Description

10 samples BAF position extracted from standard format, value stored at snp_baf

Usage

    snp_baf.pos
snp_baf.pos

Format

A list of ten, which each entry is the BAF positions for a individual

Value

BAF position

Normalized log R ratio list from Array

Description

10 samples LRR value from standard format, SNP position stored at snp_lrr.pos

Usage

    snp_lrr
snp_lrr

Format

A list of ten, which each entry is the LRR value for a individual

Value

LRR value

SNP position list from Array

Description

10 samples SNP position extracted from standard format, value stored at snp_lrr

Usage

    snp_lrr.pos
snp_lrr.pos

Format

A list of ten, which each entry is the SNP positions for a individual

Value

SNP position

Package 'iCNV'

Help Index

Get BAM baf information from vcf

Description

Usage

Arguments

Value

Examples

Generate BED file for WGS dataset.

Description

Usage

Arguments

Value

Examples

chromosome of the example

Description

Usage

Format

Value

Name of the file

Description

Usage

Format

Value

Get array information from given format

Description

Usage

Arguments

Value

Examples

CNV detection

Description

Usage

Arguments

Value

Examples

Convert icnv.output to input for Genome Browser.

Description

Usage

Arguments

Value

Examples

Example iCNV calling results.

Description

Usage

Format

Value

BAF list from NGS

Description

Usage

Format

Value

BAF chromosome from NGS

Description

Usage

Format

Value

BAF variants id from NGS

Description

Usage

Format

Value

BAF variants sample name from NGS

Description

Usage

Format

Value

BAF position list from NGS

Description

Usage

Format

Value

Normalized Poisson likelihood ratio list from NGS

Description

Usage

Format

Value

Exon location list from NGS

Description

Usage