Package 'SICtools'

Title: Find SNV/Indel differences between two bam files with near relationship
Description: This package is to find SNV/Indel differences between two bam files with near relationship in a way of pairwise comparison thourgh each base position across the genome region of interest. The difference is inferred by fisher test and euclidean distance, the input of which is the base count (A,T,G,C) in a given position and read counts for indels that span no less than 2bp on both sides of indel region.
Authors: Xiaobin Xing, Wu Wei
Maintainer: Xiaobin Xing <[email protected]>
License: GPL (>=2)
Version: 1.35.0
Built: 2024-07-01 06:02:44 UTC
Source: https://github.com/bioc/SICtools

Help Index


tools for SNV/Indel Comparison between two bam files with near relationship

Description

This package is to find SNV/Indel differences between two bam files with near relationship in a way of pairwise comparison thourgh each base position across the genome region of interest. The difference is inferred by fisher test and euclidean distance, the input of which is the base count (A,T,G,C) in a given position and read counts for indels that span no less than 2bp on both sides of indel region called from samtools+bcftools

Details

Package: SICtools
Type: Package
Version: 1.0
Date: 2014-07-24
License: GPL (>=2)
LazyLoad: Yes

Author(s)

Xiaobin Xing

Maintainer: Xiaobin Xing <[email protected]>


main function to call indel difference between the bam files

Description

test indel-read count differences at a given indel position between the two bam files. The indel position are obtained by samtools+bcftools first, and count the number of reads that span no less than 3bp of the indel boundary. The read-count matrix at a given indel region from the two bam files are tested by fisher exact test and euclidean distance. If nothing difference, NULL will be returned.

Usage

indelDiff(bam1, bam2, refFsa, regChr, regStart, regEnd, minBaseQuality = 13, minMapQuality = 0, nCores = 1, pValueCutOff= 0.05,gtDistCutOff = 0.1,verbose = TRUE)

Arguments

bam1

the first bam file to be compared

bam2

the second bam file to be compared

refFsa

the reference fasta file used for bam1 and bam2 alignments

regChr

chromosome name of the region of interest, it should match the chromosome name in reference name

regStart

the start position (1-based) of the region of interest

regEnd

the end position (1-based) of the region of interest

minBaseQuality

the minimum base quality to be used for indel-read count

minMapQuality

the minimum read mapping quality to be used for indel-read count

nCores

number of thread used for calculate in parallel

pValueCutOff

p.value cutoff from fisher.test to display output. If there is no difference between two compared positions (p.value = 1 and d.value = 0), NULL will be returned even setting pValueCutOff = 1.

gtDistCutOff

euclidean distance cutoff from dist(,method='euclidean') to display output. If there is no difference between two compared positions (p.value = 1 and d.value = 0), NULL will be returned even setting gtDistCutOff = 0.

verbose

print progress on screen, default = TRUE.

Value

indelDiff: returns a data.frame with difference information: chromosome, position, reference genenotype, two alt genotypes, and their indel-read count for two bam files, p.value (fisher exact test of these read counts) and d.value (euclidean distance of these read counts)

Author(s)

Xiaobin Xing, <email:[email protected]>

References

Li H.*, Handsaker B.*, Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009) The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9. [PMID: 19505943]

Examples

bam1 <- system.file(package='SICtools','extdata','example1.bam')
bam2 <- system.file(package='SICtools','extdata','example2.bam')
refFsa <- system.file(package='SICtools','extdata','example.ref.fasta')

indelDiff(bam1,bam2,refFsa,'chr07',828514,828914,pValueCutOff=1,gtDistCutOff=0)

main function to test point differences between the two bam files

Description

test base count (A,T,G,C) difference at a given position between the two bam files. The base count matrix is tested by fisher exact test and euclidean distance. If nothing difference, NULL will be returned.

Usage

snpDiff(bam1, bam2, refFsa, regChr, regStart, regEnd, minBaseQuality = 13, minMapQuality = 0, nCores = 1, pValueCutOff = 0.05, baseDistCutOff = 0.1,verbose = TRUE)

Arguments

bam1

the first bam file to be compared

bam2

the second bam file to be compared

refFsa

the reference fasta file used for bam1 and bam2 alignments

regChr

chromosome name of the region of interest, it should match the chromosome name in reference name

regStart

the start position (1-based) of the region of interest

regEnd

the end position (1-based) of the region of interest

minBaseQuality

the minimum base quality to be used for base count

minMapQuality

the minimum read mapping quality to be used for base count

nCores

number of thread used for calculate in parallel

pValueCutOff

p.value cutoff from fisher.test to display output. If there is no difference between two compared positions (p.value = 1 and d.value = 0), NULL will be returned even setting pValueCutOff = 1.

baseDistCutOff

euclidean distance cutoff from dist(,method='euclidean') to display output. If there is no difference between two compared positions (p.value = 1 and d.value = 0), NULL will be returned even setting baseDistCutOff = 0.

verbose

print progress on screen, default = TRUE.

Value

snpDiff: returns a data.frame with difference information: chromosome, position, reference base, base count (A,C,G,T,N) for two bam files, p.value (fisher exact test of these base counts) and d.value (euclidean distance of these base counts)

Author(s)

Xiaobin Xing, <email:[email protected]>

References

Morgan M, Pages H, Obenchain V and Hayden N. Rsamtools: Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import.

Examples

bam1 <- system.file(package='SICtools','extdata','example1.bam')
bam2 <- system.file(package='SICtools','extdata','example2.bam')
refFsa <- system.file(package='SICtools','extdata','example.ref.fasta')

snpDiff(bam1,bam2,refFsa,'chr04',962501,1026983,pValueCutOff=1,baseDistCutOff=0)