Package 'qsea'

Title:	IP-seq data analysis and vizualization
Description:	qsea (quantitative sequencing enrichment analysis) was developed as the successor of the MEDIPS package for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, qsea provides several functionalities for the analysis of other kinds of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential enrichment between groups of samples.
Authors:	Matthias Lienhard [aut, cre] , Lukas Chavez [aut] , Ralf Herwig [aut]
Maintainer:	Matthias Lienhard <[email protected]>
License:	GPL-2
Version:	1.33.0
Built:	2025-01-29 06:43:18 UTC
Source:	https://github.com/bioc/qsea

Help Index

QSEA: Quantitative sequencing enrichment analysis and visualization
estimate CNV information and add to qseaSet object
fit GLMs to reduced model and test for significance
Import sequencing data
Enrichment analysis
Estimate effective library size
Extends an exisiting qseaSet by new samples
Estimate background reads
Infer sequence pattern density values and add to qseaSet object
Add sequence preference to qseaSet object
Prepares a qseaSet Object
Fit GLM for each window
Simulation of MeDIP seq QSEA set
Principle Component Analysis (PCA) in QSea
Finds Significant Regions
Create a Results Table
Definition of normalization procedure
Plots a Heatmap-like Overview of the CNVs
Plots a genome-browser-like image of a region
Plotting functions for enrichment profiles
Plots for Principle Component Analysis (PCA) in QSEA
qseaGLM class and its methods
qseaPCA class and its methods
qseaSet class and its methods
Counts the Windows in Regions of Interest

QSEA: Quantitative sequencing enrichment analysis and visualization

Description

QSEA (quantitative sequencing enrichment analysis) was developed as the successor of the MEDIPS package for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, qsea provides functionality for the analysis of other kinds of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential enrichment between groups of samples.

Author(s)

Matthias Lienhard, Lukas Chavez and Ralf Herwig

Maintainer: Matthias Lienhard <[email protected]>

References

Lienhard M, Grimm C, Morkel M, Herwig R, Chavez L., (Bioinformatics, 2014): MEDIPS: genome-wide differential coverage analysis of sequencing data derived from DNA enrichment experiments.

estimate CNV information and add to qseaSet object

Description

This function adds information on Copy Number Variation (CNV) to the qseaSet object, which is used for normalization. Sample wise CNV information can either be provided, or estimated from input or enrichment sequencing data, by incorporating functions of the HMMcopy package.

Usage

addCNV(qs,file_name, window_size=1000000, paired=FALSE, fragment_length,cnv, 
mu=log2(c(1/2, 2/3, 1, 3/2,2,3)), normal_idx, plot_dir, MeDIP=FALSE, 
parallel=FALSE)
addCNV(qs,file_name, window_size=1000000, paired=FALSE, fragment_length,cnv, 
mu=log2(c(1/2, 2/3, 1, 3/2,2,3)), normal_idx, plot_dir, MeDIP=FALSE, 
parallel=FALSE)

Arguments

`qs`	the qseaSet object
`cnv`	pre-computed CNV information for each sample. If provided, the following parameters are ignored
`file_name`	column name of the sample table for the sequencing files, from which CNV information are computed
`window_size`	window size for CNV analysis
`paired`	are files in file_name column paired end
`fragment_length`	for single end sequencing, provide the average fragment length
`mu`	a priori CNV levels of different states, parameter passed to HMMcopy
`normal_idx`	index of samples which are assumed to be CNV free. The median of these samples serves as "normal" CNV reference level, and CNV are computed relative to this reference level. By default, QSEA looks for samples with "normal" or "control" in its name.
`plot_dir`	If provided, detail CNV plots for each chromosome and each sample are created in the provided directory
`MeDIP`	If set TRUE, QSEA assumes that provided files are methylation enriched sequencing data. In this case, only fragments without CpG dinucleotides are considered. This option allows QSEA to infer CNV information from MeDIP or MDB seq experiments directly
`parallel`	Switch for parallel computing, using BiocParallel

Value

The qseaSet object, extended by the CNV information

Author(s)

Mathias Lienhard

Examples

library("BSgenome.Hsapiens.UCSC.hg19")

bam_hESCs_1 = system.file("extdata", 
    "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData")
bam_hESCs_2 = system.file("extdata", 
    "hESCs.MeDIP.Rep2.chr22.bam", package="MEDIPSData")
sample_table=data.frame(sample_name=paste0("hESCs_", 1:2), 
    file_name=c(bam_hESCs_1,bam_hESCs_2), 
    group=rep("hESC",2), stringsAsFactors=FALSE)
qseaSet=createQseaSet(sampleTable=sample_table, 
        BSgenome="BSgenome.Hsapiens.UCSC.hg19", 
        chr.select="chr22", 
        window_size=500)

#this is an example for computing CNVs from MeDIP data. A very limited example
#however, since the samples do not contain CNVs.
qseaSet=addCNV(qseaSet, fragment_length=300, file_name="file_name", MeDIP=TRUE,
window_size=1000000)


library("BSgenome.Hsapiens.UCSC.hg19")

bam_hESCs_1 = system.file("extdata", 
    "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData")
bam_hESCs_2 = system.file("extdata", 
    "hESCs.MeDIP.Rep2.chr22.bam", package="MEDIPSData")
sample_table=data.frame(sample_name=paste0("hESCs_", 1:2), 
    file_name=c(bam_hESCs_1,bam_hESCs_2), 
    group=rep("hESC",2), stringsAsFactors=FALSE)
qseaSet=createQseaSet(sampleTable=sample_table, 
        BSgenome="BSgenome.Hsapiens.UCSC.hg19", 
        chr.select="chr22", 
        window_size=500)

#this is an example for computing CNVs from MeDIP data. A very limited example
#however, since the samples do not contain CNVs.
qseaSet=addCNV(qseaSet, fragment_length=300, file_name="file_name", MeDIP=TRUE,
window_size=1000000)

fit GLMs to reduced model and test for significance

Description

This function fits negative binomial GLMs to reduced models defined either by the "contrast" parameter, or by one or several model coefficients (specified by "coef" parameter) set to zero. Subsequently, a likelihood ratio test is applied, to identify windows significantly dependent on the tested coefficient.

Usage

addContrast(qs,glm,contrast,coef,name,verbose=TRUE, nChunks = NULL, 
    parallel = FALSE )
addContrast(qs,glm,contrast,coef,name,verbose=TRUE, nChunks = NULL, 
    parallel = FALSE )

Arguments

`qs`	a qseaSet object
`glm`	a qseaGLM object
`contrast`	numeric vector specifying a contrast of the model coefficients. This contrast can for example be defined using limma::makeContrasts()
`coef`	alternatively defines the contrast by coefficient(s) of the model tested to be equal to zero.
`name`	short descriptive name for the contrast (as "TvsN"), used for examples in columns of result tables
`verbose`	more messages that document the process
`nChunks`	fit GLMs in multiple chunks
`parallel`	use multicore processing

Value

This function returns the qseaGLM object, extended by the fitted coefficients of the reduced GLMs, as well as the test statistics. Note that one qseaGLM object can contain several contrasts.

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet()
design=model.matrix(~group, getSampleTable(qs))
TvN_glm=fitNBglm(qs, design, norm_method="beta")
TvN_glm=addContrast(qs,TvN_glm, coef=2, name="TvN")
qs=getExampleQseaSet()
design=model.matrix(~group, getSampleTable(qs))
TvN_glm=fitNBglm(qs, design, norm_method="beta")
TvN_glm=addContrast(qs,TvN_glm, coef=2, name="TvN")

Import sequencing data

Description

This function imports the alignment files (in sam/bam format) and counts the reads per genomic window or directly imports coverage files (in wiggle/bigwiggle format)

Usage

addCoverage(qs, fragment_length, uniquePos=TRUE, minMapQual=1, paired=FALSE, 
parallel=FALSE)
addCoverage(qs, fragment_length, uniquePos=TRUE, minMapQual=1, paired=FALSE, 
parallel=FALSE)

Arguments

`qs`	qseaSet object, e.g. produced by the createQseaSet() function
`fragment_length`	For single end data, provide the expected fragment length
`paired`	If set to TRUE, data is considered to be paired end sequencing, and the actual fragments size is used.
`uniquePos`	If set to TRUE, fragments with same position and orientation are considered to be PCR duplicates and replaced by one representative.
`minMapQual`	The minimal mapping quality for reads to be considered. Note that the definition of mapping quality depends on the alignment tool.
`parallel`	Switch for parallel computing, using BiocParallel

Details

The coverage is imported from the files specified in the file_name column of the sample table, provided for the createQseaSet() function. In case of alignment files, the reads are counted for the window at the center of the sequencing fragment. For single end data, Filetypes is detected automatically from the file suffix.

Value

The function returns the qseaSet object, extended by the number of reads per window for all samples

Author(s)

Mathias Lienhard

Examples

library("BSgenome.Hsapiens.UCSC.hg19")

bam_hESCs_1 = system.file("extdata", 
    "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData")
bam_hESCs_2 = system.file("extdata", 
    "hESCs.MeDIP.Rep2.chr22.bam", package="MEDIPSData")
sample_table=data.frame(sample_name=paste0("hESCs_", 1:2), 
    file_name=c(bam_hESCs_1,bam_hESCs_2), 
    group=rep("hESC",2), stringsAsFactors=FALSE)
qseaSet=createQseaSet(sampleTable=sample_table, 
        BSgenome="BSgenome.Hsapiens.UCSC.hg19", 
        chr.select="chr22", 
        window_size=500)
qseaSet=addCoverage(qseaSet, fragment_length=300)
library("BSgenome.Hsapiens.UCSC.hg19")

bam_hESCs_1 = system.file("extdata", 
    "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData")
bam_hESCs_2 = system.file("extdata", 
    "hESCs.MeDIP.Rep2.chr22.bam", package="MEDIPSData")
sample_table=data.frame(sample_name=paste0("hESCs_", 1:2), 
    file_name=c(bam_hESCs_1,bam_hESCs_2), 
    group=rep("hESC",2), stringsAsFactors=FALSE)
qseaSet=createQseaSet(sampleTable=sample_table, 
        BSgenome="BSgenome.Hsapiens.UCSC.hg19", 
        chr.select="chr22", 
        window_size=500)
qseaSet=addCoverage(qseaSet, fragment_length=300)

Enrichment analysis

Description

This function analyses the dependency of enrichment on a sequence pattern, based on a subset of windows for which the signal is known.

Usage

addEnrichmentParameters(qs, enrichmentPattern, signal, windowIdx, 
min_wd=5,bins=seq(.5,40.5,1))
addEnrichmentParameters(qs, enrichmentPattern, signal, windowIdx, 
min_wd=5,bins=seq(.5,40.5,1))

Arguments

`qs`	The qseaSet object
`enrichmentPattern`	The name of the pattern, on which the enrichment depends on (usually CpG for methylation analysis). This name must correspond to the name specified in addPatternDensity()
`windowIdx`	vector of window indices, for which "true" values are known (or can be estimated)
`signal`	Matrix containing the known (or estimated) values for all samples and all specified windows, as a numeric matrix. These values are expected to be between 0 and 1.
`bins`	For the enrichment analysis, windows are binned according to pattern density. This parameter specifies the bins.
`min_wd`	minimal number of windows per bin to be considered

Value

The function returns the qseaSet object, extended by the parameters of the enrichment profiles for all samples

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet(enrichmentAnalysis=FALSE)
#this procedure assumes that regions with low CpG density is 80% methylated
#on average, and regions within CpG islands are 25% methylated on average. 
wd=which(getRegions(qs)$CpG_density>1 &
    getRegions(qs)$CpG_density<15)
signal=(15-getRegions(qs)$CpG_density[wd])*.55/15+.25
signal=matrix(signal,nrow=length(signal),ncol=length(getSampleNames(qs)))
qs=addEnrichmentParameters(qs, enrichmentPattern="CpG", 
    windowIdx=wd, signal=signal)
qs=getExampleQseaSet(enrichmentAnalysis=FALSE)
#this procedure assumes that regions with low CpG density is 80% methylated
#on average, and regions within CpG islands are 25% methylated on average. 
wd=which(getRegions(qs)$CpG_density>1 &
    getRegions(qs)$CpG_density<15)
signal=(15-getRegions(qs)$CpG_density[wd])*.55/15+.25
signal=matrix(signal,nrow=length(signal),ncol=length(getSampleNames(qs)))
qs=addEnrichmentParameters(qs, enrichmentPattern="CpG", 
    windowIdx=wd, signal=signal)

Estimate effective library size

Description

Normalization factors for effective library size are computed using the trimmed mean of m-values approach (TMM).

Usage

addLibraryFactors(qs, factors,...)
addLibraryFactors(qs, factors,...)

Arguments

`qs`	The qseaSet object
`factors`	In case normalization factors have been pre-computed by the user, they can be passed with this parameter. In this case QSEA adds this factors to the qseaSet object and does not compute normalization factors.
`...`	Further parameters used for the TMM normalization (see details)

Details

The user can specify the TMM normalization by setting the following additional parameters, which are passed to the internal functions. \trimA [default: c(.5,.99)] lower and upper quantiles for trimming of A values \trimM [default: c(.1,.9)] lower and upper quantiles for trimming of M values \doWeighting [default: TRUE] computes a weighted TMM \ref [default: 1] the index of the reference sample \plot [default: FALSE] if set to TRUE, MvsA plots depicting the TMM normalization are created. \nReg [default: 500000] Number of regions to be analyzed for normalization. Regions are drawn uniformly over the whole genome.

Value

This function returns the qseaSet object, containing effective library size normalization factors.

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet(expSamplingDepth=500*10^(1:5), repl=5)
#in this case, the first sample has only view reads, so it is important to set
#the reference sample
qs=addLibraryFactors(qs, plot=TRUE, ref="Sim5N")

qs=getExampleQseaSet(expSamplingDepth=500*10^(1:5), repl=5)
#in this case, the first sample has only view reads, so it is important to set
#the reference sample
qs=addLibraryFactors(qs, plot=TRUE, ref="Sim5N")

Extends an exisiting qseaSet by new samples

Description

This function allows the qseaSet to be extended by new samples, provided in the sample table.

Usage

addNewSamples(qs, sampleTable, force=FALSE, parallel=FALSE)
addNewSamples(qs, sampleTable, force=FALSE, parallel=FALSE)

Arguments

`qs`	The qseaSet object to be extended
`sampleTable`	data.frame, describing the samples. Must be in same format as getSampleTable(qs)
`force`	force adding of new samples, even if existing CNV or enrichment information requires recomputation
`parallel`	parallel processing of alignment files

Value

An object of class qseaSet, including the new samples.

Author(s)

Mathias Lienhard

Examples

library("BSgenome.Hsapiens.UCSC.hg19")
data(samplesNSCLC, package="MEDIPSData")
path=system.file("extdata", package="MEDIPSData")
samples_NSCLC$file_name=paste0(path,"/",samples_NSCLC$file_name )
originalQseaSet=createQseaSet(sampleTable=samples_NSCLC[1:4,], 
        BSgenome="BSgenome.Hsapiens.UCSC.hg19", chr.select="chr22", 
        window_size=500)
originalQseaSet=addCoverage(originalQseaSet, uniquePos=TRUE, paired=TRUE)
qseaSet=addNewSamples(originalQseaSet, samples_NSCLC)
library("BSgenome.Hsapiens.UCSC.hg19")
data(samplesNSCLC, package="MEDIPSData")
path=system.file("extdata", package="MEDIPSData")
samples_NSCLC$file_name=paste0(path,"/",samples_NSCLC$file_name )
originalQseaSet=createQseaSet(sampleTable=samples_NSCLC[1:4,], 
        BSgenome="BSgenome.Hsapiens.UCSC.hg19", chr.select="chr22", 
        window_size=500)
originalQseaSet=addCoverage(originalQseaSet, uniquePos=TRUE, paired=TRUE)
qseaSet=addNewSamples(originalQseaSet, samples_NSCLC)

Estimate background reads

Description

This function sets the background reads offset parameters for the qseaSet object, either by estimating offset reads, or by setting user provided values.

Usage

addOffset(qs,enrichmentPattern , maxPatternDensity=0.01,offset)

addOffset(qs,enrichmentPattern , maxPatternDensity=0.01,offset)

Arguments

`qs`	the qseaSet object
`enrichmentPattern`	name of the enrichment pattern, as specified in addPatternDensity
`maxPatternDensity`	Maximum pattern density, at which the window is treated as pattern free.
`offset`	This parameter alternatively allows to specify the amount of background reads for each sample manually. In this case, please provide average background reads for CNV free windows in rpkm scale.

Value

The function returns the qseaSet object, extended by the estimated amount of background reads for all samples

Author(s)

Mathias Lienhard

Examples

#simulate data with varing background fractions
qs=getExampleQseaSet(expSamplingDepth=5e4, repl=5,bgfraction=seq(0,.8,.2))
#estimate the background in simulated data
addOffset(qs, "CpG", maxPatternDensity=0.7)
#return the background on different scales
getOffset(qs, scale="fraction") #estimated fraction of total reads
getOffset(qs, scale="rpw") #average background reads per CNV free window

#simulate data with varing background fractions
qs=getExampleQseaSet(expSamplingDepth=5e4, repl=5,bgfraction=seq(0,.8,.2))
#estimate the background in simulated data
addOffset(qs, "CpG", maxPatternDensity=0.7)
#return the background on different scales
getOffset(qs, scale="fraction") #estimated fraction of total reads
getOffset(qs, scale="rpw") #average background reads per CNV free window

Infer sequence pattern density values and add to qseaSet object

Description

This function estimates the average occurrences of a sequence pattern (such as CpG dinucleotides) within the overlapping sequencing fragments for each genomic window

Usage

addPatternDensity(qs, pattern,name, fragment_length, fragment_sd,
patternDensity, fixed=TRUE, masks=c("AGAPS","AMB", "RM", "TRF")[1:2])
addPatternDensity(qs, pattern,name, fragment_length, fragment_sd,
patternDensity, fixed=TRUE, masks=c("AGAPS","AMB", "RM", "TRF")[1:2])

Arguments

`qs`	a qseaSet object
`pattern`	actual sequence of the pattern (e.g. “CG”)

name

a name for the sequence pattern(e.g. “CpG”)

`fragment_length`	the average fragment length to be assumed for pattern density estimation. If omitted, this parameter is taken from the qseaSet object.
`fragment_sd`	the standard deviation of fragment length to be assumed for pattern density estimation. If omitted, this parameter is taken from the qseaSet object.
`patternDensity`	this parameter alternatively allows to specify the pattern density manually. In this case, please provide a numerical vector, containing a value (greater than 0) for each genomic window.
`fixed`	if FALSE, an IUPAC ambiguity code in the pattern can match any letter in the reference genome that is associated with the code, and vice versa.
`masks`	names of the BSgenome masks to be active.

Value

The function returns the qseaSet object, extended by the pattern density for all genomic windows

Author(s)

Mathias Lienhard

Examples

library("BSgenome.Hsapiens.UCSC.hg19")
bam_hESCs_1 = system.file("extdata", 
    "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData")
bam_hESCs_2 = system.file("extdata", 
    "hESCs.MeDIP.Rep2.chr22.bam", package="MEDIPSData")
sample_table=data.frame(sample_name=paste0("hESCs_", 1:2), 
    file_name=c(bam_hESCs_1,bam_hESCs_2), 
    group=rep("hESC",2), stringsAsFactors=FALSE)
qseaSet=createQseaSet(sampleTable=sample_table, 
        BSgenome="BSgenome.Hsapiens.UCSC.hg19", 
        chr.select="chr22", 
        window_size=500)
qseaSet=addPatternDensity(qseaSet, "CG", name="CpG", fragment_length=300)


library("BSgenome.Hsapiens.UCSC.hg19")
bam_hESCs_1 = system.file("extdata", 
    "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData")
bam_hESCs_2 = system.file("extdata", 
    "hESCs.MeDIP.Rep2.chr22.bam", package="MEDIPSData")
sample_table=data.frame(sample_name=paste0("hESCs_", 1:2), 
    file_name=c(bam_hESCs_1,bam_hESCs_2), 
    group=rep("hESC",2), stringsAsFactors=FALSE)
qseaSet=createQseaSet(sampleTable=sample_table, 
        BSgenome="BSgenome.Hsapiens.UCSC.hg19", 
        chr.select="chr22", 
        window_size=500)
qseaSet=addPatternDensity(qseaSet, "CG", name="CpG", fragment_length=300)

Add sequence preference to qseaSet object

Description

This function allows to add window specific sequencing preference, that can be used by the normalization procedure. This preference can be defined by the user, or estimated from sequencing of input libraries.

Usage

addSeqPref(qs, seqPref,file_name, fragment_length, paired=FALSE, 
uniquePos=TRUE, alpha=0.05, pseudocount=5, cut=3)
addSeqPref(qs, seqPref,file_name, fragment_length, paired=FALSE, 
uniquePos=TRUE, alpha=0.05, pseudocount=5, cut=3)

Arguments

`qs`	a qseaSet object
`seqPref`	A vector with predefined sequencing preference for each window. Values are interpreted as log2 ratios relative to normal/average sequencing preference.
`file_name`	alternatively, the sequencing preference can be estimated from input sequencing. In this case, provide the column of the sample table that contains the file names for input sequencing alignment or coverage files.
`fragment_length`	for single end data, provide the expected fragment length
`paired`	if set to TRUE, data is considered to be paired end sequencing, and the actual fragments size is used.
`uniquePos`	if set to TRUE, fragments with same position and orientation are considered to be PCR duplicates and replaced by one representative.
`alpha`	currently ignored
`pseudocount`	this value is added to the coverage of each window, to smooth the estimates.
`cut`	absolute log2 value threshold for windows to be excluded from later analysis due to extreme preference values.

Value

the function returns the qseaSet object, extended by the sequencing preference for all genomic windows.

Author(s)

Mathias Lienhard

Prepares a qseaSet Object

Description

This method prepares the qseaSet object, and prepares genome wide bins. Coverage and normalization parameters are added in succeeding functions.

Usage

createQseaSet(sampleTable,BSgenome, chr.select,Regions, window_size=250 )
createQseaSet(sampleTable,BSgenome, chr.select,Regions, window_size=250 )

Arguments

`BSgenome`	name of BSgenome package
`Regions`	GRanges object. If specified, only selected regions are processed
`chr.select`	If specified, only selected chromosomes are processed
`sampleTable`	data.frame, containing at least 3 columns: the sample names (sample_name), paths to alignment or coverage file in sam/bam/wiggle/bigwig format (file_name), and one or more test condition(s) (group). Optionally it may contain a column with alignment or coverage files for CNV analysis, and further information in the samples that are of interest for the analysis.
`window_size`	size for the genome wide bins in base pairs

Value

An object of class qseaSet, containing the sample and genome information.

Author(s)

Mathias Lienhard

Examples

library("BSgenome.Hsapiens.UCSC.hg19")
bam_hESCs_1 = system.file("extdata", "hESCs.MeDIP.Rep1.chr22.bam", 
    package="MEDIPSData")
bam_hESCs_2 = system.file("extdata", "hESCs.MeDIP.Rep2.chr22.bam", 
    package="MEDIPSData")
samplesTable=data.frame(sample_name=paste0("hESCs_", 1:2), 
file_name=c(bam_hESCs_1,bam_hESCs_2), 
group=rep("hESC",2),stringsAsFactors=FALSE)
qs=createQseaSet(samplesTable, BSgenome="BSgenome.Hsapiens.UCSC.hg19", 
    chr.select="chr22", window_size=500)

library("BSgenome.Hsapiens.UCSC.hg19")
bam_hESCs_1 = system.file("extdata", "hESCs.MeDIP.Rep1.chr22.bam", 
    package="MEDIPSData")
bam_hESCs_2 = system.file("extdata", "hESCs.MeDIP.Rep2.chr22.bam", 
    package="MEDIPSData")
samplesTable=data.frame(sample_name=paste0("hESCs_", 1:2), 
file_name=c(bam_hESCs_1,bam_hESCs_2), 
group=rep("hESC",2),stringsAsFactors=FALSE)
qs=createQseaSet(samplesTable, BSgenome="BSgenome.Hsapiens.UCSC.hg19", 
    chr.select="chr22", window_size=500)

Fit GLM for each window

Description

This function fits a negative binomial GLM for each genomic window, according to the design matrix.

Usage

fitNBglm(qs,design,link="log",keep, disp_method="region_wise",
norm_method="rpkm",init_disp=0.5 ,verbose=TRUE, minRowSum=10, pseudocount=1,
    disp_iter = 3, nChunks = NULL, parallel = FALSE)
fitNBglm(qs,design,link="log",keep, disp_method="region_wise",
norm_method="rpkm",init_disp=0.5 ,verbose=TRUE, minRowSum=10, pseudocount=1,
    disp_iter = 3, nChunks = NULL, parallel = FALSE)

Arguments

`qs`	a qseaSet object
`design`	the design matrix for the GLMs
`link`	name of the link function. Currently, only the canonical dQuotelog link function is implemented.
`keep`	indices of windows to be included in the analysis.
`disp_method`	method to estimate dispersion parameters. Allowed values are dQuoteregion_wise for independent window wise estimates, dQuotecommon for a single estimate for all windows, dQuotecutAtQuantiles for window wise estimates trimmed at the 25% and 75% quantiles, or dQuoteinitial for using the dispersion parameters provided with the init_disp parameter.
`norm_method`	normalization method, as defined by normMethod() function
`init_disp`	initial estimate for dispersion parameter. Either a single parameter for all regions, or a vector with window wise parameters.
`verbose`	more messages that document the process
`minRowSum`	filter out windows with less than minRowSum reads over all samples
`pseudocount`	this value is added to the read counts
`disp_iter`	number of iterations for dispersion estimation
`nChunks`	fit GLMs in multiple chunks
`parallel`	use multicore processing

Value

This function returns a qseaGLM object, containing the fitted coefficients of the GLMs.

Author(s)

Mathias Lienhard

Examples

#tbd
qs=getExampleQseaSet()
design=model.matrix(~group, getSampleTable(qs))
TvN_glm=fitNBglm(qs, design, norm_method="beta")
#tbd
qs=getExampleQseaSet()
design=model.matrix(~group, getSampleTable(qs))
TvN_glm=fitNBglm(qs, design, norm_method="beta")

Simulation of MeDIP seq QSEA set

Description

Creates a example qseaSet object by sampling reads for simulated Tumor and Normal samples. Number of replicates, sequencing depth and fraction of background reads can be specified.

Usage

getExampleQseaSet(CpG=TRUE,CNV=TRUE,repl=2, 
    doSampling=TRUE,enrichmentAnalysis=TRUE, expSamplingDepth=50000, 
    bgfraction=.1)
getExampleQseaSet(CpG=TRUE,CNV=TRUE,repl=2, 
    doSampling=TRUE,enrichmentAnalysis=TRUE, expSamplingDepth=50000, 
    bgfraction=.1)

Arguments

`CpG`	if TRUE CpG density is added to the object
`CNV`	if TRUE CNV are emulated for the tumor samples
`repl`	number of replicates for tumor and normal samples
`doSampling`	if TRUE, read counts are sampled and added to the object
`enrichmentAnalysis`	if TRUE, parameters for enrichment profiles are added
`expSamplingDepth`	expected value of sequencing depth
`bgfraction`	fraction of background reads

Details

The function creates an example and test qseaSet object for an toy example genome (one chromosome, 50kb) with 500 bases windows.

Value

The qseaSet object

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet()
qs=getExampleQseaSet()

Principle Component Analysis (PCA) in QSea

Description

The getPCA() function performs a Principle Component Analysis (PCA) of the coverage profiles from a qsea object for exploratory data analysis.

Usage

getPCA(qs, chr= getChrNames(qs),ROIs, minRowSum=20, keep ,
    norm_method=normMethod(logRPM =
    c("log", "library_size", "cnv", "preference", "psC10")), topVar=1000,
    samples=getSampleNames(qs), minEnrichment = 0)
getPCA(qs, chr= getChrNames(qs),ROIs, minRowSum=20, keep ,
    norm_method=normMethod(logRPM =
    c("log", "library_size", "cnv", "preference", "psC10")), topVar=1000,
    samples=getSampleNames(qs), minEnrichment = 0)

Arguments

`qs`	DIPSset (mandatory)
`chr`	chromosomes to consider
`ROIs`	If specified, only windows overlapping ROIs are considered.
`minRowSum`	minimal number of total read counts per window over all samples
`keep`	windows to consider
`norm_method`	name of predefined normalization (e.g. "beta"), or user defined normalization by calling normMethod() function
`topVar`	only the top variable windows are considered
`samples`	names of samples to be considered
`minEnrichment`	for transformation to absolute methylation level, you can specify the minimal number of expected reads for a fully methylated window. This avoids inaccurate estimates, due to low enrichment.

Details

The principle component analysis is calculated using the singular value decomposition (svd).

Value

getPCA() returns a list object, containing the svd and information on the selected windows.

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet( repl=5)
pca=getPCA(qs, norm_method="beta")
colors=c(rep("red", 5), rep("green", 5))
plotPCA(pca, bgColor=colors)
#plotPCAfactors is more interesting, if ROIs have been specified in getPCA
plotPCAfactors(pca)

qs=getExampleQseaSet( repl=5)
pca=getPCA(qs, norm_method="beta")
colors=c(rep("red", 5), rep("green", 5))
plotPCA(pca, bgColor=colors)
#plotPCAfactors is more interesting, if ROIs have been specified in getPCA
plotPCAfactors(pca)

Finds Significant Regions

Description

This function looks for regions, where the test statistic is below the defined thresholds

Usage

isSignificant(glm, contrast = NULL, fdr_th = NULL, pval_th = NULL,
            absLogFC_th = NULL, direction = "both")
isSignificant(glm, contrast = NULL, fdr_th = NULL, pval_th = NULL,
            absLogFC_th = NULL, direction = "both")

Arguments

`glm`	A qseaGLM object (mandatory)
`contrast`	name of contrast to be used
`fdr_th`	a threshold for the false discovery rate
`pval_th`	a p value threshold
`absLogFC_th`	the threshold for the absolute value of logFC
`direction`	direction of change: either "both", "loss", or "gain"

Details

If a threshold is NULL, it is ignored.

For the direction parameter, the following synonyms are valid:

"loss" == "less" == "hypo"

"gain" == "more" == "hyper"

Value

A vector with indices of significant windows, which can be passed to keep parameter of makeTable() function

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet()
design=model.matrix(~group, getSampleTable(qs))
TvN_glm=fitNBglm(qs, design, norm_method="beta")
TvN_glm=addContrast(qs,TvN_glm, coef=2, name="TvN")
sig=isSignificant(TvN_glm, fdr_th=0.01)
qs=getExampleQseaSet()
design=model.matrix(~group, getSampleTable(qs))
TvN_glm=fitNBglm(qs, design, norm_method="beta")
TvN_glm=addContrast(qs,TvN_glm, coef=2, name="TvN")
sig=isSignificant(TvN_glm, fdr_th=0.01)

Create a Results Table

Description

This function creates a table from the qsea objects qseaSet and qseaTvN_glm

Usage

makeTable(qs,glm,norm_methods="counts",samples,groupMeans, keep, ROIs,
    annotation, minPvalSummarize, CNV=FALSE, verbose=TRUE, minEnrichment=3,
    chunksize=1e5)
makeTable(qs,glm,norm_methods="counts",samples,groupMeans, keep, ROIs,
    annotation, minPvalSummarize, CNV=FALSE, verbose=TRUE, minEnrichment=3,
    chunksize=1e5)

Arguments

`qs`	a qseaSet object (mandatory)
`glm`	a list of one or more qseaGLM objects (optional)
`norm_methods`	ether a character vector of pre-defined normalization combinations, or a list defining normalization combinations. This affects both individual and mean values.
`samples`	The indices of the samples for which individual values are to be written out in the specified order
`groupMeans`	a named list of indices vectors, defining groups for which mean values are to be written out
`keep`	a vector of indices of the windows that are considered (as created by isSignificant)
`ROIs`	A GRanges object, containing regions of interest (ROIs). Only windows overlapping ROIs are considered.
`annotation`	a named list of GRange objects, containing annotations (e.g. genes, CpG islands, ...) that are added to the table.
`minPvalSummarize`	If ROIs are given, you can specify a QseaTvN_glm object. For each ROI the window with the most significant differential coverage is written out
`CNV`	If set TRUE, the CNV logFC for the samples specified by samples are written out.
`verbose`	verbosity level
`minEnrichment`	for transformation to absolute methylation level, you can specify the minimal number of expected reads for a fully methylated window. This avoids inaccurate estimates, due to low enrichment.
`chunksize`	For efficient memory usage, the table is built up in chunks. With this parameter, the maximum number of windows processed in one chunk is specified.

Details

Note that, if overlapping ROIs are specified, windows might emerge in the table several times.

Value

A result table containing the specified normalized values for the selected windows and samples/groups

Author(s)

Mathias Lienhard

Examples

#create example set
qs=getExampleQseaSet()
design=model.matrix(~group, getSampleTable(qs))
TvN_glm=fitNBglm(qs, design, norm_method="beta")
TvN_glm=addContrast(qs,TvN_glm, coef=2, name="TvN")
sig=isSignificant(TvN_glm, fdr_th=0.01)

##Table containing all significant windows
tab1=makeTable(qs=qs, glm=TvN_glm, 
    keep=sig, samples=getSampleNames(qs))
##additional CNV logFC for the selected samples
tab2=makeTable(qs=qs, glm=TvN_glm, 
    keep=sig, samples=getSampleNames(qs), CNV=TRUE)
##explicit selection of normalization: 
##counts (i.e. no normalization, only counts)
tab3=makeTable(qs=qs, glm=TvN_glm, keep=sig,
    samples=getSampleNames(qs), norm_method="counts")

##counts AND %methylation values for individual samples and group means
tab4=makeTable(qs=qs, glm=TvN_glm, keep=sig,
    samples=getSampleNames(qs), groupMeans=getSampleGroups(qs),
    norm_method=c("counts", "beta"))

#create example set
qs=getExampleQseaSet()
design=model.matrix(~group, getSampleTable(qs))
TvN_glm=fitNBglm(qs, design, norm_method="beta")
TvN_glm=addContrast(qs,TvN_glm, coef=2, name="TvN")
sig=isSignificant(TvN_glm, fdr_th=0.01)

##Table containing all significant windows
tab1=makeTable(qs=qs, glm=TvN_glm, 
    keep=sig, samples=getSampleNames(qs))
##additional CNV logFC for the selected samples
tab2=makeTable(qs=qs, glm=TvN_glm, 
    keep=sig, samples=getSampleNames(qs), CNV=TRUE)
##explicit selection of normalization: 
##counts (i.e. no normalization, only counts)
tab3=makeTable(qs=qs, glm=TvN_glm, keep=sig,
    samples=getSampleNames(qs), norm_method="counts")

##counts AND %methylation values for individual samples and group means
tab4=makeTable(qs=qs, glm=TvN_glm, keep=sig,
    samples=getSampleNames(qs), groupMeans=getSampleGroups(qs),
    norm_method=c("counts", "beta"))

Definition of normalization procedure

Description

This function allows to define normalization methods by specifying components.

Usage

normMethod(methods, ...)
normMethod(methods, ...)

Arguments

`methods`	names of predefined normalization methods ( for a list of predefined methods, see details)
`...`	sets of normalization components, that can be combined to user defined normalization methods

Details

Predefined normalization methods:

“counts”: no normalization, simply raw count values

“reads”: same as counts

“rpm”: reads per million mappable reads

“nrpm”: CNV normalized reads per million mappable reads

“beta”: transformation to % methylation, posterior mean point estimator

“logitbeta”: logit transformed beta values

“betaLB”: 2.5 lower bound for the point estimator

“betaUB”: 97.5 upper bound for the point estimator

Allowd components for user defined normalization methods:

“library_size”: scale by effective library size

“region_length”: scale by window size

“preference”: scale by positional sequencing preference

“cnv”: scale by CNV ratio

“enrichment”: use enrichment profiles for transformation to absolute methylation level

“qXY”: quantile estimator for transformation to absolute methylation level. XY must be replaced by the quantile (see example with self defined lower and upper bound)

“offset”: consider background reads

WARNING: not all combinations are allowed (eg qXY requires enrichment) and not all allowed combinations are meaningful. Inexperienced users should stick to predefined normalization methods.

Value

a list object, containing the components for the specified normalization procedure

Author(s)

Mathias Lienhard

Examples

#simply raw counts
nm=normMethod("counts")
#beta-values (% methylation) including lower and upper bounds
nm=normMethod(c("beta", "betaLB", "betaUB"))
#self defined lower and upper bound: 10% and 90% quantile
nm=normMethod("beta", 
betaLB_10=c("enrichment", "cnv", "library_size",
    "region_length", "preference","q10", "offset"),
betaUB_90=c("enrichment", "cnv", "library_size",
    "region_length", "preference","q90", "offset")
)
#simply raw counts
nm=normMethod("counts")
#beta-values (% methylation) including lower and upper bounds
nm=normMethod(c("beta", "betaLB", "betaUB"))
#self defined lower and upper bound: 10% and 90% quantile
nm=normMethod("beta", 
betaLB_10=c("enrichment", "cnv", "library_size",
    "region_length", "preference","q10", "offset"),
betaUB_90=c("enrichment", "cnv", "library_size",
    "region_length", "preference","q90", "offset")
)

Plots a Heatmap-like Overview of the CNVs

Description

This function plots the Copy Number Variations (CNVs) of the samples in a heatmap like representation. Amplified regions are depicted in red, whereas deletions are depicted green, and CNV free regions blue. The samples are ordered by an hierarchical clustering.

Usage

plotCNV(qs, dist = c("euclid", "cor")[1], clust_method = "complete",
            chr = getChrNames(qs), samples =getSampleNames(qs),
            cex = 1, labels = c(TRUE, TRUE,TRUE, TRUE), naColor = "darkgrey",
            indicateLogFC = TRUE )
plotCNV(qs, dist = c("euclid", "cor")[1], clust_method = "complete",
            chr = getChrNames(qs), samples =getSampleNames(qs),
            cex = 1, labels = c(TRUE, TRUE,TRUE, TRUE), naColor = "darkgrey",
            indicateLogFC = TRUE )

Arguments

`qs`	a qseaSet object (mandatory)
`dist`	distance measure for clustering. dQuoteeuclidian or dQuotecorrelation based (1-cor)
`clust_method`	method to be passed to hclust
`chr`	vector of chromosomes to be depicted
`samples`	samples for which CNVs are depicted
`cex`	font size of labels
`labels`	Boolean vector of length four (bottom, left, top, right), specifying the sides of the map to be labeled
`naColor`	Color for regions without CNV information
`indicateLogFC`	indicate the CNV logFC values in the legend

Value

This function returns the pairwise distances of the CNV profiles, on which the clustering is based on.

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet()
plotCNV(qs, labels=c(FALSE, TRUE, TRUE, FALSE))
qs=getExampleQseaSet()
plotCNV(qs, labels=c(FALSE, TRUE, TRUE, FALSE))

Plots a genome-browser-like image of a region

Description

This function plots the normalized coverage of specified samples in a specified region, together with annotations, in a genome-browser-like fashion

Usage

plotCoverage(qs,test_results, chr, start, end, samples,samples2, 
norm_method="nrpkm", yoffset, xlab="Position", 
ylab="MeDIP seq", col="black", main, reorder="non", indicate_reorder=TRUE, 
distfun=dist, clustmethod="complete", scale=TRUE, steps=TRUE, space=0.05, 
baselines=TRUE, scale_val, scale_unit=NULL, logFC_pc=.1, cex=1, smooth_width, 
smooth_function=mean, regions, regions_lwd=1, regions_col, 
regions_offset, regions_names, regions_dash=0.1)
plotCoverage(qs,test_results, chr, start, end, samples,samples2, 
norm_method="nrpkm", yoffset, xlab="Position", 
ylab="MeDIP seq", col="black", main, reorder="non", indicate_reorder=TRUE, 
distfun=dist, clustmethod="complete", scale=TRUE, steps=TRUE, space=0.05, 
baselines=TRUE, scale_val, scale_unit=NULL, logFC_pc=.1, cex=1, smooth_width, 
smooth_function=mean, regions, regions_lwd=1, regions_col, 
regions_offset, regions_names, regions_dash=0.1)

Arguments

`qs`	a qseaSet object
`chr`	the chromosome of the region to be depicted
`start`	the start position of the region to be depicted
`end`	the end position of the region to be depicted
`samples`	the indices of the samples to be depicted
`samples2`	if specified, used to calculated logFC (samples/samples2) profiles, must be of same length as samples
`logFC_pc`	if samples2 is specified and logFC are calculated, this parameter specifies the pseudocount to avoid division by zero
`norm_method`	a vector of normalization methods to be combined
`yoffset`	horizontal offset, used to adjust the space between the profiles
`xlab`	title for the x axis
`ylab`	title for the y axis
`main`	an overall title for the plot
`col`	color vector for the samples (is recycled)
`reorder`	indicate whether, and if yes how, the samples are reordered. Valid values are "non", "clust", "max", "minP", or a genomic position within the range that is depicted
`test_results`	a qseaGLM object, used to find the region with minimal p value (only if reorder="minP")
`indicate_reorder`	indicate the window that has been used for reordering by an arrow.
`distfun`	if reorder="clust": for hierarchical clustering for reordering
`clustmethod`	if reorder="clust": for hierarchical clustering for reordering
`scale`	if set TRUE, print a bar scale
`scale_val`	length of the bar scale
`scale_unit`	unit of the bar scale
`steps`	plot the coverage as step function (steps=TRUE), or as lines
`space`	fraction of the plot set aside for sample names etc.
`baselines`	depict the baselines (zero) of the coverage profiles
`cex`	font size
`smooth_width`	number of windows to be considered for sliding window smoothing
`smooth_function`	function to be applied on the sliding windows for smoothing
`regions`	named list of GenomicRanges objects, containing annotation (eg exons) to be depicted below the coverage profiles
`regions_lwd`	vector of line width for the
`regions_col`	vector of colors for the regions
`regions_offset`	offset value, defining the space between the regions
`regions_names`	vector of column names, that store the names of the regions
`regions_dash`	vector, specifying the length of the end dashes of the regions

Value

list containing a table containing the plotted coverage values, the position that has been used for ordering, and the image coordinates

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet(repl=5)
colors=c(rep("red", 5), rep("green", 5))
plot(1)
plotCoverage(qs,samples=getSampleNames(qs), 
    chr="chr1", start=1960001, end=1970001,col=colors, 
    norm_method="beta", yoffset=1,space=.2, reorder=1964500)
plotCoverage(qs,samples=getSampleNames(qs), 
    chr="chr1", start=1960001, end=1970001,col=colors, 
    norm_method="beta", yoffset=1,space=.2, reorder="clust")

qs=getExampleQseaSet(repl=5)
colors=c(rep("red", 5), rep("green", 5))
plot(1)
plotCoverage(qs,samples=getSampleNames(qs), 
    chr="chr1", start=1960001, end=1970001,col=colors, 
    norm_method="beta", yoffset=1,space=.2, reorder=1964500)
plotCoverage(qs,samples=getSampleNames(qs), 
    chr="chr1", start=1960001, end=1970001,col=colors, 
    norm_method="beta", yoffset=1,space=.2, reorder="clust")

Plotting functions for enrichment profiles

Description

Plots the estimated sequence pattern dependent enrichment profile for one or several samples as a matrix of plots

Usage

plotEnrichmentProfile(qs,sample, sPoints=seq(0,30,1), 
    fitPar=list(lty=2, col="green"),cfPar=list(lty=1), densityPar, meanPar,... )
plotEPmatrix(qs, sa=getSampleNames(qs),nrow=ceiling(sqrt(length(sa))), 
    ncol=ceiling(length(sa)/nrow), ...)
plotEnrichmentProfile(qs,sample, sPoints=seq(0,30,1), 
    fitPar=list(lty=2, col="green"),cfPar=list(lty=1), densityPar, meanPar,... )
plotEPmatrix(qs, sa=getSampleNames(qs),nrow=ceiling(sqrt(length(sa))), 
    ncol=ceiling(length(sa)/nrow), ...)

Arguments

`qs`	The qseaSet object
`sample`	The index of the sample for which the enrichment profile should be depicted
`sPoints`	The values at which the enrichment profile function is evaluated
`fitPar`	List of parameters for depiction of the fitted enrichment profile function (see details)
`cfPar`	List of parameters for depiction of the empirical enrichment profile (see details)
`densityPar`	List of parameters for depiction high density scatterplot of coverage and pattern density (see details)
`meanPar`	List of parameters for depiction of the mean coverage per pattern density bin (see details)
`sa`	vector of samples to be depicted in matrix plot
`nrow`	number of rows in matrix plot
`ncol`	number of columns in matrix plot
`...`	Further graphical parameters may also be supplied

Details

Parameter lists for lines in the plot (e.g. fitPar, cfPar and meanPar) are passed to graphics::lines(), densityPar are passed to graphics::smoothScatter() function.

Value

plotEnrichmentProfile returns the coordinates of the enrichment profile. plotEPmatrix returns enrichment profile coordinates for all depicted samples.

Author(s)

Mathias Lienhard

Examples

#create example object with different sequencing depth
qs=getExampleQseaSet(expSamplingDepth=50*10^(1:4), repl=4)
#enrichment profile for one sample
plotEnrichmentProfile(qs, "Sim4T")
#enrichment profile for all samples
plotEPmatrix(qs)
#create example object with different sequencing depth
qs=getExampleQseaSet(expSamplingDepth=50*10^(1:4), repl=4)
#enrichment profile for one sample
plotEnrichmentProfile(qs, "Sim4T")
#enrichment profile for all samples
plotEPmatrix(qs)

Plots for Principle Component Analysis (PCA) in QSEA

Description

The principle components can be depicted using the plotting methods plotPCA and plotPCAfactors

Usage

## S4 method for signature 'qseaPCA'
plotPCA(object,plotComponents=c(1,2), fgColor="black",
    bgColor = "white", legend, plotLabels=TRUE, radius=5, labelOffset=.5,
    labelPos=1, labelAdj, labelColor="black", cex=1, ...)

## S4 method for signature 'qseaPCA'
plotPCAfactors(object,plotComponents=c(1,2), 
    fgColor="black",bgColor = "white", plotTopLabels=100, labelsOfInterest, 
    radius=1, labelOffset=.5,labelPos=1,labelColor="black", cex=1, ...)
## S4 method for signature 'qseaPCA'
plotPCA(object,plotComponents=c(1,2), fgColor="black",
    bgColor = "white", legend, plotLabels=TRUE, radius=5, labelOffset=.5,
    labelPos=1, labelAdj, labelColor="black", cex=1, ...)

## S4 method for signature 'qseaPCA'
plotPCAfactors(object,plotComponents=c(1,2), 
    fgColor="black",bgColor = "white", plotTopLabels=100, labelsOfInterest, 
    radius=1, labelOffset=.5,labelPos=1,labelColor="black", cex=1, ...)

Arguments

`object`	the qseaPCA object, resulting from the getPCA function
`plotComponents`	vector of the two components of the PCA
`fgColor`	vector of foreground colors for the circles
`bgColor`	vector of background colors for the circles
`legend`	add a legend to the plot
`plotLabels`	if set TRUE, the labels of the samples are written in the plot
`radius`	defines the size of the plotted circles
`labelOffset`	defines the offset of the labels to the circles
`labelPos`	specify position of the labels in the plot (see graphics::text)
`labelAdj`	alternative way to specify position of the labls in the plot (see graphics::text)
`labelColor`	a vector of colors for the labels
`cex`	font size of the labels
`plotTopLabels`	labels of factors with strongest contribution to plotted components are shown
`labelsOfInterest`	vector of factor names that are highlighted and labeled in the plot
`...`	further graphical parameters

Value

The functions return a list with the coordinates of the depicted components

Author(s)

Mathias Lienhard

Examples

 
qs=getExampleQseaSet( repl=5)
pca=getPCA(qs, norm_method="beta")
colors=c(rep("red", 5), rep("green", 5))

plotPCA(pca, bgColor=colors)
#plotPCAfactors is more interesting, if ROIs have been specified in getPCA
plotPCAfactors(pca)

qs=getExampleQseaSet( repl=5)
pca=getPCA(qs, norm_method="beta")
colors=c(rep("red", 5), rep("green", 5))

plotPCA(pca, bgColor=colors)
#plotPCAfactors is more interesting, if ROIs have been specified in getPCA
plotPCAfactors(pca)

qseaGLM class and its methods

Description

The qseaGLM class is used in qsea to store fitted coefficients of the GLM.

Slots

fullModelDesign:: design matrix of full model
fullModel :: list containing parameters and fitted coefficients of full model
parameters:: list of parameters used to create the object
contrast:: list of lists containing parameters and the fitted model coeficients of the reduced models
windows:: vector of window indices, for which GLMs have been fitted

Author(s)

Matthias Lienhard

Examples

showClass("qseaGLM")
showClass("qseaGLM")

qseaPCA class and its methods

Description

The qseaPCA class is used in qsea to store results of the principle component analysis.

Slots

svd:: singular value decomposition
sample_names :: names of the samples
factor_names:: names of the genomic windows involved

Author(s)

Matthias Lienhard

Examples

showClass("qseaPCA")
showClass("qseaPCA")

qseaSet class and its methods

Description

The qseaSet class is used in qsea to store information about the coverage, the dependent organism, the chromosomes included in the input file, the length of the included chromosomes (automatically loaded), the number of regions, and optionally CNV information.

Slots

sampleTable:: Object of class "data.frame": the sample table
count_matrix:: Object of class "matrix": matrix containing the coverage for all samples
zygosity:: Object of class "matrix": matrix containing the zygosity for all chromosomes and all samples
regions:: Object of class "GenomicRanges": the genomic regions for the coverage matrix
parameters:: Object of class "list": the parameter list used to create this object
cnv:: Object of class "GenomicRanges": CNV ranges and logFCs
enrichment :: Object of class "list": parameters of the sequence pattern enrichment analysis
libraries :: Object of class "matrix": parameters of the sequencing libraries

Methods

getSampleTable: signature(object = "qseaSet"): extracts the sample table of a qsea set
getSampleNames: signature(object = "qseaSet"): extracts the sample names of a qsea set
getSampleGroups: signature(object = "qseaSet"): extracts the sample groups of a qsea set
getChrNames: signature(object = "qseaSet"): returns the analysed chromosomes
getCounts: signature(object = "qseaSet"): extracts the count matrix a qsea set
getRegions: signature(object = "qseaSet"): extracts the regions object of a qsea set
getParameters: signature(object = "qseaSet"): extracts the parameter list of a qsea set
getLibSize: signature(object = "qseaSet"): extracts the library size (eg the total number of read counts per sample)
getNormFactors: signature(object = "qseaSet"): extracts the list with the different normalization factors
hasCNV: signature(object = "qseaSet"): TRUE if CNV information is present, FALSE otherwise
getCNV: signature(object = "qseaSet"): extracts the CNV regions and logFCs
getOffset: signature(object = "qseaSet"): extracts offset of rpkm scaled background reads
getWindowSize: signature(object = "qseaSet"): returns the window size of the object
getZyosity: signature(object = "qseaSet"): returns the zygosity matrix of the object
setZygosity: signature(object = "qseaSet", zygosityMatrix): sets the zygosity matrix, and resets CNV

Author(s)

Matthias Lienhard

Examples

showClass("qseaSet")
showClass("qseaSet")

Counts the Windows in Regions of Interest

Description

This function takes a list of window indices and a list of ROIs and counts the number of overlapping windows

Usage

regionStats(qs, subsets = list(covered = which(rowSums(getCounts(qs)) >= 20)),
 ROIs = list(), minoverlap = 0, maxgap = -1)
regionStats(qs, subsets = list(covered = which(rowSums(getCounts(qs)) >= 20)),
 ROIs = list(), minoverlap = 0, maxgap = -1)

Arguments

`qs`	A qsea Set object
`subsets`	A list of window indices
`ROIs`	A list of Regions of Interest
`minoverlap`	Passed to findOverlaps
`maxgap`	Passed to findOverlaps

Value

a matrix, containing the total number of windows overlapping the ROIs and the numbers of windows from the subset list overlapping ROIs

Author(s)

Mathias Lienhard

Examples

qs=getExampleQseaSet()
#as an example, we analyze the fraction of reads covered by at least 10
#or at least 20 reads, for bins of CpG density
ROIs=list()
regs=getRegions(qs)
cpg=getRegions(qs)$CpG_density
bins=seq(0,30,5)
for(i in 1:(length(bins)-1)){
    n=paste0(bins[i],"-",bins[i+1]," CpGs")
    ROIs[[n]]=regs[which(cpg>=bins[i] & cpg < bins[i+1])]
}
subsets = list(
    ">10" = which(rowSums(getCounts(qs)) >= 10),
    ">20" = which(rowSums(getCounts(qs)) >= 20))
coverage_stats=regionStats(qs, subsets, ROIs)
coverage_stats_rel=coverage_stats[,-1]/coverage_stats[,1]
x=barplot(t(coverage_stats_rel)*100,ylab="fraction of windows[%]",
    beside=TRUE, legend=TRUE, las=2, args.legend=list(x="topleft"), 
    main="Covered Windows")


qs=getExampleQseaSet()
#as an example, we analyze the fraction of reads covered by at least 10
#or at least 20 reads, for bins of CpG density
ROIs=list()
regs=getRegions(qs)
cpg=getRegions(qs)$CpG_density
bins=seq(0,30,5)
for(i in 1:(length(bins)-1)){
    n=paste0(bins[i],"-",bins[i+1]," CpGs")
    ROIs[[n]]=regs[which(cpg>=bins[i] & cpg < bins[i+1])]
}
subsets = list(
    ">10" = which(rowSums(getCounts(qs)) >= 10),
    ">20" = which(rowSums(getCounts(qs)) >= 20))
coverage_stats=regionStats(qs, subsets, ROIs)
coverage_stats_rel=coverage_stats[,-1]/coverage_stats[,1]
x=barplot(t(coverage_stats_rel)*100,ylab="fraction of windows[%]",
    beside=TRUE, legend=TRUE, las=2, args.legend=list(x="topleft"), 
    main="Covered Windows")

Package 'qsea'

Help Index

QSEA: Quantitative sequencing enrichment analysis and visualization

Description

Author(s)

References

estimate CNV information and add to qseaSet object

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

fit GLMs to reduced model and test for significance

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Import sequencing data

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Enrichment analysis

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Estimate effective library size

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Extends an exisiting qseaSet by new samples

Description

Usage

Arguments

Value

Author(s)

Examples

Estimate background reads

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Infer sequence pattern density values and add to qseaSet object

Description

Usage

Arguments

Value

Author(s)

Examples

Add sequence preference to qseaSet object

Description

Usage

Arguments

Value

Author(s)

Prepares a qseaSet Object

Description

Usage

Arguments