Package 'vulcan' reference manual

Title:	VirtUaL ChIP-Seq data Analysis using Networks
Description:	Vulcan (VirtUaL ChIP-Seq Analysis through Networks) is a package that interrogates gene regulatory networks to infer cofactors significantly enriched in a differential binding signature coming from ChIP-Seq data. In order to do so, our package combines strategies from different BioConductor packages: DESeq for data normalization, ChIPpeakAnno and DiffBind for annotation and definition of ChIP-Seq genomic peaks, csaw to define optimal peak width and viper for applying a regulatory network over a differential binding signature.
Authors:	Federico M. Giorgi, Andrew N. Holding, Florian Markowetz
Maintainer:	Federico M. Giorgi <[email protected]>
License:	LGPL-3
Version:	1.29.0
Built:	2025-03-22 06:19:26 UTC
Source:	https://github.com/bioc/vulcan

Define the average fragment length

Description

A function to get average fragment length from a ChIP-Seq experiment

Usage

average_fragment_length(bam.files, plot = TRUE, max.dist = 550)
average_fragment_length(bam.files, plot = TRUE, max.dist = 550)

Arguments

`bam.files`	a vector of BAM files locations
`plot`	logical. Should a plot be generated?
`max.dist`	numeric. Maximum fragment length accepted. Default=550

Value

nothing

Examples

library(vulcandata)
sheetfile<-'deleteme.csv'
vulcandata::vulcansheet(sheetfile)
a<-read.csv(sheetfile,as.is=TRUE)
bams<-a$bamReads
unlink(sheetfile)
average_fragment_length(bams,plot=TRUE)
library(vulcandata)
sheetfile<-'deleteme.csv'
vulcandata::vulcansheet(sheetfile)
a<-read.csv(sheetfile,as.is=TRUE)
bams<-a$bamReads
unlink(sheetfile)
average_fragment_length(bams,plot=TRUE)

Convert correlation coefficient to p-value

Description

This functions converts an R value from a correlation calculation into a p-value, using a T distribution with the provided number of samples N minus 2 degrees of freedom

Usage

corr2p(r, N)
corr2p(r, N)

Arguments

`r`	a correlation coefficient
`N`	a number of samples

Value

p a p-value

Examples

set.seed(1)
a<-rnorm(1000)
b<-a+rnorm(1000,sd=10)
r<-cor(a,b,method='pearson')
corr2p(r,N=length(a))
set.seed(1)
a<-rnorm(1000)
b<-a+rnorm(1000,sd=10)
r<-cor(a,b,method='pearson')
corr2p(r,N=length(a))

densityauc - Calculate the AUC of a density object

Description

This function will calculate the AUC of a density object generated by the 'density' function.

Usage

densityauc(dens, window)
densityauc(dens, window)

Arguments

`dens`	a density object
`window`	a vector with two values, specifying the left and right borders for the AUC to be calculated

Value

a numeric value for the density AUC

Examples

set.seed(1)
a<-rnorm(1000)
d<-density(a)
window<-c(2,3)
da<-densityauc(d,window)

plot(d,main='')
abline(v=window,lty=2)
title(paste0('AUC between lines=',da))



set.seed(1)
a<-rnorm(1000)
d<-density(a)
window<-c(2,3)
da<-densityauc(d,window)

plot(d,main='')
abline(v=window,lty=2)
title(paste0('AUC between lines=',da))

Probability density of Pareto distributions

Description

Gives NA on values below the threshold

Usage

dpareto(x, threshold = 1, exponent, log = FALSE)
dpareto(x, threshold = 1, exponent, log = FALSE)

Arguments

`x`	Data vector of log probability densities
`threshold`	numeric value to define the start of the tail
`exponent`	the exponent obtained from the pareto.fit function
`log`	logical, should the values be log-transformed?

Value

Vector of (log) probability densities

Examples

set.seed(1)
x<-abs(rnorm(1000))
n<-length(x)
exponent<-1+n/sum(log(x))
dp<-dpareto(x,exponent=exponent)
plot(dp)
set.seed(1)
x<-abs(rnorm(1000))
n<-length(x)
exponent<-1+n/sum(log(x))
dp<-dpareto(x,exponent=exponent)
plot(dp)

Fisher integration of p-values

Description

This function applies the Fisher integration of pvalues

Usage

fisherp(ps)
fisherp(ps)

Arguments

`ps`	a vector of p-values

Value

p.val an integrated p-value

Examples

ps<-c(0.01,0.05,0.03,0.2)
fisherp(ps)
ps<-c(0.01,0.05,0.03,0.2)
fisherp(ps)

GSEA

Description

This function performs Gene Set Enrichment Analysis

Usage

gsea(
  reflist,
  set,
  method = c("permutation", "pareto"),
  np = 1000,
  w = 1,
  gsea_null = NULL
)
gsea(
  reflist,
  set,
  method = c("permutation", "pareto"),
  np = 1000,
  w = 1,
  gsea_null = NULL
)

Arguments

`reflist`	named vector of reference scores
`set`	element set
`method`	one of 'permutation' or 'pareto'
`np`	Number of permutations (Default: 1000)
`w`	exponent used to raise the supplied scores. Default is 1 (original scores unchanged)
`gsea_null`	a GSEA null distribution (Optional)

Value

A GSEA object. Basically a list of s components:

ES: The enrichment score
NES: The normalized enrichment socre
ledge: The items in the leading edge
p.value: The permutation-based p-value

Examples

reflist<-setNames(-sort(rnorm(1000)),paste0('gene',1:1000))
set<-paste0('gene',sample(1:200,50))
obj<-gsea(reflist,set,method='pareto',np=1000)
obj$p.value
reflist<-setNames(-sort(rnorm(1000)),paste0('gene',1:1000))
set<-paste0('gene',sample(1:200,50))
obj<-gsea(reflist,set,method='pareto',np=1000)
obj$p.value

kmgformat - Nice Formatting of Numbers

Description

This function will convert thousand numbers to K, millions to M, billions to G, trillions to T, quadrillions to P

Usage

kmgformat(input, roundParam = 1)
kmgformat(input, roundParam = 1)

Arguments

`input`	A vector of values
`roundParam`	How many decimal digits you want

Value

A character vector of formatted numebr names

Examples

# Thousands
set.seed(1)
a<-runif(1000,0,1e4)
plot(a,yaxt='n')
kmg<-kmgformat(pretty(a))
axis(2,at=pretty(a),labels=kmg)

# Millions to Billions
set.seed(1)
a<-runif(1000,0,1e9)
plot(a,yaxt='n',pch=20,col=val2col(a))
kmg<-kmgformat(pretty(a))
axis(2,at=pretty(a),labels=kmg)
# Thousands
set.seed(1)
a<-runif(1000,0,1e4)
plot(a,yaxt='n')
kmg<-kmgformat(pretty(a))
axis(2,at=pretty(a),labels=kmg)

# Millions to Billions
set.seed(1)
a<-runif(1000,0,1e9)
plot(a,yaxt='n',pch=20,col=val2col(a))
kmg<-kmgformat(pretty(a))
axis(2,at=pretty(a),labels=kmg)

Calculate Null Distribution for GSEA

Description

This function generates a GSEA null distribution from

Usage

null_gsea(set, reflist, w = 1, np = 1000)
null_gsea(set, reflist, w = 1, np = 1000)

Arguments

`set`	A vector containing gene names.
`reflist`	A named vector containing the weights of the entire signature
`w`	exponent used to raise the supplied scores. Default is 1 (original scores unchanged)
`np`	Number of permutations (Default: 1000)

Value

A vector of null scores appropriate for the set/reflist combination provided

Examples

reflist<-setNames(-sort(rnorm(26)),LETTERS)
set<-c('A','B','D','F')
nulldist<-null_gsea(set,reflist)
nulldist[1:10]
reflist<-setNames(-sort(rnorm(26)),LETTERS)
set<-c('A','B','D','F')
nulldist<-null_gsea(set,reflist)
nulldist[1:10]

Convert p-value to correlation coefficient

Description

This functions converts an p-value into a the corresponding correlation coefficient, using a T distribution with the provided number of samples N minus 2 degrees of freedom

Usage

p2corr(p, N)
p2corr(p, N)

Arguments

`p`	a p-value
`N`	a number of samples

Value

r a correlation coefficient

Examples

N<-100
p<-0.05
p2corr(p,N)
N<-100
p<-0.05
p2corr(p,N)

p2z

Description

This function gives a gaussian Z-score corresponding to the provided p-value Careful: sign is not provided

Usage

p2z(p)
p2z(p)

Arguments

p

a p-value

Value

z a Z score

Examples

p<-0.05
p2z(p)
p<-0.05
p2z(p)

Estimate parameters of Pareto distribution

Description

A wrapper for functions implementing actual methods

Usage

pareto.fit(data, threshold)
pareto.fit(data, threshold)

Arguments

`data`	data vector, lower threshold (or 'find', indicating it should be found from the data), method (likelihood or regression, defaulting to former)
`threshold`	numeric value to define the start of the tail

Value

List indicating type of distribution ('pareto'), parameters, information about fit (depending on method), OR a warning and NA if method is not recognized

Examples

# Estimate the tail of a pospulation normally distributed
set.seed(1)
x<-rnorm(1000)
q95<-as.numeric(quantile(abs(x),0.95))
fit<-pareto.fit(abs(x),threshold=q95)
# We can infer the pvalue of a value very much right to the tail of the
# distribution
value<-5
pvalue<-ppareto(value, threshold=q95, exponent=fit$exponent,
lower.tail=FALSE)/20
plot(density(abs(x)),xlim=c(0,value+0.3),main='Pareto fit')
arrows(value,0.2,value,0)
text(value,0.2,labels=paste0('p=',signif(pvalue,2)))
# Estimate the tail of a pospulation normally distributed
set.seed(1)
x<-rnorm(1000)
q95<-as.numeric(quantile(abs(x),0.95))
fit<-pareto.fit(abs(x),threshold=q95)
# We can infer the pvalue of a value very much right to the tail of the
# distribution
value<-5
pvalue<-ppareto(value, threshold=q95, exponent=fit$exponent,
lower.tail=FALSE)/20
plot(density(abs(x)),xlim=c(0,value+0.3),main='Pareto fit')
arrows(value,0.2,value,0)
text(value,0.2,labels=paste0('p=',signif(pvalue,2)))

Plot GSEA results

Description

This function generates a GSEA plot from a gsea object

Usage

plot_gsea(
  gsea.obj,
  twoColors = c("red", "blue"),
  plotNames = FALSE,
  colBarcode = "black",
  title = "Running Enrichment Score",
  bottomYtitle = "List Values",
  bottomYlabel = "Signature values",
  ext_nes = NULL,
  omit_middle = FALSE
)
plot_gsea(
  gsea.obj,
  twoColors = c("red", "blue"),
  plotNames = FALSE,
  colBarcode = "black",
  title = "Running Enrichment Score",
  bottomYtitle = "List Values",
  bottomYlabel = "Signature values",
  ext_nes = NULL,
  omit_middle = FALSE
)

Arguments

`gsea.obj`	GSEA object produced by the `gsea` function
`twoColors`	the two colors to use for positive[1] and negative[2] enrichment scores
`plotNames`	Logical. Should the set names be plotted?
`colBarcode`	The color of the barcode
`title`	String to be plotted above the Running Enrichment Score
`bottomYtitle`	String for the title of the bottom part of the plot
`bottomYlabel`	String for the label
`ext_nes`	Provide a NES from an external calculation
`omit_middle`	If TRUE, will not plot the running score (FALSE by default)

Value

Nothing, a plot is generated in the default output device

Examples

reflist<-setNames(-sort(rnorm(26)),LETTERS)
set<-c('A','B','D','F')
obj<-gsea(reflist,set,method='pareto')
plot_gsea(obj)
reflist<-setNames(-sort(rnorm(26)),LETTERS)
set<-c('A','B','D','F')
obj<-gsea(reflist,set,method='pareto')
plot_gsea(obj)

Cumulative distribution function of the Pareto distributions ' Gives NA on values < threshold

Description

Cumulative distribution function of the Pareto distributions ' Gives NA on values < threshold

Usage

ppareto(x, threshold = 1, exponent, lower.tail = TRUE)
ppareto(x, threshold = 1, exponent, lower.tail = TRUE)

Arguments

`x`	Data vector, lower threshold, scaling exponent, usual flags
`threshold`	numeric value to define the start of the tail
`exponent`	the exponent obtained from the pareto.fit function
`lower.tail`	logical. If the lower tail of the distribution should be considered. Default is TRUE

Value

Vector of (log) probabilities

Examples

# Estimate the tail of a pospulation normally distributed
set.seed(1)
x<-rnorm(1000)
q95<-as.numeric(quantile(abs(x),0.95))
fit<-pareto.fit(abs(x),threshold=q95)
# We can infer the pvalue of a value very much right to the tail of the
# distribution
value<-5
pvalue<-ppareto(value, threshold=q95, exponent=fit$exponent,
lower.tail=FALSE)/20
plot(density(abs(x)),xlim=c(0,value+0.3),main='Pareto fit')
arrows(value,0.2,value,0)
text(value,0.2,labels=paste0('p=',signif(pvalue,2)))
# Estimate the tail of a pospulation normally distributed
set.seed(1)
x<-rnorm(1000)
q95<-as.numeric(quantile(abs(x),0.95))
fit<-pareto.fit(abs(x),threshold=q95)
# We can infer the pvalue of a value very much right to the tail of the
# distribution
value<-5
pvalue<-ppareto(value, threshold=q95, exponent=fit$exponent,
lower.tail=FALSE)/20
plot(density(abs(x)),xlim=c(0,value+0.3),main='Pareto fit')
arrows(value,0.2,value,0)
text(value,0.2,labels=paste0('p=',signif(pvalue,2)))

REA: Rank EnrichmeNt Analysis

Description

REA Calculates enrichment of groups of objects over a vector of values associated to a population of objects

Usage

rea(signatures, groups, sweights = NULL, gweights = NULL, minsize = 1)
rea(signatures, groups, sweights = NULL, gweights = NULL, minsize = 1)

Arguments

`signatures`	a named vector, with values as signature values (e.g. logFC) and names as object names (e.g. gene symbols)
`groups`	a list of vectors of objects (e.g. pathways)
`sweights`	weights associated to objects in the signature. If NULL (default) all objects are treated according to the signature rank
`gweights`	weights associated to association strength between each object and each group. If NULL (default) all associations are treated equally
`minsize`	integer. Minimum size of the groups to be analyzed. Default=1

Value

A numeric vector of normalized enrichment scores

Examples

signatures<-setNames(-sort(rnorm(1000)),paste0('gene',1:1000))
set1<-paste0('gene',sample(1:200,50))
set2<-paste0('gene',sample(1:1000,50))
groups<-list(set1=set1,set2=set2)
obj<-rea(signatures=signatures,groups=groups)
obj
signatures<-setNames(-sort(rnorm(1000)),paste0('gene',1:1000))
set1<-paste0('gene',sample(1:200,50))
set2<-paste0('gene',sample(1:1000,50))
groups<-list(set1=set1,set2=set2)
obj<-rea(signatures=signatures,groups=groups)
obj

Slice

Description

This function prints a slice of a matrix

Usage

slice(matrix)
slice(matrix)

Arguments

matrix

A matrix

Value

prints it

Examples

set.seed(1)
example<-matrix(rnorm(1000),nrow=100,ncol=10)
slice(example)
set.seed(1)
example<-matrix(rnorm(1000),nrow=100,ncol=10)
slice(example)

Stouffer integration of Z scores

Description

This function gives a gaussian Z-score corresponding to the provided p-value Careful: sign is not provided

Usage

stouffer(x)
stouffer(x)

Arguments

`x`	a vector of Z scores

Value

Z an integrated Z score

Examples

zs<-c(1,3,5,2,3)
stouffer(zs)
zs<-c(1,3,5,2,3)
stouffer(zs)

textplot2 - An x y plot of non-overlapping text

Description

This function is an extension of the 'textplot' function from the 'wordcloud' package, with the extra functionality of specifiying the color of the points

Usage

textplot2(
  x,
  y,
  words,
  cex = 1,
  pch = 16,
  pointcolor = "#FFFFFF00",
  new = TRUE,
  show.lines = TRUE,
  ...
)
textplot2(
  x,
  y,
  words,
  cex = 1,
  pch = 16,
  pointcolor = "#FFFFFF00",
  new = TRUE,
  show.lines = TRUE,
  ...
)

Arguments

`x`	x coordinates
`y`	y coordinates
`words`	the text to plot
`cex`	font size
`pch`	pch parameter for the plotted points
`pointcolor`	a string specifying the color of the points (default #FFFFFF00)
`new`	should a new plot be created
`show.lines`	if true, then lines are plotted between x,y and the word, for those words not covering their x,y coordinates
`...`	Additional parameters to be passed to wordlayout and text.

Value

nothing

Examples

obj_names<-apply(expand.grid(LETTERS,LETTERS),1,paste,collapse='')
a<-setNames(runif(26*26),obj_names)
b<-setNames(rnorm(26*26),obj_names)
plot(a,b,pch=20,col='grey')
top<-names(sort(-a))[1:50]
textplot2(a[top],b[top],words=top,new=FALSE,pointcolor='black')
obj_names<-apply(expand.grid(LETTERS,LETTERS),1,paste,collapse='')
a<-setNames(runif(26*26),obj_names)
b<-setNames(rnorm(26*26),obj_names)
plot(a,b,pch=20,col='grey')
top<-names(sort(-a))[1:50]
textplot2(a[top],b[top],words=top,new=FALSE,pointcolor='black')

Convert a numeric vector into colors

Description

Convert a numeric vector into colors

Usage

val2col(
  z,
  col1 = "navy",
  col2 = "white",
  col3 = "red3",
  nbreaks = 100,
  center = TRUE,
  rank = FALSE
)
val2col(
  z,
  col1 = "navy",
  col2 = "white",
  col3 = "red3",
  nbreaks = 100,
  center = TRUE,
  rank = FALSE
)

Arguments

`z`	a vector of numbers
`col1`	a color name for the min value, default 'navy'
`col2`	a color name for the middle value, default 'white'
`col3`	a color name for the max value, default 'red3'
`nbreaks`	Number of colors to be generated. Default is 30.
`center`	boolean, should the data be centered? Default is TRUE
`rank`	boolean, should the data be ranked? Default is FALSE

Value

a vector of colors

Examples

a<-rnorm(1000)
cols<-val2col(a)
plot(a,col=cols,pch=16)
a<-rnorm(1000)
cols<-val2col(a)
plot(a,col=cols,pch=16)

VULCAN - VirtUaL Chipseq data Analysis using Networks

Description

This function calculates the enrichment of a gene regulatory network over a ChIP-Seq derived signature

Usage

vulcan(vobj, network, contrast, annotation = NULL, minsize = 10)
vulcan(vobj, network, contrast, annotation = NULL, minsize = 10)

Arguments

`vobj`	a list, the output of the `'vulcan.normalize'` function
`network`	an object of class `'viper::regulon'`
`contrast`	a vector of two fields, containing the condition names to be compared (1 vs 2)
`annotation`	an optional named vector to convert gene identifiers (e.g. entrez ids to gene symbols) Default (NULL) won't convert gene names.
`minsize`	integer indicating the minimum regulon size for the analysis to be run. Default: 10

Value

A list of components:

peakcounts: A matrix of raw peak counts, peaks as rows, samples as columns
peakrpkms: A matrix of peak RPKMs, peaks as rows, samples as columns
rawcounts: A matrix of raw gene counts, genes as rows, samples as columns. The counts are associated to the promoter region of the gene
rpkms: A matrix of RPKMs, genes as rows, samples as columns. The RPKMs are associated to the promoter region of the gene
normalized: A matrix of gene abundances normalized by Variance-Stabilizing Transformation (VST), genes as rows, samples as columns. The abundances are associated to the promoter region of the gene
samples: A vector of sample names and conditions
msviper: a multisample virtual proteomics object from the viper package
mrs: A table of master regulators for a specific signature, indicating their Normalized Enrichment Score (NES) and p-value

Examples

library(vulcandata)
# Get an example vulcan object (generated with vulcan.import() using the
# dummy dataset contained in the \textit{vulcandata} package)
vobj<-vulcandata::vulcanexample()
# Annotate peaks to gene names
vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
# Normalize data for VULCAN analysis
vobj<-vulcan.normalize(vobj)
# Detect which conditions are present
names(vobj$samples)

# Load an ARACNe network
# This is a regulon object as specified in the VIPER package, named 'network'
load(system.file('extdata','network.rda',package='vulcandata',mustWork=TRUE))
# Run VULCAN
# We can reduce the minimum regulon size, since in this example only one
# chromosome
# was measured, and the networks would otherwise have too few hits
vobj_analysis<-vulcan(vobj,network=network,contrast=c('t90','t0'),minsize=5)
# Visualize output using the msviper plotting function
plot(vobj_analysis$msviper,mrs=7)

library(vulcandata)
# Get an example vulcan object (generated with vulcan.import() using the
# dummy dataset contained in the \textit{vulcandata} package)
vobj<-vulcandata::vulcanexample()
# Annotate peaks to gene names
vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
# Normalize data for VULCAN analysis
vobj<-vulcan.normalize(vobj)
# Detect which conditions are present
names(vobj$samples)

# Load an ARACNe network
# This is a regulon object as specified in the VIPER package, named 'network'
load(system.file('extdata','network.rda',package='vulcandata',mustWork=TRUE))
# Run VULCAN
# We can reduce the minimum regulon size, since in this example only one
# chromosome
# was measured, and the networks would otherwise have too few hits
vobj_analysis<-vulcan(vobj,network=network,contrast=c('t90','t0'),minsize=5)
# Visualize output using the msviper plotting function
plot(vobj_analysis$msviper,mrs=7)

Function to annotate peaks for VULCAN analysis

Description

This function coalesces and annotates a set of BAM files into peak-centered data. It implements the ChIPPeakANno methods, with specific choices dealing with defining the genomic area around the promoter and which peaks to include.

Usage

vulcan.annotate(
  vobj,
  lborder = -10000,
  rborder = 10000,
  method = c("closest", "strongest", "sum", "topvar", "farthest", "lowvar"),
  TxDb = NULL
)
vulcan.annotate(
  vobj,
  lborder = -10000,
  rborder = 10000,
  method = c("closest", "strongest", "sum", "topvar", "farthest", "lowvar"),
  TxDb = NULL
)

Arguments

`vobj`	A list of peakcounts, samples and peakrpkms (i.e. the output of the function vulcan.import)
`lborder`	Boundary for peak annotation (in nucleotides) upstream of the Transcription starting site (default: -10000)
`rborder`	Boundary for peak annotation (in nucleotides) downstream of the Transcription starting site (default: 10000)
`method`	Method to deal with multiple peaks found within gene promoter boundaries. One of sum (default), closest, strongest, topvar, farthest or lowvar. This will affect only genes with multiple possible peaks. When a single peak can be mapped to the promoter region of the gene, that peak abundance will be considered as the gene promoter's occupancy. sum when multiple peaks are found, sum their contributions closest when multiple peaks are found, keep only the closest to the TSS as the representative one strongest when multiple peaks are found, keep the strongest as the representative one farthest when multiple peaks are found, keep only the closest to the TSS as the representative one topvar when multiple peaks are found, keep the most varying as the representative one lowvar when multiple peaks are found, keep the least varying as the representative one
`TxDb`	TxDb annotation object containing the knownGene track. If NULL (the default), TxDb.Hsapiens.UCSC.hg19.knownGene is loaded

Value

A list of components:

peakcounts: A matrix of raw peak counts, peaks as rows, samples as columns
peakrpkms: A matrix of peak RPKMs, peaks as rows, samples as columns
rawcounts: A matrix of raw gene counts, genes as rows, samples as columns. The counts are associated to the promoter region of the gene
rpkms: A matrix of RPKMs, genes as rows, samples as columns. The RPKMs are associated to the promoter region of the gene
samples: A vector of sample names and conditions

Examples

library(vulcandata)
vobj<-vulcandata::vulcanexample()
vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
library(vulcandata)
vobj<-vulcandata::vulcanexample()
vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')

Function to import BAM files

Description

This function coalesces and annotates a set of BAM files into peak-centered data

Usage

vulcan.import(sheetfile, intervals = NULL)
vulcan.import(sheetfile, intervals = NULL)

Arguments

`sheetfile`	path to a csv annotation file containing sample information and BAM location
`intervals`	size of the peaks. If NULL (default) it is inferred from the average fragment length observed in the dataset

Value

A list of components:

peakcounts: A matrix of raw peak counts, peaks as rows, samples as columns
peakrpkms: A matrix of peak RPKMs, peaks as rows, samples as columns
samples: A vector of sample names and conditions

Examples

library(vulcandata)
# Generate an annotation file from the dummy ChIP-Seq dataset
vfile<-tempfile()
vulcandata::vulcansheet(vfile)
# Import BAM and BED information into a list object
# vobj<-vulcan.import(vfile)
# This vobj is identical to the object returned by
# vulcandata::vulcanexample()
unlink(vfile)
library(vulcandata)
# Generate an annotation file from the dummy ChIP-Seq dataset
vfile<-tempfile()
vulcandata::vulcansheet(vfile)
# Import BAM and BED information into a list object
# vobj<-vulcan.import(vfile)
# This vobj is identical to the object returned by
# vulcandata::vulcanexample()
unlink(vfile)

Function to normalize promoter peak data

Description

This function normalizes gene-centered ChIP-Seq data using VST

Usage

vulcan.normalize(vobj)
vulcan.normalize(vobj)

Arguments

vobj

a list, the output of the 'vulcan.annotate' function

Value

A list of components:

peakcounts: A matrix of raw peak counts, peaks as rows, samples as columns
peakrpkms: A matrix of peak RPKMs, peaks as rows, samples as columns
rawcounts: A matrix of raw gene counts, genes as rows, samples as columns. The counts are associated to the promoter region of the gene
rpkms: A matrix of RPKMs, genes as rows, samples as columns. The RPKMs are associated to the promoter region of the gene
normalized: A matrix of gene abundances normalized by Variance-Stabilizing Transformation (VST), genes as rows, samples as columns. The abundances are associated to the promoter region of the gene
samples: A vector of sample names and conditions

Examples

## Not run: 
library(vulcandata)
vobj<-vulcandata::vulcanexample()
vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
vobj<-vulcan.normalize(vobj)

## End(Not run)
## Not run: 
library(vulcandata)
vobj<-vulcandata::vulcanexample()
vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
vobj<-vulcan.normalize(vobj)

## End(Not run)

Function to calculate pathway enrichment over a ChIP-Seq profile

Description

This function applies Gene Set Enrichment Analysis or Rank Enrichment Analysis over a ChIP-Seq signature contained in a vulcan package object

Usage

vulcan.pathways(
  vobj,
  pathways,
  contrast = NULL,
  method = c("GSEA", "REA"),
  np = 1000
)
vulcan.pathways(
  vobj,
  pathways,
  contrast = NULL,
  method = c("GSEA", "REA"),
  np = 1000
)

Arguments

`vobj`	a list, the output of the `'vulcan.annotate'` function
`pathways`	a list of vectors, one vector of gene identifiers per pathway
`contrast`	a vector with the name of the two conditions to compare. If method=='REA', contrast can be set to 'all', and Rank Enrichment Analysis will be performed for every sample independently, compared to the mean of the dataset.
`method`	either 'REA' for Rank Enrichment Analysis or 'GSEA' for Gene Set Enrichment Analysis
`np`	numeric, only for GSEA, the number of permutations to build the null distribution. Default is 1000

Value

if method=='GSEA', a named vector, with pathway names as names, and the normalized enrichment score of either the GSEA as value. If method=='REA', a matrix, with pathway names as rows and specific contrasts as columns (the method 'REA' allows for multiple contrasts to be calculated at the same time)

Examples

library(vulcandata)
vfile<-tempfile()
vulcandata::vulcansheet(vfile)
#vobj<-vulcan.import(vfile)
vobj<-vulcandata::vulcanexample()
unlink(vfile)
vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
vobj<-vulcan.normalize(vobj)
# Create a dummy pathway list (in entrez ids)
pathways<-list(
   pathwayA=rownames(vobj$normalized)[1:20],
   pathwayB=rownames(vobj$normalized)[21:50]
)
# Which contrast groups can be queried
names(vobj$samples)
results_gsea<-vulcan.pathways(vobj,pathways,contrast=c('t90','t0'),
method='GSEA')
results_rea<-vulcan.pathways(vobj,pathways,contrast=c('all'),method='REA')
library(vulcandata)
vfile<-tempfile()
vulcandata::vulcansheet(vfile)
#vobj<-vulcan.import(vfile)
vobj<-vulcandata::vulcanexample()
unlink(vfile)
vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
vobj<-vulcan.normalize(vobj)
# Create a dummy pathway list (in entrez ids)
pathways<-list(
   pathwayA=rownames(vobj$normalized)[1:20],
   pathwayB=rownames(vobj$normalized)[21:50]
)
# Which contrast groups can be queried
names(vobj$samples)
results_gsea<-vulcan.pathways(vobj,pathways,contrast=c('t90','t0'),
method='GSEA')
results_rea<-vulcan.pathways(vobj,pathways,contrast=c('all'),method='REA')

Weighted Stouffer integration of Z scores

Description

This function gives a gaussian Z-score corresponding to the provided p-value Careful: sign is not provided

Usage

wstouffer(x, w)
wstouffer(x, w)

Arguments

`x`	a vector of Z scores
`w`	weight for each Z score

Value

Z an integrated Z score

Examples

zs<-c(1,-3,5,2,3)
ws<-c(1,10,1,2,1)
wstouffer(zs,ws)
zs<-c(1,-3,5,2,3)
ws<-c(1,10,1,2,1)
wstouffer(zs,ws)

z2p

Description

This function gives a gaussian p-value corresponding to the provided Z-score

Usage

z2p(z)
z2p(z)

Arguments

z

a Z score

Value

a p-value

Examples

z<-1.96
z2p(z)
z<-1.96
z2p(z)

Package 'vulcan'

Help Index

Define the average fragment length

Description

Usage

Arguments

Value

Examples

Convert correlation coefficient to p-value

Description

Usage

Arguments

Value

Examples

densityauc - Calculate the AUC of a density object

Description

Usage

Arguments

Value

Examples

Probability density of Pareto distributions

Description

Usage

Arguments

Value

Examples

Fisher integration of p-values

Description

Usage

Arguments

Value

Examples

GSEA

Description

Usage

Arguments

Value

Examples

kmgformat - Nice Formatting of Numbers

Description

Usage

Arguments

Value

Examples

Calculate Null Distribution for GSEA

Description

Usage

Arguments

Value

Examples

Convert p-value to correlation coefficient

Description

Usage

Arguments

Value

Examples

p2z

Description

Usage

Arguments

Value

Examples

Estimate parameters of Pareto distribution

Description

Usage

Arguments

Value

Examples

Plot GSEA results

Description

Usage

Arguments

Value

Examples

Cumulative distribution function of the Pareto distributions ' Gives NA on values < threshold

Description

Usage

Arguments

Value

Examples