Title: | VirtUaL ChIP-Seq data Analysis using Networks |
---|---|
Description: | Vulcan (VirtUaL ChIP-Seq Analysis through Networks) is a package that interrogates gene regulatory networks to infer cofactors significantly enriched in a differential binding signature coming from ChIP-Seq data. In order to do so, our package combines strategies from different BioConductor packages: DESeq for data normalization, ChIPpeakAnno and DiffBind for annotation and definition of ChIP-Seq genomic peaks, csaw to define optimal peak width and viper for applying a regulatory network over a differential binding signature. |
Authors: | Federico M. Giorgi, Andrew N. Holding, Florian Markowetz |
Maintainer: | Federico M. Giorgi <[email protected]> |
License: | LGPL-3 |
Version: | 1.29.0 |
Built: | 2024-12-22 06:23:51 UTC |
Source: | https://github.com/bioc/vulcan |
A function to get average fragment length from a ChIP-Seq experiment
average_fragment_length(bam.files, plot = TRUE, max.dist = 550)
average_fragment_length(bam.files, plot = TRUE, max.dist = 550)
bam.files |
a vector of BAM files locations |
plot |
logical. Should a plot be generated? |
max.dist |
numeric. Maximum fragment length accepted. Default=550 |
nothing
library(vulcandata) sheetfile<-'deleteme.csv' vulcandata::vulcansheet(sheetfile) a<-read.csv(sheetfile,as.is=TRUE) bams<-a$bamReads unlink(sheetfile) average_fragment_length(bams,plot=TRUE)
library(vulcandata) sheetfile<-'deleteme.csv' vulcandata::vulcansheet(sheetfile) a<-read.csv(sheetfile,as.is=TRUE) bams<-a$bamReads unlink(sheetfile) average_fragment_length(bams,plot=TRUE)
This functions converts an R value from a correlation calculation into a p-value, using a T distribution with the provided number of samples N minus 2 degrees of freedom
corr2p(r, N)
corr2p(r, N)
r |
a correlation coefficient |
N |
a number of samples |
p a p-value
set.seed(1) a<-rnorm(1000) b<-a+rnorm(1000,sd=10) r<-cor(a,b,method='pearson') corr2p(r,N=length(a))
set.seed(1) a<-rnorm(1000) b<-a+rnorm(1000,sd=10) r<-cor(a,b,method='pearson') corr2p(r,N=length(a))
This function will calculate the AUC of a density object generated by the
'density'
function.
densityauc(dens, window)
densityauc(dens, window)
dens |
a density object |
window |
a vector with two values, specifying the left and right borders for the AUC to be calculated |
a numeric value for the density AUC
set.seed(1) a<-rnorm(1000) d<-density(a) window<-c(2,3) da<-densityauc(d,window) plot(d,main='') abline(v=window,lty=2) title(paste0('AUC between lines=',da))
set.seed(1) a<-rnorm(1000) d<-density(a) window<-c(2,3) da<-densityauc(d,window) plot(d,main='') abline(v=window,lty=2) title(paste0('AUC between lines=',da))
Gives NA on values below the threshold
dpareto(x, threshold = 1, exponent, log = FALSE)
dpareto(x, threshold = 1, exponent, log = FALSE)
x |
Data vector of log probability densities |
threshold |
numeric value to define the start of the tail |
exponent |
the exponent obtained from the pareto.fit function |
log |
logical, should the values be log-transformed? |
Vector of (log) probability densities
set.seed(1) x<-abs(rnorm(1000)) n<-length(x) exponent<-1+n/sum(log(x)) dp<-dpareto(x,exponent=exponent) plot(dp)
set.seed(1) x<-abs(rnorm(1000)) n<-length(x) exponent<-1+n/sum(log(x)) dp<-dpareto(x,exponent=exponent) plot(dp)
This function applies the Fisher integration of pvalues
fisherp(ps)
fisherp(ps)
ps |
a vector of p-values |
p.val an integrated p-value
ps<-c(0.01,0.05,0.03,0.2) fisherp(ps)
ps<-c(0.01,0.05,0.03,0.2) fisherp(ps)
This function performs Gene Set Enrichment Analysis
gsea( reflist, set, method = c("permutation", "pareto"), np = 1000, w = 1, gsea_null = NULL )
gsea( reflist, set, method = c("permutation", "pareto"), np = 1000, w = 1, gsea_null = NULL )
reflist |
named vector of reference scores |
set |
element set |
method |
one of 'permutation' or 'pareto' |
np |
Number of permutations (Default: 1000) |
w |
exponent used to raise the supplied scores. Default is 1 (original scores unchanged) |
gsea_null |
a GSEA null distribution (Optional) |
A GSEA object. Basically a list of s components:
The enrichment score
The normalized enrichment socre
The items in the leading edge
The permutation-based p-value
reflist<-setNames(-sort(rnorm(1000)),paste0('gene',1:1000)) set<-paste0('gene',sample(1:200,50)) obj<-gsea(reflist,set,method='pareto',np=1000) obj$p.value
reflist<-setNames(-sort(rnorm(1000)),paste0('gene',1:1000)) set<-paste0('gene',sample(1:200,50)) obj<-gsea(reflist,set,method='pareto',np=1000) obj$p.value
This function will convert thousand numbers to K, millions to M, billions to G, trillions to T, quadrillions to P
kmgformat(input, roundParam = 1)
kmgformat(input, roundParam = 1)
input |
A vector of values |
roundParam |
How many decimal digits you want |
A character vector of formatted numebr names
# Thousands set.seed(1) a<-runif(1000,0,1e4) plot(a,yaxt='n') kmg<-kmgformat(pretty(a)) axis(2,at=pretty(a),labels=kmg) # Millions to Billions set.seed(1) a<-runif(1000,0,1e9) plot(a,yaxt='n',pch=20,col=val2col(a)) kmg<-kmgformat(pretty(a)) axis(2,at=pretty(a),labels=kmg)
# Thousands set.seed(1) a<-runif(1000,0,1e4) plot(a,yaxt='n') kmg<-kmgformat(pretty(a)) axis(2,at=pretty(a),labels=kmg) # Millions to Billions set.seed(1) a<-runif(1000,0,1e9) plot(a,yaxt='n',pch=20,col=val2col(a)) kmg<-kmgformat(pretty(a)) axis(2,at=pretty(a),labels=kmg)
This function generates a GSEA null distribution from
null_gsea(set, reflist, w = 1, np = 1000)
null_gsea(set, reflist, w = 1, np = 1000)
set |
A vector containing gene names. |
reflist |
A named vector containing the weights of the entire signature |
w |
exponent used to raise the supplied scores. Default is 1 (original scores unchanged) |
np |
Number of permutations (Default: 1000) |
A vector of null scores appropriate for the set/reflist combination provided
reflist<-setNames(-sort(rnorm(26)),LETTERS) set<-c('A','B','D','F') nulldist<-null_gsea(set,reflist) nulldist[1:10]
reflist<-setNames(-sort(rnorm(26)),LETTERS) set<-c('A','B','D','F') nulldist<-null_gsea(set,reflist) nulldist[1:10]
This functions converts an p-value into a the corresponding correlation coefficient, using a T distribution with the provided number of samples N minus 2 degrees of freedom
p2corr(p, N)
p2corr(p, N)
p |
a p-value |
N |
a number of samples |
r a correlation coefficient
N<-100 p<-0.05 p2corr(p,N)
N<-100 p<-0.05 p2corr(p,N)
This function gives a gaussian Z-score corresponding to the provided p-value Careful: sign is not provided
p2z(p)
p2z(p)
p |
a p-value |
z a Z score
p<-0.05 p2z(p)
p<-0.05 p2z(p)
A wrapper for functions implementing actual methods
pareto.fit(data, threshold)
pareto.fit(data, threshold)
data |
data vector, lower threshold (or 'find', indicating it should be found from the data), method (likelihood or regression, defaulting to former) |
threshold |
numeric value to define the start of the tail |
List indicating type of distribution ('pareto'), parameters, information about fit (depending on method), OR a warning and NA if method is not recognized
# Estimate the tail of a pospulation normally distributed set.seed(1) x<-rnorm(1000) q95<-as.numeric(quantile(abs(x),0.95)) fit<-pareto.fit(abs(x),threshold=q95) # We can infer the pvalue of a value very much right to the tail of the # distribution value<-5 pvalue<-ppareto(value, threshold=q95, exponent=fit$exponent, lower.tail=FALSE)/20 plot(density(abs(x)),xlim=c(0,value+0.3),main='Pareto fit') arrows(value,0.2,value,0) text(value,0.2,labels=paste0('p=',signif(pvalue,2)))
# Estimate the tail of a pospulation normally distributed set.seed(1) x<-rnorm(1000) q95<-as.numeric(quantile(abs(x),0.95)) fit<-pareto.fit(abs(x),threshold=q95) # We can infer the pvalue of a value very much right to the tail of the # distribution value<-5 pvalue<-ppareto(value, threshold=q95, exponent=fit$exponent, lower.tail=FALSE)/20 plot(density(abs(x)),xlim=c(0,value+0.3),main='Pareto fit') arrows(value,0.2,value,0) text(value,0.2,labels=paste0('p=',signif(pvalue,2)))
This function generates a GSEA plot from a gsea object
plot_gsea( gsea.obj, twoColors = c("red", "blue"), plotNames = FALSE, colBarcode = "black", title = "Running Enrichment Score", bottomYtitle = "List Values", bottomYlabel = "Signature values", ext_nes = NULL, omit_middle = FALSE )
plot_gsea( gsea.obj, twoColors = c("red", "blue"), plotNames = FALSE, colBarcode = "black", title = "Running Enrichment Score", bottomYtitle = "List Values", bottomYlabel = "Signature values", ext_nes = NULL, omit_middle = FALSE )
gsea.obj |
GSEA object produced by the |
twoColors |
the two colors to use for positive[1] and negative[2] enrichment scores |
plotNames |
Logical. Should the set names be plotted? |
colBarcode |
The color of the barcode |
title |
String to be plotted above the Running Enrichment Score |
bottomYtitle |
String for the title of the bottom part of the plot |
bottomYlabel |
String for the label |
ext_nes |
Provide a NES from an external calculation |
omit_middle |
If TRUE, will not plot the running score (FALSE by default) |
Nothing, a plot is generated in the default output device
reflist<-setNames(-sort(rnorm(26)),LETTERS) set<-c('A','B','D','F') obj<-gsea(reflist,set,method='pareto') plot_gsea(obj)
reflist<-setNames(-sort(rnorm(26)),LETTERS) set<-c('A','B','D','F') obj<-gsea(reflist,set,method='pareto') plot_gsea(obj)
Cumulative distribution function of the Pareto distributions ' Gives NA on values < threshold
ppareto(x, threshold = 1, exponent, lower.tail = TRUE)
ppareto(x, threshold = 1, exponent, lower.tail = TRUE)
x |
Data vector, lower threshold, scaling exponent, usual flags |
threshold |
numeric value to define the start of the tail |
exponent |
the exponent obtained from the pareto.fit function |
lower.tail |
logical. If the lower tail of the distribution should be considered. Default is TRUE |
Vector of (log) probabilities
# Estimate the tail of a pospulation normally distributed set.seed(1) x<-rnorm(1000) q95<-as.numeric(quantile(abs(x),0.95)) fit<-pareto.fit(abs(x),threshold=q95) # We can infer the pvalue of a value very much right to the tail of the # distribution value<-5 pvalue<-ppareto(value, threshold=q95, exponent=fit$exponent, lower.tail=FALSE)/20 plot(density(abs(x)),xlim=c(0,value+0.3),main='Pareto fit') arrows(value,0.2,value,0) text(value,0.2,labels=paste0('p=',signif(pvalue,2)))
# Estimate the tail of a pospulation normally distributed set.seed(1) x<-rnorm(1000) q95<-as.numeric(quantile(abs(x),0.95)) fit<-pareto.fit(abs(x),threshold=q95) # We can infer the pvalue of a value very much right to the tail of the # distribution value<-5 pvalue<-ppareto(value, threshold=q95, exponent=fit$exponent, lower.tail=FALSE)/20 plot(density(abs(x)),xlim=c(0,value+0.3),main='Pareto fit') arrows(value,0.2,value,0) text(value,0.2,labels=paste0('p=',signif(pvalue,2)))
REA Calculates enrichment of groups of objects over a vector of values associated to a population of objects
rea(signatures, groups, sweights = NULL, gweights = NULL, minsize = 1)
rea(signatures, groups, sweights = NULL, gweights = NULL, minsize = 1)
signatures |
a named vector, with values as signature values (e.g. logFC) and names as object names (e.g. gene symbols) |
groups |
a list of vectors of objects (e.g. pathways) |
sweights |
weights associated to objects in the signature. If NULL (default) all objects are treated according to the signature rank |
gweights |
weights associated to association strength between each object and each group. If NULL (default) all associations are treated equally |
minsize |
integer. Minimum size of the groups to be analyzed. Default=1 |
A numeric vector of normalized enrichment scores
signatures<-setNames(-sort(rnorm(1000)),paste0('gene',1:1000)) set1<-paste0('gene',sample(1:200,50)) set2<-paste0('gene',sample(1:1000,50)) groups<-list(set1=set1,set2=set2) obj<-rea(signatures=signatures,groups=groups) obj
signatures<-setNames(-sort(rnorm(1000)),paste0('gene',1:1000)) set1<-paste0('gene',sample(1:200,50)) set2<-paste0('gene',sample(1:1000,50)) groups<-list(set1=set1,set2=set2) obj<-rea(signatures=signatures,groups=groups) obj
This function prints a slice of a matrix
slice(matrix)
slice(matrix)
matrix |
A matrix |
prints it
set.seed(1) example<-matrix(rnorm(1000),nrow=100,ncol=10) slice(example)
set.seed(1) example<-matrix(rnorm(1000),nrow=100,ncol=10) slice(example)
This function gives a gaussian Z-score corresponding to the provided p-value Careful: sign is not provided
stouffer(x)
stouffer(x)
x |
a vector of Z scores |
Z an integrated Z score
zs<-c(1,3,5,2,3) stouffer(zs)
zs<-c(1,3,5,2,3) stouffer(zs)
This function is an extension of the 'textplot'
function from the
'wordcloud'
package, with the extra functionality of specifiying the
color of the points
textplot2( x, y, words, cex = 1, pch = 16, pointcolor = "#FFFFFF00", new = TRUE, show.lines = TRUE, ... )
textplot2( x, y, words, cex = 1, pch = 16, pointcolor = "#FFFFFF00", new = TRUE, show.lines = TRUE, ... )
x |
x coordinates |
y |
y coordinates |
words |
the text to plot |
cex |
font size |
pch |
pch parameter for the plotted points |
pointcolor |
a string specifying the color of the points (default #FFFFFF00) |
new |
should a new plot be created |
show.lines |
if true, then lines are plotted between x,y and the word, for those words not covering their x,y coordinates |
... |
Additional parameters to be passed to wordlayout and text. |
nothing
obj_names<-apply(expand.grid(LETTERS,LETTERS),1,paste,collapse='') a<-setNames(runif(26*26),obj_names) b<-setNames(rnorm(26*26),obj_names) plot(a,b,pch=20,col='grey') top<-names(sort(-a))[1:50] textplot2(a[top],b[top],words=top,new=FALSE,pointcolor='black')
obj_names<-apply(expand.grid(LETTERS,LETTERS),1,paste,collapse='') a<-setNames(runif(26*26),obj_names) b<-setNames(rnorm(26*26),obj_names) plot(a,b,pch=20,col='grey') top<-names(sort(-a))[1:50] textplot2(a[top],b[top],words=top,new=FALSE,pointcolor='black')
Convert a numeric vector into colors
val2col( z, col1 = "navy", col2 = "white", col3 = "red3", nbreaks = 100, center = TRUE, rank = FALSE )
val2col( z, col1 = "navy", col2 = "white", col3 = "red3", nbreaks = 100, center = TRUE, rank = FALSE )
z |
a vector of numbers |
col1 |
a color name for the min value, default 'navy' |
col2 |
a color name for the middle value, default 'white' |
col3 |
a color name for the max value, default 'red3' |
nbreaks |
Number of colors to be generated. Default is 30. |
center |
boolean, should the data be centered? Default is TRUE |
rank |
boolean, should the data be ranked? Default is FALSE |
a vector of colors
a<-rnorm(1000) cols<-val2col(a) plot(a,col=cols,pch=16)
a<-rnorm(1000) cols<-val2col(a) plot(a,col=cols,pch=16)
This function calculates the enrichment of a gene regulatory network over a ChIP-Seq derived signature
vulcan(vobj, network, contrast, annotation = NULL, minsize = 10)
vulcan(vobj, network, contrast, annotation = NULL, minsize = 10)
vobj |
a list, the output of the |
network |
an object of class |
contrast |
a vector of two fields, containing the condition names to be compared (1 vs 2) |
annotation |
an optional named vector to convert gene identifiers (e.g. entrez ids to gene symbols) Default (NULL) won't convert gene names. |
minsize |
integer indicating the minimum regulon size for the analysis to be run. Default: 10 |
A list of components:
A matrix of raw peak counts, peaks as rows, samples as columns
A matrix of peak RPKMs, peaks as rows, samples as columns
A matrix of raw gene counts, genes as rows, samples as columns. The counts are associated to the promoter region of the gene
A matrix of RPKMs, genes as rows, samples as columns. The RPKMs are associated to the promoter region of the gene
A matrix of gene abundances normalized by Variance-Stabilizing Transformation (VST), genes as rows, samples as columns. The abundances are associated to the promoter region of the gene
A vector of sample names and conditions
a multisample virtual proteomics object from the viper package
A table of master regulators for a specific signature, indicating their Normalized Enrichment Score (NES) and p-value
library(vulcandata) # Get an example vulcan object (generated with vulcan.import() using the # dummy dataset contained in the \textit{vulcandata} package) vobj<-vulcandata::vulcanexample() # Annotate peaks to gene names vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum') # Normalize data for VULCAN analysis vobj<-vulcan.normalize(vobj) # Detect which conditions are present names(vobj$samples) # Load an ARACNe network # This is a regulon object as specified in the VIPER package, named 'network' load(system.file('extdata','network.rda',package='vulcandata',mustWork=TRUE)) # Run VULCAN # We can reduce the minimum regulon size, since in this example only one # chromosome # was measured, and the networks would otherwise have too few hits vobj_analysis<-vulcan(vobj,network=network,contrast=c('t90','t0'),minsize=5) # Visualize output using the msviper plotting function plot(vobj_analysis$msviper,mrs=7)
library(vulcandata) # Get an example vulcan object (generated with vulcan.import() using the # dummy dataset contained in the \textit{vulcandata} package) vobj<-vulcandata::vulcanexample() # Annotate peaks to gene names vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum') # Normalize data for VULCAN analysis vobj<-vulcan.normalize(vobj) # Detect which conditions are present names(vobj$samples) # Load an ARACNe network # This is a regulon object as specified in the VIPER package, named 'network' load(system.file('extdata','network.rda',package='vulcandata',mustWork=TRUE)) # Run VULCAN # We can reduce the minimum regulon size, since in this example only one # chromosome # was measured, and the networks would otherwise have too few hits vobj_analysis<-vulcan(vobj,network=network,contrast=c('t90','t0'),minsize=5) # Visualize output using the msviper plotting function plot(vobj_analysis$msviper,mrs=7)
This function coalesces and annotates a set of BAM files into peak-centered data. It implements the ChIPPeakANno methods, with specific choices dealing with defining the genomic area around the promoter and which peaks to include.
vulcan.annotate( vobj, lborder = -10000, rborder = 10000, method = c("closest", "strongest", "sum", "topvar", "farthest", "lowvar"), TxDb = NULL )
vulcan.annotate( vobj, lborder = -10000, rborder = 10000, method = c("closest", "strongest", "sum", "topvar", "farthest", "lowvar"), TxDb = NULL )
vobj |
A list of peakcounts, samples and peakrpkms (i.e. the output of the function vulcan.import) |
lborder |
Boundary for peak annotation (in nucleotides) upstream of the Transcription starting site (default: -10000) |
rborder |
Boundary for peak annotation (in nucleotides) downstream of the Transcription starting site (default: 10000) |
method |
Method to deal with multiple peaks found within gene promoter boundaries. One of sum (default), closest, strongest, topvar, farthest or lowvar. This will affect only genes with multiple possible peaks. When a single peak can be mapped to the promoter region of the gene, that peak abundance will be considered as the gene promoter's occupancy.
|
TxDb |
TxDb annotation object containing the knownGene track. If NULL (the default), TxDb.Hsapiens.UCSC.hg19.knownGene is loaded |
A list of components:
A matrix of raw peak counts, peaks as rows, samples as columns
A matrix of peak RPKMs, peaks as rows, samples as columns
A matrix of raw gene counts, genes as rows, samples as columns. The counts are associated to the promoter region of the gene
A matrix of RPKMs, genes as rows, samples as columns. The RPKMs are associated to the promoter region of the gene
A vector of sample names and conditions
library(vulcandata) vobj<-vulcandata::vulcanexample() vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
library(vulcandata) vobj<-vulcandata::vulcanexample() vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum')
This function coalesces and annotates a set of BAM files into peak-centered data
vulcan.import(sheetfile, intervals = NULL)
vulcan.import(sheetfile, intervals = NULL)
sheetfile |
path to a csv annotation file containing sample information and BAM location |
intervals |
size of the peaks. If NULL (default) it is inferred from the average fragment length observed in the dataset |
A list of components:
A matrix of raw peak counts, peaks as rows, samples as columns
A matrix of peak RPKMs, peaks as rows, samples as columns
A vector of sample names and conditions
library(vulcandata) # Generate an annotation file from the dummy ChIP-Seq dataset vfile<-tempfile() vulcandata::vulcansheet(vfile) # Import BAM and BED information into a list object # vobj<-vulcan.import(vfile) # This vobj is identical to the object returned by # vulcandata::vulcanexample() unlink(vfile)
library(vulcandata) # Generate an annotation file from the dummy ChIP-Seq dataset vfile<-tempfile() vulcandata::vulcansheet(vfile) # Import BAM and BED information into a list object # vobj<-vulcan.import(vfile) # This vobj is identical to the object returned by # vulcandata::vulcanexample() unlink(vfile)
This function normalizes gene-centered ChIP-Seq data using VST
vulcan.normalize(vobj)
vulcan.normalize(vobj)
vobj |
a list, the output of the |
A list of components:
A matrix of raw peak counts, peaks as rows, samples as columns
A matrix of peak RPKMs, peaks as rows, samples as columns
A matrix of raw gene counts, genes as rows, samples as columns. The counts are associated to the promoter region of the gene
A matrix of RPKMs, genes as rows, samples as columns. The RPKMs are associated to the promoter region of the gene
A matrix of gene abundances normalized by Variance-Stabilizing Transformation (VST), genes as rows, samples as columns. The abundances are associated to the promoter region of the gene
A vector of sample names and conditions
## Not run: library(vulcandata) vobj<-vulcandata::vulcanexample() vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum') vobj<-vulcan.normalize(vobj) ## End(Not run)
## Not run: library(vulcandata) vobj<-vulcandata::vulcanexample() vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum') vobj<-vulcan.normalize(vobj) ## End(Not run)
This function applies Gene Set Enrichment Analysis or Rank Enrichment Analysis over a ChIP-Seq signature contained in a vulcan package object
vulcan.pathways( vobj, pathways, contrast = NULL, method = c("GSEA", "REA"), np = 1000 )
vulcan.pathways( vobj, pathways, contrast = NULL, method = c("GSEA", "REA"), np = 1000 )
vobj |
a list, the output of the |
pathways |
a list of vectors, one vector of gene identifiers per pathway |
contrast |
a vector with the name of the two conditions to compare. If method=='REA', contrast can be set to 'all', and Rank Enrichment Analysis will be performed for every sample independently, compared to the mean of the dataset. |
method |
either 'REA' for Rank Enrichment Analysis or 'GSEA' for Gene Set Enrichment Analysis |
np |
numeric, only for GSEA, the number of permutations to build the null distribution. Default is 1000 |
if method=='GSEA', a named vector, with pathway names as names, and the normalized enrichment score of either the GSEA as value. If method=='REA', a matrix, with pathway names as rows and specific contrasts as columns (the method 'REA' allows for multiple contrasts to be calculated at the same time)
library(vulcandata) vfile<-tempfile() vulcandata::vulcansheet(vfile) #vobj<-vulcan.import(vfile) vobj<-vulcandata::vulcanexample() unlink(vfile) vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum') vobj<-vulcan.normalize(vobj) # Create a dummy pathway list (in entrez ids) pathways<-list( pathwayA=rownames(vobj$normalized)[1:20], pathwayB=rownames(vobj$normalized)[21:50] ) # Which contrast groups can be queried names(vobj$samples) results_gsea<-vulcan.pathways(vobj,pathways,contrast=c('t90','t0'), method='GSEA') results_rea<-vulcan.pathways(vobj,pathways,contrast=c('all'),method='REA')
library(vulcandata) vfile<-tempfile() vulcandata::vulcansheet(vfile) #vobj<-vulcan.import(vfile) vobj<-vulcandata::vulcanexample() unlink(vfile) vobj<-vulcan.annotate(vobj,lborder=-10000,rborder=10000,method='sum') vobj<-vulcan.normalize(vobj) # Create a dummy pathway list (in entrez ids) pathways<-list( pathwayA=rownames(vobj$normalized)[1:20], pathwayB=rownames(vobj$normalized)[21:50] ) # Which contrast groups can be queried names(vobj$samples) results_gsea<-vulcan.pathways(vobj,pathways,contrast=c('t90','t0'), method='GSEA') results_rea<-vulcan.pathways(vobj,pathways,contrast=c('all'),method='REA')
This function gives a gaussian Z-score corresponding to the provided p-value Careful: sign is not provided
wstouffer(x, w)
wstouffer(x, w)
x |
a vector of Z scores |
w |
weight for each Z score |
Z an integrated Z score
zs<-c(1,-3,5,2,3) ws<-c(1,10,1,2,1) wstouffer(zs,ws)
zs<-c(1,-3,5,2,3) ws<-c(1,10,1,2,1) wstouffer(zs,ws)
This function gives a gaussian p-value corresponding to the provided Z-score
z2p(z)
z2p(z)
z |
a Z score |
a p-value
z<-1.96 z2p(z)
z<-1.96 z2p(z)