Title: | An R package for fast computing for adaptively weighted fisher's method |
---|---|
Description: | Implementation of the adaptively weighted fisher's method, including fast p-value computing, variability index, and meta-pattern. |
Authors: | Zhiguang Huo |
Maintainer: | Zhiguang Huo <[email protected]> |
License: | GPL-3 |
Version: | 1.21.0 |
Built: | 2024-11-04 06:05:12 UTC |
Source: | https://github.com/bioc/AWFisher |
R package for fast computing for adaptively weighted fisher's method
AWFisher_pvalue(p.values)
AWFisher_pvalue(p.values)
p.values |
Input G by K p-value matrix. Each row represent a gene and each column represent a study. Note that K has to be >=2 and <=100. |
fast computing for adaptively weighted fisher's method
A list consisting of AWFisher pvalues and AWweight.
pvalues |
AWFisher pvalues. |
weights |
G by K binary weight matrix W. $W_gk = 1$ represents for gene $g$, study $k$ contributes to the meta-analysis result. $W_gk = 0$ otherwise. |
Zhiguang Huo
K <- 40 G <- 10000 p.values = matrix(rbeta(K*G, 1,1), ncol=K) res = AWFisher_pvalue(p.values) hist(res$pvalues, breaks=40) table(rowSums(res$weights)) pvalues=res$pvalues[order(res$pvalues)] plot(-log10((1:NROW(pvalues))/(1+NROW(pvalues))), -log10(pvalues),xlab='theoretical quantile', ylab='observed quantile') lines(c(0,100), c(0,100), col=2)
K <- 40 G <- 10000 p.values = matrix(rbeta(K*G, 1,1), ncol=K) res = AWFisher_pvalue(p.values) hist(res$pvalues, breaks=40) table(rowSums(res$weights)) pvalues=res$pvalues[order(res$pvalues)] plot(-log10((1:NROW(pvalues))/(1+NROW(pvalues))), -log10(pvalues),xlab='theoretical quantile', ylab='observed quantile') lines(c(0,100), c(0,100), col=2)
biomarker categrorization
biomarkerCategorization(studies, afunction, B = 10, DEindex = NULL, fdr = NULL, silence = FALSE)
biomarkerCategorization(studies, afunction, B = 10, DEindex = NULL, fdr = NULL, silence = FALSE)
studies |
a list of K studies. Each element (kth study) of the list is another list consisting gene expression matrix and and label information. |
afunction |
A function for DE analysis. Options can be function_limma or function_edgeR. Default option is function_limma. However, use could define their own function. The input of afunction should be list(data, label) which is consistent with one element of the studies list/argument. The return of afunction should be list(pvalue=apvalue, effectSize=aeffectsize) |
B |
number of permutation should be used. B=1000 is suggested. |
DEindex |
If NULL, BH method will be applied to p-values and FDR 0.05 will be used. User could specify a logical vector as DEindex. |
fdr |
Default is 0.05. The co-membership matrix calculation will base on genes with this specified fdr. |
silence |
If TRUE, will print out the bootstrapping procedure. |
biomarker categrorization via boostrap AW weight.
A list consisting of biomarker categrorization result.
varibility |
Varibility index for all genes |
dissimilarity |
Dissimilarity matrix of genes of DEindex==TRUE |
DEindex |
DEindex for Dissimilarity |
Zhiguang Huo
N0 = 10 G <- 1000 GDEp <- 50 GDEn <- 50 K = 4 studies <- NULL set.seed(15213) for(k in seq_len(K)){ astudy <- matrix(rnorm(N0*2*G),nrow=G,ncol=N0*2) ControlLabel <- seq_len(N0) caseLabel <- (N0 + 1):(2*N0) astudy[1:GDEp,caseLabel] <- astudy[1:GDEp,caseLabel] + 2 astudy[1:GDEp + GDEn,caseLabel] <- astudy[1:GDEp + GDEn,caseLabel] - 2 alabel = c(rep(0,length(ControlLabel)),rep(1,length(caseLabel))) studies[[k]] <- list(data=astudy, label=alabel) } result <- biomarkerCategorization(studies,function_limma,B=100,DEindex=NULL) sum(result$DEindex) head(result$varibility) print(result$dissimilarity[1:4,1:4])
N0 = 10 G <- 1000 GDEp <- 50 GDEn <- 50 K = 4 studies <- NULL set.seed(15213) for(k in seq_len(K)){ astudy <- matrix(rnorm(N0*2*G),nrow=G,ncol=N0*2) ControlLabel <- seq_len(N0) caseLabel <- (N0 + 1):(2*N0) astudy[1:GDEp,caseLabel] <- astudy[1:GDEp,caseLabel] + 2 astudy[1:GDEp + GDEn,caseLabel] <- astudy[1:GDEp + GDEn,caseLabel] - 2 alabel = c(rep(0,length(ControlLabel)),rep(1,length(caseLabel))) studies[[k]] <- list(data=astudy, label=alabel) } result <- biomarkerCategorization(studies,function_limma,B=100,DEindex=NULL) sum(result$DEindex) head(result$varibility) print(result$dissimilarity[1:4,1:4])
The purpose of the multi-tissue mouse metabolism transcriptomic data is to study how the gene expression changes with respect to the energy deficiency using mouse models. Very long-chain acyl-CoA dehydrogenase (VLCAD) deficiency was found to be associated with energy metabolism disorder in children. Two genotypes of the mouse model - wild type (VLCAD +/+) and VLCAD-deficient (VLCAD -/-) - were studied for three types of tissues (brown fat, liver, heart) with 3 to 4 mice in each genotype group. The sample size information is available in the table below. A total of 6,883 genes are available in this example dataset.
data_mouseMetabolism
data_mouseMetabolism
A list of data.frame with 6,883 genes (rows) and 3 - 4 mouse samples in each genotype group (columns).
data for the brown fat tissue
data for the heart tissue
data for the liver tissue
https://projecteuclid.org/download/pdfview_1/euclid.aoas/1310562214
data(data_mouseMetabolism)
data(data_mouseMetabolism)
use edgeR function to get pvalue
function_edgeR(astudy)
function_edgeR(astudy)
astudy |
A list contains a data matrix and a vector of group label |
use edgeR function to get pvalue
A list of pvalue and effect size
Zhiguang Huo
N0 = 10 G <- 1000 GDEp <- 50 GDEn <- 50 set.seed(15213) astudy <- matrix(rpois(N0*2*G,10),nrow=G,ncol=N0*2) ControlLabel <- 1:N0 caseLabel <- (N0 + 1):(2*N0) astudy[1:GDEp,caseLabel] <- astudy[1:GDEp,caseLabel] + 2 astudy[1:GDEp + GDEn,caseLabel] <- astudy[1:GDEp,caseLabel] - 2 alabel <- c(rep(0,length(ControlLabel)),rep(1,length(caseLabel))) Study <- list(data=astudy, label=alabel) result <- function_edgeR(Study) fdr <- p.adjust(result$pvalue) sum(fdr<=0.05)
N0 = 10 G <- 1000 GDEp <- 50 GDEn <- 50 set.seed(15213) astudy <- matrix(rpois(N0*2*G,10),nrow=G,ncol=N0*2) ControlLabel <- 1:N0 caseLabel <- (N0 + 1):(2*N0) astudy[1:GDEp,caseLabel] <- astudy[1:GDEp,caseLabel] + 2 astudy[1:GDEp + GDEn,caseLabel] <- astudy[1:GDEp,caseLabel] - 2 alabel <- c(rep(0,length(ControlLabel)),rep(1,length(caseLabel))) Study <- list(data=astudy, label=alabel) result <- function_edgeR(Study) fdr <- p.adjust(result$pvalue) sum(fdr<=0.05)
use limma function to get pvalue
function_limma(astudy)
function_limma(astudy)
astudy |
A list contains a data matrix and a vector of group label |
use limma function to get pvalue
A list of pvalue and effect size
Zhiguang Huo
N0 = 10 G <- 1000 GDEp <- 50 GDEn <- 50 set.seed(15213) astudy <- matrix(rnorm(N0*2*G),nrow=G,ncol=N0*2) ControlLabel <- 1:N0 caseLabel <- (N0 + 1):(2*N0) astudy[1:GDEp,caseLabel] <- astudy[1:GDEp,caseLabel] + 2 astudy[1:GDEp + GDEn,caseLabel] <- astudy[1:GDEp,caseLabel] - 2 alabel <- c(rep(0,length(ControlLabel)),rep(1,length(caseLabel))) Study <- list(data=astudy, label=alabel) result <- function_limma(Study) fdr <- p.adjust(result$pvalue) sum(fdr<=0.05)
N0 = 10 G <- 1000 GDEp <- 50 GDEn <- 50 set.seed(15213) astudy <- matrix(rnorm(N0*2*G),nrow=G,ncol=N0*2) ControlLabel <- 1:N0 caseLabel <- (N0 + 1):(2*N0) astudy[1:GDEp,caseLabel] <- astudy[1:GDEp,caseLabel] + 2 astudy[1:GDEp + GDEn,caseLabel] <- astudy[1:GDEp,caseLabel] - 2 alabel <- c(rep(0,length(ControlLabel)),rep(1,length(caseLabel))) Study <- list(data=astudy, label=alabel) result <- function_limma(Study) fdr <- p.adjust(result$pvalue) sum(fdr<=0.05)
Variability Index
variabilityIndex(studies, afunction, B = 10, silence = FALSE)
variabilityIndex(studies, afunction, B = 10, silence = FALSE)
studies |
a list of K studies. Each element (kth study) of the list is another list consisting gene expression matrix and and label information. |
afunction |
A function for DE analysis. Options can be function_limma or function_edgeR. Default option is function_limma. However, use could define their own function. The input of afunction should be list(data, label) which is consistent with one element of the studies list/argument. The return of afunction should be list(pvalue=apvalue, effectSize=aeffectsize) |
B |
number of permutation should be used. B=1000 is suggested. |
silence |
If TRUE, will print out the bootstrapping procedure. |
Variability Index via boostrap AW weight.
A list consisting of biomarker categrorization result.
varibility |
Varibility index for all genes |
Zhiguang Huo
N0 = 10 G <- 1000 GDEp <- 50 GDEn <- 50 K = 4 studies <- NULL set.seed(15213) for(k in 1:K){ astudy <- matrix(rnorm(N0*2*G),nrow=G,ncol=N0*2) ControlLabel <- 1:N0 caseLabel <- (N0 + 1):(2*N0) astudy[1:GDEp,caseLabel] <- astudy[1:GDEp,caseLabel] + 2 astudy[1:GDEp + GDEn,caseLabel] <- astudy[1:GDEp + GDEn,caseLabel] - 2 alabel = c(rep(0,length(ControlLabel)),rep(1,length(caseLabel))) studies[[k]] <- list(data=astudy, label=alabel) } result <- variabilityIndex(studies,function_limma,B=100) head(result)
N0 = 10 G <- 1000 GDEp <- 50 GDEn <- 50 K = 4 studies <- NULL set.seed(15213) for(k in 1:K){ astudy <- matrix(rnorm(N0*2*G),nrow=G,ncol=N0*2) ControlLabel <- 1:N0 caseLabel <- (N0 + 1):(2*N0) astudy[1:GDEp,caseLabel] <- astudy[1:GDEp,caseLabel] + 2 astudy[1:GDEp + GDEn,caseLabel] <- astudy[1:GDEp + GDEn,caseLabel] - 2 alabel = c(rep(0,length(ControlLabel)),rep(1,length(caseLabel))) studies[[k]] <- list(data=astudy, label=alabel) } result <- variabilityIndex(studies,function_limma,B=100) head(result)