Title: | Artificial Components Detection of Differentially Expressed Genes |
---|---|
Description: | This package provides a multivariate inferential analysis method for detecting differentially expressed genes in gene expression data. It uses artificial components, close to the data's principal components but with an exact interpretation in terms of differential genetic expression, to identify differentially expressed genes while controlling the false discovery rate (FDR). The methods on this package are described in the vignette or in the article 'Multivariate Method for Inferential Identification of Differentially Expressed Genes in Gene Expression Experiments' by J. P. Acosta, L. Lopez-Kleine and S. Restrepo (2015, pending publication). |
Authors: | Juan Pablo Acosta, Liliana Lopez-Kleine |
Maintainer: | Juan Pablo Acosta <[email protected]> |
License: | GPL-3 |
Version: | 1.37.0 |
Built: | 2024-10-30 03:23:59 UTC |
Source: | https://github.com/bioc/acde |
This package provides a multivariate inferential analysis method for detecting differentially expressed genes in gene expression data. It uses artificial components, close to the data's principal components but with an exact interpretation in terms of differential genetic expression, to identify differentially expressed genes while controlling the false discovery rate (FDR). The methods on this package are described in the article Multivariate Method for Inferential Identification of Differentially Expressed Genes in Gene Expression Experiments by Acosta (2015).
Package: | acde |
Type: | Package |
Version: | 1.0 |
Date: | 2015-02-25 |
License: | GLP-3 |
LazyData: | yes |
Depends: | R(>= 3.1), ade4(>= 1.6), boot(>= 1.3) |
Encoding: | UTF-8 |
Built: | R 3.1.2; 2015-05-01; unix |
Index:
ac Artificial Components for Gene Expression Data acde-package Artificial Components Detection of Differentially Expressed Genes bcaFDR BCa Confidence Upper Bound for the FDR. fdr False Discovery Rate Computation phytophthora Gene Expression Data for Tomato Plants Inoculated with _Phytophthora infestans_ plot.STP Plot Method for Single Time Point Analysis plot.TC Plot Method for Time Course Analysis print.STP Print Method for Single Time Point Analysis print.TC Print Method for Time Course Analysis qval Q-Values Computation stp Single Time Point Analysis for Detecting Differentially Expressed Genes tc Time Course Analysis for Detecting Differentially Expressed Genes
Juan Pablo Acosta, Liliana Lopez-Kleine
Maintainer: Juan Pablo Acosta <[email protected]>
Acosta, J. P. (2015) Strategy for Multivariate Identification of Differentially Expressed Genes in Microarray Data. Unpublished MS thesis. Universidad Nacional de Colombia, Bogot\'a.
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 resSTP <- stp(Z, des) resSTP plot(resSTP) ## Time course analysis for 500 genes with 10 treatment ## replicates and 10 control replicates tPts <- c("h0", "12h", "24h") n <- 500; p <- 20; p1 <- 10 Z <- vector("list", 3) des <- vector("list", 3) for(tp in 1:3){ des[[tp]] <- c(rep(1, p1), rep(2, (p-p1))) } mu <- as.matrix(rexp(n, rate=1)) ### h0 time point (no diff. expr.) Z[[1]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### h12 time point (diff. expr. begins) Z[[2]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### Up regulated genes Z[[2]][1:5,1:p1] <- Z[[2]][1:5,1:p1] + matrix(runif(5*p1, 1, 3), nrow=5) #### Down regulated genes Z[[2]][6:15,(p1+1):p] <- Z[[2]][6:15,(p1+1):p] + matrix(runif(10*(p-p1), 1, 2), nrow=10) ### h24 time point (maximum differential expression) Z[[3]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### 5 up regulated genes Z[[3]][1:5,1:p1] <- Z[[3]][1:5,1:p1] + 5 #### 10 down regulated genes Z[[3]][6:15,(p1+1):p] <- Z[[3]][6:15,(p1+1):p] + 4 resTC <- tc(Z, des) resTC summary(resTC) plot(resTC)
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 resSTP <- stp(Z, des) resSTP plot(resSTP) ## Time course analysis for 500 genes with 10 treatment ## replicates and 10 control replicates tPts <- c("h0", "12h", "24h") n <- 500; p <- 20; p1 <- 10 Z <- vector("list", 3) des <- vector("list", 3) for(tp in 1:3){ des[[tp]] <- c(rep(1, p1), rep(2, (p-p1))) } mu <- as.matrix(rexp(n, rate=1)) ### h0 time point (no diff. expr.) Z[[1]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### h12 time point (diff. expr. begins) Z[[2]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### Up regulated genes Z[[2]][1:5,1:p1] <- Z[[2]][1:5,1:p1] + matrix(runif(5*p1, 1, 3), nrow=5) #### Down regulated genes Z[[2]][6:15,(p1+1):p] <- Z[[2]][6:15,(p1+1):p] + matrix(runif(10*(p-p1), 1, 2), nrow=10) ### h24 time point (maximum differential expression) Z[[3]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### 5 up regulated genes Z[[3]][1:5,1:p1] <- Z[[3]][1:5,1:p1] + 5 #### 10 down regulated genes Z[[3]][6:15,(p1+1):p] <- Z[[3]][6:15,(p1+1):p] + 4 resTC <- tc(Z, des) resTC summary(resTC) plot(resTC)
Computes the artificial components for gene expression data between two conditions for a single time point.
ac(Z, design) ac2(Z, design)
ac(Z, design) ac2(Z, design)
Z |
a numeric matrix or data.frame with |
design |
a vector of length |
This function computes the artificial components of Z, based on the
specified design vector. First, the function scales so that
its columns have zero mean and unit variance. Then computation of
the artificial components
and
is performed as
, where
,
and
, where
. Here,
is the number of treatment replicates, and
has
positive and
negative entries.
ac
returns a matrix with the artificial components
and
in the columns.
ac2
returns a matrix with the second artificial component
in the only column.
Juan Pablo Acosta ([email protected]).
Acosta, J. P. (2015) Strategy for Multivariate Identification of Differentially Expressed Genes in Microarray Data. Unpublished MS thesis. Universidad Nacional de Colombia, Bogot\'a.
## Computes the artificial components for the ## phitophthora infestans data at 60 hai. psi <- ac(phytophthora[[4]], c(rep(1,8), rep(2,8))) plot(x=psi[,1], y=psi[,2])
## Computes the artificial components for the ## phitophthora infestans data at 60 hai. psi <- ac(phytophthora[[4]], c(rep(1,8), rep(2,8))) plot(x=psi[,1], y=psi[,2])
For internal use in function stp
. Computes a BCa confidence
upper bound for the FDR following Algorithm 2 in the vignette.
bcaFDR(Z, design, th = NULL, B = 100, lambda = 0.5, PER = FALSE, R = 1000, gamma = 0.95, Q = NULL, ...)
bcaFDR(Z, design, th = NULL, B = 100, lambda = 0.5, PER = FALSE, R = 1000, gamma = 0.95, Q = NULL, ...)
Z |
a matrix or data.frame representing genes' expression levels. The rows of
|
design |
a vector of length equal to the number of columns in |
th |
Threshold values for estimating the FDR. If |
B |
Number of bootstrap or permutation replications for estimating the FDR at
each iteration (as passed from |
lambda |
Parameter for the estimation of |
R |
Number of bootstrap replications for the computation of the FDR's BCa
confidence upper bound (as passed from |
gamma |
Confidence level for the FDR's BCa upper confidence bound (as passed
from |
PER |
If |
Q |
Estimated FDR as returned in object |
... |
additional arguments for parallel computation in |
cbound |
BCa upper confidence bound for the FDR for each threshold value in |
warnings |
warning messages generated from use of |
Juan Pablo Acosta ([email protected]).
Acosta, J. P. (2015) Strategy for Multivariate Identification of Differentially Expressed Genes in Microarray Data. Unpublished MS thesis. Universidad Nacional de Colombia, Bogot\'a.
Storey, J. D. (2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3): 479–498.
Efron B. and Tibshirani R. J. (1994) An Introduction to the Bootstrap. Chapman & Hall/CRC, 1993.
stp
.
## Single time point analysis for 50 genes with 10 treatment ## replicates and 10 control replicates n <- 50; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 5 resFdr <- fdr(Z, des) bca <- bcaFDR(Z, des, Q=resFdr$Q, B=50, R=500) plot(resFdr$th, resFdr$Q, type="l", col="blue") lines(resFdr$th, bca$cbound, col="green") legend(x="topright", legend=c("FDR", "BCa upper bound"), lty=c(1,1), col=c("blue", "green")) ## Note: Discontinuities in the BCa upper bound are due to warnings ## generated during computations with function \code{boot.ci} ## from package \code{boot}.
## Single time point analysis for 50 genes with 10 treatment ## replicates and 10 control replicates n <- 50; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 5 resFdr <- fdr(Z, des) bca <- bcaFDR(Z, des, Q=resFdr$Q, B=50, R=500) plot(resFdr$th, resFdr$Q, type="l", col="blue") lines(resFdr$th, bca$cbound, col="green") legend(x="topright", legend=c("FDR", "BCa upper bound"), lty=c(1,1), col=c("blue", "green")) ## Note: Discontinuities in the BCa upper bound are due to warnings ## generated during computations with function \code{boot.ci} ## from package \code{boot}.
For internal use in functions stp
and bcaFDR
.
Computes steps 2.1 to 2.4 from Algorithm 1 in the vignette.
fdr(Z, design, th = NULL, B = 100, lambda = 0.5, PER = FALSE, ...)
fdr(Z, design, th = NULL, B = 100, lambda = 0.5, PER = FALSE, ...)
Z |
a matrix or data.frame representing genes' expression levels. The rows
of |
design |
a vector of length equal to the number of columns in |
th |
Threshold values for estimating the FDR. If |
B |
Number of bootstrap or permutation replications for estimating the FDR
(as passed from |
lambda |
Parameter for the estimation of |
PER |
If |
... |
additional arguments for parallel computation in |
Q |
Estimations of the FDR using each value in |
th |
Threshold values used for estimating the FDR. |
pi0 |
Estimation of |
B |
Number of bootstrap or permutation replications used for estimating the FDR. |
lambda |
Parameter used for the estimation of |
call |
The matched call. |
Juan Pablo Acosta ([email protected]).
Acosta, J. P. (2015) Strategy for Multivariate Identification of Differentially Expressed Genes in Microarray Data. Unpublished MS thesis. Universidad Nacional de Colombia, Bogot\'a.
Storey, J. D. (2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3): 479–498.
stp
.
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 res <- fdr(Z, des) plot(res$th, res$Q, type="l", col="blue") legend(x="topright", legend="FDR", lty=1, col="blue")
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 res <- fdr(Z, des) plot(res$th, res$Q, type="l", col="blue") legend(x="topright", legend="FDR", lty=1, col="blue")
Gene expression data for 16 tomato plants (line IL6-2) in field conditions. 8 of these plants were inoculated with Phytophthora infestans, and the other 8 were mock-inoculated with sterile water. Leaf tissue samples from each replicate were taken at 12 hours before and 12, 36 and 60 hours after inoculation. We refer to 12 hours before inoculation as the h0 time point. Expression levels were obtained for 13440 genes.
data("phytophthora")
data("phytophthora")
A list with four matrices representing expression levels for 13440 genes (rows) in 16 replicates (columns) at time points "h0", "h12", "h36" and "h60". At each time point, the first 8 columns correspond to treatment (inoculated) replicates and the last 8 columns correspond to control (mock-inoculated) replicates. The names of the genes are the names of the rows in each matrix.
For details about experimental conditions, see Restrepo et al. (2005) and Cai et al. (2013).
Tomato Expression Database website (http://ted.bti.cornell.edu/), experiment E022 (Restrepo et al., 2005).
Restrepo, S., Cai, G., Fry, W. E. and Smart, C. D. (2005) Gene expression profiling of infection of tomato by Phytophthora infestans in the field. Phytopathology, 95(S88).
Cai, G., Restrepo, S., Myers, K., Zuluaga, P., Danies, G., Smart, C. and Fry W.E. (2013) Gene profiling in partially resistant and susceptible near-isogenic tomatoes in response to late blight in the field. Molecular plant pathology, 14(2): 171–184.
for(tp in 1:4){ cat(paste("Time Point:", names(phytophthora[tp]), "\n")) print(phytophthora[[tp]][1:10,]) cat("...\n \n") }
for(tp in 1:4){ cat(paste("Time Point:", names(phytophthora[tp]), "\n")) print(phytophthora[[tp]][1:10,]) cat("...\n \n") }
a method for the plot generic. It is designed for displaying plots of the estimated FDR and the genes' classification when performing a Single Time Point Analysis for detecting differentially expressed genes in gene expression data.
## S3 method for class 'STP' plot(x, FDR=TRUE, AC=TRUE, WARNINGS=FALSE, tp=NULL, ...)
## S3 method for class 'STP' plot(x, FDR=TRUE, AC=TRUE, WARNINGS=FALSE, tp=NULL, ...)
x |
an object of class ' |
FDR |
if |
AC |
if |
WARNINGS |
if |
tp |
a character string to be added at the end of the plot's title
(used for adding time points in |
... |
further arguments passed to or from other methods. |
Juan Pablo Acosta ([email protected]).
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 resSTP <- stp(Z, des) resSTP plot(resSTP)
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 resSTP <- stp(Z, des) resSTP plot(resSTP)
a method for the plot generic. It is designed for displaying plots of the estimated FDR and the genes' classification when performing a Time Course Analysis for detecting differentially expressed genes in gene expression data.
## S3 method for class 'TC' plot(x, iRatios=TRUE, FDR = TRUE, AC = TRUE, WARNINGS = FALSE, ...)
## S3 method for class 'TC' plot(x, iRatios=TRUE, FDR = TRUE, AC = TRUE, WARNINGS = FALSE, ...)
x |
if |
iRatios |
an object of class ' |
FDR |
if |
AC |
if |
WARNINGS |
if |
... |
further arguments passed to or from other methods. |
Juan Pablo Acosta ([email protected]).
tc
, print.TC
, summary.TC
.
## Time course analysis for 500 genes with 10 treatment ## replicates and 10 control replicates tPts <- c("h0", "12h", "24h") n <- 500; p <- 20; p1 <- 10 Z <- vector("list", 3) des <- vector("list", 3) for(tp in 1:3){ des[[tp]] <- c(rep(1, p1), rep(2, (p-p1))) } mu <- as.matrix(rexp(n, rate=1)) ### h0 time point (no diff. expr.) Z[[1]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### h12 time point (diff. expr. begins) Z[[2]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### Up regulated genes Z[[2]][1:5,1:p1] <- Z[[2]][1:5,1:p1] + matrix(runif(5*p1, 1, 3), nrow=5) #### Down regulated genes Z[[2]][6:15,(p1+1):p] <- Z[[2]][6:15,(p1+1):p] + matrix(runif(10*(p-p1), 1, 2), nrow=10) ### h24 time point (maximum differential expression) Z[[3]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### 5 up regulated genes Z[[3]][1:5,1:p1] <- Z[[3]][1:5,1:p1] + 5 #### 10 down regulated genes Z[[3]][6:15,(p1+1):p] <- Z[[3]][6:15,(p1+1):p] + 4 resTC <- tc(Z, des) resTC summary(resTC) plot(resTC) ## Not run: ## Phytophthora Infestans Time Course Analysis (takes time...) dataPI <- phytophthora desPI <- vector("list", 4) for(tp in 1:4){ desPI[[tp]] <- c(rep(1, 8), rep(2, 8)) } resPI <- tc(dataPI, desPI) resPI summary(resPI) plot(resPI) ## End(Not run)
## Time course analysis for 500 genes with 10 treatment ## replicates and 10 control replicates tPts <- c("h0", "12h", "24h") n <- 500; p <- 20; p1 <- 10 Z <- vector("list", 3) des <- vector("list", 3) for(tp in 1:3){ des[[tp]] <- c(rep(1, p1), rep(2, (p-p1))) } mu <- as.matrix(rexp(n, rate=1)) ### h0 time point (no diff. expr.) Z[[1]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### h12 time point (diff. expr. begins) Z[[2]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### Up regulated genes Z[[2]][1:5,1:p1] <- Z[[2]][1:5,1:p1] + matrix(runif(5*p1, 1, 3), nrow=5) #### Down regulated genes Z[[2]][6:15,(p1+1):p] <- Z[[2]][6:15,(p1+1):p] + matrix(runif(10*(p-p1), 1, 2), nrow=10) ### h24 time point (maximum differential expression) Z[[3]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### 5 up regulated genes Z[[3]][1:5,1:p1] <- Z[[3]][1:5,1:p1] + 5 #### 10 down regulated genes Z[[3]][6:15,(p1+1):p] <- Z[[3]][6:15,(p1+1):p] + 4 resTC <- tc(Z, des) resTC summary(resTC) plot(resTC) ## Not run: ## Phytophthora Infestans Time Course Analysis (takes time...) dataPI <- phytophthora desPI <- vector("list", 4) for(tp in 1:4){ desPI[[tp]] <- c(rep(1, 8), rep(2, 8)) } resPI <- tc(dataPI, desPI) resPI summary(resPI) plot(resPI) ## End(Not run)
a method for the print generic. It prints relevant results when performing a Single Time Point Analysis for detecting differentially expressed genes in gene expression data.
## S3 method for class 'STP' print(x, headerSTP = TRUE, ...)
## S3 method for class 'STP' print(x, headerSTP = TRUE, ...)
x |
an object of class ' |
headerSTP |
if |
... |
further arguments passed to or from other methods. |
If the desired FDR level was achieved (i.e. x$astar <= x$alpha
),
the results are printed for the differentially expressed genes and 10
more rows only. If the desired FDR level was not achieved, only ten
rows are displayed.
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 resSTP <- stp(Z, des) resSTP plot(resSTP)
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 resSTP <- stp(Z, des) resSTP plot(resSTP)
methods for the print and summary generics that print relevant results when performing a Time Course Analysis for detecting differentially expressed genes in gene expression data.
## S3 method for class 'TC' print(x, ...) ## S3 method for class 'TC' summary(object, ...)
## S3 method for class 'TC' print(x, ...) ## S3 method for class 'TC' summary(object, ...)
x |
an object of class ' |
object |
an object of class ' |
... |
further arguments passed to or from other methods. |
With print
, at each time point, if the desired FDR level
was achieved (i.e. x$astar <= x$alpha
), the results are
printed for the differentially expressed genes and 10 more rows
only. If the desired FDR level was not achieved, only ten rows
are displayed.
summary
prints a more concise version of the results.
## Time course analysis for 500 genes with 10 treatment ## replicates and 10 control replicates tPts <- c("h0", "12h", "24h") n <- 500; p <- 20; p1 <- 10 Z <- vector("list", 3) des <- vector("list", 3) for(tp in 1:3){ des[[tp]] <- c(rep(1, p1), rep(2, (p-p1))) } mu <- as.matrix(rexp(n, rate=1)) ### h0 time point (no diff. expr.) Z[[1]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### h12 time point (diff. expr. begins) Z[[2]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### Up regulated genes Z[[2]][1:5,1:p1] <- Z[[2]][1:5,1:p1] + matrix(runif(5*p1, 1, 3), nrow=5) #### Down regulated genes Z[[2]][6:15,(p1+1):p] <- Z[[2]][6:15,(p1+1):p] + matrix(runif(10*(p-p1), 1, 2), nrow=10) ### h24 time point (maximum differential expression) Z[[3]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### 5 up regulated genes Z[[3]][1:5,1:p1] <- Z[[3]][1:5,1:p1] + 5 #### 10 down regulated genes Z[[3]][6:15,(p1+1):p] <- Z[[3]][6:15,(p1+1):p] + 4 resTC <- tc(Z, des) resTC summary(resTC) plot(resTC) ## Not run: ## Phytophthora Infestans Time Course Analysis (takes time...) dataPI <- phytophthora desPI <- vector("list", 4) for(tp in 1:4){ desPI[[tp]] <- c(rep(1, 8), rep(2, 8)) } resPI <- tc(dataPI, desPI) resPI summary(resPI) plot(resPI) ## End(Not run)
## Time course analysis for 500 genes with 10 treatment ## replicates and 10 control replicates tPts <- c("h0", "12h", "24h") n <- 500; p <- 20; p1 <- 10 Z <- vector("list", 3) des <- vector("list", 3) for(tp in 1:3){ des[[tp]] <- c(rep(1, p1), rep(2, (p-p1))) } mu <- as.matrix(rexp(n, rate=1)) ### h0 time point (no diff. expr.) Z[[1]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### h12 time point (diff. expr. begins) Z[[2]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### Up regulated genes Z[[2]][1:5,1:p1] <- Z[[2]][1:5,1:p1] + matrix(runif(5*p1, 1, 3), nrow=5) #### Down regulated genes Z[[2]][6:15,(p1+1):p] <- Z[[2]][6:15,(p1+1):p] + matrix(runif(10*(p-p1), 1, 2), nrow=10) ### h24 time point (maximum differential expression) Z[[3]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### 5 up regulated genes Z[[3]][1:5,1:p1] <- Z[[3]][1:5,1:p1] + 5 #### 10 down regulated genes Z[[3]][6:15,(p1+1):p] <- Z[[3]][6:15,(p1+1):p] + 4 resTC <- tc(Z, des) resTC summary(resTC) plot(resTC) ## Not run: ## Phytophthora Infestans Time Course Analysis (takes time...) dataPI <- phytophthora desPI <- vector("list", 4) for(tp in 1:4){ desPI[[tp]] <- c(rep(1, 8), rep(2, 8)) } resPI <- tc(dataPI, desPI) resPI summary(resPI) plot(resPI) ## End(Not run)
For internal use in function stp
. Computes the genes' Q-Values
in the Single Time Point Analysis according to Algorithm 3
in the vignette.
qval(Q, psi2)
qval(Q, psi2)
Q |
vector with the estimated FDRs when the threshold values
used are |
psi2 |
vector with the second artificial component as returned by |
returns a vector with the computed Q-Values for each gene in the experiment.
Juan Pablo Acosta ([email protected]).
Acosta, J. P. (2015) Strategy for Multivariate Identification of Differentially Expressed Genes in Microarray Data. Unpublished MS thesis. Universidad Nacional de Colombia, Bogot\'a.
Storey, J. D. (2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3): 479–498.
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 res <- fdr(Z, des) qValues <- qval(res$Q, ac2(Z, des)) plot(res$th, res$Q, type="l", col="blue") lines(res$th, qValues[order(abs(ac2(Z, des)))], col="green") legend(x="topright", legend=c("FDR", "Q Values"), lty=c(1,1), col=c("blue", "green"))
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 res <- fdr(Z, des) qValues <- qval(res$Q, ac2(Z, des)) plot(res$th, res$Q, type="l", col="blue") lines(res$th, qValues[order(abs(ac2(Z, des)))], col="green") legend(x="topright", legend=c("FDR", "Q Values"), lty=c(1,1), col=c("blue", "green"))
Performs the Single Time Point Analysis for detecting differentially expressed genes following Acosta (2015).
stp(Z, design, alpha = 0.05, B = 100, lambda = 0.5, th = NULL, PER = FALSE, BCa = FALSE, gamma = 0.95, R = 1000, ...)
stp(Z, design, alpha = 0.05, B = 100, lambda = 0.5, th = NULL, PER = FALSE, BCa = FALSE, gamma = 0.95, R = 1000, ...)
Z |
a matrix or data.frame representing genes' expression levels. The
rows of |
design |
a vector of length equal to the number of columns in |
alpha |
between 0 and 1. Desired level for controlling the false discovery rate (FDR). |
B |
Number of bootstrap or permutation replications for estimating the FDR. |
lambda |
Parameter for the estimation of |
th |
Threshold values for estimating the FDR. If |
PER |
If |
BCa |
If |
gamma |
Confidence level for the FDR's BCa confidence upper bound. |
R |
Number of bootstrap replications for the computation of the FDR's BCa confidence upper bound. |
... |
additional arguments for parallel computation in |
For details on the computations performed in this function, see Acosta (2015).
Additional parameters in the '...
' argument are used for
parallel computation in bootstrap calculations. These are supplied
to calls to the boot
function in package boot
. With
this in mind, the use of additional arguments must be restricted to
arguments parallel
and ncpus
from function boot
.
stp
returns an object of class 'STP
', which is a
list with components:
dgenes |
factor with the classification of each gene in |
tstar |
Threshold value used to identify differentially expressed genes. |
astar |
Achieved FDR level. |
Q |
Estimations of the FDR using each value in |
th |
Threshold values used for estimating the FDR. |
qvalues |
Estimated Q-Values for the genes in the analysis. If argument
|
pi0 |
Estimation of |
B |
Number of bootstrap or permutation replications used for estimating the FDR. |
lambda |
Parameter used for the estimation of |
ac |
Artificial components of |
gNames |
Gene names (by default the row names in |
iRatio |
Inertia ratio |
bca |
BCa upper confidence bounds for the FDR using each value in |
gamma |
Confidence level used in the computation of the BCa upper bounds. |
alpha |
Desired FDR level. |
call |
The matched call. |
If argument BCa=TRUE
, computations may take a considerable
amount of time.
Juan Pablo Acosta ([email protected]).
Acosta, J. P. (2015) Strategy for Multivariate Identification of Differentially Expressed Genes in Microarray Data. Unpublished MS thesis. Universidad Nacional de Colombia, Bogot\'a.
Storey, J. D. (2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3): 479–498.
Efron B. and Tibshirani R. J. (1994) An Introduction to the Bootstrap. Chapman & Hall/CRC, 1993.
tc
for Time Course Analysis; plot.STP
,
print.STP
.
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 resSTP <- stp(Z, des) resSTP plot(resSTP) ## Not run: ## Phytophthora Infestans Single Time Point Analysis (takes time...) dataPI <- phytophthora[[4]] desPI <- c(rep(1,8), rep(2,8)) resPI <- stp(dataPI, desPI) resPI plot(resPI, tp="60 hai") ## End(Not run)
## Single time point analysis for 500 genes with 10 treatment ## replicates and 10 control replicates n <- 500; p <- 20; p1 <- 10 des <- c(rep(1, p1), rep(2, (p-p1))) mu <- as.matrix(rexp(n, rate=1)) Z <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### 5 up regulated genes Z[1:5,1:p1] <- Z[1:5,1:p1] + 5 ### 10 down regulated genes Z[6:15,(p1+1):p] <- Z[6:15,(p1+1):p] + 4 resSTP <- stp(Z, des) resSTP plot(resSTP) ## Not run: ## Phytophthora Infestans Single Time Point Analysis (takes time...) dataPI <- phytophthora[[4]] desPI <- c(rep(1,8), rep(2,8)) resPI <- stp(dataPI, desPI) resPI plot(resPI, tp="60 hai") ## End(Not run)
Performs the Time Course Analysis from Acosta (2015) for detecting differentially expressed genes in time course experiments for gene expression data.
tc(data, designs, tPoints = NULL, method = c("active vs complementary", "groups conformation"), activeTP = NULL, alpha = 0.05, B = 100, lambda = 0.5, PER = FALSE, BCa = FALSE, gamma = 0.95, R = 1000, ...)
tc(data, designs, tPoints = NULL, method = c("active vs complementary", "groups conformation"), activeTP = NULL, alpha = 0.05, B = 100, lambda = 0.5, PER = FALSE, BCa = FALSE, gamma = 0.95, R = 1000, ...)
data |
a list with matrices or data.frames representing genes' expression levels at each time point. The rows of each matrix correspond to the genes in the experiment, and the columns correspond to the replicates. Treatment replicates are to the left, control replicates to the right. |
designs |
a list with a vector for each time point, of length equal to the number
of columns in the respective matrix or data.frame in |
tPoints |
a character vector with the names of the timepoints. |
method |
if "active vs complementary", an analysis following the active vs complementary time points approach (Acosta, 2015) is performed. If "groups conformation", an analysis following the groups conformation through time approach (Acosta, 2015) is performed. The default is both. |
activeTP |
numeric. The index of the active timepoint in |
alpha |
between 0 and 1. Desired level for controlling the false discovery rate (FDR). |
B |
Number of bootstrap or permutation replications for estimating the FDR. |
lambda |
Parameter for the estimation of |
PER |
If |
BCa |
If TRUE, a BCa confidence upper bound for the FDR is computed (see Efron and Tibshirani, 1994). |
gamma |
Confidence level for the FDR's BCa upper confidence bound. |
R |
Number of bootstrap replications for the computation of the FDR's BCa upper confidence bound. |
... |
additional arguments for parallel computation in |
In the active vs complementary time points approach, the
time point that maximizes the inertia ratio is selected as the
active time point. Then, a Single Time Point Analysis (stp
)
is performed on this time point and plots of the behavior throughout the
time course of the differentially expressed genes identified in this time
point are displayed.
In the groups conformation through time approach, a Single Time
Point Analysis (stp
) is performed at each time point and plots
are displayed showing the behaviour of the differential expression process
throughout the time course; that is, how many genes are differentially
expressed and how strong is the differential expression at each time point.
For details on the computations performed in this function, see Acosta (2015).
Additional parameters in the '...
' argument are used for parallel
computation in bootstrap calculations. These are supplied to calls to the
boot
function in package boot
. With this in mind, the use
of additional arguments must be restricted to arguments parallel
and ncpus
from function boot
.
tc
returns an object of class 'TC
', which is a
list with components:
iRatios |
inertia ratios for each time point. |
gct |
results for the groups conformation through time approach.
A list with an object of class ' |
act |
results for the active vs complementary time points approach.
A list with an object of class ' |
activeTP |
the index of the active timepoint in |
tPoints |
a character vector with the names of the timepoints. |
call |
The matched call. |
If argument BCa=TRUE
, computations may take a considerable
amount of time.
Juan Pablo Acosta ([email protected]).
Acosta, J. P. (2015) Strategy for Multivariate Identification of Differentially Expressed Genes in Microarray Data. Unpublished MS thesis. Universidad Nacional de Colombia, Bogot\'a.
Storey, J. D. (2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3): 479–498.
Efron B. and Tibshirani R. J. (1994) An Introduction to the Bootstrap. Chapman & Hall/CRC, 1993.
stp
for Single Time Point Analysis;
plot.TC
, print.TC
, summary.TC
.
## Time course analysis for 500 genes with 10 treatment ## replicates and 10 control replicates tPts <- c("h0", "12h", "24h") n <- 500; p <- 20; p1 <- 10 Z <- vector("list", 3) des <- vector("list", 3) for(tp in 1:3){ des[[tp]] <- c(rep(1, p1), rep(2, (p-p1))) } mu <- as.matrix(rexp(n, rate=1)) ### h0 time point (no diff. expr.) Z[[1]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### h12 time point (diff. expr. begins) Z[[2]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### Up regulated genes Z[[2]][1:5,1:p1] <- Z[[2]][1:5,1:p1] + matrix(runif(5*p1, 1, 3), nrow=5) #### Down regulated genes Z[[2]][6:15,(p1+1):p] <- Z[[2]][6:15,(p1+1):p] + matrix(runif(10*(p-p1), 1, 2), nrow=10) ### h24 time point (maximum differential expression) Z[[3]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### 5 up regulated genes Z[[3]][1:5,1:p1] <- Z[[3]][1:5,1:p1] + 5 #### 10 down regulated genes Z[[3]][6:15,(p1+1):p] <- Z[[3]][6:15,(p1+1):p] + 4 resTC <- tc(Z, des) resTC summary(resTC) plot(resTC) ## Not run: ## Phytophthora Infestans Time Course Analysis (takes time...) dataPI <- phytophthora desPI <- vector("list", 4) for(tp in 1:4){ desPI[[tp]] <- c(rep(1, 8), rep(2, 8)) } resPI <- tc(dataPI, desPI) resPI summary(resPI) plot(resPI) ## End(Not run)
## Time course analysis for 500 genes with 10 treatment ## replicates and 10 control replicates tPts <- c("h0", "12h", "24h") n <- 500; p <- 20; p1 <- 10 Z <- vector("list", 3) des <- vector("list", 3) for(tp in 1:3){ des[[tp]] <- c(rep(1, p1), rep(2, (p-p1))) } mu <- as.matrix(rexp(n, rate=1)) ### h0 time point (no diff. expr.) Z[[1]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) ### h12 time point (diff. expr. begins) Z[[2]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### Up regulated genes Z[[2]][1:5,1:p1] <- Z[[2]][1:5,1:p1] + matrix(runif(5*p1, 1, 3), nrow=5) #### Down regulated genes Z[[2]][6:15,(p1+1):p] <- Z[[2]][6:15,(p1+1):p] + matrix(runif(10*(p-p1), 1, 2), nrow=10) ### h24 time point (maximum differential expression) Z[[3]] <- t(apply(mu, 1, function(mui) rnorm(p, mean=mui, sd=1))) #### 5 up regulated genes Z[[3]][1:5,1:p1] <- Z[[3]][1:5,1:p1] + 5 #### 10 down regulated genes Z[[3]][6:15,(p1+1):p] <- Z[[3]][6:15,(p1+1):p] + 4 resTC <- tc(Z, des) resTC summary(resTC) plot(resTC) ## Not run: ## Phytophthora Infestans Time Course Analysis (takes time...) dataPI <- phytophthora desPI <- vector("list", 4) for(tp in 1:4){ desPI[[tp]] <- c(rep(1, 8), rep(2, 8)) } resPI <- tc(dataPI, desPI) resPI summary(resPI) plot(resPI) ## End(Not run)