Title: | Laplace Mixture Model in Microarray Experiments |
---|---|
Description: | Laplace mixture modelling of microarray experiments. A hierarchical Bayesian approach is used, and the hyperparameters are estimated using empirical Bayes. The main purpose is to identify differentially expressed genes. |
Authors: | Yann Ruffieux, contributions from Debjani Bhowmick, Anthony C. Davison, and Darlene R. Goldstein |
Maintainer: | Yann Ruffieux <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.73.0 |
Built: | 2024-10-30 08:34:16 UTC |
Source: | https://github.com/bioc/lapmix |
Creates a volcano plot of log-fold changes versus log-odds of differential expression under the Laplace mixture model.
lap.volcanoplot(res, highlight=0, ...)
lap.volcanoplot(res, highlight=0, ...)
res |
the output from the |
highlight |
number of genes to be highlighted; genes are highlighted in descending order of their posterior odds |
... |
additional arguments given to the plot function |
A volcano plot is any plot which displays fold changes versus a measure of statistical significance of the change.
A plot is created on the current graphics device.
Yann Ruffieux
# See lapmix.Fit example
# See lapmix.Fit example
Computes posterior odds of differential expression under the Laplace mixture model, with parameters estimated using an empirical Bayes approach.
lapmix.Fit(Y, asym=FALSE, fast=TRUE, two.step=TRUE, w=0.1, V=10, beta=0, gamma=1, alpha=0.1)
lapmix.Fit(Y, asym=FALSE, fast=TRUE, two.step=TRUE, w=0.1, V=10, beta=0, gamma=1, alpha=0.1)
Y |
data.frame or matrix containing the log relative expression levels, where each row represents a gene. Alternatively, a list of arrays of possibly different sizes, or an object of class |
asym |
indicates whether the asymmetric Laplace model is used rather than the symmetric one. |
fast |
indicates whether the 'fast' estimation method is used. |
two.step |
indicates whether the two-step estimation method is used; otherwise the marginal likelihood is maximised in one step. |
w , V , beta , gamma , alpha
|
initial values given to the optimization algorithms for the estimation of the hyperparameters. |
This method fits the results of a microarray experiment to a Laplace mixture model. These results are assumed to take the form of normalized base 2 logarithm of the expression ratios. An empirical Bayes approach is used to estimate the hyperparameters of the model. The lap.lodds
is sometimes known as the L-statistic (if the symmetric model is used) or the AL-statistic (if the asymmetric model is used). These statistics can be used to rank the genes according to the posterior odds of differential expression, via the routine laptopTable
. They can be visualized using the lap.volcanoplot
function.
If there are different numbers of replicates between genes, one may wish to write the data in a list of arrays. If a matrix representation is desired, one can stick in NaN's where appropriate.
The ‘fast’ estimation method ignores the integrals which cannot be computed with the t-distribution function. This method is suggested, since these problematic integrals are few and far between. The estimates are practically not affected, and we avoid the potential problems that arise when integrating numerically with the integrate
function.
lap.lods |
numeric vector containing the posterior log-odds of differential expression |
prob |
numeric vector containing the posterior probabilities of differential expression |
med.number |
number of differentially expressed genes according to the median rule |
M |
numeric vector with average log fold changes within genes |
s_sq |
numeric vector with sample variances within genes |
nb.rep |
numeric vector with number of replicates within genes |
estimates |
list containing the empirical Bayes estimates of the hyperparameters |
code |
integer indicating why the likelihood optmization terminated, cf. |
Yann Ruffieux
Bhowmick, D., Davison, A.C., and Goldstein, D.R. (2006). A Laplace mixture model for identification of differential expression in microarray experiments.
# Simulate gene expression data under Laplace mixture model: 3000 genes with # 4 duplicates each; one gene in ten is differentially expressed. G <- 3000 Y <- NULL sigma_sq <- 1/rgamma(G, shape=2.8, scale=0.04) mu <- rexp(G, rate=1/(sigma_sq*1.2))-rexp(G, rate=1/(sigma_sq*1.2)) is.diff <- sample(c(0,1), replace=TRUE, prob=c(0.9,0.1), size=G) mu <- mu*is.diff for(g in 1:G) Y <- rbind(Y, rnorm(4,mu[g], sd=sqrt(sigma_sq[g]))) # with symmetric model res <- lapmix.Fit(Y) res$estimates laptopTable(res, 20) lap.volcanoplot(res, highlight=res$med.number) # with asymmetric model res2 <- lapmix.Fit(Y, asym=TRUE) res2$estimates laptopTable(res2, 20) lap.volcanoplot(res2, highlight=res2$med.number)
# Simulate gene expression data under Laplace mixture model: 3000 genes with # 4 duplicates each; one gene in ten is differentially expressed. G <- 3000 Y <- NULL sigma_sq <- 1/rgamma(G, shape=2.8, scale=0.04) mu <- rexp(G, rate=1/(sigma_sq*1.2))-rexp(G, rate=1/(sigma_sq*1.2)) is.diff <- sample(c(0,1), replace=TRUE, prob=c(0.9,0.1), size=G) mu <- mu*is.diff for(g in 1:G) Y <- rbind(Y, rnorm(4,mu[g], sd=sqrt(sigma_sq[g]))) # with symmetric model res <- lapmix.Fit(Y) res$estimates laptopTable(res, 20) lap.volcanoplot(res, highlight=res$med.number) # with asymmetric model res2 <- lapmix.Fit(Y, asym=TRUE) res2$estimates laptopTable(res2, 20) lap.volcanoplot(res2, highlight=res2$med.number)
Extract a table of the top-ranked genes from a Laplace mixture model fit.
laptopTable(res, number=res$med.number, sort.by='L')
laptopTable(res, number=res$med.number, sort.by='L')
res |
the output from the |
number |
how many genes to pick out; if missing: number is determined by the median rule |
sort.by |
character string specifying statistic to sort the selected genes by in the output data.frame. |
This function summarizes a Laplace mixture model fit object produced by lapmix.Fit
by selecting the top-ranked genes according to the posterior log-odds or M-values.
The sort.by
argument specifies the criterion used to select the top genes. Only two choices at the moment: "M"
to sort by the (absolute) coefficient representing the log-fold-change, and "L"
to sort by the posterior odds of differential expression under the Laplace mixture model.
A dataframe with a row for the number
top genes and the following columns:
M |
average log fold change |
log.odds |
log posterior odds that the gene is differentially expressed |
Yann Ruffieux
# See lapmix.Fit example
# See lapmix.Fit example