Package 'HEM' reference manual

Title:	Heterogeneous error model for identification of differentially expressed genes under multiple conditions
Description:	This package fits heterogeneous error models for analysis of microarray data
Authors:	HyungJun Cho <[email protected]> and Jae K. Lee <[email protected]>
Maintainer:	HyungJun Cho <[email protected]>
License:	GPL (>= 2)
Version:	1.79.0
Built:	2025-03-10 03:48:27 UTC
Source:	https://github.com/bioc/HEM

AM transformation for LPE

Description

Computes AM for LPE

Author(s)

HyungJun Cho and Jae K. Lee

AM transformation for LPE

Description

Computes AM for LPE

Author(s)

HyungJun Cho and Jae K. Lee

Baseline ASE estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline PSE estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error bootstrap estimation for oligonucleotide arrays

Description

Estimates baseline error using bootstrap samples for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error bootstrap estimation for oligonucleotide arrays

Description

Estimates baseline error using bootstrap samples for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error bootstrap estimation for oligonucleotide arrays

Description

Estimates baseline error using bootstrap samples for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Prediction using smoothing spine

Description

Makes predictions using smoothing spine

Author(s)

HyungJun Cho and Jae K. Lee

Heterogeneous Error Model for Identification of Differential Expressed Genes Under Multiple Conditions

Description

Fits an error model with heterogeneous experimental and biological variances.

Usage

hem(dat, probe.ID=NULL, n.layer, design, burn.ins=1000, n.samples=3000,
    method.var.e="gam", method.var.b="gam", method.var.t="gam",           
    var.e=NULL, var.b=NULL, var.t=NULL, var.g=1, var.c=1, var.r=1,
    alpha.e=3, beta.e=.1, alpha.b=3, beta.b=.1, alpha.t=3, beta.t=.2,
    n.digits=10, print.message.on.screen=TRUE)
hem(dat, probe.ID=NULL, n.layer, design, burn.ins=1000, n.samples=3000,
    method.var.e="gam", method.var.b="gam", method.var.t="gam",           
    var.e=NULL, var.b=NULL, var.t=NULL, var.g=1, var.c=1, var.r=1,
    alpha.e=3, beta.e=.1, alpha.b=3, beta.b=.1, alpha.t=3, beta.t=.2,
    n.digits=10, print.message.on.screen=TRUE)

Arguments

`dat`	data
`probe.ID`	a vector of probe set IDs
`n.layer`	number of layers; 1=one-layer EM, 2=two-layer EM
`design`	design matrix
`burn.ins`	number of burn-ins for MCMC
`n.samples`	number of samples for MCMC
`method.var.e`	prior specification method for experimental variance; "gam"=Gamma(alpha,beta), "peb"=parametric EB prior specification, "neb"=nonparametric EB prior specification
`method.var.b`	prior specification method for biological variance; "gam"=Gamma(alpha,beta), "peb"=parametric EB prior specification
`method.var.t`	prior specification method for total variance; "gam"=Gamma(alpha,beta), "peb"=parametric EB prior specification, "neb"=nonparametric EB prior specification
`var.e`	prior estimate matrix for experimental variance
`var.b`	prior estimate matrix for biological variance
`var.t`	prior estimate matrix for total variance
`var.g`	N(0, var.g); prior parameter for gene effect
`var.c`	N(0, var.c); prior parameter for condition effect
`var.r`	N(0, var.r); prior parameter for interaction effect of gene and condition
`alpha.e`, `beta.e`	Gamma(alpha.e,alpha.e); prior parameters for inverse of experimental variance
`alpha.b`, `beta.b`	Gamma(alpha.b,alpha.b); prior parameters for inverse of biological variance
`alpha.t`, `beta.t`	Gamma(alpha.b,alpha.b); prior parameters for inverse of total variance
`n.digits`	number of digits
`print.message.on.screen`	if TRUE, process status is shown on screen.

Value

`n.gene`	numer of genes
`n.chip`	number of chips
`n.cond`	number of conditions
`design`	design matrix
`burn.ins`	number of burn-ins for MCMC
`n.samples`	number of samples for MCMC
`priors`	prior parameters
`m.mu`	estimated mean expression intensity for each gene under each condition
`m.x`	estimated unobserved expression intensity for each combination of genes, conditions, and individuals (n.layer=2)
`m.var.b`	estimated biological variances (n.layer=2)
`m.var.e`	estimated experiemental variances (n.layer=2)
`m.var.t`	estimated total variances (n.layer=1)
`H`	H-scores

Author(s)

HyungJun Cho and Jae K. Lee

References

Cho, H. and Lee, J.K. (2004) Bayesian Hierarchical Error Model for Analysis of Gene Expression Data, Bioinformatics, 20: 2016-2025.

Examples


#Example 1: Two-layer HEM

data(pbrain)

##construct a design matrix
cond <- c(1,1,1,1,1,1,2,2,2,2,2,2) #condition
ind  <- c(1,1,2,2,3,3,1,1,2,2,3,3) #biological replicate
rep  <- c(1,2,1,2,1,2,1,2,1,2,1,2) #experimental replicate
design <- data.frame(cond,ind,rep)

##normalization
pbrain.nor <- hem.preproc(pbrain[,2:13])

##fit HEM with two layers of error
##using the small numbers of burn-ins and MCMC samples for a testing purpose;
##but increase the numbers for a practical purpose 
#pbrain.hem <- hem(pbrain.nor, n.layer=2, design=design, 
#                  burn.ins=10, n.samples=30)

##print H-scores
#pbrain.hem$H 


#Example 2: One-layer HEM

data(mubcp)

##construct a design matrix
cond <- c(rep(1,6),rep(2,5),rep(3,5),rep(4,5),rep(5,5))
ind  <- c(1:6,rep((1:5),4))
design <- data.frame(cond,ind)

##construct a design matrix
mubcp.nor <- hem.preproc(mubcp)

#fit HEM with one layers of error
#using the small numbers of burn-ins and MCMC samples for a testing purpose;
#but increase the numbers for a practical purpose 
#mubcp.hem <- hem(mubcp.nor, n.layer=1,design=design, burn.ins=10, n.samples=30)

##print H-scores
#mubcp.hem$H


###NOTE: Use 'hem.fdr' for FDR evaluation
###NOTE: Use 'hem.eb.prior' for Empirical Bayes (EB) prior sepecification
###NOTE: Use EB-HEM ('hem' after 'hem.eb.prior') for small data sets

#Example 1: Two-layer HEM

data(pbrain)

##construct a design matrix
cond <- c(1,1,1,1,1,1,2,2,2,2,2,2) #condition
ind  <- c(1,1,2,2,3,3,1,1,2,2,3,3) #biological replicate
rep  <- c(1,2,1,2,1,2,1,2,1,2,1,2) #experimental replicate
design <- data.frame(cond,ind,rep)

##normalization
pbrain.nor <- hem.preproc(pbrain[,2:13])

##fit HEM with two layers of error
##using the small numbers of burn-ins and MCMC samples for a testing purpose;
##but increase the numbers for a practical purpose 
#pbrain.hem <- hem(pbrain.nor, n.layer=2, design=design, 
#                  burn.ins=10, n.samples=30)

##print H-scores
#pbrain.hem$H 


#Example 2: One-layer HEM

data(mubcp)

##construct a design matrix
cond <- c(rep(1,6),rep(2,5),rep(3,5),rep(4,5),rep(5,5))
ind  <- c(1:6,rep((1:5),4))
design <- data.frame(cond,ind)

##construct a design matrix
mubcp.nor <- hem.preproc(mubcp)

#fit HEM with one layers of error
#using the small numbers of burn-ins and MCMC samples for a testing purpose;
#but increase the numbers for a practical purpose 
#mubcp.hem <- hem(mubcp.nor, n.layer=1,design=design, burn.ins=10, n.samples=30)

##print H-scores
#mubcp.hem$H


###NOTE: Use 'hem.fdr' for FDR evaluation
###NOTE: Use 'hem.eb.prior' for Empirical Bayes (EB) prior sepecification
###NOTE: Use EB-HEM ('hem' after 'hem.eb.prior') for small data sets

Empirical Bayes (EB) Prior Specification

Description

Estimates experimental and biological variances by LPE and resampling

Usage

hem.eb.prior(dat,  n.layer, design, 
             method.var.e="neb", method.var.b="peb", method.var.t="neb", 
             rep=TRUE, baseline.var="LPE", p.remove=0, max.chip=4,
             q=0.01, B=25, n.digits=10, print.message.on.screen=TRUE)
hem.eb.prior(dat,  n.layer, design, 
             method.var.e="neb", method.var.b="peb", method.var.t="neb", 
             rep=TRUE, baseline.var="LPE", p.remove=0, max.chip=4,
             q=0.01, B=25, n.digits=10, print.message.on.screen=TRUE)

Arguments

`dat`	data
`n.layer`	number of layers
`design`	design matrix
`method.var.e`	prior specification method for experimental variance; "peb"=parametric EB prior specification, "neb"=nonparametric EB prior specification
`method.var.b`	prior specification method for biological variance; "peb"=parametric EB prior specification
`method.var.t`	prior specification method for total variance; "peb"=parametric EB prior specification, "neb"=nonparametric EB prior specification
`rep`	no replication if FALSE
`baseline.var`	baseline variance estimation method: LPE for replicated data and BLPE, PSE, or ASE for unreplicated data
`p.remove`	percent of removed rank-variance genes for BLPE
`max.chip`	maximum number of chips to estimate errors
`q`	quantile for paritioning genes based on expression levels
`B`	number of iterations for resampling
`n.digits`	number of digits
`print.message.on.screen`	if TRUE, process status is shown on screen.

Value

`var.b`	prior estimate matrix for biological variances (n.layer=2)
`var.e`	prior estimate matrix for experiemtnal variances (n.layer=2)
`var.t`	prior estimate matrix for total variances (n.layer=1)

Author(s)

HyungJun Cho and Jae K. Lee

Examples


#Example 1: Two-layer HEM with EB prior specification

data(pbrain)

##construct a design matrix
cond <- c(1,1,1,1,1,1,2,2,2,2,2,2)
ind  <- c(1,1,2,2,3,3,1,1,2,2,3,3)
rep  <- c(1,2,1,2,1,2,1,2,1,2,1,2)
design <- data.frame(cond,ind,rep)

##normalization
pbrain.nor <- hem.preproc(pbrain[,2:13])

##take a subset for a testing purpose;
##use all genes for a practical purpose
pbrain.nor <- pbrain.nor[1:1000,]

##estimate hyperparameters of variances by LPE
#pbrain.eb  <- hem.eb.prior(pbrain.nor, n.layer=2,  design=design,
#                           method.var.e="neb", method.var.b="peb")     

#fit HEM with two layers of error
#using the small numbers of burn-ins and MCMC samples for a testing purpose;
#but increase the numbers for a practical purpose 
#pbrain.hem <- hem(pbrain.nor, n.layer=2,  design=design,burn.ins=10, n.samples=30, 
#              method.var.e="neb", method.var.b="peb", 
#              var.e=pbrain.eb$var.e, var.b=pbrain.eb$var.b)


#Example 2: One-layer HEM with EB prior specification

data(mubcp)

##construct a design matrix
cond <- c(rep(1,6),rep(2,5),rep(3,5),rep(4,5),rep(5,5))
ind  <- c(1:6,rep((1:5),4))
design <- data.frame(cond,ind)

##normalization
mubcp.nor <- hem.preproc(mubcp)

##take a subset for a testing purpose;
##use all genes for a practical purpose
mubcp.nor <- mubcp.nor[1:1000,] 

##estimate hyperparameters of variances by LPE
#mubcp.eb  <- hem.eb.prior(mubcp.nor, n.layer=1, design=design,
#             method.var.t="neb")                                

#fit HEM with two layers of error
#using the small numbers of burn-ins and MCMC samples for a testing purpose;
#but increase the numbers for a practical purpose 
#mubcp.hem <- hem(mubcp.nor, n.layer=1, design=design,  burn.ins=10, n.samples=30, 
#             method.var.t="neb", var.t=mubcp.eb$var.t)

#Example 1: Two-layer HEM with EB prior specification

data(pbrain)

##construct a design matrix
cond <- c(1,1,1,1,1,1,2,2,2,2,2,2)
ind  <- c(1,1,2,2,3,3,1,1,2,2,3,3)
rep  <- c(1,2,1,2,1,2,1,2,1,2,1,2)
design <- data.frame(cond,ind,rep)

##normalization
pbrain.nor <- hem.preproc(pbrain[,2:13])

##take a subset for a testing purpose;
##use all genes for a practical purpose
pbrain.nor <- pbrain.nor[1:1000,]

##estimate hyperparameters of variances by LPE
#pbrain.eb  <- hem.eb.prior(pbrain.nor, n.layer=2,  design=design,
#                           method.var.e="neb", method.var.b="peb")     

#fit HEM with two layers of error
#using the small numbers of burn-ins and MCMC samples for a testing purpose;
#but increase the numbers for a practical purpose 
#pbrain.hem <- hem(pbrain.nor, n.layer=2,  design=design,burn.ins=10, n.samples=30, 
#              method.var.e="neb", method.var.b="peb", 
#              var.e=pbrain.eb$var.e, var.b=pbrain.eb$var.b)


#Example 2: One-layer HEM with EB prior specification

data(mubcp)

##construct a design matrix
cond <- c(rep(1,6),rep(2,5),rep(3,5),rep(4,5),rep(5,5))
ind  <- c(1:6,rep((1:5),4))
design <- data.frame(cond,ind)

##normalization
mubcp.nor <- hem.preproc(mubcp)

##take a subset for a testing purpose;
##use all genes for a practical purpose
mubcp.nor <- mubcp.nor[1:1000,] 

##estimate hyperparameters of variances by LPE
#mubcp.eb  <- hem.eb.prior(mubcp.nor, n.layer=1, design=design,
#             method.var.t="neb")                                

#fit HEM with two layers of error
#using the small numbers of burn-ins and MCMC samples for a testing purpose;
#but increase the numbers for a practical purpose 
#mubcp.hem <- hem(mubcp.nor, n.layer=1, design=design,  burn.ins=10, n.samples=30, 
#             method.var.t="neb", var.t=mubcp.eb$var.t)

FDR Evaluation

Description

Computes resampling-based False Discovery Rate (FDR)

Usage

hem.fdr(dat,  n.layer, design, rep=TRUE, hem.out, eb.out=NULL, n.iter=5, q.trim=0.9,
        target.fdr=c(0.001, 0.005, 0.01, 0.05, 0.1, 0.15, 0.20, 0.30, 0.40, 0.50),
        n.digits=10, print.message.on.screen=TRUE)
hem.fdr(dat,  n.layer, design, rep=TRUE, hem.out, eb.out=NULL, n.iter=5, q.trim=0.9,
        target.fdr=c(0.001, 0.005, 0.01, 0.05, 0.1, 0.15, 0.20, 0.30, 0.40, 0.50),
        n.digits=10, print.message.on.screen=TRUE)

Arguments

`dat`	data
`n.layer`	number of layers: 1=one-layer EM; 2=two-layer EM
`design`	design matrix
`rep`	no replication if FALSE
`hem.out`	output from hem function
`eb.out`	output from hem.eb.prior function
`n.iter`	number of iterations
`q.trim`	quantile used for estimtaing the proportion of true negatives (pi0)
`target.fdr`	Target FDRs
`n.digits`	number of digits
`print.message.on.screen`	if TRUE, process status is shown on screen.

Value

`fdr`	H-values and corresponding FDRs
`pi0`	estimated proportion of true negatives
`H.null`	H-scores from null data
`targets`	given target FDRs, corrsponding critical values and numbers of significant genes are provided

Author(s)

HyungJun Cho and Jae K. Lee

Examples


data(pbrain)

##construct a design matrix
cond <- c(1,1,1,1,1,1,2,2,2,2,2,2)
ind  <- c(1,1,2,2,3,3,1,1,2,2,3,3)
rep  <- c(1,2,1,2,1,2,1,2,1,2,1,2)
design <- data.frame(cond,ind,rep)

##normalization
pbrain.nor <- hem.preproc(pbrain[,2:13])

##take a subset for a testing purpose;
##use all genes for a practical purpose
pbrain.nor <- pbrain.nor[1:1000,]

##estimate hyperparameters of variances by LPE
#pbrain.eb  <- hem.eb.prior(pbrain.nor, n.layer=2,  design=design,
#                           method.var.e="neb", method.var.b="peb")     

##fit HEM with two layers of error
##using the small numbers of burn-ins and MCMC samples for a testing purpose;
##but increase the numbers for a practical purpose 
#pbrain.hem <- hem(pbrain.nor, n.layer=2,  design=design,burn.ins=10, n.samples=30, 
#              method.var.e="neb", method.var.b="peb", 
#              var.e=pbrain.eb$var.e, var.b=pbrain.eb$var.b)

##Estimate FDR based on resampling 
#pbrain.fdr <- hem.fdr(pbrain.nor, n.layer=2,  design=design, 
#                  hem.out=pbrain.hem, eb.out=pbrain.eb)

data(pbrain)

##construct a design matrix
cond <- c(1,1,1,1,1,1,2,2,2,2,2,2)
ind  <- c(1,1,2,2,3,3,1,1,2,2,3,3)
rep  <- c(1,2,1,2,1,2,1,2,1,2,1,2)
design <- data.frame(cond,ind,rep)

##normalization
pbrain.nor <- hem.preproc(pbrain[,2:13])

##take a subset for a testing purpose;
##use all genes for a practical purpose
pbrain.nor <- pbrain.nor[1:1000,]

##estimate hyperparameters of variances by LPE
#pbrain.eb  <- hem.eb.prior(pbrain.nor, n.layer=2,  design=design,
#                           method.var.e="neb", method.var.b="peb")     

##fit HEM with two layers of error
##using the small numbers of burn-ins and MCMC samples for a testing purpose;
##but increase the numbers for a practical purpose 
#pbrain.hem <- hem(pbrain.nor, n.layer=2,  design=design,burn.ins=10, n.samples=30, 
#              method.var.e="neb", method.var.b="peb", 
#              var.e=pbrain.eb$var.e, var.b=pbrain.eb$var.b)

##Estimate FDR based on resampling 
#pbrain.fdr <- hem.fdr(pbrain.nor, n.layer=2,  design=design, 
#                  hem.out=pbrain.hem, eb.out=pbrain.eb)

Generation of null data

Description

Generates null data by resampling

Author(s)

HyungJun Cho and Jae K. Lee

Generation of null data

Description

Generates null data by resampling

Author(s)

HyungJun Cho and Jae K. Lee

Generation of null data

Description

Generates null data by resampling

Author(s)

HyungJun Cho and Jae K. Lee

Preprocessing

Description

Performs IQR normalization, thesholding, and log2-transformation

Usage

hem.preproc(x, data.type = "MAS5")
hem.preproc(x, data.type = "MAS5")

Arguments

`x`	data
`data.type`	data type: MAS5 or MAS4

Author(s)

HyungJun Cho and Jae K. Lee

Examples


data(pbrain)
pbrain.nor <- hem.preproc(pbrain[,2:13])

data(pbrain)
pbrain.nor <- hem.preproc(pbrain[,2:13])

Gene expression data for mouse B cell development

Description

This data set consists of gene expression of the five consecutive stages (pre-B1, large pre-B2, small pre-B2, immature B, and mature B cells) of mouse B cell development. The data were obtained with high-density oligonucleotide arrays, Affymetrix Mu11k GeneChips, from flow-cytometrically purified cells.

Usage

data(mubcp)data(mubcp)

Format

A matrix containing 13,207 probe sets and 26 chips; first 6 chips for pre-B1 cell and next 20 chips for other stages (5 chips for each)

Source

Hoffmann, R., Seidl, T., Neeb, M., Rolink, A. and Melchers, F. (2002). Changes in gene expression profiles in developing B cells of murine bone marrow, Genome Research 12:98-111.

Baseline error nonparametric estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error nonparametric estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error nonparametric estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error parametric estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error parametric estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Baseline error parametric estimation for oligonucleotide arrays

Description

Estimates baseline error for oligonucleotide arrays

Author(s)

HyungJun Cho and Jae K. Lee

Gene expression data for primate brains

Description

This data set consists of gene expression of primate brains (Affymetrix U95A GeneChip). The frozen brains of three humans (H1, H2, H3) and three chimpanzees (C1, C2, C3) were used to take the postmortem tissue samples, and two independent tissue samples for each individual were taken.

Usage

data(pbrain)data(pbrain)

Format

A matrix containing 12,600 probe sets and 12 chips (H1,H1,H2,H2,H3,H3,C1,C1,C2,C2,C3,C3); the first column is probe set ID

Source

Enard, W., Khaitovich, P., Klose, J., Zollner, S., Heissig, F., Giavalisco, P., Nieselt-Struwe, K., Muchmore, E., Varki, A., Ravid, R., Doxiadis, G.M., Bontrop, R.R., and Paabo, S. (2002) Intra- and interspecific variation in primate gene expression patterns, Science 296:340-343

Package 'HEM'

Help Index

AM transformation for LPE

Description

Author(s)

AM transformation for LPE

Description

Author(s)

Baseline ASE estimation for oligonucleotide arrays

Description

Author(s)

Baseline error estimation for oligonucleotide arrays

Description

Author(s)

Baseline error estimation for oligonucleotide arrays

Description

Author(s)

Baseline PSE estimation for oligonucleotide arrays

Description

Author(s)

Baseline error bootstrap estimation for oligonucleotide arrays

Description

Author(s)

Baseline error bootstrap estimation for oligonucleotide arrays

Description

Author(s)

Baseline error bootstrap estimation for oligonucleotide arrays

Description

Author(s)

Prediction using smoothing spine

Description

Author(s)

Heterogeneous Error Model for Identification of Differential Expressed Genes Under Multiple Conditions

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Empirical Bayes (EB) Prior Specification

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

FDR Evaluation

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Generation of null data

Description

Author(s)

Generation of null data

Description

Author(s)

Generation of null data

Description

Author(s)

Preprocessing

Description

Usage

Arguments

Author(s)

See Also

Examples

Gene expression data for mouse B cell development

Description

Usage

Format

Source

Baseline error nonparametric estimation for oligonucleotide arrays

Description