Title: | Background Adjustment Using Sequence Information |
---|---|
Description: | Background adjustment using sequence information |
Authors: | Jean(ZHIJIN) Wu, Rafael Irizarry with contributions from James MacDonald <[email protected]> Jeff Gentry |
Maintainer: | Z. Wu <[email protected]> |
License: | LGPL |
Version: | 2.79.0 |
Built: | 2024-10-30 07:58:06 UTC |
Source: | https://github.com/bioc/gcrma |
Spline coefficients for estimation of affinity from probe sequence
data(affinity.spline.coefs)
data(affinity.spline.coefs)
An internal function to be used by gcrma
.
bg.adjust.fullmodel(pms,mms,ncs=NULL,apm,amm,anc=NULL,index.affinities,k=6 * fast + 0.25 * (1 - fast),rho=.7,fast=FALSE) bg.adjust.affinities(pms,ncs,apm,anc,index.affinities,k=6 * fast + 0.25 * (1 - fast),fast=FALSE,nomm=FALSE)
bg.adjust.fullmodel(pms,mms,ncs=NULL,apm,amm,anc=NULL,index.affinities,k=6 * fast + 0.25 * (1 - fast),rho=.7,fast=FALSE) bg.adjust.affinities(pms,ncs,apm,anc,index.affinities,k=6 * fast + 0.25 * (1 - fast),fast=FALSE,nomm=FALSE)
pms |
PM intensities after optical background correction, before non-specific-binding correction. |
mms |
MM intensities after optical background correction, before non-specific-binding correction. |
ncs |
Negative control probe intensities after optical background correction, before
non-specific-binding correction. If |
index.affinities |
The index of pms with known sequences. (For some types of arrays the sequences of a small subset of probes are not provided by Affymetrix.) |
apm |
Probe affinities for PM probes with known sequences. |
amm |
Probe affinities for MM probes with known sequences. |
anc |
Probe affinities for Negative control probes with known
sequences. This is ignored when |
rho |
correlation coefficient of log background intensity in a pair of pm/mm probes. Default=.7 |
k |
A tuning parameter. See details. |
fast |
Logical value. If |
nomm |
Logical value indicating if MM intensities are available and will to be used to estimate background. |
Assumes PM=background1+signal,mm=background2,
(log(background1),log(background2))'
follow bivariate normal distribution, signal distribution follows power
law.
bg.parameters.gcrma
and sg.parameters.gcrma
provide adhoc estimates of the parameters.
the original gcrma uses an empirical Bayes estimate. this requires a
complicated numerical integration. An add-hoc method tries to imitate
the empirical Bayes estimate with a PM-B but values of PM-B<k
going to k
. This can be thought as a shrunken MVUE. For more
details see Wu et al. (2003).
a vector of same length as x.
Rafeal Irizarry, Zhijin(Jean) Wu
This function performs background adjustment (optical noise and
non-specific binding on an AffyBatch
project and returns an AffyBatch
object in which the PM
intensities are adjusted.
bg.adjust.gcrma(object,affinity.info=NULL, affinity.source=c("reference","local"), NCprobe=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast),stretch=1.15*fast+1*(1-fast),correction=1, GSB.adjust=TRUE, rho=.7,optical.correct=TRUE,verbose=TRUE,fast=TRUE)
bg.adjust.gcrma(object,affinity.info=NULL, affinity.source=c("reference","local"), NCprobe=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast),stretch=1.15*fast+1*(1-fast),correction=1, GSB.adjust=TRUE, rho=.7,optical.correct=TRUE,verbose=TRUE,fast=TRUE)
object |
an |
affinity.info |
|
affinity.source |
|
NCprobe |
Index of negative control probes. When set as
|
type |
"fullmodel" for sequence and MM model. "affinities" for sequence information only. "mm" for using MM without sequence information. |
k |
A tuning factor. |
stretch |
. |
correction |
. |
GSB.adjust |
Logical value. If |
rho |
correlation coefficient of log background intensity in a pair of pm/mm probes. Default=.7 |
optical.correct |
Logical value. If |
verbose |
Logical value. If |
fast |
Logical value. If |
The returned value is an AffyBatch
object, in which the PM probe intensities
have been background adjusted. The rest is left the same as the
starting AffyBatch
object.
The tunning factor k
will have different meainngs if one uses
the fast (ad hoc) algorithm or the empirical bayes approach. See Wu
et al. (2003)
An AffyBatch
.
Rafeal Irizarry
if(require(affydata) & require(hgu95av2probe) & require(hgu95av2cdf)){ data(Dilution) ai <- compute.affinities(cdfName(Dilution)) Dil.adj<-bg.adjust.gcrma(Dilution,affinity.info=ai,type="affinities") }
if(require(affydata) & require(hgu95av2probe) & require(hgu95av2cdf)){ data(Dilution) ai <- compute.affinities(cdfName(Dilution)) Dil.adj<-bg.adjust.gcrma(Dilution,affinity.info=ai,type="affinities") }
An internal function to be used by gcrma
bg.parameters.ns(x,affinities,affinities2=NULL,affinities3=NULL,span=.2)
bg.parameters.ns(x,affinities,affinities2=NULL,affinities3=NULL,span=.2)
x |
PM or MM intensities after optical background correction, before non-specific-binding correction. |
affinities |
Probe affinities for probes with known sequences.Used to estimate the function between non-specific binding and affinities. |
affinities2 |
Probe affinities for the probes whoes expected non-specific binding intensity is to be predicted. |
affinities3 |
Probe affinities for another extra group of probes whoes expected non-specific binding intensity is to be predicted. |
span |
The span parameter passed to loess function |
a vector of same length as x.
Rafeal Irizarry, Zhijin (Jean) Wu
An internal function to calculate probe affinities from their sequences.
compute.affinities(cdfname,verbose=TRUE) compute.affinities2(cdfname,verbose=TRUE) check.probes(probepackage,cdfname)
compute.affinities(cdfname,verbose=TRUE) compute.affinities2(cdfname,verbose=TRUE) check.probes(probepackage,cdfname)
cdfname |
Object of class |
probepackage |
|
verbose |
Logical value. If |
The affinity of a probe is described as the sum of position-dependent base affinities. Each base at each position contributes to the total affinity of a probe in an additive fashion. For a given type of base, the positional effect is modeled as a spline function with 5 degrees of freedom.
Use compute.affinities2
if there are no MM probes.
check.probes
makes sure things are matching as they should.
compute.affinities
returns an AffyBatch
with the
affinities for PM probes in the pm locations and the affinities for MM
probes in the mm locations. NA will be added for probes with no
sequence information.
Rafeal Irizarry
Hekstra, D., Taussig, A. R., Magnasco, M., and Naef, F. (2003) Absolute mRNA concentrations from sequence-specific calibration of oligonucleotide array. Nucleic Acids Research, 31. 1962-1968.
This function converts an AffyBatch
into an ExpressionSet
using the robust multi-array average (RMA) expression measure with help of probe sequence.
gcrma(object,affinity.info=NULL, affinity.source=c("reference","local"),NCprobe=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast),stretch=1.15*fast+1*(1-fast),correction=1, GSB.adjust=TRUE, rho=.7,optical.correct=TRUE,verbose=TRUE,fast=TRUE, subset=NULL,normalize=TRUE,...)
gcrma(object,affinity.info=NULL, affinity.source=c("reference","local"),NCprobe=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast),stretch=1.15*fast+1*(1-fast),correction=1, GSB.adjust=TRUE, rho=.7,optical.correct=TRUE,verbose=TRUE,fast=TRUE, subset=NULL,normalize=TRUE,...)
object |
an |
affinity.info |
|
affinity.source |
|
NCprobe |
Index of negative control probes. When set as
|
type |
"fullmodel" for sequence and MM model. "affinities" for sequence information only. "mm" for using MM without sequence information. |
k |
A tuning factor. |
stretch |
. |
correction |
. |
GSB.adjust |
Logical value. If |
rho |
correlation coefficient of log background intensity in a pair of pm/mm probes. Default=.7 |
optical.correct |
Logical value. If |
verbose |
Logical value. If |
fast |
Logical value. If |
subset |
a character vector with the the names of the probesets to be used in expression calculation. |
normalize |
logical value. If 'TRUE' normalize data using quantile normalization. |
... |
further arguments to be passed (not currently implemented - stub for future use). |
Note that this expression measure is given to you in log base 2 scale. This differs from most of the other expression measure methods.
The tuning factor k
will have different meanings if one uses
the fast (add-hoc) algorithm or the empirical Bayes approach. See Wu
et al. (2003)
An ExpressionSet
.
Rafeal Irizarry
if(require(affydata) & require(hgu95av2probe) & require(hgu95av2cdf)){ data(Dilution) ai <- compute.affinities(cdfName(Dilution)) Dil.expr<-gcrma(Dilution,affinity.info=ai,type="affinities") }
if(require(affydata) & require(hgu95av2probe) & require(hgu95av2cdf)){ data(Dilution) ai <- compute.affinities(cdfName(Dilution)) Dil.expr<-gcrma(Dilution,affinity.info=ai,type="affinities") }
This function adjust for non-specific binding when all arrays in the dataset share the same probe affinity information. It takes matrices of PM probe intensities, MM probe intensities, other negative control probe intensities(optional) and the associated probe affinities, and return one matrix of non-specific binding corrected PM probe intensities.
gcrma.engine(pms,mms,ncs=NULL, pm.affinities=NULL,mm.affinities=NULL,anc=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast), stretch=1.15*fast+1*(1-fast),correction=1,GSB.adjust=TRUE,rho=0.7, verbose=TRUE,fast=FALSE)
gcrma.engine(pms,mms,ncs=NULL, pm.affinities=NULL,mm.affinities=NULL,anc=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast), stretch=1.15*fast+1*(1-fast),correction=1,GSB.adjust=TRUE,rho=0.7, verbose=TRUE,fast=FALSE)
pms |
The matrix of PM intensities |
mms |
The matrix of MM intensities |
ncs |
The matrix of negative control probe intensities. When left
as |
pm.affinities |
The vector of PM probe affinities. Note: This can be
shorter than the number of rows in |
mm.affinities |
The vector of MM probe affinities. |
anc |
The vector of Negative Control probe affinities. This is
ignored if MMs are used as negative controls ( |
type |
"fullmodel" for sequence and MM model. "affinities" for sequence information only. "mm" for using MM without sequence information. |
k |
A tuning factor. |
stretch |
. |
correction |
. |
GSB.adjust |
Logical value. If |
rho |
correlation coefficient of log background intensity in a pair of pm/mm probes. Default=.7 |
verbose |
Logical value. If |
fast |
Logicalvalue. If |
Note that this expression measure is given to you in log base 2 scale. This differs from most of the other expression measure methods.
The tunning factor k
will have different meainngs if one uses
the fast (add-hoc) algorithm or the empirical bayes approach. See Wu
et al. (2003)
A matrix of PM intensties.
Rafeal Irizarry & Zhijin Wu
gcrma.engine2
This function adjust for non-specific binding when each array has its own probe affinity information. It takes an AffyBatch object of probe intensities and an AffyBatch of probe affinity, returns one matrix of non-specific binding corrected PM probe intensities.
gcrma.engine2(object,pmIndex=NULL,mmIndex=NULL, NCprobe=NULL,affinity.info, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast), stretch=1.15*fast+1*(1-fast),correction=1,GSB.adjust=TRUE,rho=0.7, verbose=TRUE,fast=TRUE)
gcrma.engine2(object,pmIndex=NULL,mmIndex=NULL, NCprobe=NULL,affinity.info, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast), stretch=1.15*fast+1*(1-fast),correction=1,GSB.adjust=TRUE,rho=0.7, verbose=TRUE,fast=TRUE)
object |
an |
pmIndex |
Index of PM probes.This will be computed within the
function if left |
mmIndex |
Index of MM probes.This will be computed within the
function if left |
NCprobe |
Index of negative control probes. When set as
|
affinity.info |
|
type |
"fullmodel" for sequence and MM model. "affinities" for sequence information only. "mm" for using MM without sequence information. |
k |
A tuning factor. |
stretch |
. |
correction |
. |
GSB.adjust |
Logical value. If |
rho |
correlation coefficient of log background intensity in a pair of pm/mm probes. Default=.7 |
verbose |
Logical value. If |
fast |
Logicalvalue. If |
Note that this expression measure is given to you in log base 2 scale. This differs from most of the other expression measure methods.
The tunning factor k
will have different meainngs if one uses
the fast (add-hoc) algorithm or the empirical bayes approach. See Wu
et al. (2003)
A matrix of PM intensties.
Rafeal Irizarry & Zhijin Wu
gcrma.engine
This function converts CEL files into an ExpressionSet
using the robust multi-array average (RMA) expression measure with help of probe sequences.
just.gcrma(..., filenames=character(0), phenoData=new("AnnotatedDataFrame"), description=NULL, notes="", compress=getOption("BioC")$affy$compress.cel, normalize=TRUE, bgversion=2, affinity.info=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast), stretch=1.15*fast+1*(1-fast), correction=1, rho=0.7, optical.correct=TRUE, verbose=TRUE, fast=TRUE, minimum=1, optimize.by = c("speed","memory"), cdfname = NULL, read.verbose = FALSE) justGCRMA(..., filenames=character(0), widget=getOption("BioC")$affy$use.widgets, compress=getOption("BioC")$affy$compress.cel, celfile.path=getwd(), sampleNames=NULL, phenoData=NULL, description=NULL, notes="", normalize=TRUE, bgversion=2, affinity.info=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast), stretch=1.15*fast+1*(1-fast), correction=1, rho=0.7, optical.correct=TRUE, verbose=TRUE, fast=TRUE, minimum=1, optimize.by = c("speed","memory"), cdfname = NULL, read.verbose = FALSE)
just.gcrma(..., filenames=character(0), phenoData=new("AnnotatedDataFrame"), description=NULL, notes="", compress=getOption("BioC")$affy$compress.cel, normalize=TRUE, bgversion=2, affinity.info=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast), stretch=1.15*fast+1*(1-fast), correction=1, rho=0.7, optical.correct=TRUE, verbose=TRUE, fast=TRUE, minimum=1, optimize.by = c("speed","memory"), cdfname = NULL, read.verbose = FALSE) justGCRMA(..., filenames=character(0), widget=getOption("BioC")$affy$use.widgets, compress=getOption("BioC")$affy$compress.cel, celfile.path=getwd(), sampleNames=NULL, phenoData=NULL, description=NULL, notes="", normalize=TRUE, bgversion=2, affinity.info=NULL, type=c("fullmodel","affinities","mm","constant"), k=6*fast+0.5*(1-fast), stretch=1.15*fast+1*(1-fast), correction=1, rho=0.7, optical.correct=TRUE, verbose=TRUE, fast=TRUE, minimum=1, optimize.by = c("speed","memory"), cdfname = NULL, read.verbose = FALSE)
... |
file names separated by comma. |
filenames |
file names in a character vector. |
widget |
a logical specifying if widgets should be used. |
compress |
are the CEL files compressed? |
phenoData |
a |
description |
a |
notes |
notes. |
affinity.info |
|
type |
"fullmodel" for sequence and MM model. "affinities" for sequence information only. "mm" for using MM without sequence information. |
k |
A tuning factor. |
rho |
correlation coefficient of log background intensity in a pair of pm/mm probes. Default=.7. |
stretch |
. |
correction |
. |
normalize |
Logical value. If |
optical.correct |
Logical value. If |
verbose |
Logical value. If |
fast |
Logical value. If |
optimize.by |
"speed" will use a faster algorithm but more RAM, and "memory" will be slower, but require less RAM. |
bgversion |
integer value indicating which RMA background to use 1: use background similar to pure R rma background given in affy version 1.0 - 1.0.2 2: use background similar to pure R rma background given in affy version 1.1 and above. |
minimum |
. |
celfile.path |
a character denoting the path 'ReadAffy' should look for cel files. |
sampleNames |
a character vector of sample names to be used in the 'AffyBatch'. |
cdfname |
Used to specify the name of an alternative cdf package. If set to
|
read.verbose |
Logical value. If |
This method should require much less RAM than the conventional
method of first creating an AffyBatch
and then running
gcrma
.
This is a simpler version than gcrma
, so some of the arguments
available in gcrma
are not available here. For example, it is
not possible to use the MM probes to estimate background. Instead, the
internal NSB estimates are used (which is also the default for gcrma
).
Note that this expression measure is given to you in log base 2 scale. This differs from most of the other expression measure methods.
The tuning factor k
will have different meanings if one uses
the fast (add-hoc) algorithm or the empirical Bayes approach. See Wu
et al. (2003)
fast.bkg
and mem.bkg
are two internal functions.
An ExpressionSet
object.
James W. MacDonald