Title: | Variance stabilization and calibration for microarray data |
---|---|
Description: | The package implements a method for normalising microarray intensities from single- and multiple-color arrays. It can also be used for data from other technologies, as long as they have similar format. The method uses a robust variant of the maximum-likelihood estimator for an additive-multiplicative error model and affine calibration. The model incorporates data calibration step (a.k.a. normalization), a model for the dependence of the variance on the mean intensity and a variance stabilizing data transformation. Differences between transformed intensities are analogous to "normalized log-ratios". However, in contrast to the latter, their variance is independent of the mean, and they are usually more sensitive and specific in detecting differential transcription. |
Authors: | Wolfgang Huber, with contributions from Anja von Heydebreck. Many comments and suggestions by users are acknowledged, among them Dennis Kostka, David Kreil, Hans-Ulrich Klein, Robert Gentleman, Deepayan Sarkar and Gordon Smyth |
Maintainer: | Wolfgang Huber <[email protected]> |
License: | Artistic-2.0 |
Version: | 3.75.0 |
Built: | 2024-12-31 03:04:56 UTC |
Source: | https://github.com/bioc/vsn |
vsn
The main function of the package is vsn2
.
Interesting for its applications are also
predict
and the wrapper function justvsn
.
vsn2
can be applied to objects of class
ExpressionSet
,
NChannelSet
,
AffyBatch
(from the affy
package) and
RGList
(from the limma
package),
matrix
and vector
. It returns an object of class
vsn
, which contains the results of fitting the
vsn
model to the data.
The most common use case is that you will want to construct a new data object with the vsn-normalized data whose class is the same as that of the input data and which preserves the metadata. This can be achieved by
fit = vsn2(x, ...) nx = predict(fit, newdata=x)
To simplify this, there exists also a simple wrapper
justvsn
.
Wolfgang Huber
justvsn
is equivalent to calling
fit = vsn2(x, ...) nx = predict(fit, newdata=x, useDataInFit = TRUE)
vsnrma
is a wrapper around vsn2
and rma
.
justvsn(x, ...) vsnrma(x, ...)
justvsn(x, ...) vsnrma(x, ...)
x |
For |
... |
Further arguments that get passed on to |
vsnrma
does probe-wise
background correction and between-array normalization by calling
vsn2
on the perfect match (PM) values only. Probeset
summaries are calculated with the medianpolish algorithm of
rma
.
justvsn
returns the vsn-normalised intensities in
an object generally of the same class as its first
argument (see the man page of predict
for
details). It preserves the metadata.
vsnrma
returns an ExpressionSet
.
Wolfgang Huber
##-------------------------------------------------- ## use "vsn2" to produce a "vsn" object ##-------------------------------------------------- data("kidney") fit = vsn2(kidney) nkid = predict(fit, newdata=kidney) ##-------------------------------------------------- ## justvsn on ExpressionSet ##-------------------------------------------------- nkid2 = justvsn(kidney) stopifnot(identical(exprs(nkid), exprs(nkid2))) ##-------------------------------------------------- ## justvsn on RGList ##-------------------------------------------------- rg = new("RGList", list(R=exprs(kidney)[,1,drop=FALSE], G=exprs(kidney)[,2,drop=FALSE])) erge = justvsn(rg)
##-------------------------------------------------- ## use "vsn2" to produce a "vsn" object ##-------------------------------------------------- data("kidney") fit = vsn2(kidney) nkid = predict(fit, newdata=kidney) ##-------------------------------------------------- ## justvsn on ExpressionSet ##-------------------------------------------------- nkid2 = justvsn(kidney) stopifnot(identical(exprs(nkid), exprs(nkid2))) ##-------------------------------------------------- ## justvsn on RGList ##-------------------------------------------------- rg = new("RGList", list(R=exprs(kidney)[,1,drop=FALSE], G=exprs(kidney)[,2,drop=FALSE])) erge = justvsn(rg)
Intensity data for one cDNA slide with two adjacent tissue samples from a nephrectomy (kidney)
data(kidney)
data(kidney)
kidney
is an
ExpressionSet
containing the data from one cDNA
chip. The 8704x2 matrix exprs(kidney)
contains the
spot intensities for the red (635 nm) and green color channels
(532 nm) respectively. For each spot, a background estimate from a
surrounding region was subtracted.
The chip was produced in 2001 by Holger Sueltmann at the Division of Molecular Genome Analysis at the German Cancer Research Center in Heidelberg.
Huber W, Boer JM, von Heydebreck A, Gunawan B, Vingron M, Fuzesi L, Poustka A, Sueltmann H. Transcription profiling of renal cell carcinoma. Verh Dtsch Ges Pathol. 2002;86:153-64. PMID: 12647365
data("kidney") plot(exprs(kidney), pch = ".", log = "xy") abline(a = 0, b = 1, col = "blue")
data("kidney") plot(exprs(kidney), pch = ".", log = "xy") abline(a = 0, b = 1, col = "blue")
logLik
calculates the log likelihood and its gradient
for the vsn model. plotVsnLogLik
makes a false color plot for
a 2D section of the likelihood landscape.
## S4 method for signature 'vsnInput' logLik(object, p, mu = numeric(0), sigsq=as.numeric(NA), calib="affine") plotVsnLogLik(object, p, whichp = 1:2, expand = 1, ngrid = 31L, fun = logLik, main = "log likelihood", ...)
## S4 method for signature 'vsnInput' logLik(object, p, mu = numeric(0), sigsq=as.numeric(NA), calib="affine") plotVsnLogLik(object, p, whichp = 1:2, expand = 1, ngrid = 31L, fun = logLik, main = "log likelihood", ...)
object |
A |
p |
For |
mu |
Numeric vector of length 0 or |
sigsq |
Numeric scalar. |
calib |
as in |
whichp |
Numeric vector of length 2, with the indices of those
two parameters in |
expand |
Numeric vector of length 1 or 2 with expansion factors for the plot range. The range is auto-calculated using a heuristic, but manual adjustment can be useful; see example. |
ngrid |
Integer scalar, the grid size. |
fun |
Function to use for log-likelihood calculation. This parameter is exposed only for testing purposes. |
main |
This parameter is passed on |
... |
Arguments that get passed on to |
logLik
is an R interface to the likelihood computations in vsn (which are done in C).
For logLik
, a numeric matrix of size nrow(p)+1
by ncol(p)
.
Its columns correspond to the columns of p
.
Its first row are the likelihood values, its rows 2...nrow(p)+1
contain the gradients.
If mu
and sigsq
are
specified, the ordinary negative log likelihood is calculated using these
parameters as given. If they are not specified, the profile negative log likelihood
is calculated.
For plotVsnLogLik
, a dataframe with the 2D grid coordinates and
log likelihood values.
Wolfgang Huber
data("kidney") v = new("vsnInput", x=exprs(kidney), pstart=array(as.numeric(NA), dim=c(1, ncol(kidney), 2))) fit = vsn2(kidney) print(coef(fit)) p = sapply(seq(-1, 1, length=31), function(f) coef(fit)+c(0,0,f,0)) ll = logLik(v, p) plot(p[3, ], ll[1, ], type="l", xlab=expression(b[1]), ylab=expression(-log(L))) abline(v=coef(fit)[3], col="red") plotVsnLogLik(v, coef(fit), whichp=c(1,3), expand=0.2)
data("kidney") v = new("vsnInput", x=exprs(kidney), pstart=array(as.numeric(NA), dim=c(1, ncol(kidney), 2))) fit = vsn2(kidney) print(coef(fit)) p = sapply(seq(-1, 1, length=31), function(f) coef(fit)+c(0,0,f,0)) ll = logLik(v, p) plot(p[3, ], ll[1, ], type="l", xlab=expression(b[1]), ylab=expression(-log(L))) abline(v=coef(fit)[3], col="red") plotVsnLogLik(v, coef(fit), whichp=c(1,3), expand=0.2)
8 cDNA chips from Alizadeh lymphoma paper
data(lymphoma)
data(lymphoma)
lymphoma
is an ExpressionSet
containing the data from 8 chips
from the lymphoma data set by Alizadeh et al. (see references). Each
chip represents two samples: on color channel 1 (CH1, Cy3, green) the
common reference sample, and on color channel 2 (CH2, Cy5, red) the
various disease samples. See pData(lymphoma)
. The 9216x16
matrix exprs(lymphoma)
contains the background-subtracted spot
intensities (CH1I-CH1B and CH2I-CH2B, respectively).
The chip intensity files were downloaded from the Stanford microarray database. Starting from the link below, this was done by following the links Published Data -> Alizadeh AA, et al. (2000) Nature 403(6769):503-11 -> Data in SMD -> Display Data, and selecting the following 8 slides:
lc7b019 |
lc7b047 |
lc7b048 |
lc7b056 |
lc7b057 |
lc7b058 |
lc7b069 |
lc7b070 |
Then, the script makedata.R
from the scripts
subdirectory
of this package was run to generate the R data object.
http://genome-www5.stanford.edu/MicroArray/SMD
A. Alizadeh et al., Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503-11, Feb 3, 2000.
data("lymphoma") lymphoma pData(lymphoma)
data("lymphoma") lymphoma pData(lymphoma)
Methods for objects of classes
matrix
,
ExpressionSet
,
vsn
and
MAList
to plot row standard deviations versus row means.
meanSdPlot(x, ranks = TRUE, xlab = ifelse(ranks, "rank(mean)", "mean"), ylab = "sd", pch, plot = TRUE, bins = 50, ...)
meanSdPlot(x, ranks = TRUE, xlab = ifelse(ranks, "rank(mean)", "mean"), ylab = "sd", pch, plot = TRUE, bins = 50, ...)
x |
An object of class
|
ranks |
Logical, indicating whether the x-axis (means) should be plotted
on the original scale ( |
xlab |
Character, label for the x-axis. |
ylab |
Character, label for the y-axis. |
pch |
Ignored - exists for backward compatibility. |
plot |
Logical. If |
bins |
Gets passed on to |
... |
Further arguments that get passed on to |
Standard deviation and mean are calculated row-wise from the
expression matrix (in) x
. The scatterplot of these versus each other
allows you to visually verify whether there is a dependence of the standard
deviation (or variance) on the mean.
The red line depicts the running median estimator (window-width 10%).
If there is no variance-mean dependence, then the line should be approximately horizontal.
A named list with five components: its elements px
and
py
are the x- and y-coordinates of the individual data points
in the plot; its first and second element are the x-coordinates and values of
the running median estimator (the red line in the plot).
Its element gg
is the plot object (see examples).
Depending on the value of plot
, the method can (and by default does) have a side effect,
which is to print gg
on the active graphics device.
Wolfgang Huber
data("kidney") log.na <- function(x) log(ifelse(x>0, x, NA)) exprs(kidney) <- log.na(exprs(kidney)) msd <- meanSdPlot(kidney) ## The `ggplot` object is returned in list element `gg`, here is an example of how to modify the plot library("ggplot2") msd$gg + ggtitle("Hello world") + scale_fill_gradient(low = "yellow", high = "darkred") + scale_y_continuous(limits = c(0, 7)) ## Try this out with not log-transformed data, vsn2-transformed data, the lymphoma data, your data ...
data("kidney") log.na <- function(x) log(ifelse(x>0, x, NA)) exprs(kidney) <- log.na(exprs(kidney)) msd <- meanSdPlot(kidney) ## The `ggplot` object is returned in list element `gg`, here is an example of how to modify the plot library("ggplot2") msd$gg + ggtitle("Hello world") + scale_fill_gradient(low = "yellow", high = "darkred") + scale_y_continuous(limits = c(0, 7)) ## Try this out with not log-transformed data, vsn2-transformed data, the lymphoma data, your data ...
Wrapper for vsn2
to be used as a normalization
method with the expresso function of the package affy. The expresso
function is deprecated, consider using justvsn
instead. The normalize.AffyBatch.vsn can still be useful on its own,
as it provides some additional control of the normalization process
(fitting on subsets, alternate transform parameters).
normalize.AffyBatch.vsn( abatch, reference, strata = NULL, subsample = if (nrow(exprs(abatch))>30000L) 30000L else 0L, subset, log2scale = TRUE, log2asymp=FALSE, ...)
normalize.AffyBatch.vsn( abatch, reference, strata = NULL, subsample = if (nrow(exprs(abatch))>30000L) 30000L else 0L, subset, log2scale = TRUE, log2asymp=FALSE, ...)
abatch |
An object of type
|
reference |
Optional, a 'vsn' object from a previous fit. If this
argument is specified, the data in 'x' are normalized "towards" an
existing set of reference arrays whose parameters are stored in the
object 'reference'. If this argument is not specified, then the data
in 'x' are normalized "among themselves". See |
strata |
The 'strata' functionality is not supported, the parameter is ignored. |
subsample |
Is passed on to |
subset |
This allows the specification of a subset of expression measurements to be used for the vsn fit. The transformation with the parameters of this fit is then, however, applied to the whole dataset. This is useful for excluding expression measurements that are known to be differentially expressed or control probes that may not match the vsn model, thus avoiding that they influence the normalization process. This operates at the level of probesets, not probes. Both 'subset' and 'subsample' can be used together. |
log2scale |
If TRUE, this will perform a global affine transform on the data to put them on a similar scale as the original non-transformed data. Many users prefer this. Fold-change estimates are not affected by this transform. In some situations, however, it may be helpful to turn this off, e.g., when comparing independently normalized subsets of the data. |
log2asymp |
If TRUE, this will perform a global affine transform on the data to make the generalized log (asinh) transform be asymptotically identical to a log base 2 transform. Some people find this helpful. Only one of 'log2scale' or 'log2asymp' can be set to TRUE. Fold-change estimates are not affected by this transform. |
... |
Further parameters for |
Please refer to the Details and References
sections of the
man page for vsn2
for more details about this method.
Important note: after calling vsn2
, the function
normalize.AffyBatch.vsn
exponentiates the data (base 2).
This is done in order to make the behavior of this function similar to the
other normalization methods in affy. That packages uses the convention
of taking the logarithm to base in
subsequent analysis steps (e.g. in medpolish
).
An object of class AffyBatch
.
The vsn
object returned, which can be used as reference
for
subsequent fits, is provided by
description(abatch)@preprocessing$vsnReference
.
D. P. Kreil http://bioinf.boku.ac.at/, Wolfgang Huber
## Please see vignette.
## Please see vignette.
Functions to validate and assess the performance of vsn through simulation of data.
sagmbSimulateData(n=8064, d=2, de=0, up=0.5, nrstrata=1, miss=0, log2scale=FALSE) sagmbAssess(h1, sim)
sagmbSimulateData(n=8064, d=2, de=0, up=0.5, nrstrata=1, miss=0, log2scale=FALSE) sagmbAssess(h1, sim)
n |
Numeric. Number of probes (rows). |
d |
Numeric. Number of arrays (columns). |
de |
Numeric. Fraction of differentially expressed genes. |
up |
Numeric. Fraction of up-regulated genes among the differentially expressed genes. |
nrstrata |
Numeric. Number of probe strata. |
miss |
Numeric. Fraction of data points that is randomly sampled
and set to |
log2scale |
Logical. If |
h1 |
Matrix. Calibrated and transformed data, according, e.g., to vsn |
sim |
List. The output of a previous call to
|
Please see the vignette.
For sagmbSimulateData
, a list with four components:
hy
, an n x d
matrix with the true (=simulated)
calibrated, transformed data;
y
, an n x d
matrix with the simulated
uncalibrated raw data - this is intended to be fed into
vsn2
;
is.de
, a logical vector of length n
, specifying
which probes are simulated to be differentially expressed.
strata
, a factor of length n
.
For sagmbSimulateData
, a number: the root mean squared
difference between true and estimated transformed data.
Wolfgang Huber
Wolfgang Huber, Anja von Heydebreck, Holger Sueltmann, Annemarie Poustka, and Martin Vingron (2003) "Parameter estimation for the calibration and variance stabilization of microarray data", Statistical Applications in Genetics and Molecular Biology: Vol. 2: No. 1, Article 3. http://www.bepress.com/sagmb/vol2/iss1/art3
sim <- sagmbSimulateData(nrstrata = 4) ny <- vsn2(sim$y, strata = sim$strata) res <- sagmbAssess(exprs(ny), sim) res
sim <- sagmbSimulateData(nrstrata = 4) ny <- vsn2(sim$y, strata = sim$strata) res <- sagmbAssess(exprs(ny), sim) res
The transformation that is applied to the scaling parameter of the vsn model
scalingFactorTransformation(b)
scalingFactorTransformation(b)
b |
Real vector. |
A real vector of same length as b, with transformation f
applied (see
vignette Likelihood Calculations for vsn).
Wolfgang Huber
b = seq(-3, 2, length=20) fb = scalingFactorTransformation(b) if(interactive()) plot(b, fb, type="b", pch=16)
b = seq(-3, 2, length=20) fb = scalingFactorTransformation(b) if(interactive()) plot(b, fb, type="b", pch=16)
Class to contain result of a vsn fit
new("vsn")
vsn2(x)
with x
being an
ExpressionSet
.
coefficients
:A 3D array of size (number of strata) x (number of columns of the data matrix) x 2. It contains the fitted normalization parameters (see vignette).
strata
:A factor of length 0 or n. If its length is n, then its levels correspond to different normalization strata (see vignette).
mu
:A numeric vector of length n with the fitted
parameters , for
.
sigsq
:A numeric scalar, .
hx
:A numeric matrix with 0 or n rows. If the number of
rows is n, then hx
contains the transformed data matrix.
lbfgsb
:An integer scalar containing the return code from the L-BFGS-B optimizer.
hoffset
:Numeric scalar, the overall offset - see
manual page of
vsn2
.
calib
:Character of length 1, see manual page of
vsn2
.
[
Subset
dim
Get dimensions of data matrix.
nrow
Get number of rows of data matrix.
ncol
Get number of columns of data matrix.
show
Print a summary of the object
exprs
Accessor to slot hx
.
coef
, coefficients
Accessors to slot
coefficients
.
Wolfgang Huber
data("kidney") v = vsn2(kidney) show(v) dim(v) v[1:10, ]
data("kidney") v = vsn2(kidney) show(v) dim(v) v[1:10, ]
vsn2
fits the vsn model to the data
in x
and returns a vsn
object with
the fit parameters and the transformed data matrix.
The data are, typically, feature intensity readings from a
microarray, but this function may also be useful for other kinds of
intensity data that obey an additive-multiplicative error model.
To obtain an object of the same class as x
, containing
the normalised data and the same metdata as x
, use
fit = vsn2(x, ...) nx = predict(fit, newdata=x)
or the wrapper justvsn
.
Please see the vignette Introduction to vsn.
vsnMatrix(x, reference, strata, lts.quantile = 0.9, subsample = 0L, verbose = interactive(), returnData = TRUE, calib = "affine", pstart, minDataPointsPerStratum = 42L, optimpar = list(), defaultpar = list(factr=5e7, pgtol=2e-4, maxit=60000L, trace=0L, cvg.niter=7L, cvg.eps=0)) ## S4 method for signature 'ExpressionSet' vsn2(x, reference, strata, ...) ## S4 method for signature 'AffyBatch' vsn2(x, reference, strata, subsample, ...) ## S4 method for signature 'NChannelSet' vsn2(x, reference, strata, backgroundsubtract=FALSE, foreground=c("R","G"), background=c("Rb", "Gb"), ...) ## S4 method for signature 'RGList' vsn2(x, reference, strata, ...)
vsnMatrix(x, reference, strata, lts.quantile = 0.9, subsample = 0L, verbose = interactive(), returnData = TRUE, calib = "affine", pstart, minDataPointsPerStratum = 42L, optimpar = list(), defaultpar = list(factr=5e7, pgtol=2e-4, maxit=60000L, trace=0L, cvg.niter=7L, cvg.eps=0)) ## S4 method for signature 'ExpressionSet' vsn2(x, reference, strata, ...) ## S4 method for signature 'AffyBatch' vsn2(x, reference, strata, subsample, ...) ## S4 method for signature 'NChannelSet' vsn2(x, reference, strata, backgroundsubtract=FALSE, foreground=c("R","G"), background=c("Rb", "Gb"), ...) ## S4 method for signature 'RGList' vsn2(x, reference, strata, ...)
x |
An object containing the data to which the model is fitted. |
reference |
Optional, a |
strata |
Optional, a |
lts.quantile |
Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression. Allowed values are between 0.5 and 1. A value of 1 corresponds to ordinary least sum of squares regression. |
subsample |
Integer of length 1. If its value is greater than 0,
the model parameters are
estimated from a subsample of the data of size |
backgroundsubtract |
Logical of length 1: should local background estimates be subtracted before fitting vsn? |
foreground , background
|
Aligned character vectors of the same length,
naming the channels of |
verbose |
Logical. If TRUE, some messages are printed. |
returnData |
Logical. If TRUE, the transformed data are returned
in a slot of the resulting |
calib |
Character of length 1. Allowed values are |
pstart |
Optional, a three-dimensional numeric array that
specifies start values for the iterative parameter
estimation algorithm.
If not specified, the function tries to guess useful start values.
The first dimension corresponds to the levels of |
minDataPointsPerStratum |
The minimum number of data points per stratum. Normally there is no need for the user to change this; refer to the vignette for further documentation. |
optimpar |
Optional, a list with parameters for the likelihood
optimisation algorithm. Default parameters are taken from
|
defaultpar |
The default parameters for the likelihood
optimisation algorithm. Values in |
... |
Arguments that get passed on to |
An object of class vsn
.
The data are returned on a scale to base 2. More precisely,
the transformed data are subject to the transformation
, where the function
is called the
generalised logarithm, the offset
and the scaling parameter
are the fitted model parameters
(see references), and
is a parameter transformation that
allows ensuring positivity of the factor in front of
while
using an unconstrained optimisation over
[4].
The overall offset
is computed from the
's such that for
large
the transformation approximately corresponds to the
function. This is done separately for each stratum, but with the
same value across arrays. More precisely, if the element
b[s,i]
of the array b is the scaling parameter for the s
-th
stratum and the i
-th array, then c[s]
is computed as
log2(2*f(mean(b[,i])))
.
The offset c is inconsequential for all differential
expression calculations, but many users like to see the data in a
range that they are familiar with.
vsn2
methods exist for
ExpressionSet
,
NChannelSet
,
AffyBatch
(from the affy
package),
RGList
(from the limma
package),
matrix
and numeric
.
If x
is an NChannelSet
, then
vsn2
is applied to the matrix that is obtained
by horizontally concatenating the color channels.
Optionally, available background estimates can be subtracted before.
If x
is an RGList
, it is
converted into an NChannelSet
using a copy of Martin Morgan's code for RGList
to
NChannelSet
coercion, then the NChannelSet
method is called.
If the reference
argument is not specified, then the model
parameters and
are fit from the data in
x
.
This is the mode of operation described in [1]
and that was the only option in versions 1.X of this package.
If reference
is specified, the model parameters
and
are taken from it.
This allows for 'incremental' normalization [4].
L-BFGS-B
uses three termination criteria:
(f_k - f_{k+1}) / max(|f_k|, |f_{k+1}|, 1) <= factr * epsmch
where epsmch
is the machine precision.
|gradient| < pgtol
iterations > maxit
These are set by the elements factr
, pgtol
and
maxit
of optimpar
. The remaining elements are
trace
An integer between 0 and 6, indicating the
verbosity level of L-BFGS-B
, higher values
create more output.
cvg.niter
The number of iterations to be used in the least trimmed sum of squares regression.
cvg.eps
Numeric. A convergence threshold for the least trimmed sum of squares regression.
Wolfgang Huber
[1] Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Wolfgang Huber, Anja von Heydebreck, Holger Sueltmann, Annemarie Poustka, Martin Vingron; Bioinformatics (2002) 18 Suppl.1 S96-S104.
[2] Parameter estimation for the calibration and variance stabilization of microarray data, Wolfgang Huber, Anja von Heydebreck, Holger Sueltmann, Annemarie Poustka, and Martin Vingron; Statistical Applications in Genetics and Molecular Biology (2003) Vol. 2 No. 1, Article 3. http://www.bepress.com/sagmb/vol2/iss1/art3.
[3] L-BFGS-B: Fortran Subroutines for Large-Scale Bound Constrained Optimization, C. Zhu, R.H. Byrd, P. Lu and J. Nocedal, Technical Report, Northwestern University (1996).
[4] Package vignette: Likelihood Calculations for vsn
data("kidney") fit = vsn2(kidney) ## fit nkid = predict(fit, newdata=kidney) ## apply fit plot(exprs(nkid), pch=".") abline(a=0, b=1, col="red")
data("kidney") fit = vsn2(kidney) ## fit nkid = predict(fit, newdata=kidney) ## apply fit plot(exprs(nkid), pch=".") abline(a=0, b=1, col="red")
Apply the vsn transformation to data.
## S4 method for signature 'vsn' predict(object, newdata, strata=object@strata, log2scale=TRUE, useDataInFit=FALSE)
## S4 method for signature 'vsn' predict(object, newdata, strata=object@strata, log2scale=TRUE, useDataInFit=FALSE)
object |
An object of class |
newdata |
Object of class
|
strata |
Optional, a |
log2scale |
If |
useDataInFit |
If fit = vsn2(x, ...) nx = predict(fit, newdata=x) and is used, for example, in the |
An object typically of the same class as newdata
. There are two
exceptions: if newdata
is an
RGList
, the return value is an
NChannelSet
, and
if newdata
is numeric, the return value is a matrix
with 1
column.
Wolfgang Huber
data("kidney") ## nb: for random subsampling, the 'subsample' argument of vsn ## provides an easier way to do this fit = vsn2(kidney[sample(nrow(kidney), 500), ]) tn = predict(fit, newdata=exprs(kidney))
data("kidney") ## nb: for random subsampling, the 'subsample' argument of vsn ## provides an easier way to do this fit = vsn2(kidney[sample(nrow(kidney), 500), ]) tn = predict(fit, newdata=exprs(kidney))
Class to contain input data and parameters for vsn functions
new("vsnInput")
x
:A numeric matrix with the input data.
reference
:An object of vsn
,
typically this would have been
obtained from a previous fit to a set of reference arrays (data).
strata
:A factor of length 0 or n. If its length is
n, then its levels correspond to different normalization strata
(see vsn2
).
ordered
:Logical scalar; are the rows reordered so that the strata are contiguous.
lts.quantile
:Numeric scalar, seevsn2
.
subsample
:Integer scalar, seevsn2
.
verbose
:Logical scalar, seevsn2
.
calib
Character of length 1, see manual page of vsn2
.
pstart
:A 3D array of size (number of strata) x (number of columns of the data matrix) x 2. It contains the start parameters.
optimpar
:List with parameters for the numerical
optimiser L-BFGS-B
; see the manual page of vsn2
.
[
Subset
dim
Get dimensions of data matrix.
nrow
Get number of rows of data matrix.
ncol
Get number of columns of data matrix.
show
Print a summary of the object
Wolfgang Huber