Title: | The GASSCO method for correcting for slide-dependent gene-specific dye bias |
---|---|
Description: | Many two-colour hybridizations suffer from a dye bias that is both gene-specific and slide-specific. The former depends on the content of the nucleotide used for labeling; the latter depends on the labeling percentage. The slide-dependency was hitherto not recognized, and made addressing the artefact impossible. Given a reasonable number of dye-swapped pairs of hybridizations, or of same vs. same hybridizations, both the gene- and slide-biases can be estimated and corrected using the GASSCO method (Margaritis et al., Mol. Sys. Biol. 5:266 (2009), doi:10.1038/msb.2009.21) |
Authors: | Philip Lijnzaad and Thanasis Margaritis |
Maintainer: | Philip Lijnzaad <[email protected]> |
License: | GPL-3 |
Version: | 1.67.0 |
Built: | 2024-12-19 03:52:52 UTC |
Source: | https://github.com/bioc/dyebias |
Many two-colour hybridizations suffer from a dye bias that is both gene-specific and slide-specific. The former depends on the content of the nucleotide used for labeling; the latter depends on the labeling percentage. The slide-dependency was hitherto not recognized, and made addressing the artefact impossible. Given a reasonable number of dye-swapped pairs of hybridizations, or of same vs. same hybridizations, both the gene- and slide-biases can be estimated and corrected using the GASSCO method (Margaritis et al., Mol. Sys. Biol. 5:266 (2009), doi:10.1038/msb.2009.21)
Package: | dyebias |
Type: | Package |
Version: | 1.7.1 |
Date: | 26 May 2010 |
Licence: | GPL-3 |
Philip Lijnzaad and Thanasis Margaritis
Philip Lijnzaad <[email protected]> (Maintainer).
dyebias.application.subset
,
dyebias.apply.correction
,
dyebias.boxplot
,
dyebias.estimate.iGSDBs
,
dyebias.monotonicity
,
dyebias.monotonicityplot
,
dyebias.rgplot
,
dyebias.trendplot
.
Convenience function returning a subset of reporters that can be
expected to be corrected reasonably well. Often, the logical AND of
this set and that of maW(data.norm) == 1.0
is used. The
resulting subset is passed as the application.subset
-argument
to dyebias.apply.correction
.
dyebias.application.subset(data.raw=NULL, min.SNR=1.5, use.background=FALSE, maxA=15)
dyebias.application.subset(data.raw=NULL, min.SNR=1.5, use.background=FALSE, maxA=15)
data.raw |
A |
min.SNR |
The minimum signal to noise ratio to require. It is loosely defined here as the foreground over the background signal. The background signal may not be real; see below. |
use.background |
Logical indicating whether or not to use the background signals |
maxA |
The maximum signal that is still allowed. |
This routine requires an marrayRaw
object since only that
contains the background intensities. If you only have normalized data,
use something like
bg <- matrix(0.5, nrow=maNspots(data.norm), ncol=maNsamples(data.norm)) data.raw <- new("marrayRaw", maRf=maR(data.norm), maGf=maG(data.norm), maRb=bg, maGb=bg, maW=maW(data.norm))
A matrix of logicals with the same dimensions as those of
maRf{data.raw}
is returned.
Philip Lijnzaad [email protected]
Margaritis, T., Lijnzaad, P., van Leenen, D., Bouwmeester, D., Kemmeren, P., van Hooff, S.R and Holstege, F.C.P. (2009) Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, 5:266, 2009. doi: 10.1038/msb.2009.21.
## First load data and estimate the iGSDBs ## (see dyebias.estimate.iGSDBs) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) ### choose which genes to dye bias correct. Typically, this is based ### both on flagged spots and intensity application.subset <- maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE) summary(application.subset)
## First load data and estimate the iGSDBs ## (see dyebias.estimate.iGSDBs) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) ### choose which genes to dye bias correct. Typically, this is based ### both on flagged spots and intensity application.subset <- maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE) summary(application.subset)
Corrects the gene- and slide specific dye bias in a data set, using the GASSCO method by Margaritis et al.
dyebias.apply.correction(data.norm, iGSDBs, estimator.subset=TRUE, application.subset=TRUE, dyebias.percentile=5, minmaxA.perc=25, minA.abs=NULL, maxA.abs=NULL, verbose=FALSE)
dyebias.apply.correction(data.norm, iGSDBs, estimator.subset=TRUE, application.subset=TRUE, dyebias.percentile=5, minmaxA.perc=25, minA.abs=NULL, maxA.abs=NULL, verbose=FALSE)
data.norm |
A |
iGSDBs |
A data frame with the intrinsic gene specific dye bias per reporter
(i.e., oligo or cDNA). The data frame would typically have come
from a call to The data frame must have (at least) the following columns:
The order of the rows in this data frame is irrelevant. There must
be no rows with duplicate For any reporter in |
estimator.subset |
An index indicating which reporters are fit to be used as estimators of the slide bias. This set of reporters is used throughout the whole data set. Reporters that are typically excluded are those corresponding to parasitic DNA elements or mitochondrial genes. |
application.subset |
An index indicating which values must be dye
bias-corrected. It should be either a vector with as many values as
spots, or a matrix of the same dimensions as
Often it is prudent not to dye bias-correct measurements that are
close to the detection limit or close to signal saturation. A
convenience function for this is provided; see |
dyebias.percentile |
The slide bias estimation uses a small subset of reporters having the strongest green or red iGSDB, as specified by this percentile. The default should suffice in practically all cases. |
minmaxA.perc |
To obtain a robust estimate of the slide bias, the range of the
average expression |
minA.abs |
If specified, reporters with an average expression
( |
maxA.abs |
If specified, reporters with an average expression
( |
verbose |
Logical speficying whether to be verbose or not |
This function corrects the gene-specific dye bias of two-colour microarrays with the GASSCO method. This method is general, robust and fast, and is based on the observation that the total bias per gene is the product of a slide-specific factor (strongly related to the labeling percentage) and an intrinsic gene-specific factor (iGSDB), which is strongly related to the probe sequence.
The slide bias is estimated from the total bias of the
dyebias.percentile
percentage of reporters having the strongest
iGSDB. The iGSDBs can be estimated with dyebias.estimate.iGSDBs
.
If the signal of certain oligos is too weak, or in contrast, tends to
be saturated, they are no good estimator of the slide bias.
Therefore, only reporters with an average expression level
that is not too extreme are allowed to be slide bias estimators. (This
is the reason for the
A
-column in the iGSDBs
data
frame).
Full control over which reporters to allow as slide bias estimators is
given by the arguments minmaxA.perc
, minA.abs
, and
maxA.abs
; see there for details. To not exclude any reporter
(e.g., when is not available and therefore artificially set),
you can use
minA.abs= -Inf
and maxA.abs = Inf
.
For further details concerning the method, see the dyebias
vignette and the publication. If your research benefits from using this
package, we kindly request that you cite this work.
The data returned is a list wit the following elements
data.corrected |
A |
estimators |
Another list, containing the details of the
reporters that were used to obtain an estimate of the slide bias.
The contents of the
|
summary |
A data frame summarizing the correction process per slide. It consist of the following columns:
|
data.uncorrected |
The uncorrected input |
Note that the input data should be normalized, and that the dye swaps should not have been swapped back (if needed, this can of course be done afterwards).
Philip Lijnzaad [email protected]
Margaritis, T., Lijnzaad, P., van Leenen, D., Bouwmeester, D., Kemmeren, P., van Hooff, S.R and Holstege, F.C.P. (2009). Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, 5:266, 2009. doi: 10.1038/msb.2009.21.
dyebias.estimate.iGSDBs
,
dyebias.application.subset
,
dyebias.rgplot
,
dyebias.maplot
,
dyebias.boxplot
,
dyebias.trendplot
## First load data and estimate the iGSDBs ## (see dyebias.estimate.iGSDBs) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) ### choose which genes to dye bias correct: application.subset <- (maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE)) ### do the correction: correction <- dyebias.apply.correction(data.norm=data.norm, iGSDBs = iGSDBs.estimated, estimator.subset=estimator.subset, application.subset = application.subset, verbose=FALSE) ## Not run: edit(correction$summary) ## End(Not run) ## give overview: correction$summary[,c("slide", "file", "avg.correction", "reduction.perc", "p.value")] ## and summary: summary(as.numeric(correction$summary[, "reduction.perc"]))
## First load data and estimate the iGSDBs ## (see dyebias.estimate.iGSDBs) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) ### choose which genes to dye bias correct: application.subset <- (maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE)) ### do the correction: correction <- dyebias.apply.correction(data.norm=data.norm, iGSDBs = iGSDBs.estimated, estimator.subset=estimator.subset, application.subset = application.subset, verbose=FALSE) ## Not run: edit(correction$summary) ## End(Not run) ## give overview: correction$summary[,c("slide", "file", "avg.correction", "reduction.perc", "p.value")] ## and summary: summary(as.numeric(correction$summary[, "reduction.perc"]))
The aim of this routine is to show the magnitude of the dye bias across the data set, as well as the extent to which the GASSCO method could get rid of it. Typically, two boxplots would be shown, one before, one after dye bias correction. For esthetic reasons, the boxplots are usually ordered by the overal slide bias of the uncorrected data set. See also Margaritis et al. (2009), Fig. 1 and 3.
dyebias.boxplot(data, iGSDBs, dyebias.percentile=5, application.subset=TRUE, order, output=NULL, ylim=c(-4,4), ...)
dyebias.boxplot(data, iGSDBs, dyebias.percentile=5, application.subset=TRUE, order, output=NULL, ylim=c(-4,4), ...)
data |
The |
iGSDBs |
A data frame with intrinsic gene-specific dye biases,
the same as that used in |
dyebias.percentile |
The percentile of intrinsic gene specific dye biases (iGSDBs) for which to highlight the reporters. |
application.subset |
The set of reporters that was eligible for dye bias correction; same
argument as for |
order |
If |
output |
Specifies the output. If |
ylim |
As for |
... |
Other arguments (such as |
The order obtained, for use in a later call to this same function.
Philip Lijnzaad [email protected]
Margaritis, T., Lijnzaad, P., van Leenen, D., Bouwmeester, D., Kemmeren, P., van Hooff, S.R and Holstege, F.C.P. (2009) Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, 5:266, 2009. doi: 10.1038/msb.2009.21.
dyebias.estimate.iGSDBs
,
dyebias.apply.correction
,
dyebias.rgplot
,
dyebias.maplot
,
dyebias.trendplot
ylim <- c(-1, 1) layout(matrix(1:2, nrow=1,ncol=2)) order <- dyebias.boxplot(data=data.norm, iGSDBs=iGSDBs.estimated, # from e.g. dyebias.estimate.iGSDBs order=NULL, # i.e., order by increasing slide bias output=NULL, main="before correction", ylim=ylim) order <- dyebias.boxplot(data=correction$data.corrected, # from dyebias.apply.correction iGSDBs=iGSDBs.estimated, order=order, # order by the original slide bias output=NULL, main="after correction", ylim=ylim )
ylim <- c(-1, 1) layout(matrix(1:2, nrow=1,ncol=2)) order <- dyebias.boxplot(data=data.norm, iGSDBs=iGSDBs.estimated, # from e.g. dyebias.estimate.iGSDBs order=NULL, # i.e., order by increasing slide bias output=NULL, main="before correction", ylim=ylim) order <- dyebias.boxplot(data=correction$data.corrected, # from dyebias.apply.correction iGSDBs=iGSDBs.estimated, order=order, # order by the original slide bias output=NULL, main="after correction", ylim=ylim )
Obtain estimates for the instrinsic gene-specific dye bias (iGSDB) using a set of normalized data, as part of the GASSCO method.
dyebias.estimate.iGSDBs(data.norm, is.balanced=TRUE, reference="ref", verbose=FALSE)
dyebias.estimate.iGSDBs(data.norm, is.balanced=TRUE, reference="ref", verbose=FALSE)
data.norm |
A If the data is unbalanced (so |
is.balanced |
The use of this argument is discouraged, since designs should generally be
balanced. The values other than Logical indicating whether the data set represents a balanced design
(which is by far the most common case). A design is balanced if all
factor values are present an equal number of times in both the
forward and reverse dye orientations. A self-self design is by
definition balanced (even if the number of slides is uneven). If
If |
reference |
If the design contains a single common reference,
the |
verbose |
Logical, indicating wether or not to be verbose. |
This function implements the first step of the GASSCO method: estimating the so-called intrinsic gene specific dye biases, or briefly iGSDB. They can be estimated from a (preferably large) data set containing either self-self experiments, or dye-swapped slides.
The assumption underlying this approach is that with self-selfs, or with pairs of dye swaps, the only effect that can lead to systematic changes between Cy5 and Cy3, is in fact the dye effect.
There are two cases to distinguish, the balanced case, and the
unbalanced case. In the balanced case, the iGSDB estimate is simply
the average (where
) over all
slides. A set of slides is balanced if all factor values are present
in as many dye-swapped as non-dye-swapped slides. A set of self-self
slides is in fact a degenerate form of this, and is therefore also
balanced.
In the unbalanced case, one could omit slides until the data set is balanced. However, this is wasteful as we can use linear modelling to obtain estimates. We use the limma package for this (Smyth, 2005). The only unbalanced designs currently supported are a common reference design, and a set of common reference designs.
There are no weights or subset argument to this function; the estimation is done for all reporters found. If there are replicate spots, they are averaged prior to the estimation (the reason being that we are not interested in p-values for the estimate)
Having obtained the iGSDB estimates, the corrections can be applied
to either to the hybridizations given by the data.norm
argument,
or to a different set of slides that is thought to have very similar
iGSDBs. Applying the corrections is done with dyebias.apply.correction
.
A data frame is returned with as many rows as there are reporters
(replicate spots have been averaged), and the following columns:
reporterId |
The name of the reporter |
dyebias |
The intrinsic gene-specific dye bias (iGSDB) of this reporter |
A |
The average expression level of this reporter in the given data set |
p.value |
The p-value for the |
This data frame is typically used as input to dyebias.apply.correction
.
Note that the input data should be normalized, and that the dye swaps should not have been swapped back. After all, we're interested in the difference of Cy5 over Cy3, not the difference of experiment over reference.
Philip Lijnzaad [email protected]
Margaritis, T., Lijnzaad, P., van Leenen, D., Bouwmeester, D., Kemmeren, P., van Hooff, S.R and Holstege, F.C.P. (2009) Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, 5:266, 2009. doi: 10.1038/msb.2009.21.
Dudoit, S. and Yang, Y.H. (2002) Bioconductor R packages for exploratory analysis and normalization of cDNA microarray data. In: Parmigiani, G., Garrett, E.S. , Irizarry, R.A., and Zeger, S.L. (eds.) The Analysis of Gene Expression Data: Methods and Software, Springer, New York.
Smyth, G.K. (2005) Limma: linear models for microarray data. In: Gentleman, R., Carey, V., Dudoit, S., Irizarry, R. and Huber, W. (eds). Bioinformatics and Computational Biology Solutions using R and Bioconductor, Springer, New York.
dyebias.apply.correction
iGSDBs.estimated <- dyebias.estimate.iGSDBs(data.norm, is.balanced=TRUE, verbose=FALSE) summary(iGSDBs.estimated) ## Not run: hist(iGSDBs.estimated$dyebias, breaks=50) ## End(Not run)
iGSDBs.estimated <- dyebias.estimate.iGSDBs(data.norm, is.balanced=TRUE, verbose=FALSE) summary(iGSDBs.estimated) ## Not run: hist(iGSDBs.estimated$dyebias, breaks=50) ## End(Not run)
If you order genes by their iGSDB, and hybridizations by slide bias, the graphs of each gene should form a 'fan' out of the origin (see also dyebias.trendplot). This function gives measure of the extent to which this is true.
This function has been depracated, as it is of limited use and takes too long to compute.
dyebias.monotonicity(data, iGSDBs, dyebias.percentile = 5, order = NULL)
dyebias.monotonicity(data, iGSDBs, dyebias.percentile = 5, order = NULL)
data |
The |
iGSDBs |
A data frame with intrinsic gene-specific dye biases,
the same as that used in |
dyebias.percentile |
The percentile of intrinsic gene specific dye biases (iGSDBs) for which to highlight the reporters. Default should suffice in almost all cases. |
order |
If |
The total dye bias appears to be the product of iGSDB and slide bias. In other words, it is monotonous (always increasing or always decreasing), both with respect to the intrinsic gene specific dye bias and with respect to the slide bias. This function orders genes by their iGSDB and the slides by slide bias. Subsequently a linear regression of each gene is done, with x being the slide bias rank, (not the slide bias itself), and y being the M. The slopes of each linear regression line should be an increasing array of values, representing the 'fan' of lines. The degree to which this array is increasing is tested using the Mann-Kendall test, and is returned. In the case of uncorrected data, tau is generally larger than 0.3. After correction, tau should be close to zero.
A dyebias.monotonicity
uses cor.test
, which returns
htest
object. To this list an extra element, order
, is
added, which indicates the ordering of the data set by slide bias.
The degree of monotonicity is indicated by the estimate
element; its signficance by the p.value
element.
This function takes very long to compute, since it calculates regressions for each gene.
Philip Lijnzaad [email protected]
Margaritis, T., Lijnzaad, P., van Leenen, D., Bouwmeester, D., Kemmeren, P., van Hooff, S.R and Holstege, F.C.P. (2009). Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, 5:266, 2009. doi:10.1038/msb.2009.21.
dyebias.trendplot
,
dyebias.monotonicityplot
## Not run: options(stringsAsFactors = FALSE) library(dyebias) library(dyebiasexamples) data(data.raw) data(data.norm) ### obtain estimate for the iGSDBs: iGSDBs.estimated <- dyebias.estimate.iGSDBs(data.norm, is.balanced=TRUE, verbose=FALSE) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) application.subset <- maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE) ### do the correction: correction <- dyebias.apply.correction(data.norm=data.norm, iGSDBs = iGSDBs.estimated, estimator.subset=estimator.subset, application.subset = application.subset, verbose=FALSE) cat("monotonicity before correction") monotonicity <- dyebias.monotonicity(data=data.norm, iGSDBs=iGSDBs.estimated, order=NULL) monotonicity cat("monotonicity after correction") dyebias.monotonicity(data=correction$data.corrected, iGSDBs=iGSDBs.estimated, order= monotonicity$order) ## End(Not run)
## Not run: options(stringsAsFactors = FALSE) library(dyebias) library(dyebiasexamples) data(data.raw) data(data.norm) ### obtain estimate for the iGSDBs: iGSDBs.estimated <- dyebias.estimate.iGSDBs(data.norm, is.balanced=TRUE, verbose=FALSE) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) application.subset <- maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE) ### do the correction: correction <- dyebias.apply.correction(data.norm=data.norm, iGSDBs = iGSDBs.estimated, estimator.subset=estimator.subset, application.subset = application.subset, verbose=FALSE) cat("monotonicity before correction") monotonicity <- dyebias.monotonicity(data=data.norm, iGSDBs=iGSDBs.estimated, order=NULL) monotonicity cat("monotonicity after correction") dyebias.monotonicity(data=correction$data.corrected, iGSDBs=iGSDBs.estimated, order= monotonicity$order) ## End(Not run)
If you order genes by their iGSDB, and hybridizations by slide bias, the graphs of each gene should form a 'fan' out of the origin (see also dyebias.trendplot). This function plots the regression slope of each gene, ordered by iGSDB and slide bias. If the uncorrected total dye bias is indeed monotonous, an increasing trend should be visible.
This function has been depracated, as it is of limited use and takes too long to compute.
dyebias.monotonicityplot(data, iGSDBs, dyebias.percentile = 5, order = NULL, output = NULL, pch = 19, cex = 0.3, cex.lab = 1.4, ylim = c(-0.2, 0.2), xlab = "rank", ylab = "slope", sub = NULL, ...)
dyebias.monotonicityplot(data, iGSDBs, dyebias.percentile = 5, order = NULL, output = NULL, pch = 19, cex = 0.3, cex.lab = 1.4, ylim = c(-0.2, 0.2), xlab = "rank", ylab = "slope", sub = NULL, ...)
data |
The |
iGSDBs |
A data frame with intrinsic gene-specific dye biases,
the same as that used in |
dyebias.percentile |
The percentile of intrinsic gene specific dye biases (iGSDBs) for which to highlight the reporters. Default should suffice in almost all cases. |
order |
If |
output |
Specifies the output. If |
pch , cex , cex.lab , ylim , xlab , ylab
|
As for |
sub |
The subtitle. If |
... |
Other arguments are passed on to |
The total dye bias appears to be the product of iGSDB and slide bias. In other words, it is monotonous (always increasing or always decreasing), both with respect to the intrinsic gene specific dye bias and with respect to the slide bias. This function orders genes by their iGSDB and the slides by slide bias. Subsequently a linear regression of each gene is done, with x being the slide bias rank (not the slide bias itself), and y being the M. The slopes of each linear regression line should be an increasing array of values, representing the 'fan' of lines. The array of slopes is plotted (versus the rank). Generally, a clear trend is visible for uncorrected hybridizations, and the trend has disappeared after dye bias correction.
The order of the slide bias is returned, for use in plotting the behaviour of the regression slopes in the corrected data set.
This function takes very long to compute, since it calculates regressions for each gene.
Philip Lijnzaad [email protected]
Margaritis, T., Lijnzaad, P., van Leenen, D., Bouwmeester, D., Kemmeren, P., van Hooff, S.R and Holstege, F.C.P. (2009). Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, 5:266, 2009. doi: 10.1038/msb.2009.21.
dyebias.monotonicity
,
dyebias.trendplot
## Not run: options(stringsAsFactors = FALSE) library(dyebias) library(dyebiasexamples) data(data.raw) data(data.norm) ### obtain estimate for the iGSDBs: iGSDBs.estimated <- dyebias.estimate.iGSDBs(data.norm, is.balanced=TRUE, verbose=FALSE) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) application.subset <- maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE) ### do the correction: correction <- dyebias.apply.correction(data.norm=data.norm, iGSDBs = iGSDBs.estimated, estimator.subset=estimator.subset, application.subset = application.subset, verbose=FALSE) layout(matrix(1:2, nrow=1,ncol=2)) order <- dyebias.monotonicityplot(data=data.norm, iGSDBs=iGSDBs.estimated, # from e.g. dyebias.estimate.iGSDBs order=NULL, # i.e., order by increasing slide bias output=NULL, main="before correction" ) order <- dyebias.monotonicityplot(data=correction$data.corrected, iGSDBs=iGSDBs.estimated, order=order, # order by the original slide bias output=NULL, main="after correction" ) ## End(Not run)
## Not run: options(stringsAsFactors = FALSE) library(dyebias) library(dyebiasexamples) data(data.raw) data(data.norm) ### obtain estimate for the iGSDBs: iGSDBs.estimated <- dyebias.estimate.iGSDBs(data.norm, is.balanced=TRUE, verbose=FALSE) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) application.subset <- maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE) ### do the correction: correction <- dyebias.apply.correction(data.norm=data.norm, iGSDBs = iGSDBs.estimated, estimator.subset=estimator.subset, application.subset = application.subset, verbose=FALSE) layout(matrix(1:2, nrow=1,ncol=2)) order <- dyebias.monotonicityplot(data=data.norm, iGSDBs=iGSDBs.estimated, # from e.g. dyebias.estimate.iGSDBs order=NULL, # i.e., order by increasing slide bias output=NULL, main="before correction" ) order <- dyebias.monotonicityplot(data=correction$data.corrected, iGSDBs=iGSDBs.estimated, order=order, # order by the original slide bias output=NULL, main="after correction" ) ## End(Not run)
Plots the vs.
(or alternatively
vs.
)
signal of one slide, highlighting the reporters with the strongest red
and green dye bias. Two lines indicate two-fold change. See also
Margaritis et al. (2009), Fig. 1
dyebias.rgplot(data, slide, iGSDBs, dyebias.percentile=5, application.subset=TRUE, output=NULL, xlim = c(log2(50),log2(50000)), ylim = c(log2(50),log2(50000)), xticks = c(100,1000,10000,10000), yticks = c(100,1000,10000,10000), pch = 19, cex = 0.3, cex.lab = 1.4, ...) dyebias.maplot(data, slide, iGSDBs, dyebias.percentile=5, application.subset=TRUE, output=NULL, xlim = c(6,16), ylim = c(-2,2), pch = 19, cex = 0.3, cex.lab = 1.4, ...)
dyebias.rgplot(data, slide, iGSDBs, dyebias.percentile=5, application.subset=TRUE, output=NULL, xlim = c(log2(50),log2(50000)), ylim = c(log2(50),log2(50000)), xticks = c(100,1000,10000,10000), yticks = c(100,1000,10000,10000), pch = 19, cex = 0.3, cex.lab = 1.4, ...) dyebias.maplot(data, slide, iGSDBs, dyebias.percentile=5, application.subset=TRUE, output=NULL, xlim = c(6,16), ylim = c(-2,2), pch = 19, cex = 0.3, cex.lab = 1.4, ...)
data |
The |
slide |
The index of the slide to plot; must be > 1, and < |
iGSDBs |
A data frame with intrinsic gene-specific dye biases,
the same as that used in |
dyebias.percentile |
The percentile of intrinsic gene specific dye biases (iGSDBs) for which to highlight the reporters. |
application.subset |
The set of reporters that was eligible for dye bias correction; same
argument as for |
output |
Specifies the output. If |
xlim , ylim , xticks , yticks , pch , cex , cex.lab
|
Graphical parameters; see |
... |
Other arguments (such as |
None.
The highlighted spots are all spots with an iGSDB that lies
in the top- or bottom- dyebias.percentile
of iGSDBS. That is, not just
the estimator genes are highlighted.
Philip Lijnzaad [email protected]
Margaritis, T., Lijnzaad, P., van Leenen, D., Bouwmeester, D., Kemmeren, P., van Hooff, S.R and Holstege, F.C.P. (2009). Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, 5:266, 2009. doi: 10.1038/msb.2009.21.
dyebias.estimate.iGSDBs
,
dyebias.apply.correction
,
dyebias.rgplot
,
dyebias.maplot
,
dyebias.boxplot
,
dyebias.trendplot
## show both an RG-plot and an MA-plot of the uncorrected data and the ## corrected data next to each other. slide <- 3 # or any other other, of course layout(matrix(1:4, nrow=2,ncol=2, byrow=TRUE)) dyebias.rgplot(data=data.norm, slide=slide, iGSDBs=iGSDBs.estimated, # from dyebias.estimate.iGSDBs main=sprintf("RG-plot, uncorrected, slide %d", slide), output=NULL) dyebias.rgplot(data=correction$data.corrected, slide=slide, iGSDBs=iGSDBs.estimated, main=sprintf("RG-plot, corrected, slide %d", slide), output=NULL) dyebias.maplot(data=data.norm, slide=slide, iGSDBs=iGSDBs.estimated, main=sprintf("MA-plot, uncorrected, slide %d",slide), output=NULL) dyebias.maplot(data=correction$data.corrected, slide=slide, iGSDBs=iGSDBs.estimated, main=sprintf("MA-plot, corrected, slide %d",slide), output=NULL)
## show both an RG-plot and an MA-plot of the uncorrected data and the ## corrected data next to each other. slide <- 3 # or any other other, of course layout(matrix(1:4, nrow=2,ncol=2, byrow=TRUE)) dyebias.rgplot(data=data.norm, slide=slide, iGSDBs=iGSDBs.estimated, # from dyebias.estimate.iGSDBs main=sprintf("RG-plot, uncorrected, slide %d", slide), output=NULL) dyebias.rgplot(data=correction$data.corrected, slide=slide, iGSDBs=iGSDBs.estimated, main=sprintf("RG-plot, corrected, slide %d", slide), output=NULL) dyebias.maplot(data=data.norm, slide=slide, iGSDBs=iGSDBs.estimated, main=sprintf("MA-plot, uncorrected, slide %d",slide), output=NULL) dyebias.maplot(data=correction$data.corrected, slide=slide, iGSDBs=iGSDBs.estimated, main=sprintf("MA-plot, corrected, slide %d",slide), output=NULL)
The aim of this routine is to show the monotonicity of the total dye bias in the (uncorrected) data set. This is to judge whether the total dye bias of one reporter in one hybridization indeed behaves as the product of an intrinsic gene specific dye bias (iGSDB) and a slide specific factor (the slide bias), which is at the heart of the GASSCO method.
Showing the total dye bias of all reporters is too overwhelming,
therefore the medians of the total dye bias after binning by intrinsic
gene specific dye bias (as given in dyebias$dyebias
) are
plotted.
dyebias.trendplot(data, iGSDBs, dyebias.percentile=5, application.subset=TRUE, n.bins=20, order, output=NULL, ylim=c(-1,1), cex=0.3, lty=1, lwd=1, type="median", main="dye bias trend plot", xlab="slide bias rank", ylab="M", sub=NULL, ...)
dyebias.trendplot(data, iGSDBs, dyebias.percentile=5, application.subset=TRUE, n.bins=20, order, output=NULL, ylim=c(-1,1), cex=0.3, lty=1, lwd=1, type="median", main="dye bias trend plot", xlab="slide bias rank", ylab="M", sub=NULL, ...)
data |
The |
iGSDBs |
A data frame with intrinsic gene-specific dye biases,
the same as that used in |
dyebias.percentile |
The percentile of intrinsic gene specific dye biases (iGSDBs) for which to highlight the reporters. Default should suffice in almost all cases. |
application.subset |
The set of reporters that was eligible for dye bias correction; same
argument as for |
n.bins |
The number of bins into which to classify the reporters, based on their intrinsic gene-specific dye bias. The median of each bin is plotted. |
type |
What to print for each bin and hybridization. Valid values are:
|
order |
If |
output |
Specifies the output. If |
ylim , lty , lwd , main , sub , cex , xlab , ylab
|
As for |
... |
Other arguments are passed on to |
The order obtained, for use in a later call to this same function.
Philip Lijnzaad [email protected]
Margaritis, T., Lijnzaad, P., van Leenen, D., Bouwmeester, D., Kemmeren, P., van Hooff, S.R and Holstege, F.C.P. (2009). Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, 5:266, 2009. doi: 10.1038/msb.2009.21.
dyebias.estimate.iGSDBs
,
dyebias.apply.correction
,
dyebias.rgplot
,
dyebias.maplot
,
dyebias.monotonicity
dyebias.monotonicityplot
## show trend plots of uncorrected and corrected next to each other: ylim <- c(-0.6, 0.6) layout(matrix(1:2, nrow=1,ncol=2)) order <- dyebias.trendplot(data=data.norm, iGSDBs=iGSDBs.estimated, # from e.g. dyebias.estimate.iGSDBs order=NULL, # i.e., order by increasing slide bias output=NULL, main="before correction", ylim=ylim ) order <- dyebias.trendplot(data=correction$data.corrected, # from dyebias.apply.correction iGSDBs=iGSDBs.estimated, order=order, # order by the original slide bias output=NULL, main="after correction", ylim=ylim )
## show trend plots of uncorrected and corrected next to each other: ylim <- c(-0.6, 0.6) layout(matrix(1:2, nrow=1,ncol=2)) order <- dyebias.trendplot(data=data.norm, iGSDBs=iGSDBs.estimated, # from e.g. dyebias.estimate.iGSDBs order=NULL, # i.e., order by increasing slide bias output=NULL, main="before correction", ylim=ylim ) order <- dyebias.trendplot(data=correction$data.corrected, # from dyebias.apply.correction iGSDBs=iGSDBs.estimated, order=order, # order by the original slide bias output=NULL, main="after correction", ylim=ylim )