Title: | Methods for Spike-in Arrays |
---|---|
Description: | The package contains functions that can be used to compare expression measures on different array platforms. |
Authors: | Matthew N McCall <[email protected]>, Rafael A Irizarry <[email protected]> |
Maintainer: | Matthew N McCall <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.63.0 |
Built: | 2024-10-31 05:25:59 UTC |
Source: | https://github.com/bioc/spkTools |
This is a SpikeInExpressionSet object containing the data from the Affymetrix HGU133A Spike-in Experiment.
data(affy)
data(affy)
It contains a matrix of expression values and a matrix of nominal concentrations.
For more information see Irizarry, R.A., et al. NAR (2003) http://www.biostat.jhsph.edu/~ririzarr/papers/index.html
Plots boxplots of the data resulting from a call to spkBox.
plotSpkBox(boxs, fc=2, box.names=NULL, ...)
plotSpkBox(boxs, fc=2, box.names=NULL, ...)
boxs |
the output of a call to spkBox |
fc |
expected fold change |
box.names |
names to be printed below each boxplot |
... |
parameters passed to boxplot |
Boxplots for spike-in and non-spike-in comparisons stratified by ALE strata are produced.
Matthew N. McCall
data(affy) affySlope <- spkSlope(affy) affyBox <- spkBox(affy, affySlope) plotSpkBox(affyBox)
data(affy) affySlope <- spkSlope(affy) affyBox <- spkBox(affy, affySlope) plotSpkBox(affyBox)
This is a class representation for spike-in expression
data. SpikeInExpressionSet
class is derived from
ExpressionSet
, and requires a matrix names exprs
and a matrix named spikeIn
.
Extends class ExpressionSet
.
createSpikeInExpressionSet(exprs, spikeIn, ...)
new("SpikeInExpressionSet",
phenoData = new("AnnotatedDataFrame"),
featureData = new("AnnotatedDataFrame"),
experimentData = new("MIAME"),
annotation = character(0),
exprs = new("matrix"),
spikeIn = new("matrix"))
This creates a SpikeInExpressionSet
with assayData
implicitly created to contain exprs
and
spikeIn
. Additional named matrix arguments with the same
dimensions as exprs
are added to assayData
; the row
and column names of these additional matrices should match those of
exprs
and spikeIn
.
new("SpikeInExpressionSet",
assayData = assayDataNew(exprs=new("matrix"),spikeIn=new("matrix")),
phenoData = new("AnnotatedDataFrame"),
featureData = new("AnnotatedDataFrame"),
experimentData = new("MIAME"),
annotation = character(0),
This creates a SpikeInExpressionSet
with assayData
provided explicitly. In this form, the only required named argument is
assayData
.
Inherited from ExpressionSet
:
assayData
:Contains matrices with equal dimensions,
and with column number equal to
nrow(phenoData)
. assayData
must contain a matrix
exprs
and a matrix spikeIn
with rows representing
features and columns representing samples.
phenoData
:See eSet
annotation
See eSet
featureData
See eSet
experimentData
:See eSet
Class-specific methods:
spikeIn(SpikeInExpressionSet)
,
spikeIn(SpikeInExpressionSet)<-
Access and set elements
named spikeIn
in the AssayData-class
slot.
spkSplit(SpikeInExpressionSet)
creates two SpikeInExpressionSet objects – one with the spike-in probes and one with the non-spike-in probes.
For derived methods (see ExpressionSet
).
eSet-class
, ExpressionSet-class
.
# create an instance of SpikeInExpressionSet new("SpikeInExpressionSet") new("SpikeInExpressionSet", exprs=matrix(runif(1000), nrow=100), spikeIn=matrix(rep(1:10,100), nrow=100)) # class specific methods data(affy) affySpikes <- spikeIn(affy) affySplit <- spkSplit(affy)
# create an instance of SpikeInExpressionSet new("SpikeInExpressionSet") new("SpikeInExpressionSet", exprs=matrix(runif(1000), nrow=100), spikeIn=matrix(rep(1:10,100), nrow=100)) # class specific methods data(affy) affySpikes <- spikeIn(affy) affySplit <- spkSplit(affy)
Estimates the standard deviation for spike-ins at the lowest possible fold change in each bin.
spkAccSD(object, spkSlopeOut, tol=3)
spkAccSD(object, spkSlopeOut, tol=3)
object |
a SpikeInExpressionSet object |
spkSlopeOut |
the output from the spkSlope function |
tol |
number of digits after decimal point |
returns the median absolute deviation (MAD) for each bin.
Matthew N. McCall
data(affy) affySlope <- spkSlope(affy) spkAccSD <- spkAccSD(affy, affySlope)
data(affy) affySlope <- spkSlope(affy) spkAccSD <- spkAccSD(affy, affySlope)
A wrapper for the functions contained in the spkTools package, which calls each function.
spkAll(object, label, model=expr~spike+probe+array, fc=NULL, tol=3, xrngs=NULL, yrngs=NULL, cuts=c(.6,.99), potQuantile=.995, gnn=c(25,100,10000), pch=".", output="eps")
spkAll(object, label, model=expr~spike+probe+array, fc=NULL, tol=3, xrngs=NULL, yrngs=NULL, cuts=c(.6,.99), potQuantile=.995, gnn=c(25,100,10000), pch=".", output="eps")
object |
a SpikeInExpressionSet object |
label |
a character string to insert into the graphs and tables produced |
model |
model to be passed to spkAnova |
fc |
the fold change for which fold change plots will be produced |
tol |
the number of digits after the decimal point in fc |
xrngs |
ranges for the x-axis of each plot. d=density, s=slope, v=box, m=M vs A |
yrngs |
ranges for the y-axis of each plot. d=density, s=slope, v=box, m=M vs A |
cuts |
quantiles used to make the low, medium, and high bins |
potQuantile |
the desired quantile to compute the probability of being above |
gnn |
a vector of 3 numbers passed to spkGNN: the desired number of true positives, the number of truly expressed genes, and the number of truly unexpressed genes |
pch |
plotting point to be used in spkSlope |
output |
the format in which to save the plots produced. Options are "pdf" and "eps" |
The full complement of plots and tables described in the vignette are created and saved in the current working directory.
Matthew N. McCall
data(affy) spkAll(affy, label="affy", fc=2)
data(affy) spkAll(affy, label="affy", fc=2)
Computes the mean squared errors of a microarray spike-in design due to concentration, probe, array, and error.
spkAnova(object, model=expr~spike+probe+array)
spkAnova(object, model=expr~spike+probe+array)
object |
a SpikeInExpressionSet object |
model |
the anova model |
A vector of the mean squared errors from the anova model.
Matthew N. McCall
data(affy) spkAnova(affy)
data(affy) spkAnova(affy)
Computes the imbalance of a microarray spike-in design due to probes and arrays.
spkBal(object)
spkBal(object)
object |
a SpikeInExpressionSet object |
The probe and array imbalances.
Matthew N. McCall
Wu, Chien-Fu, Iterative Construction of Nearly Balanced Assignments I: Categorical Covariates. Technometrics, Vol. 23, No. 1. (Feb, 1981), pp. 37-44.
data(affy) spkBal(affy)
data(affy) spkBal(affy)
A function to calculate the log-ratios stratified by which ALE groups yield the comparison. They are stratified by which bins are being compared to produce the given fold change.
spkBox(object, spkSlopeOut, fc = 2, tol = 3, reduce=TRUE)
spkBox(object, spkSlopeOut, fc = 2, tol = 3, reduce=TRUE)
object |
a SpikeInExpressionSet object |
spkSlopeOut |
the output of the spkSlope function |
fc |
the fold change of interest |
tol |
the precision (number of digits after decimal point) in fc |
reduce |
if TRUE the number of points plotted in the null bins is reduced |
This function requires the output of spkSlope.
A list with the log-ratios separated by ALE strata comparison.
Matthew N. McCall
data(affy) affySlope <- spkSlope(affy) spkBox(affy,affySlope)
data(affy) affySlope <- spkSlope(affy) spkBox(affy,affySlope)
A density plot of the non-spike-in expression with a rug of the average expression at each spike-in level.
spkDensity(object, spkSlopeOut, cuts=TRUE, label = NULL, ...)
spkDensity(object, spkSlopeOut, cuts=TRUE, label = NULL, ...)
object |
a SpikeInExpressionSet object |
spkSlopeOut |
the output from the spkSlope function |
cuts |
if TRUE vertical lines are drawn at the expression values separating low vs medium and medium vs high ALE strata |
label |
a character string to insert into the plot title |
... |
arguments passed to the plot function |
This function requires the output of spkSlope.
Density plot is produced.
Matthew N. McCall
data(affy) affySlope <- spkSlope(affy) spkDensity(affy,affySlope)
data(affy) affySlope <- spkSlope(affy) spkDensity(affy,affySlope)
Computes the number of genes one would need to consider to obtain a given number of truly positive genes if one considered genes in order of decreasing observed fold change.
spkGNN(n, n.expr, n.unexpr, AccuracySlope, AccuracySD, nullfc)
spkGNN(n, n.expr, n.unexpr, AccuracySlope, AccuracySD, nullfc)
n |
the desired number of true positives |
n.expr |
the actual number of truly expressed genes |
n.unexpr |
the actual number of truly unexpressed genes |
AccuracySlope |
the signal detect slope from the spkSlope function |
AccuracySD |
the standard deviation of the signal detect slope from the spkAccSD function |
nullfc |
a vector of null fold changes from the spkBox function |
This function returns the expected number of genes one would have to consider to obtain N true positives under the given conditions.
Matthew N. McCall
data(affy) spkSlopeOut <- spkSlope(affy) spkBoxOut <- spkBox(affy, spkSlopeOut, fc=2) AccuracySlope <- round(spkSlopeOut$slope[-1], digits=2) AccuracySD <- round(spkAccSD(affy, spkSlopeOut), digits=2) spkGNN(n=25, n.expr=100, n.unexpr=10000, AccuracySlope[2], AccuracySD[2], spkBoxOut[[2]])
data(affy) spkSlopeOut <- spkSlope(affy) spkBoxOut <- spkBox(affy, spkSlopeOut, fc=2) AccuracySlope <- round(spkSlopeOut$slope[-1], digits=2) AccuracySD <- round(spkAccSD(affy, spkSlopeOut), digits=2) spkGNN(n=25, n.expr=100, n.unexpr=10000, AccuracySlope[2], AccuracySD[2], spkBoxOut[[2]])
Plots log-ratios (M) vs. average log expression (A) for a SpikeInExpressionSet object.
spkMA(object, spkSlopeOut, fc=2, tol=3, label=NULL, ylim=NULL, outlier=1, reduce=TRUE, plot.legend=TRUE)
spkMA(object, spkSlopeOut, fc=2, tol=3, label=NULL, ylim=NULL, outlier=1, reduce=TRUE, plot.legend=TRUE)
object |
a SpikeInExpressionSet object |
spkSlopeOut |
the output from the spkSlope function |
fc |
the fold change of interest |
tol |
the precision (number of digits after decimal point) in fc |
label |
a character string to insert into the plot title |
ylim |
limits of y-axis |
outlier |
log fold change cut-off for outliers |
reduce |
if TRUE some points are removed from the background to speed plotting |
plot.legend |
if TRUE a legend is plotted |
The MA plot is produced.
Matthew N. McCall
data(affy) affySlope <- spkSlope(affy) spkMA(affy, affySlope)
data(affy) affySlope <- spkSlope(affy) spkMA(affy, affySlope)
Compute log-ratios among spike-in genes.
spkPair(object)
spkPair(object)
object |
a SpikeInExpressionSet object |
An array containing either log-ratios (M), average log expression (A), and nominal concentrations (N1 & N2). Dimension one is genes, dimension two is array pairings, dimension three is M, A, N1, and N2.
Matthew N. McCall
data(affy) affyPair <- spkPair(affy)
data(affy) affyPair <- spkPair(affy)
Compute log-ratios among non-spike-in genes.
spkPairNS(object, output="M")
spkPairNS(object, output="M")
object |
a SpikeInExpressionSet object |
output |
what to return; either "M" for log-ratios or "A" for average log expression. |
A matrix containing either log-ratios (M) or average log expression (A). Rows are genes and columns are array pairings.
Matthew N. McCall
data(affy) affyPairNS <- spkPairNS(affy)
data(affy) affyPairNS <- spkPairNS(affy)
Compute the probability that a spike-in with a nominal fold change of 2 appears in the the top 0.5% (default) of log-ratios.
spkPot(object, spkSlopeOut, sig, SD, precisionQuantile)
spkPot(object, spkSlopeOut, sig, SD, precisionQuantile)
object |
a SpikeInExpressionSet object |
spkSlopeOut |
the output from the spkSlope function |
sig |
the signal detect slopes from a call to spkSlope |
SD |
the standard deviation from spkAccSD |
precisionQuantile |
the desired quantile to compute the probability of being above |
A vector of probabilities for each ALE strata.
Matthew N. McCall
data(affy) affySlope <- spkSlope(affy) affyAccSD <- spkAccSD(affy, affySlope) spkPot(affy, affySlope, affySlope$slopes, affyAccSD, .995)
data(affy) affySlope <- spkSlope(affy) affyAccSD <- spkAccSD(affy, affySlope) spkPot(affy, affySlope, affySlope$slopes, affyAccSD, .995)
An internal function called by spkSlope.
spkQuantile(amt, avgE, ens, p)
spkQuantile(amt, avgE, ens, p)
amt |
a vector of nominal concentrations |
avgE |
the observed average expression corresponding to each nominal concentration |
ens |
the average expression across arrays of unexpressed genes |
p |
the quantiles to make the bins |
Matthew N. McCall
data(affy) affySlope <- spkSlope(affy)
data(affy) affySlope <- spkSlope(affy)
Plots observed expression vs. nominal concentration. The overall regression slope, as well as, regression slopes for low, medium, and high bins are computed and the regression lines plotted.
spkSlope(object, label = NULL, cuts=c(.6,.99), ...)
spkSlope(object, label = NULL, cuts=c(.6,.99), ...)
object |
a SpikeInExpressionSet object |
label |
a character string to insert into the plot title |
cuts |
quantiles used to make the low, medium, and high bins |
... |
arguments passed to the plot function |
The bins are created by computing the proportion of non-spike-in genes with expression values less than or equal to the average expression value at each nominal concentration. Using the default value of cuts, the high bin contains nominal concentrations with 99 percent or more of the non-spike-in expression values lower than it. The medium bin contains nominal concentrations with between 60 and 99 percent of the non-spike-in expression values lower than it. The low bin contains nominal concentrations with less than 60 percent of the non-spike-in expression values lower than it.
avgExp |
average expression at each nominal concentration |
slopes |
the regression slopes - overall and for each bin |
breaks |
which spike-in levels fall in each bin |
brkpts |
the expression value of the cut points between bins |
prop |
the proportion of non-spike-in probes with expression less than the average expression at each nominal concentration |
Matthew N. McCall
data(affy) spkSlope(affy)
data(affy) spkSlope(affy)
A collection of functions to examine microarray datasets that include spike-ins. In particular, it allows one to explore the distribution of spike-ins within the range of possible expression values, the relationship between nominal concentration and expression, and the relationship between expected and observed fold change for different levels of comparison.
Package: | spkTools |
Type: | Package |
Version: | 0.0.1 |
Date: | 2007-10-9 |
License: | GPL version 2 or newer |
Matthew N. McCall
Maintainer: Matthew N. McCall <[email protected]>
## The Three Plots data(affy) par(mfrow=c(2,2)) affySlope <- spkSlope(affy) spkDensity(affy, affySlope) spkBox(affy, affySlope) ## The Full Wrapper data(affy) spkAll(affy, label="Affymetrix", fc=2)
## The Three Plots data(affy) par(mfrow=c(2,2)) affySlope <- spkSlope(affy) spkDensity(affy, affySlope) spkBox(affy, affySlope) ## The Full Wrapper data(affy) spkAll(affy, label="Affymetrix", fc=2)
Compute an estimate of the standard deviation in expression at each nominal concentration.
spkVar(object)
spkVar(object)
object |
a SpikeInExpressionSet object |
a matrix containing spike-in levels and corresponding MADs.
Matthew N. McCall
data(affy) spkVar(affy)
data(affy) spkVar(affy)
Prints a summary table of the data resulting from a call to spkBox.
summarySpkBox(boxs)
summarySpkBox(boxs)
boxs |
the output of a call to spkBox |
A dataframe with 2 columns: the mean fold change and the median average distance of the fold changes.
Matthew N. McCall
data(affy) affySlope <- spkSlope(affy) affyBox <- spkBox(affy, affySlope) plotSpkBox(affyBox)
data(affy) affySlope <- spkSlope(affy) affyBox <- spkBox(affy, affySlope) plotSpkBox(affyBox)