Title: | Detect differential expression in microarray and proteomics datasets with the Power Law Global Error Model (PLGEM) |
---|---|
Description: | The Power Law Global Error Model (PLGEM) has been shown to faithfully model the variance-versus-mean dependence that exists in a variety of genome-wide datasets, including microarray and proteomics data. The use of PLGEM has been shown to improve the detection of differentially expressed genes or proteins in these datasets. |
Authors: | Mattia Pelizzola <[email protected]> and Norman Pavelka <[email protected]> |
Maintainer: | Norman Pavelka <[email protected]> |
License: | GPL-2 |
Version: | 1.79.0 |
Built: | 2024-12-30 03:09:01 UTC |
Source: | https://github.com/bioc/plgem |
This ExpressionSet
object contains a subset of the microarray data used
in References for PLGEM set-up and validation. Briefly, it contains
normalized gene expression values from a total of 6 hybridizations on
MG-U74Av2 Affymetrix GeneChip microarrays. Two experimental conditions are
represented in this dataset: the baseline condition (‘C’) contains data
of immature murine dendritic cells (4 replicates); the other condition
(‘LPS’) contains data of the same cells stimulated for 4 hours with LPS
(2 replicates).
data(LPSeset)
data(LPSeset)
An object of class ExpressionSet
.
Mattia Pelizzola [email protected]
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17;5:203.; http://www.biomedcentral.com/1471-2105/5/203
data(LPSeset) library(Biobase) head(exprs(LPSeset))
data(LPSeset) library(Biobase) head(exprs(LPSeset))
This function selects differentially expressed genes/proteins (DEG) at a given
significance level, based on observed PLGEM signal-to-noise ratio (STN)
values (typically obtained via a call to plgem.obsStn
) and
pre-computed p-values (typically obtained via a call to
plgem.pValue
).
plgem.deg(observedStn, plgemPval, delta=0.001, verbose=FALSE)
plgem.deg(observedStn, plgemPval, delta=0.001, verbose=FALSE)
observedStn |
|
plgemPval |
|
delta |
numeric vector; the significance level(s) to be used for the selection of DEG; value(s) must be between 0 and 1 (excluded). |
verbose |
|
This function allows for the selection of DEG by setting a significance
cut-off on pre-calculated p-values. The significance level delta
roughly represents the false positive rate of the DEG selection, e.g. if a
delta
of 0.001 is chosen in a microarray dataset with 10,000 genes
(none of which is truly differentially expressed), on average 10
genes/proteins are expected to be selected by chance alone.
A list
of four elements:
fit |
the input |
PLGEM.STN |
the input |
p-value |
the input |
significant |
a |
Mattia Pelizzola [email protected]
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
plgem.fit
, plgem.obsStn
,
plgem.resampledStn
, plgem.pValue
,
run.plgem
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset, fittingEval=FALSE) LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit) set.seed(123) LPSresampledStn <- plgem.resampledStn(data=LPSeset, plgemFit=LPSfit) LPSpValues <- plgem.pValue(LPSobsStn, LPSresampledStn) LPSdegList <- plgem.deg(observedStn=LPSobsStn, plgemPval=LPSpValues, delta=0.001)
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset, fittingEval=FALSE) LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit) set.seed(123) LPSresampledStn <- plgem.resampledStn(data=LPSeset, plgemFit=LPSfit) LPSpValues <- plgem.pValue(LPSobsStn, LPSresampledStn) LPSdegList <- plgem.deg(observedStn=LPSobsStn, plgemPval=LPSpValues, delta=0.001)
Function for fitting and evaluating goodness of fit of a PLGEM on a set
of replicated samples defined in an ExpressionSet
.
plgem.fit(data, covariate=1, fitCondition=1, p=10, q=0.5, trimAllZeroRows=FALSE, zeroMeanOrSD=c("replace", "trim"), fittingEval=FALSE, plot.file=FALSE, prefix=NULL, gPar=setGpar(), verbose=FALSE)
plgem.fit(data, covariate=1, fitCondition=1, p=10, q=0.5, trimAllZeroRows=FALSE, zeroMeanOrSD=c("replace", "trim"), fittingEval=FALSE, plot.file=FALSE, prefix=NULL, gPar=setGpar(), verbose=FALSE)
data |
an object of class |
covariate |
|
fitCondition |
|
p |
|
q |
|
trimAllZeroRows |
|
zeroMeanOrSD |
either |
fittingEval |
|
plot.file |
|
prefix |
optional |
gPar |
optional |
verbose |
|
plgem.fit
fits a Power Law Global Error Model (PLGEM) to an
ExpressionSet
and optionally evaluates the quality of the fit. This
PLGEM aims to find the mathematical relationship between standard
deviation and mean gene expression values (or protein abundance levels) in a
set of replicated microarray (or proteomics) samples, according to the
following power law:
It has been demonstrated that this model fits to Affymetrix GeneChip datasets, as well as to datasets of normalized spectral counts obtained by mass spectrometry-based proteomics. Technically, two replicates are required and sufficient to fit a PLGEM. Having 3 or more replicates, of course, improves the fitting and is recommended (see References for details).
The phenoData
slot of the ExpressionSet
given as input is
expected to contain the necessary information to distinguish the various
experimental conditions from one another. The columns of the pData
are
referred to as ‘covariates’. There has to be at least one covariate
defined in the input ExpressionSet
. The sample attributes according to
this covariate must be distinct for samples that are to be treated as distinct
experimental conditions and identical for samples that are to be treated as
replicates.
There is a couple different ways to specify the covariate
: If an
integer
or a numeric
is given, it will be taken as the covariate
number (in the same order in which the covariates appear in the
colnames
of the pData
). If a character
is given, it will
be taken as the covariate name itself (in the same way the covariates are
specified in the colnames
of the pData
). By default, the first
covariate appearing in the colnames
of the pData
is used.
Similarly, there is a couple different ways to specify on which experimental
condition to fit the model. The available ‘condition names’ are taken
from unique(as.character(pData(data)[, covariate]))
. If
fitCondition
is given as a character
, it will be taken as the
condition name itself. If fitCondition
is given as an integer
or a numeric
value, it will be taken as the condition number (in the
same order of appearance as in the ‘condition names’). By default, the
first condition name is used.
Setting trimAllZeroRows=TRUE
is especially useful in proteomics data
sets, where there is no guarantee of identifying a protein across all
experimental conditions. Since PLGEM is fitted only to the data
corresponding to a single experimental condition (as defined by
fitCondition
), it is possible to generate a non-negligible number of
rows containing only zero values, even if there were no such rows in the
original (complete) data set containing all experimental conditions.
Setting zeroMeanOrSD="replace"
(the current default, for backward
compatibility) will cause the function to replace zero or negative means with
the smallest positive mean found in the data set and to replace zero standard
deviations with the smallest non-zero standard deviation found in the data
set. Setting zeroMeanOrSD="trim"
is the current recommended option,
especially for spectral counting proteomics data sets that are typically
characterized by a high data granularity or for microarray data sets with a
small number of replicates. In both cases, there are chances for data values
for a same gene or protein to be identical across replicates (and therefore
with zero standard deviation) by chance alone. Note that setting
trimAllZeroRows=TRUE
does not guarantee that there will be no rows with
zero mean or zero standard deviation.
If argument fittingEval
is set to TRUE
, a graphical control of
the goodness of the PLGEM fitting is produced and a plot containing
four panels is generated. The top-left panel shows the power law,
characterized by a ‘SLOPE’ and an ‘INTERCEPT’. The top-right
panel represents the distribution of model residuals. The bottom-left reports
the contour plot of ranked residuals. The bottom-right panel finally shows the
relationship between the distribution of observed residuals and the normal
distribution. A good fit normally gives a horizontal symmetric rank-plot and a
near normal distribution of residuals.
Warnings are issued if the fitted PLGEM slope is above 1 or under 0.5, if the
adjusted is below 0.95 or if the Pearson correlation
coefficient is below 0.85. These are the ranges of values inside which most
GeneChip MAS5 dataset and NSAF proteomics dataset have been empirically
observed to lie (see References).
A list
of six elements (see Details):
SLOPE |
the slope of the fitted PLGEM. |
INTERCEPT |
the intercept of the fitted PLGEM. |
DATA.PEARSON |
the Pearson correlation coefficient between the
|
ADJ.R2.MP |
the adjusted |
COVARIATE |
a |
FIT.CONDITION |
a |
Mattia Pelizzola [email protected]
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
setGpar
, plgem.obsStn
,
plgem.resampledStn
, plgem.pValue
,
plgem.deg
, run.plgem
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset, fittingEval=TRUE) as.data.frame(LPSfit)
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset, fittingEval=TRUE) as.data.frame(LPSfit)
This function computes observed signal-to-noise ratio (STN) values using
PLGEM fitting parameters (obtained via a call to function
plgem.fit
) to detect differential expression in an
ExpressionSet
, containing either microarray or proteomics data.
plgem.obsStn(data, plgemFit, covariate=1, baselineCondition=1, verbose=FALSE)
plgem.obsStn(data, plgemFit, covariate=1, baselineCondition=1, verbose=FALSE)
data |
an object of class |
plgemFit |
|
covariate |
|
baselineCondition |
|
verbose |
|
The phenoData
slot of the ExpressionSet
given as input is
expected to contain the necessary information to distinguish the various
experimental conditions from one another. The columns of the pData
are
referred to as ‘covariates’. There has to be at least one covariate
defined in the input ExpressionSet
. The sample attributes according to
this covariate must be distinct for samples that are to be treated as distinct
experimental conditions and identical for samples that are to be treated as
replicates.
There is a couple different ways how to specify the covariate
: If an
integer
or a numeric
is given, it will be taken as the covariate
number (in the same order in which the covariates appear in the
colnames
of the pData
). If a character
is given, it will
be taken as the covariate name itself (in the same way the covariates are
specified in the colnames
of the pData
). By default, the first
covariate appearing in the colnames
of the pData
is used.
Similarly, there is a couple different ways how to specify which experimental
condition to treat as the baseline. The available ‘condition names’ are
taken from unique(as.character(pData(data)[, covariate]))
. If
baselineCondition
is given as a character
, it will be taken as
the condition name itself. If baselineCondition
is given as an
integer
or a numeric
value, it will be taken as the condition
number (in the same order of appearance as in the ‘condition names’).
By default, the first condition name is used.
PLGEM-STN values are a measure of the degree of differential expression between a condition and the baseline:
where:
plgem.obsStn
determines the observed PLGEM-STN values for each gene
or protein in data
; see References for details.
A list
of two elements:
fit |
the input |
PLGEM.STN |
a |
Mattia Pelizzola [email protected]
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
plgem.fit
, plgem.resampledStn
,
plgem.pValue
, plgem.deg
, run.plgem
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset) LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit) head(LPSobsStn[["PLGEM.STN"]])
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset) LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit) head(LPSobsStn[["PLGEM.STN"]])
This function computes p-values for observed PLGEM signal-to-noise ratio (STN)
values (typically obtained via a call to plgem.obsStn
) from
resampled STN values (typically obtained via a call to
plgem.resampledStn
).
plgem.pValue(observedStn, plgemResampledStn, verbose=FALSE)
plgem.pValue(observedStn, plgemResampledStn, verbose=FALSE)
observedStn |
|
plgemResampledStn |
|
verbose |
|
The p-value of each given observed STN value is computed based on the quantile that the given value occupies in the corresponding distribution of resampled PLGEM-STN values, based on the following relationship:
A matrix
with the same dim
ensions and
dimnames
as the input observedStn$PLGEM.STN
, where each
entry represents the p-value of the corresponding observed PLGEM-STN value.
Mattia Pelizzola [email protected]
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
plgem.fit
, plgem.obsStn
,
plgem.resampledStn
, plgem.deg
,
run.plgem
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset) LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit) head(LPSobsStn[["PLGEM.STN"]]) set.seed(123) LPSresampledStn <- plgem.resampledStn(data=LPSeset, plgemFit=LPSfit) LPSpValues <- plgem.pValue(LPSobsStn, LPSresampledStn) head(LPSpValues)
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset) LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit) head(LPSobsStn[["PLGEM.STN"]]) set.seed(123) LPSresampledStn <- plgem.resampledStn(data=LPSeset, plgemFit=LPSfit) LPSpValues <- plgem.pValue(LPSobsStn, LPSresampledStn) head(LPSpValues)
This function computes resampled signal-to-noise ratio (STN) values using
PLGEM fitting parameters (obtained via a call to function
plgem.fit
) to detect differential expression in an
ExpressionSet
, containing either microarray or proteomics data.
plgem.resampledStn(data, plgemFit, covariate=1, baselineCondition=1, iterations="automatic", verbose=FALSE)
plgem.resampledStn(data, plgemFit, covariate=1, baselineCondition=1, iterations="automatic", verbose=FALSE)
data |
an object of class |
plgemFit |
|
covariate |
|
baselineCondition |
|
verbose |
|
iterations |
number of iterations for the resampling step; if
|
The phenoData
slot of the ExpressionSet
given as input is
expected to contain the necessary information to distinguish the various
experimental conditions from one another. The columns of the pData
are
referred to as ‘covariates’. There has to be at least one covariate
defined in the input ExpressionSet
. The sample attributes according to
this covariate must be distinct for samples that are to be treated as distinct
experimental conditions and identical for samples that are to be treated as
replicates.
There is a couple different ways how to specify the covariate
: If an
integer
or a numeric
is given, it will be taken as the covariate
number (in the same order in which the covariates appear in the
colnames
of the pData
). If a character
is given, it will
be taken as the covariate name itself (in the same way the covariates are
specified in the colnames
of the pData
). By default, the first
covariate appearing in the colnames
of the pData
is used.
Similarly, there is a couple different ways how to specify which experimental
condition to treat as the baseline. The available ‘condition names’ are
taken from unique(as.character(pData(data)[, covariate]))
. If
baselineCondition
is given as a character
, it will be taken as
the condition name itself. If baselineCondition
is given as an
integer
or a numeric
value, it will be taken as the condition
number (in the same order of appearance as in the ‘condition names’).
By default, the first condition name is used.
PLGEM-STN values are a measure of the degree of differential expression between a condition and the baseline:
where:
plgem.resampledStn
determines the resampled PLGEM-STN values for each
gene or protein in data
using a resampling approach; see References
for details. The number of iterations should be chosen depending on the number
of available replicates of the condition used for fitting the model.
A list
of two elements:
RESAMPLED.STN |
|
REPL.NUMBER |
the number of replicates found for each experimental condition; see References for details. |
Mattia Pelizzola [email protected]
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
plgem.fit
, plgem.obsStn
,
plgem.pValue
, plgem.deg
, run.plgem
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset) LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit) set.seed(123) LPSresampledStn <- plgem.resampledStn(data=LPSeset, plgemFit=LPSfit) plot(density(LPSresampledStn[["RESAMPLED.STN"]], bw=0.01), col="black", lwd=2, xlab="PLGEM STN values", main="Distribution of observed\nand resampled PLGEM-STN values") lines(density(LPSobsStn[["PLGEM.STN"]], bw=0.01), col="red") legend("topright", legend=c("resampled", "observed"), col=c("black", "red"), lwd=2:1)
data(LPSeset) LPSfit <- plgem.fit(data=LPSeset) LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit) set.seed(123) LPSresampledStn <- plgem.resampledStn(data=LPSeset, plgemFit=LPSfit) plot(density(LPSresampledStn[["RESAMPLED.STN"]], bw=0.01), col="black", lwd=2, xlab="PLGEM STN values", main="Distribution of observed\nand resampled PLGEM-STN values") lines(density(LPSobsStn[["PLGEM.STN"]], bw=0.01), col="red") legend("topright", legend=c("resampled", "observed"), col=c("black", "red"), lwd=2:1)
This function writes the output of function plgem.deg
or
run.plgem
to a series of files in the current working directory.
plgem.write.summary(x, prefix=NULL, verbose=FALSE)
plgem.write.summary(x, prefix=NULL, verbose=FALSE)
x |
|
prefix |
optional |
verbose |
|
This function writes three types of files to the current working directory:
1) A comma-separated text file containing the PLGEM fitting parameters;
2) A comma-separated text file containing the observed STN values and their associated p-values for all performed comparisons; (STN values and p-values from different comparisons appear in different columns, with column headers reflecting the underlying comparison)
3) One or more plain text files containing the identifiers of the significant genes or proteins, with filenames reflecting the specific comparisons that were performed (i.e. which experimental condition was compared to the baseline) and the specific significance threshold that were used in the DEG selection step.
Before files are written, the function checks for existence of files with identical names in the working directory and prompts the user to decide whether to abort the writing process or to overwrite the existing files.
The function returns no value. It is called for its side effect to write files to the working directory.
Mattia Pelizzola [email protected]
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
## Not run: data(LPSeset) LPSdegList <- run.plgem(LPSeset, fitting.eval=FALSE) plgem.write.summary(LPSdegList, prefix="test", verbose=TRUE) ## End(Not run)
## Not run: data(LPSeset) LPSdegList <- run.plgem(LPSeset, fitting.eval=FALSE) plgem.write.summary(LPSdegList, prefix="test", verbose=TRUE) ## End(Not run)
This function automatically performs PLGEM fitting and evaluation, determination of observed and resampled PLGEM-STN values, and selection of differentially expressed genes/proteins (DEG) using the PLGEM method.
run.plgem(esdata, signLev=0.001, rank=100, covariate=1, baselineCondition=1, Iterations="automatic", trimAllZeroRows=FALSE, zeroMeanOrSD=c("replace", "trim"), fitting.eval=TRUE, plotFile=FALSE, writeFiles=FALSE, Prefix=NULL, Verbose=FALSE)
run.plgem(esdata, signLev=0.001, rank=100, covariate=1, baselineCondition=1, Iterations="automatic", trimAllZeroRows=FALSE, zeroMeanOrSD=c("replace", "trim"), fitting.eval=TRUE, plotFile=FALSE, writeFiles=FALSE, Prefix=NULL, Verbose=FALSE)
esdata |
an object of class |
signLev |
numeric vector; significance level(s) for the DEG selection. Value(s) must be in (0,1). |
rank |
|
covariate |
|
baselineCondition |
|
Iterations |
number of iterations for the resampling step; if
|
trimAllZeroRows |
|
zeroMeanOrSD |
either |
fitting.eval |
|
plotFile |
|
writeFiles |
|
Prefix |
optional |
Verbose |
|
The phenoData
slot of the ExpressionSet
given as input is
expected to contain the necessary information to distinguish the various
experimental conditions from one another. The columns of the pData
are
referred to as ‘covariates’. There has to be at least one covariate
defined in the input ExpressionSet
. The sample attributes according to
this covariate must be distinct for samples that are to be treated as distinct
experimental conditions and identical for samples that are to be treated as
replicates.
There is a couple different ways how to specify the covariate
: If an
integer
or a numeric
is given, it will be taken as the covariate
number (in the same order in which the covariates appear in the
colnames
of the pData
). If a character
is given, it will
be taken as the covariate name itself (in the same way the covariates are
specified in the colnames
of the pData
). By default, the first
covariate appearing in the colnames
of the pData
is used.
Similarly, there is a couple different ways how to specify which experimental
condition to treat as the baseline. The available ‘condition names’ are
taken from unique(as.character(pData(data)[, covariate]))
. If
baselineCondition
is given as a character
, it will be taken as
the condition name itself. If baselineCondition
is given as an
integer
or a numeric
value, it will be taken as the condition
number (in the same order of appearance as in the ‘condition names’).
By default, the first condition name is used.
The model is fitted on the most replicated condition. When more conditions
exist with the maximum number of replicates, the condition providing the best
fit is chosen (based on the adjusted ). If there is again a
tie, the first one is arbitrarily taken.
If less than 3 replicates are provided for the condition used for fitting,
then the selection is based on ranking according to the observed PLGEM-STN
values. In this case the first rank
genes or proteins are selected for
each comparison.
Otherwise DEG are selected comparing the observed and resampled PLGEM-STN
values at the signLev
significance level(s), based on p-values obtained
via a call to function plgem.pValue
. See References for details.
A list
of four elements:
fit |
the input |
PLGEM.STN |
a |
p-value |
a |
significant |
a |
Mattia Pelizzola [email protected]
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
plgem.fit
, plgem.obsStn
,
plgem.resampledStn
, plgem.pValue
,
plgem.deg
, plgem.write.summary
data(LPSeset) set.seed(123) LPSdegList <- run.plgem(esdata=LPSeset, fitting.eval=FALSE)
data(LPSeset) set.seed(123) LPSdegList <- run.plgem(esdata=LPSeset, fitting.eval=FALSE)
Function to set graphical parameters for PLGEM fitting evaluation
plots produced by function plgem.fit
.
setGpar(minLnM=NULL, maxLnM=NULL, minLnSD=NULL, maxLnSD=NULL, minRes=NULL, maxRes=NULL)
setGpar(minLnM=NULL, maxLnM=NULL, minLnSD=NULL, maxLnSD=NULL, minRes=NULL, maxRes=NULL)
minLnM |
minimum 'ln(mean)' value in upper left plot. |
maxLnM |
maximum 'ln(mean)' value in upper left plot. |
minLnSD |
minimum 'ln(sd)' value in upper left plot. |
maxLnSD |
maximum 'ln(sd)' value in upper left plot. |
minRes |
minimum 'residual' value in upper right plot. |
maxRes |
maximum 'residual' value in upper right plot. |
A call to setGpar()
is the recommended way to set graphical parameters
in plgem.fit
. If parameters are left unspecified, suitable defaults
will be found by plgem.fit
.
A list
of six elements:
minLnM |
minimum 'ln(mean)' value in upper left plot. |
maxLnM |
maximum 'ln(mean)' value in upper left plot. |
minLnSD |
minimum 'ln(sd)' value in upper left plot. |
maxLnSD |
maximum 'ln(sd)' value in upper left plot. |
minRes |
minimum 'residual' value in upper right plot. |
maxRes |
maximum 'residual' value in upper right plot. |
Norman Pavelka [email protected]
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
setGpar(minLnM=-1, maxLnM=8)
setGpar(minLnM=-1, maxLnM=8)