Title: | Signal-to-Noise applied to Gene Expression Experiments |
---|---|
Description: | Signal-to-Noise applied to Gene Expression Experiments. Signal-to-noise ratios can be used as a proxy for quality of gene expression studies and samples. The SNRs can be calculated on any gene expression data set as long as gene IDs are available, no access to the raw data files is necessary. This allows to flag problematic studies and samples in any public data set. |
Authors: | David Venet <[email protected]> |
Maintainer: | David Venet <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.47.0 |
Built: | 2024-11-30 04:49:48 UTC |
Source: | https://github.com/bioc/SNAGEE |
Signal-to-Noise Applied to Gene Expression Experiments
Package: | SNAGEE |
Version: | 0.99.0 |
Date: | 2012-01-26 |
Depends: | R (>= 2.6.0) |
Imports: | SNAGEEdata |
Suggests: | ALL |
Enhances: | parallel |
License: | Artistic-2.0 |
URL: | http://fleming.ulb.ac.be/SNAGEE |
Index:
qualStudy Quality of a study qualSample Quality of samples in a study toSnageeFormat Turns an Eset to a list usable by SNAGEE
David Venet <[email protected]>
Maintainer: David Venet <[email protected]>
# Get the list of genes geneList = getCC()$g; # Create a random data set d=list(genes=geneList, data=matrix(rnorm(length(geneList)*50),ncol=50)); # Calculate its quality (it's going to be very close to 0) qualStudy(d, disattenuate=FALSE); # Calcuate individual sample qualities qs = qualSample(d);
# Get the list of genes geneList = getCC()$g; # Create a random data set d=list(genes=geneList, data=matrix(rnorm(length(geneList)*50),ncol=50)); # Calculate its quality (it's going to be very close to 0) qualStudy(d, disattenuate=FALSE); # Calcuate individual sample qualities qs = qualSample(d);
Calculate the relative quality of all samples from a study.
qualSample(data,mode="complete",cc=NULL,multicore=FALSE)
qualSample(data,mode="complete",cc=NULL,multicore=FALSE)
data |
The study data. If an Eset, toSnageeFormat is called on it. Otherwise, must be a list with fields 'genes' containing the vector of gene IDs (from Entrez) and 'data' containing the gene expression data. |
mode |
Which gene-gene correlation matrix should be used. Can be 'complete' (using all platforms) or 'woAffy' (without the Affy platforms). |
cc |
Can be used if wishing to use a custom gene-gene correlation matrix. Must be a list with fields 'g' containing the gene IDs and 'cc' containing the (upper triangular part of the) correlations. |
multicore |
Should the parallel version be used? This is based on the parallel package, if that package cannot be loaded it will fall back on single core, with a warning. |
The function calculates the quality of all samples in a study. Lower values are of lower quality. The numerical values of the study (the 'data' field) should be in log-scale, and normalized. It is recommended to used medpolish on the data.
Each gene should only appear once in the gene list. Duplicated genes must be merged before using the function. Non-finite values should also be removed first (using the impute package for instance).
SNAGEE, qualStudy, toSnageeFormat
# Get the list of genes geneList = getCC()$g; # Create a random data set d=list(genes=geneList, data=matrix(rnorm(length(geneList)*50),ncol=50)); # And calculate the quality of the samples (they are all about the same) qualSample(d);
# Get the list of genes geneList = getCC()$g; # Create a random data set d=list(genes=geneList, data=matrix(rnorm(length(geneList)*50),ncol=50)); # And calculate the quality of the samples (they are all about the same) qualSample(d);
Calculate the quality of a study.
qualStudy(d,mode="complete",cc=NULL,disattenuate=TRUE)
qualStudy(d,mode="complete",cc=NULL,disattenuate=TRUE)
d |
The study data. If an Eset, toSnageeFormat is called on it. Otherwise, must be a list with fields 'genes' containing the vector of gene IDs (from NCBI's Gene DB) and 'data' containing the actual data. |
mode |
Which gene-gene correlation matrix should be used. Can be 'complete' (using all platforms) or 'woAffy' (without the Affy platforms). |
cc |
Can be used if wishing to use a custom gene-gene correlation matrix. Must be a list with fields 'g' containing the gene IDs and 'cc' containing the (upper triangular part of the) correlations. |
disattenuate |
Should the qualities be disattenuated? |
The function calculates the quality of a study. The numerical values of the study (the 'data' field) should be in log-scale, and normalized. It is recommended to used medpolish on the data.
Each gene should only appear once in the gene list. Duplicated genes must be merged before using the function.
The mode
'woAffy'
may be useful to compare Affymetrix to not Affymetrix studies.
As the median gene correlation matrix was calculated with a majority of Affymetrix platforms, those
platforms tend to be given higher quality than the others with the 'complete'
mode,
which may be misleading.
SNAGEE, qualSample, linktoSnageeFormat
# Get the list of genes geneList = getCC()$g; # Create a random data set d=list(genes=geneList, data=matrix(rnorm(length(geneList)*50),ncol=50)); # And calculate its quality (it's going to be close to 0) qualStudy(d, disattenuate=FALSE);
# Get the list of genes geneList = getCC()$g; # Create a random data set d=list(genes=geneList, data=matrix(rnorm(length(geneList)*50),ncol=50)); # And calculate its quality (it's going to be close to 0) qualStudy(d, disattenuate=FALSE);
Turns an Eset into a list usable by SNAGEE.
toSnageeFormat(data)
toSnageeFormat(data)
data |
An Eset. If already a list, leaves it as it is. |
The function turns an Eset into a list usable by SNAGEE. Gene ID annotations are found using the annotation slot of the Eset, and the related annotation DB. If no annotation DB can be found, gives an error.
In addition, features with identical gene IDs are averaged, and the data are medpolished.
# Get the list of genes geneList = getCC()$g; # Create a random data set d=list(genes=geneList, data=matrix(rnorm(length(geneList)*50),ncol=50)); # And calculate its quality (it's going to be close to 0) qualStudy(d, disattenuate=FALSE);
# Get the list of genes geneList = getCC()$g; # Create a random data set d=list(genes=geneList, data=matrix(rnorm(length(geneList)*50),ncol=50)); # And calculate its quality (it's going to be close to 0) qualStudy(d, disattenuate=FALSE);