Package 'ROTS' reference manual

Title:	Reproducibility-Optimized Test Statistic
Description:	Calculates the Reproducibility-Optimized Test Statistic (ROTS) for differential testing in omics data.
Authors:	Fatemeh Seyednasrollah, Tomi Suomi, Laura L. Elo
Maintainer:	Tomi Suomi <[email protected]>
License:	GPL (>= 2)
Version:	1.99.11
Built:	2025-03-26 03:32:52 UTC
Source:	https://github.com/bioc/ROTS

Gene expression data from the Affymetrix spike-in experiment

Description

The Affymetrix HG-U95Av2 spike-in data contains two groups of 12 samples. In this carefully controlled experiment, it is known that 14 genes are differentially expressed between the two sample groups, while the rest of the genes are equally expressed. In order to reduce the running time of the vignette example, this package uses a subset of 1000 genes of the original dataset including all the spiked genes with 5 samples from each group. The preprocessed RMA-normalized data were obtained from the Bioconductor 2.2 package DEDS. For more details about the spike-in experiment, see http://www.affymetrix.com.

Usage

affySpikeInaffySpikeIn

Format

affySpikeIn: a numeric matrix of gene expression levels with 12626 rows (genes) and 24 columns (samples).
affySpikeIn.L: a vector indicating the sample groups.
affySpikeIn.gnames: a character vector containing the names of the genes.
spikedgene: a numeric vector indicating the locations of the 14 spiked genes.

Reproducibility-Optimized Test Statistic (ROTS)

Description

Calculates the reproducibility-optimized test statistic (ROTS) for ranking genes in order of evidence for differential expression using linear mixed-effects model

Usage

	lmeROTS(formula, data, metadata, B = 100, K = NULL, seed = NULL,
	BPPARAM = bpparam())
lmeROTS(formula, data, metadata, B = 100, K = NULL, seed = NULL,
	BPPARAM = bpparam())

Arguments

`formula`	A string representing a formula that follows the `lme4` syntax and where the dependent variable (e.g. expression ~ ...) is omitted. `lmeROTS` uses each row of `data` as the dependent variable and looks for explanatory variables from `metadata`.
`data`	A numeric data matrix in which rows correspond to genes and columns correspond to samples.
`metadata`	A numeric data matrix in which rows correspond to samples and columns correspond to explanatory variables.
`B`	An integer specifying the number of bootstrap and permutation resamplings (default 100).
`K`	An integer indicating the largest top list size considered. If no value is given, 1/4 of the features are used.
`seed`	An integer seed for the random number generator.
`BPPARAM`	BiocParallel back-end if other than default.

Details

The reproducibility-optimization procedure ROTS enables the selection of a suitable gene ranking statistic directly from the given dataset. This function allows the use of linear mixed-effects models as a basis for reproducibility-optimization procedure.

Value

lmeROTS returns an object of class regROTS, which is a list of ROTS results for each of the explanatory variables in the formula. They contain the following components:

`d`	the value of the optimized ROTS-statistic for each gene.
`coef`	coefficients of the model for each gene.
`pvalue`	the corresponding pvalues.
`FDR`	the corresponding FDR-values.
`a1`	the optimized parameter a1.
`a2`	the optimized parameter a2.
`k`	the optimized top list size.
`R`	the optimized reproducibility value.
`Z`	the optimized reproducibility Z-score.
`ztable`	the full reproducibility table.

Author(s)

Tomi Suomi, Laura L. Elo
Maintainer: Tomi Suomi <[email protected]>

Examples

## lmeROTS-statistic
## Not run: rots.out <- lmeROTS("Group + Age + (1|Individual)", data, metadata)
## lmeROTS-statistic
## Not run: rots.out <- lmeROTS("Group + Age + (1|Individual)", data, metadata)

Reproducibility-Optimized Test Statistic (ROTS)

Description

Calculates the reproducibility-optimized test statistic (ROTS) for ranking genes in order of evidence for differential expression using linear model

Usage

	lmROTS(formula, data, metadata, B = 100, K = NULL, seed = NULL,
	BPPARAM = bpparam())
lmROTS(formula, data, metadata, B = 100, K = NULL, seed = NULL,
	BPPARAM = bpparam())

Arguments

`formula`	A string representing a formula that follows the `lm` syntax and where the dependent variable (e.g. expression ~ ...) is omitted. `lmROTS` uses each row of `data` as the dependent variable and looks for explanatory variables from `metadata`.
`data`	A numeric data matrix in which rows correspond to genes and columns correspond to samples.
`metadata`	A numeric data matrix in which rows correspond to samples and columns correspond to explanatory variables.
`B`	An integer specifying the number of bootstrap and permutation resamplings (default 100).
`K`	An integer indicating the largest top list size considered. If no value is given, 1/4 of the features are used.
`seed`	An integer seed for the random number generator.
`BPPARAM`	BiocParallel back-end if other than default.

Details

The reproducibility-optimization procedure ROTS enables the selection of a suitable gene ranking statistic directly from the given dataset. This function allows the use of linear models as a basis for reproducibility-optimization procedure.

Value

lmROTS returns an object of class regROTS, which is a list of ROTS results for each of the explanatory variables in the formula. They contain the following components:

`d`	the value of the optimized ROTS-statistic for each gene.
`coef`	coefficients of the model for each gene.
`pvalue`	the corresponding pvalues.
`FDR`	the corresponding FDR-values.
`a1`	the optimized parameter a1.
`a2`	the optimized parameter a2.
`k`	the optimized top list size.
`R`	the optimized reproducibility value.
`Z`	the optimized reproducibility Z-score.
`ztable`	the full reproducibility table.

Author(s)

Tomi Suomi, Laura L. Elo
Maintainer: Tomi Suomi <[email protected]>

Examples

## lmROTS-statistic
## Not run: rots.out <- lmeROTS("Group + Age", data, metadata)
## lmROTS-statistic
## Not run: rots.out <- lmeROTS("Group + Age", data, metadata)

Plotting of a regROTS object

Description

Plots the regROTS objects created with the ROTS package.

Usage

  ## S3 method for class 'regROTS'
plot(x, fdr=0.05, type=NULL, features=NULL, labels=FALSE, ...)
## S3 method for class 'regROTS'
plot(x, fdr=0.05, type=NULL, features=NULL, labels=FALSE, ...)

Arguments

`x`	A `regROTS` object created from differential expression testing run by lmeROTS or lmROTS.
`fdr`	Selected cutoff for FDR value.
`type`	Type of plot to be created. Options are: 'volcano', 'reproducibility', 'pvalue'.
`features`	Option to plot only selected variables from the output.
`labels`	Option to print labels for differentially expressed features.
`...`	Other arguments passed to the plot function.

Details

This function plots the results from a ROTS object using given false discovery rate threshold.

Value

Plots the results from ROTS object.

Author(s)

Tomi Suomi, Laura L. Elo
Maintainer: Tomi Suomi <[email protected]>

Examples

  ## Run and plot lmeROTS-statistic 
  ## Not run: rots.out <- lmeROTS("Group + Age + (1|Individual)", data, metadata)
  ## Not run: plot(rots.out, type="reproducibility")
## Run and plot lmeROTS-statistic 
  ## Not run: rots.out <- lmeROTS("Group + Age + (1|Individual)", data, metadata)
  ## Not run: plot(rots.out, type="reproducibility")

Plotting of a ROTS object

Description

Plots the ROTS objects created with the ROTS package.

Usage

  ## S3 method for class 'ROTS'
plot(x, fdr=0.05, type=NULL, labels=FALSE, ...)
## S3 method for class 'ROTS'
plot(x, fdr=0.05, type=NULL, labels=FALSE, ...)

Arguments

`x`	A `ROTS` object created from differential expression testing run by ROTS.
`fdr`	Selected cutoff for FDR value.
`type`	Type of plot to be created. Options are: 'volcano', 'heatmap', 'ma', 'reproducibility', 'pvalue', 'pca'.
`labels`	Option to print labels for differentially expressed features.
`...`	Other arguments passed to the plot function.

Details

This function plots the results from a ROTS object using given false discovery rate threshold.

Value

Plots the results from ROTS object.

Author(s)

Fatemeh Seyednasrollah, Tomi Suomi, Laura L. Elo
Maintainer: Tomi Suomi <[email protected]>

Examples

  ## ROTS-statistic for the Affymetrix spike-in data. 
  rots.out <- ROTS(data = affySpikeIn, groups = c(rep(0,5), rep(1,5)),
      B = 100, K = 500 , seed = 1234)
  ## Plotting of the ROTS results.
  plot(rots.out, type="volcano")
## ROTS-statistic for the Affymetrix spike-in data. 
  rots.out <- ROTS(data = affySpikeIn, groups = c(rep(0,5), rep(1,5)),
      B = 100, K = 500 , seed = 1234)
  ## Plotting of the ROTS results.
  plot(rots.out, type="volcano")

Gathering results from a regROTS object

Description

Gathers the results from regROTS objects created with the ROTS package.

Usage

  results(object, order=FALSE, pvalue=NULL, FDR=NULL, coef=NULL, logfc=NULL)
results(object, order=FALSE, pvalue=NULL, FDR=NULL, coef=NULL, logfc=NULL)

Arguments

`object`	A `regROTS` object created from differential expression testing run by lmeROTS or lmROTS.
`order`	Ordering of results based on significance (default FALSE).
`pvalue`	Optional p-value cutoff for filtering the results.
`FDR`	Optional false discovery rate cutoff for filtering the results.
`coef`	Optional model coefficient cutoff for filtering the results (`regROTS`).
`logfc`	Optional log fold change cutoff for filtering the results (`ROTS`).

Details

This is a convenience function to gather results from regROTS objects.

Value

A list of reproducibility-optimized results for each variable in the model.

Author(s)

Tomi Suomi, Laura L. Elo
Maintainer: Tomi Suomi <[email protected]>

Examples

  ## Run and plot lmeROTS-statistic 
  ## Not run: rots.out <- lmeROTS("Group + Age + (1|Individual)", data, metadata)
  ## Not run: results <- results(rots.out)
## Run and plot lmeROTS-statistic 
  ## Not run: rots.out <- lmeROTS("Group + Age + (1|Individual)", data, metadata)
  ## Not run: results <- results(rots.out)

Reproducibility-Optimized Test Statistic (ROTS)

Description

Calculates the reproducibility-optimized test statistic (ROTS) for ranking genes in order of evidence for differential expression in two-group or multi-group comparisons.

Usage

	ROTS(data, groups, B = 1000, K = NULL, paired = FALSE, seed = NULL,
	a1 = NULL, a2 = NULL, log = TRUE, progress = FALSE, verbose = TRUE,
	time = NULL, event = NULL)
ROTS(data, groups, B = 1000, K = NULL, paired = FALSE, seed = NULL,
	a1 = NULL, a2 = NULL, log = TRUE, progress = FALSE, verbose = TRUE,
	time = NULL, event = NULL)

Arguments

`data`	A numeric data matrix or an ExpressionSet instance, in which rows correspond to genes and columns correspond to samples.
`groups`	A vector indicating the sample groups.
`B`	An integer specifying the number of bootstrap and permutation resamplings (default 1000).
`K`	An integer indicating the largest top list size considered. If no value is given, 1/4 of the features are used.
`paired`	A logical indicating whether a paired test is performed. The samples are expected to be in the same order in both groups.
`seed`	An integer seed for the random number generator.
`a1`, `a2`	Non-negative parameters. See details section for further information.
`log`	A logical (deafult TRUE) indicating whether input data is log2 scaled. This information is only used to calculate log fold change.
`progress`	A logical indicating if additional progress bars are shown.
`verbose`	A logical indicating if messages are shown.
`time`	Time to event (for survival data).
`event`	Censoring information (for survival data): 1=event, 0=censored.

Details

The reproducibility-optimization procedure ROTS enables the selection of a suitable gene ranking statistic directly from the given dataset. In two-group comparisons, the statistic is optimized among a family of t-type statistics d = m/(a1+a2*s), where m is the absolute difference between the group averages, s is the pooled standard error, and a1 and a2 are the non-negative parameters to be optimized. Two special cases of this family are the ordinary t-statistic (a1=0, a2=1) and the signal log-ratio (a1=1, a2=0). The optimality is defined in terms of maximal overlap of top-ranked genes in group-preserving bootstrap datasets. Importantly, besides the group labels, no a priori information about the properties of the data is required and no fixed cutoff for the gene rankings needs to be specified. For more details about the reproducibility-optimization procedure, see Elo et al. (2008).

The user is given the option to adjust the largest top list size considered in the reproducibility calculations, since lowering this size can markedly reduce the computation time. In large data matrices with thousands of rows, we generally recommend using a size of several thousands. In smaller data matrices, and especially if there are many rows with only a few non-missing entries, the size of K should be decreased accordingly.

ROTS tolerates a moderate number of missing values in the data matrix by effectively ignoring their contribution during the operation of the procedure. However, each row of the data matrix must contain at least two values in both groups. The rows containing only a few non-missing values should be removed; or alternatively, the missing data entries can be imputed using, e.g., the K-nearest neighbors imputation, which is implemented in the Bioconductor package impute. ROTS assumes the input data matrix is log2 transformed (the default for log parameter is set to TRUE). Although, this only affects fold change values, we recommend setting log parameter to FALSE if the input matrix is not log transformed to avoid downstream confusions.

If the parameter values a1 and a2 are set by the user, then no optimization is performed but the statistic and FDR-values are calculated for the given parameters. The false discovery rate (FDR) for the optimized test statistic is calculated by permuting the sample labels. The results for all the genes can be obtained by setting the FDR cutoff to 1.

Value

ROTS returns an object of class ROTS, which is a list containing the following components

`data`	the expression data matrix.
`B`	the number of bootstrap and permutation resamplings.
`d`	the value of the optimized ROTS-statistic for each gene.
`pvalue`	the corresponding pvalues.
`FDR`	the corresponding FDR-values.
`a1`	the optimized parameter a1.
`a2`	the optimized parameter a2.
`k`	the optimized top list size.
`R`	the optimized reproducibility value.
`Z`	the optimized reproducibility Z-score.

print prints the optimized parameters a1 and a2, the optimized top list size and the corresponding reproducibility values.

summary summarizes the results of a ROTS analysis. If fdr and num.genes are not specified, then the optimized parameters a1 and a2, the optimized top list size and the corresponding reproducibility values are shown. If fdr or num.genes is specified, then also the gene-specific information is shown for the genes at the specified FDR-level or top list size, respectively.

Author(s)

Fatemeh Seyednasrollah, Tomi Suomi, Laura L. Elo
Maintainer: Tomi Suomi <[email protected]>

References

Suomi T, Seyednasrollah F, Jaakkola MK, Faux T, Elo LL.
ROTS: An R package for reproducibility-optimized statistical testing.
PLoS Comput Biol 2017; 13: e1005562.

Examples

## ROTS-statistic for the Affymetrix spike-in data. 
rots.out <- ROTS(data = affySpikeIn, groups = c(rep(0,5), rep(1,5)),
    B = 100, K = 500 , seed = 1234)
## Summary of the ROTS results.
rots.summary <- summary(rots.out, fdr = 0.05)
## ROTS-statistic for the Affymetrix spike-in data. 
rots.out <- ROTS(data = affySpikeIn, groups = c(rep(0,5), rep(1,5)),
    B = 100, K = 500 , seed = 1234)
## Summary of the ROTS results.
rots.summary <- summary(rots.out, fdr = 0.05)

Summary of a regROTS object

Description

Summarizes the differential testing results from ROTS package.

Usage

  ## S3 method for class 'regROTS'
summary(object, fdr=0.05, ...)
## S3 method for class 'regROTS'
summary(object, fdr=0.05, ...)

Arguments

`object`	A `regROTS` object created from differential expression testing run by ROTS.
`fdr`	Selected cutoff for FDR value.
`...`	Other arguments passed to the summary function.

Details

This function returns the summary information of regROTS object.

Value

Summary of reproducibility-optimisation.

Author(s)

Tomi Suomi, Laura L. Elo
Maintainer: Tomi Suomi <[email protected]>

Examples

  ## Run and get summary of lmeROTS-statistic 
  ## Not run: rots.out <- lmeROTS("Group + Age + (1|Individual)", data, metadata)
  ## Not run: summary(rots.out)
## Run and get summary of lmeROTS-statistic 
  ## Not run: rots.out <- lmeROTS("Group + Age + (1|Individual)", data, metadata)
  ## Not run: summary(rots.out)

Summary of a ROTS object

Description

Summarizes the differential testing results from ROTS package.

Usage

  ## S3 method for class 'ROTS'
summary(object, fdr=NULL, num.genes=NULL, verbose=TRUE, ...)
## S3 method for class 'ROTS'
summary(object, fdr=NULL, num.genes=NULL, verbose=TRUE, ...)

Arguments

`object`	a `ROTS` object created from differential expression testing run by ROTS.
`fdr`	selected cutoff for FDR value.
`num.genes`	selected cutoff number for number of differentially detected features.
`verbose`	If TRUE (default), summary function will print out 10 first detections which fulfill the cutoff criteria.
`...`	other arguments passed to the summary function.

Details

This function returns the summary information (including row number, test-statistic, pvalue and FDR value) for the selected features.

Value

Returns a matrix where the rows are the selected features and columns are the Row number, ROTS-statistic, pvalue and FDR.

Author(s)

Fatemeh Seyednasrollah, Tomi Suomi, Laura L. Elo
Maintainer: Tomi Suomi <[email protected]>

References

Suomi T, Seyednasrollah F, Jaakkola MK, Faux T, Elo LL.
ROTS: An R package for reproducibility-optimized statistical testing.
PLoS Comput Biol 2017; 13: e1005562.

Examples

  ## ROTS-statistic for the Affymetrix spike-in data. 
  rots.out <- ROTS(data = affySpikeIn, groups = c(rep(0,5), rep(1,5)),
      B = 100, K = 500 , seed = 1234)
  ## Summary of the ROTS results.
  rots.summary <- summary(rots.out, fdr = 0.05)
## ROTS-statistic for the Affymetrix spike-in data. 
  rots.out <- ROTS(data = affySpikeIn, groups = c(rep(0,5), rep(1,5)),
      B = 100, K = 500 , seed = 1234)
  ## Summary of the ROTS results.
  rots.summary <- summary(rots.out, fdr = 0.05)

Protein expression data from the CPTAC Technology Assessment (Study 6)

Description

The data contains preprocessed expression values from CPTAC Technology Assessment, where equimolar concentrations of 48 human proteins (Sigma UPS 1) are spiked in five different concentrations into yeast proteome background and then processed. The example data contains only two groups from the study 6.

Usage

upsSpikeInupsSpikeIn

Format

upsSpikeIn: A numeric matrix of protein expressions.

Package 'ROTS'

Help Index

Gene expression data from the Affymetrix spike-in experiment

Description

Usage

Format

Reproducibility-Optimized Test Statistic (ROTS)

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Reproducibility-Optimized Test Statistic (ROTS)

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Plotting of a regROTS object

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Plotting of a ROTS object

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Gathering results from a regROTS object

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Reproducibility-Optimized Test Statistic (ROTS)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Summary of a regROTS object

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Summary of a ROTS object

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Protein expression data from the CPTAC Technology Assessment (Study 6)

Description

Usage

Format