Package 'randRotation' reference manual

Title:	Random Rotation Methods for High Dimensional Data with Batch Structure
Description:	A collection of methods for performing random rotations on high-dimensional, normally distributed data (e.g. microarray or RNA-seq data) with batch structure. The random rotation approach allows exact testing of dependent test statistics with linear models following arbitrary batch effect correction methods.
Authors:	Peter Hettegger [aut, cre]
Maintainer:	Peter Hettegger <[email protected]>
License:	GPL-3
Version:	1.19.0
Built:	2025-03-27 04:06:41 UTC
Source:	https://github.com/bioc/randRotation

randRotation: Random Rotation Methods for High Dimensional Data with Batch Structure

Description

A collection of methods for performing random rotations on high-dimensional, normally distributed data (e.g. microarray or RNA-seq data) with batch structure. The random rotation approach allows exact testing of dependent test statistics with linear models following arbitrary batch effect correction methods.

Details

Please refer to the package vignette for further details on usage and for a "quick start". rotateStat is the central function of the package. Methods are described in (Hettegger et al. 2021).

Author(s)

Maintainer: Peter Hettegger [email protected] (ORCID)

References

Create transformed model matrix for contrast rotation

Description

This function takes a model matrix X and a contrast matrix C and creates a transformed model matrix corresponding to a transformed set of coefficients.

Usage

contrastModel(X, C, coef.h = seq_len(ncol(C)))
contrastModel(X, C, coef.h = seq_len(ncol(C)))

Arguments

`X`	`(numeric)` model matrix with dimensions `samples x coefficients`.
`C`	`(numeric)` contrast matrix with dimensions `coefficients x contrasts`. The contrast matrix must have full column rank.
`coef.h`	column numbers of contrasts (in `C`) which should be set as `coef.h` in the transformed model, see `initRandrot`. All columns are set as `coef.h` by default.

Details

The last n coefficients of the transformed model matrix correspond to the n contrasts. By default, all contrasts are set as coef.h. See package vignette for examples of data rotations with contrasts.

Value

A transformed model matrix with coef.h set as attribute.

Author(s)

Peter Hettegger

Examples


group <- c("A", "A", "B", "B")
X <- model.matrix(~0+group)
C <- cbind(contrast1 = c(1, -1))
X2 <- contrastModel(X, C)
group <- c("A", "A", "B", "B")
X <- model.matrix(~0+group)
C <- cbind(contrast1 = c(1, -1))
X2 <- contrastModel(X, C)

Dimensions of an Object

Description

Retrieve the dimensions of an object.

Usage

## S4 method for signature 'initRandrot'
dim(x)

## S4 method for signature 'initBatchRandrot'
dim(x)
## S4 method for signature 'initRandrot'
dim(x)

## S4 method for signature 'initBatchRandrot'
dim(x)

Arguments

`x`	An object of class `initRandrot-class` or `initBatchRandrot-class`.

Value

Vector of length two with number of features and number of samples. See also initRandrot.

Dimnames of an Object

Description

Retrieve the dimnames of an object.

Usage

## S4 method for signature 'initRandrot'
dimnames(x)

## S4 method for signature 'initBatchRandrot'
dimnames(x)
## S4 method for signature 'initRandrot'
dimnames(x)

## S4 method for signature 'initBatchRandrot'
dimnames(x)

Arguments

`x`	An object of class `initRandrot-class` or `initBatchRandrot-class`.

Value

A list with names of features and samples, see initRandrot.

Initialised random rotation batch object

Description

This class contains initRandrot or initRandrotW class objects for each batch. See also descriptions in initRandrot and initRandrot-class.

Components

batch.obj: List of initRandrot or initRandrotW class objects for each batch.
split.by: List of sample indices for each batch.

Author(s)

Peter Hettegger

Initialisation of a random rotation Object

Description

Initialization of a linear model for subsequent generation of randomly rotated data (randrot) associated with the null hypothesis $H_{0}: \beta_{coef.h} = 0$ . Basics of rotation tests are found in (Hettegger et al. 2021) and (Langsrud 2005).

Usage

initRandrot(Y = NULL, X = NULL, coef.h = NULL, weights = NULL, cormat = NULL)

initBatchRandrot(
  Y = NULL,
  X = NULL,
  coef.h = NULL,
  batch = NULL,
  weights = NULL,
  cormat = NULL
)

## S4 method for signature 'list'
initBatchRandrot(
  Y = NULL,
  X = Y$design,
  coef.h = NULL,
  batch = NULL,
  weights = Y$weights,
  cormat = NULL
)

## S4 method for signature 'list'
initRandrot(
  Y = NULL,
  X = Y$design,
  coef.h = NULL,
  weights = Y$weights,
  cormat = NULL
)
initRandrot(Y = NULL, X = NULL, coef.h = NULL, weights = NULL, cormat = NULL)

initBatchRandrot(
  Y = NULL,
  X = NULL,
  coef.h = NULL,
  batch = NULL,
  weights = NULL,
  cormat = NULL
)

## S4 method for signature 'list'
initBatchRandrot(
  Y = NULL,
  X = Y$design,
  coef.h = NULL,
  batch = NULL,
  weights = Y$weights,
  cormat = NULL
)

## S4 method for signature 'list'
initRandrot(
  Y = NULL,
  X = Y$design,
  coef.h = NULL,
  weights = Y$weights,
  cormat = NULL
)

Arguments

`Y`	a data matrix with `features x samples` dimensions or a list with elements `E`, `design` and `weights` (see `Details`). Missing values (`NA`) are allowed but e.g. lead to NAs for all samples of the respective features in the rotated dataset and should thus be avoided. We highly recommend avoiding missing values by e.g. replacing them by imputation or removing features containing NAs.
`X`	the design matrix of the experiment with `samples x coefficients` dimensions. For `initBatchRandrot`, specify the design matrix without the batch variable. A warning is generated if `X[,coef.d]` does not have full rank, see Details.
`coef.h`	single integer or vector of integers specifying the "hypothesis coefficients" (`H0` coefficients). `coef.h` should correspond to the last columns in `X` (see `Details`). If available, `attr(X, "coef.h")` is used, see `contrastModel`. By default, all coefficients are set as `H0` coefficients. If `coef.h` is set `-1`, no coefficient is set as `H0` coefficient.
`weights`	numerical matrix of finite positive weights > 0 (as in weighted least squares regression. Dimensions must be equal to dimensions of `Y`.
`cormat`	the sample correlation matrix with `samples x samples` dimensions. Must be a real symmetric positive-definite square matrix. See `Details` for usage in `initBatchRandrot`.
`batch`	Batch covariate of the same length as `ncol(Y)`.

Details

This function performs basic initial checks and preparatory calculations for random rotation data generation. Nomenclature of variables is mainly as in (Langsrud 2005) and (Hettegger et al. 2021). See also package vignette for application examples.

Y can also be a list with elements E, design and weights. Y$E is thereby used as Y, Y$design is used as X and Y$weights is used as weights. By this, the functions are compatible with results from e.g. voom (limma package), see Examples.

coef.h specifies the model coefficients associated with the null hypothesis ("hypothesis coefficients"). All other model coefficients are considered as "determined coefficients" coef.d (Langsrud 2005). The design matrix is rearranged so that coef.h correspond to the last columns of the design matrix and coef.d correspond to the first columns of the design matrix. This is necessary for adequate transformation of the combined null-hypothesis $H_{0}: \beta_{coef.h} = 0$ by QR decomposition. If X[,coef.d] does not have full rank, a warning is generated and coef.d is set to coef.d <- seq_len(qr(X[,coef.d])$rank).

Weights must be finite positive numerics greater zero. This is necessary for model (QR) decomposition and for back transformation of the rotated data into the original variance structure, see also randrot. Weights as estimated e.g. by voom (Law et al. 2014) are suitable and can be used without further processing. Note that due to the whitening transformation (i.e. by using the arguments weights and/or cormat) the rank of the transformed (whitened) design matrix X could change (become smaller), which could become dangerous for the fitting procedures. If you get errors using weights and/or cormat, try the routine without using weights and/or cormat to exclude this source of errors.

The following section provides a brief summary how rotations are calculated. A more general introduction is given in (Langsrud 2005). For reasons of readability, we omit writing %*% for matrix multiplication and write * for transposed matrix. The rotation is done by multiplying the features x samples data matrix Y with the transpose of the restricted random rotation matrix Rt

Rt = Xd Xd* + [Xh Xe] R [Xh Xe]*

with R being a (reduced) random rotation matrix and Xd, Xh and Xe being columns of the full QR decomposition of the design matrix X. [Xd Xh Xe] = qr.Q(qr(X), complete = TRUE), where Xd correspond to columns coef.d, Xh to columns coef.h and Xe to the remaining columns.

If weights and/or cormat are specified, each feature Y[i,] and the design matrix X are whitening transformed before rotation. The whitening matrix T is defined as T = solve(C) w, where solve(C) is the inverse Cholesky decompostion of the correlation matrix (cormat = CC*) and w is a diagonal matrix of the square roots of the sample weights for the according feature (w = diag(sqrt(weights[i,]))).

The rotated data for one feature y.r[i,] is thus calculated as

y.r[i,] = ( solve(T) Rt T (y[i,])* )* and [Xd Xh Xe] = qr.Q(qr(TX), complete = TRUE)

For weights = NULL and cormat = NULL, T is the identity matrix.

Note that a separate QR decomposition is calculated for each feature if weights are specified. The restricted random orthogonal matrix Rt is calculated with the same reduced random orthogonal matrix R for all features.

When using initBatchRandrot, initRandrot is called for each batch separately. When using initBatchRandrot with cormat, cormat needs to be a list of correlation matrices with one matrix for each batch. Note that this implicitly assumes a block design of the sample correlation matrix, where sample correlation coefficients between batches are zero. For a more general sample correlation matrix, allowing non-zero sample correlation coefficients between batches, see package vignette. Batches are split according to split(seq_along(batch), batch).

Value

An initialised initRandrot, initRandrotW or initBatchRandrot object.

Author(s)

Peter Hettegger

References

Hettegger P, Vierlinger K, Weinhaeusel A (2021). “Random rotation for identifying differentially expressed genes with linear models following batch effect correction.” Bioinformatics. ISSN 1367-4803, doi:10.1093/bioinformatics/btab063.

Langsrud O (2005). “Rotation tests.” Statistics and Computing, 15(1), 53–60. ISSN 09603174, doi:10.1007/s11222-005-4789-5.

Law CW, Chen Y, Shi W, Smyth GK (2014). “Voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.” Genome Biology, 15(2), 1–17. ISSN 1474760X, doi:10.1186/gb-2014-15-2-r29, http://www.ncbi.nlm.nih.gov/pubmed/24485249.

Examples

# For further examples see '?rotateStat' and package vignette.

# Example 1: Compatibility with limma::voom

## Not run: 
v <- voom(counts, design)
ir <- initRandrot(v)
## End(Not run)

# Example 2:

#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1
# See '?rotateStat'
# For further examples see '?rotateStat' and package vignette.

# Example 1: Compatibility with limma::voom

## Not run: 
v <- voom(counts, design)
ir <- initRandrot(v)
## End(Not run)

# Example 2:

#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1
# See '?rotateStat'

Initialised random rotation class

Description

List-based S4 class containing all information necessary to generate randomly rotated data with the randrot method. initRandrot and initRandrotW objects are created with the initRandrot method.

initRandrotW is organised as its base class initRandrot, altough some components are changed or added.

Components

The following components are included as list elements:

X: Original (non-transformed) design matrix.
Xhe, Xhe.Y.w, Yd: Pre-multiplied matrix products needed for generation of rotated data (randrot).
coef.h, coef.d: Indices of $H_0$ coefficients (coef.h or "hypothesis coefficients") and indices of all other coefficients (coef.d or "determined coefficients").
cormat: Correlation matrix, see initRandrot.
tcholC: Cholesky decomposition of cormat: cormat = crossprod(tcholC).
rank: Rank of the qr decomposition of (transformed/whitened) X

The following components are changed or added in initRandrotW-class as compared to initRandrot-class:

decomp.list: List containing Xd, Xhe and rank of the transformed/whitened design matrix for each feature, see also X_decomp.
w: Numeric matrix with dimensions features x samples containing component wise square root of the weight matrix, see initRandrot.

Author(s)

Peter Hettegger

Calculate resampling based p-values and FDRs

Description

This function calculates either (1) resampling based p-values with subsequent p-value adjustment using stats::p.adjust or (2) resampling based false-discovery-rates (FDRs) for rotated statistics from a rotateStat object.

Usage

pFdr(obj, method = "none", pooled = TRUE, na.rm = FALSE, beta = 0.05)
pFdr(obj, method = "none", pooled = TRUE, na.rm = FALSE, beta = 0.05)

Arguments

`obj`	A `rotateStat` object as returned by `rotateStat`.
`method`	Can be either `"none"` (default), `"fdr.q"`, `"fdr.qu"` or any term that can be passed as `method` argument to `stats::p.adjust`, see `Details`. If `method = "none"`, resampling based p-values without further adjustment are calculated.
`pooled`	`logical`. `TRUE` (default) if marginal distributions are exchangeable for all features so that rotated stats can be pooled, see `Details`.
`na.rm`	`logical`. `NA` values are ignored if set `TRUE`. `NA` values should be avoided and could e.g. be removed by imputation in original data or by removing features that contain `NA` values. Few `NA` values do not have a large effect, but many `NA` values can lead to wrong estimations of p-values and FDRs. We highly recommend avoiding `NA` values.
`beta`	`numeric` between `0` and `1`. Corresponds to beta in (Yekutieli and Benjamini 1999).

Details

Larger values of obj$s0 are considered more significant when compared to the empirical distribution. E.g. for calculation of resampling based p-values (with pooled = FALSE) we in principle use p.val <- (rowSums(obj$stats >= obj$s0)+1)/(ncol(obj$stats)+1) according to (Phipson and Smyth 2010).

method = "fdr.q" and method = "fdr.qu" are resampling based fdr estimates and can only be used with pooled = TRUE. method = "fdr.q" is the FDR local estimator and method = "fdr.qu" is the FDR upper limit, see (Reiner et al. 2003; Yekutieli and Benjamini 1999). For all other method arguments resampling based p-values are calculated and passed to stats::p.adjust for p-value adjustment. So these methods provide resampling based p-values with (non-resampling based) p-value adjustment. method = "fdr.q" and method = "fdr.qu" were adapted from package fdrame (Benjamini et al. 2019; Reiner et al. 2003).

When pooled = TRUE, marginal distributions of the test statistics are considered exchangeable for all features. The resampling based p-values of each feature are then calculated from all rotated statistics (all features, all rotations). For these cases, if the number of features is reasonably large, usually only few resamples (argument R in rotateStat) are required. We want to emphasize that in order for the marginal distributions to be exchangeable, the statistics must be a pivotal quantity (i.e. it must be scale independent). Pivotal quantities are e.g. t values. Using e.g. linear models with coef as statistics is questionable if the different features are measured on different scales. The resampled coefficients then have different variances and pooled = TRUE is not applicable. We thus highly recommend using pivotal quantities as statistics in rotateStat if possible.

When pooled = FALSE the resampling based p-values are calculcated for each feature separately. This is required if one expects the resampling based statistics to be distributed differently for individual features. For most common applications this should not be the case and the marginal distribution are exchangeable for all features, hence pooled = TRUE by default.

If method = "fdr.q" or method = "fdr.qu" and weights were specified when initialising the random rotation object (see parameter initialised.obj in rotateStat), a warning is displayed. The correlation structure (dependence structure) of linear model coefficients between different features is not generally preserved if different weights are used for different features. Methods fdr.q and fdr.qu rely on preserved correlation structure of dependent statistics and thus should not be used if statistics based on model coefficients (e.g. t statistics of model coefficients) are used in combination with different weights.

P-values and FDRs are calculated for each column of obj$s0 separately.

Value

A numeric matrix of corrected p-values or FDRs with dimension dim(obj$s0).

Author(s)

Peter Hettegger

References

Benjamini Y, Kenigsberg E, Reiner A, Yekutieli D (2019). fdrame: FDR adjustments of Microarray Experiments (FDR-AME). R package version 1.56.0.

Phipson B, Smyth GK (2010). “Permutation P-values should never be zero: Calculating exact P-values when permutations are randomly drawn.” Statistical Applications in Genetics and Molecular Biology, 9(1). ISSN 15446115, doi:10.2202/1544-6115.1585, 1603.05766, http://www.ncbi.nlm.nih.gov/pubmed/21044043.

Reiner A, Yekutieli D, Benjamini Y (2003). “Identifying differentially expressed genes using false discovery rate controlling procedures.” Bioinformatics, 19(3), 368–375. ISSN 13674803, doi:10.1093/bioinformatics/btf877, http://www.ncbi.nlm.nih.gov/pubmed/12584122.

Yekutieli D, Benjamini Y (1999). “Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics.” Journal of Statistical Planning and Inference, 82(1-2), 171–196. ISSN 03783758, doi:10.1016/S0378-3758(99)00041-5, http://www.ncbi.nlm.nih.gov/pubmed/83580500015.

Examples

# See also '?rotateStat':

#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1

# Definition of the batch effect correction procedure with subsequent calculation
# of two-sided test statistics
statistic <- function(., batch, mod, coef){

  # The "capture.output" and "suppressMessages" simply suppress any output
  capture.output(suppressMessages(
    Y.tmp <- sva::ComBat(., batch = batch, mod)
  ))

  fit1 <- lm.fit(mod, t(Y.tmp))
  abs(coef(fit1)[coef,])
}

# We calculate test statistics for the second coefficient

res1 <- rotateStat(initialised.obj = init1,
                    R = 10,
                    statistic = statistic,
                    batch = pdata$batch, mod = mod1, coef = 2)

hist(pFdr(res1))
# See also '?rotateStat':

#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1

# Definition of the batch effect correction procedure with subsequent calculation
# of two-sided test statistics
statistic <- function(., batch, mod, coef){

  # The "capture.output" and "suppressMessages" simply suppress any output
  capture.output(suppressMessages(
    Y.tmp <- sva::ComBat(., batch = batch, mod)
  ))

  fit1 <- lm.fit(mod, t(Y.tmp))
  abs(coef(fit1)[coef,])
}

# We calculate test statistics for the second coefficient

res1 <- rotateStat(initialised.obj = init1,
                    R = 10,
                    statistic = statistic,
                    batch = pdata$batch, mod = mod1, coef = 2)

hist(pFdr(res1))

Quantile-Quantile plot of data sample against uniform theoretical quantiles

Description

qqunif produces a QQ plot of the values in ps against the theoretical quantiles of the uniform distribution.

Usage

qqunif(
  ps,
  log = "xy",
  pch = 20,
  xlab = "theoretical quantiles",
  ylab = "sample quantiles",
  plot.it = TRUE,
  ...
)
qqunif(
  ps,
  log = "xy",
  pch = 20,
  xlab = "theoretical quantiles",
  ylab = "sample quantiles",
  plot.it = TRUE,
  ...
)

Arguments

`ps`	`numeric` vector of values (e.g. p-values). Values must be between 0 and 1. Values like `NA`, NaN, Inf etc. produce an error.
`log`	`character` indicating whether axis should be plotted in log scale. Either `""`, `"x"`, `"y"` or `"xy"`.
`pch`	Point symbol, see `par`.
`xlab`	Label for the x axis.
`ylab`	Label for the y axis.
`plot.it`	`logical` whether the result should be plotted.
`...`	Graphical parameters forwarded to `qqplot`

Details

This function can e.g. be used for comparing p-values against the uniform distribution. The log scale of the x and y axes allow a closer look at low p-values.

This function is a modified version of the examples in the qqnorm documentation page.

Value

A list of x and y coordinates, as in qqplot.

Examples

qqunif(runif(100))
qqunif(runif(100))

Random orthogonal matrix

Description

Generation of a random orthogonal n x n matrix.

Usage

randorth(n, type = c("orthonormal", "unitary"), I.matrix = FALSE)
randorth(n, type = c("orthonormal", "unitary"), I.matrix = FALSE)

Arguments

`n`	`numeric` of length 1 defining the dimensions of the `n x n` square matrix.
`type`	Either `"orthonormal"` or `"unitary"` defining whether a real orthonormal matrix or a complex unitary matrix should be returned.
`I.matrix`	If `TRUE`, the identity matrix is returned.

Details

A random orthogonal matrix R is generated in order that t(R) (for "orthonormal") or Conj(t(R)) (for "unitary") equals the inverse matrix of R.

This function was adapted from the pracma package (pracma::randortho).

The random orthogonal matrices are distributed with Haar measure over O(n), where O(n) is the set of orthogonal matrices of order n. The random orthogonal matrices are basically distributed "uniformly" in the space of random orthogonal matrices of dimension n x n. See also the Examples and (Stewart 1980; Mezzadri 2007).

Value

A random orthogonal matrix of dimension n x n.

Author(s)

Peter Hettegger

References

Mezzadri F (2007). “How to generate random matrices from the classical compact groups.” Notices of the American Mathematical Society, 54(5), 592–604. ISSN 1088-9477, 0609050, http://arxiv.org/abs/math-ph/0609050.

Stewart GW (1980). “The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators.” SIAM Journal on Numerical Analysis. ISSN 0036-1429, doi:10.1137/0717034.

Examples


# The following example shows the orthogonality of the random orthogonal matrix:
R1 <- randorth(4)
zapsmall(t(R1) %*% R1)

R1 <- randorth(4, "unitary")
zapsmall(Conj(t(R1)) %*% R1)

# The following example shows the distribution of 2-dimensional random orthogonal vectors
# on the unit circle.
tmp1 <- vapply(1:400, function(i)randorth(2)[,1], numeric(2))
plot(t(tmp1), xlab = "x", ylab = "y")

# The following example shows the orthogonality of the random orthogonal matrix:
R1 <- randorth(4)
zapsmall(t(R1) %*% R1)

R1 <- randorth(4, "unitary")
zapsmall(Conj(t(R1)) %*% R1)

# The following example shows the distribution of 2-dimensional random orthogonal vectors
# on the unit circle.
tmp1 <- vapply(1:400, function(i)randorth(2)[,1], numeric(2))
plot(t(tmp1), xlab = "x", ylab = "y")

Generate random permutation matrix for n samples

Description

Generate a random permutation matrix for n samples.

Usage

randpermut(n)
randpermut(n)

Arguments

`n`	Number of samples

Details

This methods generates an orthogonal matrix with only one entry in each row and column being 1, all other entries being 0.

Value

A random permutation matrix of dimension n x n

Author(s)

Peter Hettegger

Examples

tmp1 <- randpermut(5)
t(tmp1) %*% tmp1
tmp1 <- randpermut(5)
t(tmp1) %*% tmp1

Random rotation of initialised object

Description

Perform random data rotation of a previously initialised object (see initRandrot) associated with the null hypothesis $H_{0}: \beta_{coef.h} = 0$ .

Usage

randrot(object, ...)

## S4 method for signature 'initRandrot'
randrot(object, ...)

## S4 method for signature 'initRandrotW'
randrot(object, ...)

## S4 method for signature 'initBatchRandrot'
randrot(object, ...)
randrot(object, ...)

## S4 method for signature 'initRandrot'
randrot(object, ...)

## S4 method for signature 'initRandrotW'
randrot(object, ...)

## S4 method for signature 'initBatchRandrot'
randrot(object, ...)

Arguments

`object`	An initialised object of class `initRandrot-class` or `initBatchRandrot-class`.
`...`	further arguments passed to `randorth`

Details

This function generates a randomly rotated dataset from an initialised randrot object (see initRandrot). See also package vignette for application examples. Only the numerical matrix of rotated data is returned, no design matrix, weights or other info is return for efficiency purposes. Please consider that, if you e.g. use weights or if you use rotateStat, you may need to forward the design matrix X, weights etc. to subsequent analyses. See the example in rotateStat.

Details on the calculation of a rotated dataset are given in initRandrot, (Langsrud 2005) and (Hettegger et al. 2021).

Value

numeric matrix of rotated data under the specified combined null hypothesis.

Author(s)

Peter Hettegger

References

Examples

# For further examples see '?rotateStat' and package vignette.

#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2,
                            batch = pdata$batch)
init1

### Fit model to original data

fit.orig <- lm.fit(mod1, t(edata))
head(t(coef(fit.orig)))

### Fit model to rotated data

edata.rot <- randrot(init1)
fit.rot <- lm.fit(mod1, t(edata.rot))
head(t(coef(fit.rot)))


# Note that the coefficients stay equal if we regress only on the
# non-hypothesis coefficients

mod0 <- model.matrix(~1, pdata)
fit.orig0 <- lm.fit(mod0, t(edata))
fit.rot0  <- lm.fit(mod0, t(edata.rot))
head(t(coef(fit.orig0)))
head(t(coef(fit.rot0)))

# For further examples see '?rotateStat' and package vignette.

#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2,
                            batch = pdata$batch)
init1

### Fit model to original data

fit.orig <- lm.fit(mod1, t(edata))
head(t(coef(fit.orig)))

### Fit model to rotated data

edata.rot <- randrot(init1)
fit.rot <- lm.fit(mod1, t(edata.rot))
head(t(coef(fit.rot)))


# Note that the coefficients stay equal if we regress only on the
# non-hypothesis coefficients

mod0 <- model.matrix(~1, pdata)
fit.orig0 <- lm.fit(mod0, t(edata))
fit.rot0  <- lm.fit(mod0, t(edata.rot))
head(t(coef(fit.orig0)))
head(t(coef(fit.rot0)))

Defunct functions in package ‘randRotation’

Description

These functions are defunct and no longer available.

Details

Defunct functions are: df_estimate

Generate data rotations and calculate statistics on it

Description

This function generates rotations of data and calculates the provided statistic on each rotation and the non-rotated (original) data. This is the central function of the package.

Usage

rotateStat(
  initialised.obj,
  R = 10,
  statistic,
  ...,
  parallel = FALSE,
  BPPARAM = BiocParallel::bpparam()
)
rotateStat(
  initialised.obj,
  R = 10,
  statistic,
  ...,
  parallel = FALSE,
  BPPARAM = BiocParallel::bpparam()
)

Arguments

`initialised.obj`	An initialised random rotation object as returned by `initRandrot` and `initBatchRandrot`.
`R`	The number of resamples/rotations. Single `numeric` larger than 1.
`statistic`	A function which takes a data matrix (same dimensions as `Y` - see also `initRandrot`) as first argument and returns a statistic of interest. Any further arguments are passed to it with the `...` argument. We highly recommend using pivotal quantities as `statistic` if possible (see also `Details` in `pFdr`). Note that `pFdr` considers larger values of statistics as more significant, so one-tailed tests may require reversal of the sign and two-tailed tests may require taking absolute values, see `Examples`. The results of `statistic` for each resample are finally combined with `as.matrix` and `cbind`, so ensure that `statistic` returns either a vector or a matrix. Results with multiple columns are possible and handled adequately in subsequent functions (e.g. `pFdr`). Note that `statistic` must not necessarily be of the same length as `nrow(Y)`, but can also be e.g. a summary statistic of genes (like in gene set testing).
`...`	Further named arguments for `statistic` which are passed unchanged each time it is called. Avoid partial matching to arguments of `rotateStat`. See also the `Examples`.
`parallel`	`logical` if parallel computation should be performed, see details for use of parallel computing.
`BPPARAM`	An optional `BiocParallelParam` instance, see documentation of `BiocParallel` package of Bioconductor.

Details

The function takes an initialised randrot object (initRandrot) and a function that calculates a statistic on the data. The statistic function thereby takes the a matrix Y as first argument. Any further arguments are passed to it by ....

Together with pFdr, this function implements the workflow described in (Hettegger et al. 2021).

Be aware that only data is rotated (see also randrot), so any additional information including weights, X etc. need to be provided to statistic. See also package vignette and Examples.

Parallel processing is implemented with the BiocParallel package of Bioconductor. The default argument BiocParallel::bpparam() for BPPARAM returns the registered default backend. See package documentation for further information and usage options. If parallel = TRUE the function calls in statistic need to be called explicitly with package name and "::". So e.g. calling lmFit from the limma package is done with limma::lmFit(...), see also the examples in the package vignette.

Value

An object of class rotateStat.

Author(s)

Peter Hettegger

References

Examples

#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1

# Definition of the batch effect correction procedure with subsequent calculation
# of two-sided test statistics
statistic <- function(., batch, mod, coef){

  # The "capture.output" and "suppressMessages" simply suppress any output
  capture.output(suppressMessages(
    Y.tmp <- sva::ComBat(., batch = batch, mod)
  ))

  fit1 <- lm.fit(mod, t(Y.tmp))
  abs(coef(fit1)[coef,])
}

# We calculate test statistics for the second coefficient

res1 <- rotateStat(initialised.obj = init1,
                    R = 10,
                    statistic = statistic,
                    batch = pdata$batch, mod = mod1, coef = 2)

hist(pFdr(res1))
#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1

# Definition of the batch effect correction procedure with subsequent calculation
# of two-sided test statistics
statistic <- function(., batch, mod, coef){

  # The "capture.output" and "suppressMessages" simply suppress any output
  capture.output(suppressMessages(
    Y.tmp <- sva::ComBat(., batch = batch, mod)
  ))

  fit1 <- lm.fit(mod, t(Y.tmp))
  abs(coef(fit1)[coef,])
}

# We calculate test statistics for the second coefficient

res1 <- rotateStat(initialised.obj = init1,
                    R = 10,
                    statistic = statistic,
                    batch = pdata$batch, mod = mod1, coef = 2)

hist(pFdr(res1))

Rotated object containing rotated and non-rotated statistics

Description

This list based class contains calculated statistics for the original data (s0) and rotated data (stats). See also rotateStat.

Components

s0: Calculated statistics for original (non-rotated) data as returned by the statistic function (rotateStat).
stats: List of length ncol.s containing statistics on rotated data for each column returned by the statistic function.
ncol.s: Number of columns returned by the statistic function.
R: Number of resamples/rotations.

Author(s)

Peter Hettegger

Show an Object

Description

Display the object by printing structured summary information.

Usage

## S4 method for signature 'initRandrot'
show(object)

## S4 method for signature 'initBatchRandrot'
show(object)

## S4 method for signature 'rotateStat'
show(object)
## S4 method for signature 'initRandrot'
show(object)

## S4 method for signature 'initBatchRandrot'
show(object)

## S4 method for signature 'rotateStat'
show(object)

Arguments

object

An object of class initRandrot-class, initRandrotW-class or initBatchRandrot-class.

Details

The show method always displays the original design matrix (X), not the transformed (whitened) versions.

Value

show returns an invisible NULL.

Extract model weights

Description

weights is a generic function which extracts fitting weights from objects returned by modeling functions. NOTE: This man page is for the weights S4 generic function defined in the randRotation package.

Usage

## S4 method for signature 'initRandrot'
weights(object, ...)

## S4 method for signature 'initBatchRandrot'
weights(object, ...)
## S4 method for signature 'initRandrot'
weights(object, ...)

## S4 method for signature 'initBatchRandrot'
weights(object, ...)

Arguments

`object`	An object of class `initRandrot-class`, `initRandrotW-class` or `initBatchRandrot-class`.
`...`	Kept for compatibility with the default method, see `?stats::weights`. For objects defined in package randRotation, this argument is currently not needed.

Value

Weights extracted from the object object. NULL if no weights were specified. See ?stats::weights for the value returned by the default method.

Examples

weights
showMethods("weights")
selectMethod("weights", "ANY")  # the default method

weights
showMethods("weights")
selectMethod("weights", "ANY")  # the default method

Decomposition of the design matrix for random rotation generation

Description

Full QR decomposition of the design matrix X. No argument checks are performed, see Details.

Usage

X_decomp(X = NULL, coef.d = seq_len(ncol(X) - 1))
X_decomp(X = NULL, coef.d = seq_len(ncol(X) - 1))

Arguments

`X`	Design matrix as generated by `model.matrix`.
`coef.d`	Non-`H0` coefficients.

Details

The design matrix X is QR decomposed into X = Xq Xr. By performing a full QR decomposition, Xq is automatically extended to a full basis. Xq is further split into Xd and Xhe, where Xd corresponds to columns coef.d (non-H0 or non-Null-Hypothesis columns) and Xhe correspond to all other columns (H0 and error columns), see initRandrot. No argument checks are performed for reasons of performance as this function is called frequently by initRandrot when weights are used. See (Hettegger et al. 2021) and (Langsrud 2005) for further details.

Value

A list object containing matrices Xd, Xhe and rank of the qr decomposition.

Author(s)

Peter Hettegger

References

Examples

design <- cbind(1, rep(0:1, 5))
X_decomp(design)
design <- cbind(1, rep(0:1, 5))
X_decomp(design)

Package 'randRotation'

Help Index

randRotation: Random Rotation Methods for High Dimensional Data with Batch Structure

Description

Details

Author(s)

References

See Also

Create transformed model matrix for contrast rotation

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Dimensions of an Object

Description

Usage

Arguments

Value

Dimnames of an Object

Description

Usage

Arguments

Value

Initialised random rotation batch object

Description

Components

Author(s)

Initialisation of a random rotation Object

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Initialised random rotation class

Description

Components

Author(s)

Calculate resampling based p-values and FDRs

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Quantile-Quantile plot of data sample against uniform theoretical quantiles

Description

Usage

Arguments

Details

Value

Examples

Random orthogonal matrix

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Generate random permutation matrix for n samples

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Random rotation of initialised object

Description