Package 'xcore' reference manual

Title:	xcore expression regulators inference
Description:	xcore is an R package for transcription factor activity modeling based on known molecular signatures and user's gene expression data. Accompanying xcoredata package provides a collection of molecular signatures, constructed from publicly available ChiP-seq experiments. xcore use ridge regression to model changes in expression as a linear combination of molecular signatures and find their unknown activities. Obtained, estimates can be further tested for significance to select molecular signatures with the highest predicted effect on the observed expression changes.
Authors:	Maciej Migdał [aut, cre] , Bogumił Kaczkowski [aut]
Maintainer:	Maciej Migdał <[email protected]>
License:	GPL-2
Version:	1.11.0
Built:	2025-03-30 06:15:26 UTC
Source:	https://github.com/bioc/xcore

re-export magrittr pipe operator

Description

re-export magrittr pipe operator

Add molecular signatures to MultiAssayExperiment

Description

addSignatures extends mae by adding to it new experiments. Rows consistency is ensured by taking an intersection of rows after new experiments are added.

Usage

addSignatures(mae, ..., intersect_rows = TRUE)
addSignatures(mae, ..., intersect_rows = TRUE)

Arguments

`mae`	MultiAssayExperiment object.
`...`	named experiments to be added to `mae`.
`intersect_rows`	logical flag indicating if only common rows across experiments should be included. Only set to `FALSE` if you know what you are doing.

Value

MultiAssayExperiment object with new experiments added.

Examples

data("rinderpest_mini", "remap_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)
mae <- addSignatures(mae, remap = remap_mini)

data("rinderpest_mini", "remap_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)
mae <- addSignatures(mae, remap = remap_mini)

Apply function over groups of columns

Description

Returns a array obtained by applying a function to rows of submatrices of the input matrix, where the submatrices are divided into specified groups of columns.

Usage

applyOverColumnGroups(mat, groups, f, ...)
applyOverColumnGroups(mat, groups, f, ...)

Arguments

`mat`	a matrix.
`groups`	a vector giving columns grouping.
`f`	function to be applied.
`...`	optional arguments to `f`.

Value

a matrix of dimensions nrow(mat) x nlevels(groups).

Apply function over selected column in list of data frames

Description

applyOverDFList operates on a list of data frames where all data frames has the same size and columns. Column of interest is extracted from each data frame and column binded in groups, next fun is applied over rows. Final result is a matrix with result for each group on a separate column. Function is parallelized over groups.

Usage

applyOverDFList(list_of_df, col_name, fun, groups)
applyOverDFList(list_of_df, col_name, fun, groups)

Arguments

`list_of_df`	list of `data.frame`s.
`col_name`	string specifying column in `data.frame`s to apply `fun` on.
`fun`	function to apply, should take a single vector as a argument.
`groups`	factor defining how elements of `list_of_df` should be grouped.

Value

matrix with nrow(list_of_df[[1]]) rows and nlevels(groups) columns.

Transform design matrix to factor

Description

Transform design matrix to factor

Usage

design2factor(design)
design2factor(design)

Arguments

design

design matrix

Value

factor

Examples

## Not run: 
design <- matrix(data = c(1, 1, 0, 0, 0, 0, 1, 1),
                 nrow = 4,
                 ncol = 2,
                 dimnames = list(c(paste("sample", 1:4)), c("gr1", "gr2")))
design2factor(design)

## End(Not run)

## Not run: 
design <- matrix(data = c(1, 1, 0, 0, 0, 0, 1, 1),
                 nrow = 4,
                 ncol = 2,
                 dimnames = list(c(paste("sample", 1:4)), c("gr1", "gr2")))
design2factor(design)

## End(Not run)

Estimate linear models goodness of fit statistic

Description

Estimate goodness of fit statistic of penalized linear regression models. Works with different goodness of fit statistic functions.

Usage

estimateStat(x, y, u, s, method = "cv", nfold = 10, statistic = rsq, alpha = 0)
estimateStat(x, y, u, s, method = "cv", nfold = 10, statistic = rsq, alpha = 0)

Arguments

`x`	input matrix, of dimension nobs x nvars; each row is an observation vector. Can be in sparse matrix format (inherit from class `"sparseMatrix"` as in package `Matrix`)
`y`	response variable. Quantitative for `family="gaussian"`, or `family="poisson"` (non-negative counts). For `family="binomial"` should be either a factor with two levels, or a two-column matrix of counts or proportions (the second column is treated as the target class; for a factor, the last level in alphabetical order is the target class). For `family="multinomial"`, can be a `nc>=2` level factor, or a matrix with `nc` columns of counts or proportions. For either `"binomial"` or `"multinomial"`, if `y` is presented as a vector, it will be coerced into a factor. For `family="cox"`, preferably a `Surv` object from the survival package: see Details section for more information. For `family="mgaussian"`, `y` is a matrix of quantitative responses.
`u`	offset vector as in `glmnet`. `"U"` experiment in mae.
`s`	user supplied lambda.
`method`	currently only cross-validation is implemented.
`nfold`	number of fold to use in cross-validation.
`statistic`	function computing goodness of fit statistic. Should accept `y`, `x`, `offset` arguments and return a numeric vector of the same length. See `rsq`, `mse` for examples.
`alpha`	The elasticnet mixing parameter, with $0\le\alpha\le 1$ . The penalty is defined as $(1-\alpha)/2\|\|\beta\|\|_2^2+\alpha\|\|\beta\|\|_1.$ `alpha=1` is the lasso penalty, and `alpha=0` the ridge penalty.

Value

numeric vector of statistic estimates.

Filter signatures by coverage

Description

Filter signatures overlapping low or high number of promoters. Useful to get rid of signatures that have very low variance.

Usage

filterSignatures(
  mae,
  min = 0.05,
  max = 0.95,
  ref_experiment = "Y",
  omit_experiments = c("Y", "U")
)
filterSignatures(
  mae,
  min = 0.05,
  max = 0.95,
  ref_experiment = "Y",
  omit_experiments = c("Y", "U")
)

Arguments

`mae`	MultiAssayExperiment object.
`min`	length one numeric between 0 and 1 defining minimum promoter coverage for the signature to pass filtering.
`max`	length one numeric between 0 and 1 defining maximum promoter coverage for the signature to pass filtering.
`ref_experiment`	string giving name of experiment to use for inferring total number of promoters.
`omit_experiments`	character giving names of experiments to exclude from filtering.

Value

MultiAssayExperiment object with selected experiments filtered.

Examples

data("rinderpest_mini", "remap_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)
mae <- addSignatures(mae, remap = remap_mini)
mae <- filterSignatures(mae)

data("rinderpest_mini", "remap_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)
mae <- addSignatures(mae, remap = remap_mini)
mae <- filterSignatures(mae)

Combine p-values using Fisher method

Description

Fisher's method is a meta-analysis technique used to combine the results from independent statistical tests with the same hypothesis (Wikipedia article).

Usage

fisherMethod(p.value, lower.tail = FALSE, log.p = TRUE)
fisherMethod(p.value, lower.tail = FALSE, log.p = TRUE)

Arguments

`p.value`	a numeric vector of p-values to combine.
`lower.tail`	logical; if TRUE (default), probabilities are $P[X \le x]$ , otherwise, $P[X > x]$ .
`log.p`	logical; if TRUE, probabilities p are given as log(p).

Value

a number giving combined p-value.

Calculate regions coverage

Description

getCoverage calculates coverage of regions (rows in interaction matrix) by features (columns). It is possible to specify features grouping variable gr then coverage tells how many distinct groups the region overlap with.

Usage

getCoverage(mat, gr)
getCoverage(mat, gr)

Arguments

`mat`	dgCMatrix interaction matrix such as produced by `getInteractionMatrix`.
`gr`	factor specifying features groups. Must have length equal to number of columns in `mat`.

Value

Numeric vector.

Examples

data("remap_mini")
y <- colnames(remap_mini)

# simple coverage
gr <- seq_along(y) %>% as.factor()
getCoverage(remap_mini, gr)

# per cell type coverage
gr <- sub(".*\\.", "", y) %>% as.factor()
getCoverage(remap_mini, gr)

data("remap_mini")
y <- colnames(remap_mini)

# simple coverage
gr <- seq_along(y) %>% as.factor()
getCoverage(remap_mini, gr)

# per cell type coverage
gr <- sub(".*\\.", "", y) %>% as.factor()
getCoverage(remap_mini, gr)

Compute interaction matrix

Description

getInteractionMatrix construct interaction matrix between two Granges objects. Names of object a became row names and names of b column names.

Usage

getInteractionMatrix(a, b, ext = 500, count = FALSE)
getInteractionMatrix(a, b, ext = 500, count = FALSE)

Arguments

`a`	GRanges object.
`b`	GRanges object.
`ext`	Integer specifying number of base pairs the `a` coordinates should be extended in upstream and downstream directions.
`count`	Logical indicating if matrix should hold number of overlaps between `a` and `b` or if FALSE presence / absence indicators.

Value

Sparse matrix of class dgCMatrix, with rows corresponding to a and columns to b. Each cell holds a number indicating how many times a and b overlapped.

Examples

a <- GenomicRanges::GRanges(
  seqnames = c("chr20", "chr4"),
  ranges = IRanges::IRanges(
    start = c(62475984L, 173530220L),
    end = c(62476001L, 173530236L)),
  strand = c("-", "-"),
  name = c("hg19::chr20:61051039..61051057,-;hg_188273.1",
           "hg19::chr4:174451370..174451387,-;hg_54881.1"))
b <- GenomicRanges::GRanges(
  seqnames = c("chr4", "chr20"),
  ranges = IRanges::IRanges(
    start = c(173530229L, 63864270L),
    end = c(173530236L, 63864273L)),
  strand = c("-", "-"),
  name = c("HAND2", "GATA5"))
getInteractionMatrix(a, b)

a <- GenomicRanges::GRanges(
  seqnames = c("chr20", "chr4"),
  ranges = IRanges::IRanges(
    start = c(62475984L, 173530220L),
    end = c(62476001L, 173530236L)),
  strand = c("-", "-"),
  name = c("hg19::chr20:61051039..61051057,-;hg_188273.1",
           "hg19::chr4:174451370..174451387,-;hg_54881.1"))
b <- GenomicRanges::GRanges(
  seqnames = c("chr4", "chr20"),
  ranges = IRanges::IRanges(
    start = c(173530229L, 63864270L),
    end = c(173530236L, 63864273L)),
  strand = c("-", "-"),
  name = c("HAND2", "GATA5"))
getInteractionMatrix(a, b)

Calculate variance weighted average coefficients matrix

Description

Calculate variance weighted average coefficients matrix

Usage

getVarianceWeightedAvgCoeff(pvalues, groups)
getVarianceWeightedAvgCoeff(pvalues, groups)

Arguments

`pvalues`	list of `data.frame`s outputs from `ridgePvals`.
`groups`	factor giving the grouping.

Value

variance weighted average coefficients matrix

Check if argument is a binary flag

Description

Check if argument is a binary flag

Usage

isTRUEorFALSE(x)
isTRUEorFALSE(x)

Arguments

`x`	object to test

Value

binary flag

Calculate Mean Absolute Error

Description

Calculate Mean Absolute Error

Usage

mae(y, yhat, ...)
mae(y, yhat, ...)

Arguments

`y`	numeric vector of observed expression values.
`yhat`	numeric vector of predicted expression values.
`...`	not used.

Value

numeric vector

Helper summarizing MAE object

Description

Helper summarizing MAE object

Usage

maeSummary(mae)
maeSummary(mae)

Arguments

mae

MultiAssayExperiment object.

Value

named list giving number of rows and columns, overall mean and standard deviation in mae's experiments.

Gene expression modeling pipeline

Description

modelGeneExpression uses parallelization if parallel backend is registered. For that reason we advise against passing parallel argument to internally called cv.glmnet routine.

Usage

modelGeneExpression(
  mae,
  yname = "Y",
  uname = "U",
  xnames,
  design = NULL,
  standardize = TRUE,
  parallel = FALSE,
  pvalues = TRUE,
  precalcmodels = NULL,
  ...
)
modelGeneExpression(
  mae,
  yname = "Y",
  uname = "U",
  xnames,
  design = NULL,
  standardize = TRUE,
  parallel = FALSE,
  pvalues = TRUE,
  precalcmodels = NULL,
  ...
)

Arguments

`mae`	MultiAssayExperiment object such as produced by `prepareCountsForRegression`.
`yname`	string indicating experiment in `mae` to use as the expression input.
`uname`	string indicating experiment in `mae` to use as the basal expression level.
`xnames`	character indicating experiments in `mae` to use as molecular signatures.
`design`	matrix giving the design matrix for the samples. Default (`NULL`) is to use design found in `mae` metadata. Columns corresponds to samples groups and rows to samples names. Only samples included in the design will be processed.
`standardize`	logical flag indicating if the molecular signatures should be scaled. Advised to be set to `TRUE`.
`parallel`	parallel argument to internally used `cv.glmnet` function. Advised to be set to `FALSE` as it might interfere with parallelization used in `modelGeneExpression`.
`pvalues`	logical flag indicating if significance testing for the estimated molecular signatures activities should be performed.
`precalcmodels`	optional list of precomputed `'cv.glmnet'` objects for each molecular signature and sample. The elements of this list should be matching the `xnames` vector. Each of those elements should be a named list holding `'cv.glmnet'` objects for each sample. If provided those models will be used instead of running regression from scratch.
`...`	arguments passed to glmnet::cv.glmnet.

Details

For speeding up the calculations consider lowering number of folds used in internally run cv.glmnet by specifying nfolds argument. By default 10 fold cross validation is used.

The relationship between the expression (Y) and molecular signatures (X) is described using linear model formulation. The pipeline attempts to model the change in expression between basal expression level (u) and each sample, with the goal of finding the unknown molecular signatures activities. Linear models are fit using popular ridge regression implementation glmnet (Friedman, Hastie, and Tibshirani 2010).

If pvalues is set to TRUE the significance of the estimated molecular signatures activities is tested using methodology introduced by (Cule, Vineis, and De Iorio 2011) which original implementation can be found in ridge-package.

If replicates are available the signatures activities estimates and their standard error estimates can be combined. This is done by averaging signatures activities estimates and pooling their significance estimates using Stouffer's method for the Z-scores and Fisher's method for the p-values.

For detailed pipeline description we refer interested user to paper accompanying this package.

Value

Nested list with following elements

regression_models: Named list with elements corresponding to signatures specified in xnames. Each of these is a list holding 'cv.glmnet' objects corresponding to each sample.
pvalues: Named list with elements corresponding to signatures specified in xnames. Each of these is a list holding data.frame of signature's p-values and test statistics estimated for each sample.
zscore_avg: Named list with elements corresponding to signatures specified in xnames. Each of these is a matrix holding replicate average Z-scores with columns corresponding to groups in the design.
coef_avg: Named list with elements corresponding to signatures specified in xnames. Each of these is a matrix holding replicate averaged signatures activities with columns corresponding to groups in the design.
results: Named list of a data.frames holding replicate average molecular signatures, overall molecular signatures Z-score and p-values calculated over groups using Stouffer's and Fisher's methods.

Examples

data("rinderpest_mini", "remap_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)
mae <- addSignatures(mae, remap = remap_mini)
mae <- filterSignatures(mae)
res <- modelGeneExpression(
  mae = mae,
  xnames = "remap",
  nfolds = 5)

data("rinderpest_mini", "remap_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)
mae <- addSignatures(mae, remap = remap_mini)
mae <- filterSignatures(mae)
res <- modelGeneExpression(
  mae = mae,
  xnames = "remap",
  nfolds = 5)

Ridge regression wrapper for modelGeneExpression

Description

Internal function used in modelGeneExpression. It runs ridge regression parallelly across signatures and samples as specified by experiment design.

Usage

modelGeneExpression_ridge_regression_wraper(
  mae,
  yname,
  uname,
  xnames,
  groups,
  standardize,
  parallel,
  precalcmodels,
  ...
)
modelGeneExpression_ridge_regression_wraper(
  mae,
  yname,
  uname,
  xnames,
  groups,
  standardize,
  parallel,
  precalcmodels,
  ...
)

Arguments

`mae`	MultiAssayExperiment object such as produced by `prepareCountsForRegression`.
`yname`	string indicating experiment in `mae` to use as the expression input.
`uname`	string indicating experiment in `mae` to use as the basal expression level.
`xnames`	character indicating experiments in `mae` to use as molecular signatures.
`groups`	factor representation of design matrix.
`standardize`	logical flag indicating if the molecular signatures should be scaled. Advised to be set to `TRUE`.
`parallel`	parallel argument to internally used `cv.glmnet` function. Advised to be set to `FALSE` as it might interfere with parallelization used in `modelGeneExpression`.
`precalcmodels`	optional list of precomputed `'cv.glmnet'` objects for each molecular signature and sample. The elements of this list should be matching the `xnames` vector. Each of those elements should be a named list holding `'cv.glmnet'` objects for each sample. If provided those models will be used instead of running regression from scratch.
`...`	arguments passed to glmnet::cv.glmnet.

Value

Named list with elements corresponding to signatures specified in xnames. Each of these is a list holding 'cv.glmnet' objects corresponding to each sample.

Statistical testing of ridge regression estimates wrapper for modelGeneExpression

Description

Internal function used in modelGeneExpression. It runs ridgePvals parallelly across signatures and samples as specified by experiment design.

Usage

modelGeneExpression_significance_testing_wraper(
  mae,
  yname,
  uname,
  xnames,
  groups,
  standardize,
  regression_models
)
modelGeneExpression_significance_testing_wraper(
  mae,
  yname,
  uname,
  xnames,
  groups,
  standardize,
  regression_models
)

Arguments

`mae`	MultiAssayExperiment object such as produced by `prepareCountsForRegression`.
`yname`	string indicating experiment in `mae` to use as the expression input.
`uname`	string indicating experiment in `mae` to use as the basal expression level.
`xnames`	character indicating experiments in `mae` to use as molecular signatures.
`groups`	factor representation of design matrix.
`standardize`	logical flag indicating if the molecular signatures should be scaled. Advised to be set to `TRUE`.
`regression_models`	Named list with elements corresponding to signatures specified in `xnames`. Each of these is a list holding `'cv.glmnet'` objects corresponding to each sample. Usually returned by `modelGeneExpression_ridge_regression_wraper`.

Value

Named list with elements corresponding to signatures specified in xnames. Each of these is a list holding data.frame of signature's p-values and test statistics estimated for each sample.

Calculate Mean Squared Error

Description

Calculate Mean Squared Error

Usage

mse(y, yhat, ...)
mse(y, yhat, ...)

Arguments

`y`	numeric vector of observed expression values.
`yhat`	numeric vector of predicted expression values.
`...`	not used.

Value

numeric vector

Process count matrix for expression modeling

Description

Expression counts are processed using edgeR following User's Guide. Shortly, counts for each sample are filtered for lowly expressed promoters, normalized for the library size and transformed into counts per million (CPM). Optionally, CPM are log2 transformed with addition of pseudo count. Basal level expression is calculated by averaging base_lvl samples expression values.

Usage

prepareCountsForRegression(
  counts,
  design,
  base_lvl,
  log2 = TRUE,
  pseudo_count = 1L,
  drop_base_lvl = TRUE
)
prepareCountsForRegression(
  counts,
  design,
  base_lvl,
  log2 = TRUE,
  pseudo_count = 1L,
  drop_base_lvl = TRUE
)

Arguments

`counts`	matrix of read counts.
`design`	matrix giving the design matrix for the samples. Columns corresponds to samples groups and rows to samples names.
`base_lvl`	string indicating group in `design` corresponding to basal expression level. The reference samples to which expression change will be compared.
`log2`	logical flag indicating if counts should be log2(counts per million) should be returned.
`pseudo_count`	integer count to be added before taking log2.
`drop_base_lvl`	logical flag indicating if `base_lvl` samples should be dropped from resulting MultiAssayExperiment object.

Value

MultiAssayExperiment object with two experiments:

U: matrix giving expression values averaged over basal level samples
Y: matrix of expression values

design with base_lvl dropped is stored in metadata and directly available for modelGeneExpression.

Examples

data("rinderpest_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)

data("rinderpest_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)

Create MultiAssayExperiment object for expression modeling

Description

regressionData orgnize expression data and experiment design into MultiAssayExperiment object that can be further used in xcore framework. Additionally, function calculate basal expression level, for latter use in expression modeling, by averaging base_lvl samples expression values.

Usage

regressionData(expr_mat, design, base_lvl, drop_base_lvl = TRUE)
regressionData(expr_mat, design, base_lvl, drop_base_lvl = TRUE)

Arguments

`expr_mat`	matrix of expression values.
`design`	matrix giving the design matrix for the samples. Columns corresponds to samples groups and rows to samples names.
`base_lvl`	string indicating group in `design` corresponding to basal expression level. The reference samples to which expression change will be compared.
`drop_base_lvl`	logical flag indicating if `base_lvl` samples should be dropped from resulting MultiAssayExperiment object.

Details

Note that regressionData does not apply any normalization or transformation to the input data! Use prepareCountsForRegression if you want to start with raw expression counts.

Value

MultiAssayExperiment object with two experiments:

U: matrix giving expression values averaged over basal level samples
Y: matrix of expression values

design with base_lvl dropped is stored in metadata and directly available for modelGeneExpression.

Examples

data("rinderpest_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- regressionData(
  expr_mat = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)

data("rinderpest_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- regressionData(
  expr_mat = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)

xcore example molecular signatures

Description

Molecular signatures data intended for use in xcore vignette and examples. It is build ReMap2020 molecular signatures constructed against FANTOM5 annotation, which can be found in xcoredata package. Here the data is only a subset limited to core promoters (promoters_f5_core) and randomly selected 600 signatures.

Usage

data(remap_mini)
data(remap_mini)

Format

A dgCMatrix with 14191 rows and 600 columns holding interaction matrix for subset of ReMap2020 molecular signatures against FANTOM5 annotation. Rows corresponds to FANTOM5 promoters and columns to signatures.

Calculate replicate variance weighted averaged Z-scores

Description

Replicate averaged Z-scores is calculated by dividing replicate average coefficient by replicate pooled standard error.

Usage

repVarianceWeightedAvgZscore(pvalues, groups)
repVarianceWeightedAvgZscore(pvalues, groups)

Arguments

`pvalues`	Data frame with `'se'` (standard error) and `'coef'` (coefficient) columns. Such as in `pvalues` output of `modelGeneExpression` .
`groups`	Factor giving group membership for samples in `pvalues`.

Value

Numeric matrix of averaged Z-scores. Columns correspond to groups and rows to predictors.

Significance testing in linear ridge regression

Description

Standard error estimation and significance testing for coefficients estimated in linear ridge regression. ridgePvals re-implement original method by (Cule et al. BMC Bioinformatics 2011.) found in ridge-package. This function is intended to use with cv.glmnet output.

Usage

ridgePvals(x, y, beta, lambda, standardizex = TRUE, svdX = NULL)
ridgePvals(x, y, beta, lambda, standardizex = TRUE, svdX = NULL)

Arguments

`x`	input matrix, same as used in `cv.glmnet`.
`y`	response variable, same as used in `cv.glmnet`.
`beta`	matrix of coefficients, estimated using `cv.glmnet`.
`lambda`	lambda value for which `beta` was estimated.
`standardizex`	logical flag for x variable standardization, should be set to same value as `standarize` flag in `cv.glmnet`.
`svdX`	optional singular-value decomposition of `x` matrix. One can be obtained using `link[base]{svd}`. Passing this argument omits internal call to `link[base]{svd}`, this is useful when calling `ridgePvals` repeatedly using same `x`.

Value

a data.frame with columns

coef: beta's names
se: beta's standard errors
tstat: beta's test statistic
pval: beta's p-values

xcore example expression data

Description

Expression data intended for use in xcore vignette and examples. It is build from FANTOM5's 293SLAM rinderpest infection time course dataset. Here the data is only a subset limited to core promoters (promoters_f5_core).

Usage

data(rinderpest_mini)
data(rinderpest_mini)

Format

A matrix with 14191 rows and 6 columns holding expression counts from CAGE-seq experiment. Rows corresponds to FANTOM5 promoters and columns to time points at which expression was measured 0 and 24 hours post infection.

Calculate $R^2$

Description

Calculate $R^2$

Usage

rsq(y, yhat, offset)
rsq(y, yhat, offset)

Arguments

`y`	numeric vector of observed expression values.
`yhat`	numeric vector of predicted expression values.
`offset`	numeric vector giving basal expression level.

Value

numeric vector

Simplify Interaction Matrix

Description

Simplify Interaction Matrix

Usage

simplifyInteractionMatrix(mat, alpha = 0.5, colname = NA)
simplifyInteractionMatrix(mat, alpha = 0.5, colname = NA)

Arguments

`mat`	dgCMatrix interaction matrix such as produced by `getInteractionMatrix`.
`alpha`	Number between 0 and 1 specifying voting threshold. Eg. for 3 column matrix alpha 0.5 will give voting criteria >= 2.
`colname`	character giving new column name.

Value

dgCMatrix

Combine Z-scores using Stouffer's method

Description

Stouffer's Z-score method is a meta-analysis technique used to combine the results from independent statistical tests with the same hypothesis. It is closely related to Fisher's method, but operates on Z-scores instead of p-values (Wikipedia article).

Usage

stoufferZMethod(z)
stoufferZMethod(z)

Arguments

`z`	a numeric vector of Z-score to combine.

Value

a number giving combined Z-score.

Subset keeping missing

Description

Subset matrix keeping unmatched rows as NA.

Usage

subsetWithMissing(mat, rows)
subsetWithMissing(mat, rows)

Arguments

`mat`	matrix
`rows`	character

Value

a matrix

Translate counts matrix rownames

Description

translateCounts renames counts matrix rownames according to supplied dictionary. Function can handle many to one assignments by taking a sum or an average over counts rows. Other types of ambiguous assignments are not supported.

Usage

translateCounts(counts, dict)
translateCounts(counts, dict)

Arguments

`counts`	matrix of expression values.
`dict`	named character vector mapping `counts` rownames to new values. Values of vector should correspond to new desired rownames, and its names to current rownames.

Value

matrix of expression values with new rownames.

Examples

counts <- matrix(
data = c(5, 4, 3, 2),
nrow = 2,
   dimnames = list(
     c("ENSG00000130700", "ENSG00000089225"),
     c("treatment", "control")
   )
 )
dict <- c(ENSG00000130700 = "GATA5", ENSG00000089225 = "TBX5")
translateCounts(counts, dict)

counts <- matrix(
data = c(5, 4, 3, 2),
nrow = 2,
   dimnames = list(
     c("ENSG00000130700", "ENSG00000089225"),
     c("treatment", "control")
   )
 )
dict <- c(ENSG00000130700 = "GATA5", ENSG00000089225 = "TBX5")
translateCounts(counts, dict)

Package 'xcore'

Help Index

re-export magrittr pipe operator

Description

Add molecular signatures to MultiAssayExperiment

Description

Usage

Arguments

Value

Examples

Apply function over groups of columns

Description

Usage

Arguments

Value

Apply function over selected column in list of data frames

Description

Usage

Arguments

Value

Transform design matrix to factor

Description

Usage

Arguments

Value

Examples

Estimate linear models goodness of fit statistic

Description

Usage

Arguments

Value

Filter signatures by coverage

Description

Usage

Arguments

Value

Examples

Combine p-values using Fisher method

Description

Usage

Arguments

Value

Calculate regions coverage

Description

Usage

Arguments

Value

Examples

Compute interaction matrix

Description

Usage

Arguments

Value

Examples

Calculate variance weighted average coefficients matrix

Description

Usage

Arguments

Value

Check if argument is a binary flag

Description

Usage

Arguments

Value

Calculate Mean Absolute Error

Description

Usage

Arguments

Value

Helper summarizing MAE object

Description

Usage

Arguments

Value

Gene expression modeling pipeline

Description

Usage

Arguments

Details

Value