Package 'DEGreport' reference manual

Title:	Report of DEG analysis
Description:	Creation of ready-to-share figures of differential expression analyses of count data. It integrates some of the code mentioned in DESeq2 and edgeR vignettes, and report a ranked list of genes according to the fold changes mean and variability for each selected gene.
Authors:	Lorena Pantano [aut, cre], John Hutchinson [ctb], Victor Barrera [ctb], Mary Piper [ctb], Radhika Khetani [ctb], Kenneth Daily [ctb], Thanneer Malai Perumal [ctb], Rory Kirchner [ctb], Michael Steinbaugh [ctb], Ivo Zeller [ctb]
Maintainer:	Lorena Pantano <[email protected]>
License:	MIT + file LICENSE
Version:	1.43.0
Built:	2025-03-18 04:20:33 UTC
Source:	https://github.com/bioc/DEGreport

Deprecated functions in package DEGreport

Description

These functions are provided for compatibility with older versions of DEGreport only and will be defunct at the next release.

Details

The following functions are deprecated and will be made defunct; use the replacement indicated below:

degRank, degPR, degBIcmd, degBI, degFC, degComb, degNcomb: DESeq2::lcfShrink. This function was trying to avoid big FoldChange in variable genes. There are other methods nowadays like lcfShrink function. DEGreport

Author(s)

Maintainer: Lorena Pantano [email protected]

Other contributors:

John Hutchinson [contributor]
Victor Barrera [contributor]
Mary Piper [contributor]
Radhika Khetani [contributor]
Kenneth Daily [contributor]
Thanneer Malai Perumal [contributor]
Rory Kirchner [contributor]
Michael Steinbaugh [contributor]
Ivo Zeller [contributor]

Create report of RNAseq DEG anlaysis

Description

This function get the count matrix, pvalues, and FC of a DEG analysis and create a report to help to detect possible problems with the data.

Usage

createReport(g, counts, tags, pvalues, path, pop = 400, name = "DEGreport")
createReport(g, counts, tags, pvalues, path, pop = 400, name = "DEGreport")

Arguments

`g`	Character vector with the group the samples belong to.
`counts`	Matrix with counts for each samples and each gene. Should be same length than pvalues vector.
`tags`	Genes of DEG analysis
`pvalues`	pvalues of DEG analysis
`path`	path to save the figure
`pop`	random genes for background
`name`	name of the html file

Value

A HTML file with all figures and tables

Method to get all table stored for an specific comparison

Description

Method to get all table stored for an specific comparison

Usage

deg(object, value = NULL, tidy = NULL, top = NULL, ...)

## S4 method for signature 'DEGSet'
deg(object, value = NULL, tidy = NULL, top = NULL, ...)
deg(object, value = NULL, tidy = NULL, top = NULL, ...)

## S4 method for signature 'DEGSet'
deg(object, value = NULL, tidy = NULL, top = NULL, ...)

Arguments

`object`	DEGSet
`value`	Character to specify which table to use.
`tidy`	Return data.frame, tibble or original class.
`top`	Limit number of rows to return. Default: All.
`...`	Other parameters to pass for other methods.

Author(s)

Lorena Pantano

References

Testing if top is whole number or not comes from: https://stackoverflow.com/a/3477158

Distribution of gene ratios used to calculate Size Factors.

Description

This function check the median ratio normalization used by DESeq2 and similarly by edgeR to visualy check whether the median is the best size factor to represent depth.

Usage

degCheckFactors(counts, each = FALSE)
degCheckFactors(counts, each = FALSE)

Arguments

`counts`	Matrix with counts for each samples and each gene. row number should be the same length than pvalues vector.
`each`	Plot each sample separately.

Details

This function will plot the gene ratios for each sample. To calculate the ratios, it follows the simliar logic than DESeq2/edgeR uses, where the expression of each gene is divided by the mean expression of that gene. The distribution of the ratios should approximate to a normal shape and the factors should be similar to the median of distributions. If some samples show different distribution, the factor may be bias due to some biological or technical factor.

Value

ggplot2 object

References

Code to calculate size factors comes from DESeq2::estimateSizeFactorsForMatrix().

Examples

data(humanGender)
library(SummarizedExperiment)
degCheckFactors(assays(humanGender)[[1]][, 1:10])
data(humanGender)
library(SummarizedExperiment)
degCheckFactors(assays(humanGender)[[1]][, 1:10])

Make nice colors for metadata

Description

The function will take a metadata table and use Set2 palette when number of levels is > 3 or a set or orange/blue colors other wise.

Usage

degColors(
  ann,
  col_fun = FALSE,
  con_values = c("grey80", "black"),
  cat_values = c("orange", "steelblue"),
  palette = "Set2"
)
degColors(
  ann,
  col_fun = FALSE,
  con_values = c("grey80", "black"),
  cat_values = c("orange", "steelblue"),
  palette = "Set2"
)

Arguments

`ann`	Data.frame with metadata information. Each column will be used to generate a palette suitable for the values in there.
`col_fun`	Whether to return a function for continuous variables (compatible with `ComplexHeatmap::HeatmapAnnotation()`) or the colors themself (comparible with [pheatmap::pheatmap())]). [pheatmap::pheatmap())]: R:pheatmap::pheatmap())
`con_values`	Color to be used for continuous variables.
`cat_values`	Color to be used for 2-levels categorical variables.
`palette`	Palette to use from `brewer.pal()` for multi-levels categorical variables.

Examples

data(humanGender)
library(DESeq2)
library(ComplexHeatmap)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:10, idx],
  colData(humanGender)[idx,], design=~group)
th <- HeatmapAnnotation(df = colData(dse),
                       col = degColors(colData(dse), TRUE))
Heatmap(log2(counts(dse)+0.5), top_annotation = th)

custom <- degColors(colData(dse), TRUE,
          con_values = c("white", "red"),
          cat_values = c("white", "black"),
          palette = "Set1")
th <- HeatmapAnnotation(df = colData(dse),
                        col = custom)
Heatmap(log2(counts(dse)+0.5), top_annotation = th)
data(humanGender)
library(DESeq2)
library(ComplexHeatmap)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:10, idx],
  colData(humanGender)[idx,], design=~group)
th <- HeatmapAnnotation(df = colData(dse),
                       col = degColors(colData(dse), TRUE))
Heatmap(log2(counts(dse)+0.5), top_annotation = th)

custom <- degColors(colData(dse), TRUE,
          con_values = c("white", "red"),
          cat_values = c("white", "black"),
          palette = "Set1")
th <- HeatmapAnnotation(df = colData(dse),
                        col = custom)
Heatmap(log2(counts(dse)+0.5), top_annotation = th)

Automatize the use of `results()` for multiple comparisons

Description

This function will extract the output of DESeq2::results() and DESeq2::lfcShrink() for multiple comparison using:

Usage

degComps(
  dds,
  combs = NULL,
  contrast = NULL,
  alpha = 0.05,
  skip = FALSE,
  type = "normal",
  pairs = FALSE,
  fdr = "default"
)
degComps(
  dds,
  combs = NULL,
  contrast = NULL,
  alpha = 0.05,
  skip = FALSE,
  type = "normal",
  pairs = FALSE,
  fdr = "default"
)

Arguments

`dds`	DESeq2::DESeqDataSet obcject.
`combs`	Optional vector indicating the coefficients or columns fom `colData(dds)` to create group comparisons.
`contrast`	Optional vector to specify contrast. See `DESeq2::results()`.
`alpha`	Numeric value used in independent filtering in `DESeq2::results()`.
`skip`	Boolean to indicate whether skip shrinkage. For instance when it comes from LRT method.
`type`	Type of shrinkage estimator. See `DESeq2::lfcShrink()`.
`pairs`	Boolean to indicate whether create all comparisons or only use the coefficient already created from `DESeq2::resultsNames()`.
`fdr`	type of fdr correction. `default` is FDR value, `lfdr-stat` is for local FDR using the statistics of the test, `lfdr-pvalue` is for local FDR using the p-value of the test. fdrtools needs to be installed and loaded by the user

Details

coefficients
contrast
Multiple columns in colData that match coefficients
Multiple columns in colData to create all possible contrasts

Value

DEGSet with unSrunken and Srunken results.

Author(s)

Lorena Pantano

Examples

library(DESeq2)
dds <- makeExampleDESeqDataSet(betaSD=1)
colData(dds)[["treatment"]] <- sample(colData(dds)[["condition"]], 12)
  design(dds) <-  ~ condition + treatment
dds <- DESeq(dds)
res <- degComps(dds, combs = c("condition", 2),
                contrast = list("treatment_B_vs_A", c("condition", "A", "B")))
# library(fdrtools)
#res <- degComps(dds,contrast = list("treatment_B_vs_A"),
#                fdr="lfdr-stat")
library(DESeq2)
dds <- makeExampleDESeqDataSet(betaSD=1)
colData(dds)[["treatment"]] <- sample(colData(dds)[["condition"]], 12)
  design(dds) <-  ~ condition + treatment
dds <- DESeq(dds)
res <- degComps(dds, combs = c("condition", 2),
                contrast = list("treatment_B_vs_A", c("condition", "A", "B")))
# library(fdrtools)
#res <- degComps(dds,contrast = list("treatment_B_vs_A"),
#                fdr="lfdr-stat")

Calculate the correlation relationshipt among all covariates in the metadata table

Description

This function will calculate the correlation among all columns in the metadata

Usage

degCorCov(metadata, fdr = 0.05, use_pval = FALSE, ...)
degCorCov(metadata, fdr = 0.05, use_pval = FALSE, ...)

Arguments

`metadata`	data.frame with samples metadata.
`fdr`	numeric value to use as cutoff to determine the minimum fdr to consider significant correlations between pcs and covariates.
`use_pval`	boolean to indicate to use p-value instead of FDR to hide non-significant correlation.
`...`	Parameters to pass to `ComplexHeatmap::Heatmap()`.

Value

: list: a) cor, data.frame with pair-wise correlations, pvalues, FDR b) corMat, data.frame with correlation matrix c) fdrMat, data.frame with FDR matrix b) plot, Heatmap plot of correlation matrix

Author(s)

: Lorena Pantano, Kenneth Daily and Thanneer Malai Perumal

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
cor <- degCorCov(colData(dse))
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
cor <- degCorCov(colData(dse))

Find correlation between pcs and covariates

Description

This function will calculate the pcs using prcomp function, and correlate categorical and numerical variables from metadata. The size of the dots indicates the importance of the metadata, for instance, when the range of the values is pretty small (from 0.001 to 0.002 in ribosimal content), the correlation results is not important. If black stroke lines are shown, the correlation analysis has a FDR < 0.05 for that variable and PC. Only significant variables according the linear model are colored. See details to know how this is calculated.

Usage

degCovariates(
  counts,
  metadata,
  fdr = 0.1,
  scale = FALSE,
  minPC = 5,
  correlation = "kendall",
  addCovDen = TRUE,
  legacy = FALSE,
  smart = TRUE,
  method = "lm",
  plot = TRUE
)
degCovariates(
  counts,
  metadata,
  fdr = 0.1,
  scale = FALSE,
  minPC = 5,
  correlation = "kendall",
  addCovDen = TRUE,
  legacy = FALSE,
  smart = TRUE,
  method = "lm",
  plot = TRUE
)

Arguments

`counts`	normalized counts matrix
`metadata`	data.frame with samples metadata.
`fdr`	numeric value to use as cutoff to determine the minimum fdr to consider significant correlations between pcs and covariates.
`scale`	boolean to determine wether counts matrix should be scaled for pca. default FALSE.
`minPC`	numeric value that will be used as cutoff to select only pcs that explain more variability than this.
`correlation`	character determining the method for the correlation between pcs and covariates.
`addCovDen`	boolean. Whether to add the covariates dendograme to the plot to see covariates relationship. It will show `degCorCov()` dendograme on top of the columns of the heatmap.
`legacy`	boolean. Whether to plot the legacy version.
`smart`	boolean. Whether to avoid normalization of the numeric covariates when calculating importance. This is not used if `legacy = TRUE`. See @details for more information.
`method`	character. Whether to use `lm` to calculate the significance of the variable during reduction step. See @details for more information.
`plot`	Whether to plot or not the correlation matrix.

Details

This method is adapeted from Daily et al 2017 article. Principal components from PCA analysis are correlated with covariates metadata. Factors are transformed to numeric variables. Correlation is measured by cor.test function with Kendall method by default.

The size of the dot, or importance, indicates the importance of the covariate based on the range of the values. Covariates where the range is very small (like a % of mapped reads that varies between 0.001 to 0.002) will have a very small size (0.1*max_size). The maximum value is set to 5 units. To get to importance, each covariate is normalized using this equation: 1 - min(v/max(v)), and the minimum and maximum values are set to 0.01 and 1 respectively. For instance, 0.5 would mean there is at least 50% of difference between the minimum value and the maximum value. Categorical variables are plot using the maximum size always, since it is not possible to estimate the variability. By default, it won't do v/max(v) if the values are already between 0-1 or 0-100 (already normalized values as rates and percentages). If you want to ignore the importance, use legacy = TRUE.

Finally, a linear model is used to calculate the significance of the covariates effect on the PCs. For that, this function uses lm to regress the data and uses the p-value calculated by each variable in the model to define significance (pvalue < 0.05). Variables with a black stroke are significant after this step. Variables with grey stroke are significant at the first pass considering p.value < 0.05 for the correlation analysis.

Value

: list:

plot, heatmap showing the signifcance of the variables.
corMatrix, correlation, p-value, FDR values for each covariate and PCA pais
pcsMatrix: PCs loading for each sample
scatterPlot: plot for each significant covariate and the PC values.
significants: contains the significant covariates using a linear model to predict the coefficient of covariates that have some color in the plot. All the significant covariates from the liner model analysis are returned.

Author(s)

: Lorena Pantano, Victor Barrera, Kenneth Daily and Thanneer Malai Perumal

References

Daily, K. et al. Molecular, phenotypic, and sample-associated data to describe pluripotent stem cell lines and derivatives. Sci Data 4, 170030 (2017).

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
res <- degCovariates(log2(counts(dse)+0.5), colData(dse))
res <- degCovariates(log2(counts(dse)+0.5),
  colData(dse), legacy = TRUE)
res$plot
res$scatterPlot[[1]]
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
res <- degCovariates(log2(counts(dse)+0.5), colData(dse))
res <- degCovariates(log2(counts(dse)+0.5),
  colData(dse), legacy = TRUE)
res$plot
res$scatterPlot[[1]]

Method to get the default table to use.

Description

It can accept a list of new padj values matching the same dimmensions than the current vector.

Usage

degDefault(object)

degCorrect(object, fdr)

## S4 method for signature 'DEGSet'
degDefault(object)

## S4 method for signature 'DEGSet'
degCorrect(object, fdr)
degDefault(object)

degCorrect(object, fdr)

## S4 method for signature 'DEGSet'
degDefault(object)

## S4 method for signature 'DEGSet'
degCorrect(object, fdr)

Arguments

`object`	DEGSet
`fdr`	It can be `fdr-stat`, `fdr-pvalue`, vector of new padj

Author(s)

Lorena Pantano

Examples

library(DESeq2)
library(dplyr)
dds <- makeExampleDESeqDataSet(betaSD=1)
colData(dds)[["treatment"]] <- sample(colData(dds)[["condition"]], 12)
design(dds) <-  ~ condition + treatment
dds <- DESeq(dds)
res <- degComps(dds, contrast = list("treatment_B_vs_A"))
library(DESeq2)
library(dplyr)
dds <- makeExampleDESeqDataSet(betaSD=1)
colData(dds)[["treatment"]] <- sample(colData(dds)[["condition"]], 12)
design(dds) <-  ~ condition + treatment
dds <- DESeq(dds)
res <- degComps(dds, contrast = list("treatment_B_vs_A"))

Filter genes by group

Description

This function will keep only rows that have a minimum counts of 1 at least in a min number of samples (default 80%).

Usage

degFilter(counts, metadata, group, min = 0.8, minreads = 0)
degFilter(counts, metadata, group, min = 0.8, minreads = 0)

Arguments

`counts`	Matrix with expression data, columns are samples and rows are genes or other feature.
`metadata`	Data.frame with information about each column in counts matrix. Rownames should match `colnames(counts)`.
`group`	Character column in metadata used to group samples and applied the cutoff.
`min`	Percentage value indicating the minimum number of samples in each group that should have more than 0 in count matrix.
`minreads`	Integer minimum number of reads to consider a feature expressed.

Value

count matrix after filtering genes (features) with not enough expression in any group.

Examples

data(humanGender)
library(SummarizedExperiment)
idx <- c(1:10, 75:85)
c <- degFilter(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], "group", min=1)
data(humanGender)
library(SummarizedExperiment)
idx <- c(1:10, 75:85)
c <- degFilter(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], "group", min=1)

MA-plot from base means and log fold changes

Description

MA-plot addaptation to show the shrinking effect.

Usage

degMA(
  results,
  title = NULL,
  label_points = NULL,
  label_column = "symbol",
  limit = NULL,
  diff = 5,
  raw = FALSE,
  correlation = FALSE
)
degMA(
  results,
  title = NULL,
  label_points = NULL,
  label_column = "symbol",
  limit = NULL,
  diff = 5,
  raw = FALSE,
  correlation = FALSE
)

Arguments

`results`	DEGSet class.
`title`	Optional. Plot title.
`label_points`	Optionally label these particular points.
`label_column`	Match label_points to this column in the results.
`limit`	Absolute maximum to plot on the log2FoldChange.
`diff`	Minimum difference between logFoldChange before and after shrinking.
`raw`	Whether to plot just the unshrunken log2FC.
`correlation`	Whether to plot the correlation of the two logFCs.

Value

MA-plot ggplot.

Author(s)

Victor Barrera

Rory Kirchner

Lorena Pantano

Examples

library(DESeq2)
dds <- makeExampleDESeqDataSet(betaSD=1)
dds <- DESeq(dds)
res <- degComps(dds, contrast = list("condition_B_vs_A"))
degMA(res)
library(DESeq2)
dds <- makeExampleDESeqDataSet(betaSD=1)
dds <- DESeq(dds)
res <- degComps(dds, contrast = list("condition_B_vs_A"))
degMA(res)

Distribution of expression of DE genes compared to the background

Description

Distribution of expression of DE genes compared to the background

Usage

degMB(tags, group, counts, pop = 400)
degMB(tags, group, counts, pop = 400)

Arguments

`tags`	List of genes that are DE.
`group`	Character vector with group name for each sample in the same order than counts column names.
`counts`	Matrix with counts for each samples and each gene Should be same length than pvalues vector.
`pop`	number of random samples taken for background comparison

Value

ggplot2 object

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degMB(row.names(res)[1:20], colData(dds)[["group"]],
  counts(dds, normalized = TRUE))
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degMB(row.names(res)[1:20], colData(dds)[["group"]],
  counts(dds, normalized = TRUE))

Plot MDS from normalized count data

Description

Uses cmdscale to get multidimensional scaling of data matrix, and plot the samples with ggplot2.

Usage

degMDS(counts, condition = NULL, k = 2, d = "euclidian", xi = 1, yi = 2)
degMDS(counts, condition = NULL, k = 2, d = "euclidian", xi = 1, yi = 2)

Arguments

`counts`	matrix samples in columns, features in rows
`condition`	vector define groups of samples in counts. It has to be same order than the count matrix for columns.
`k`	integer number of dimensions to get
`d`	type of distance to use, c("euclidian", "cor").
`xi`	number of component to plot in x-axis
`yi`	number of component to plot in y-axis

Value

ggplot2 object

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
degMDS(counts(dse), condition = colData(dse)[["group"]])
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
degMDS(counts(dse), condition = colData(dse)[["group"]])

Distribution of pvalues by expression range

Description

This function plot the p-values distribution colored by the quantiles of the average count data.

Usage

degMean(pvalues, counts)
degMean(pvalues, counts)

Arguments

`pvalues`	pvalues of DEG analysis.
`counts`	Matrix with counts for each samples and each gene. row number should be the same length than pvalues vector.

Value

ggplot2 object

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degMean(res[, 4], counts(dds))
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degMean(res[, 4], counts(dds))

Integrate data comming from degPattern into one data object

Description

The simplest case is if you want to convine the pattern profile for gene expression data and proteomic data. It will use the first element as the base for the integration. Then, it will loop through clusters and run degPatterns in the second data set to detect patterns that match this one.

Usage

degMerge(
  matrix_list,
  cluster_list,
  metadata_list,
  summarize = "group",
  time = "time",
  col = "condition",
  scale = TRUE,
  mapping = NULL
)
degMerge(
  matrix_list,
  cluster_list,
  metadata_list,
  summarize = "group",
  time = "time",
  col = "condition",
  scale = TRUE,
  mapping = NULL
)

Arguments

`matrix_list`	list expression data for each element
`cluster_list`	list df item from degPattern output
`metadata_list`	list data.frames from each element with design experiment. Normally `colData` output
`summarize`	character column to use to group samples
`time`	character column to use as x-axes in figures
`col`	character column to color samples in figures
`scale`	boolean scale by row expression matrix
`mapping`	data.frame mapping table in case elements use different ID in the row.names of expression matrix. For instance, when integrating miRNA/mRNA.

Value

A data.frame with information on what genes are in each cluster in all data set, and the correlation value for each pair cluster comparison.

Correlation of the standard desviation and the mean of the abundance of a set of genes.

Description

Correlation of the standard desviation and the mean of the abundance of a set of genes.

Usage

degMV(group, pvalues, counts, sign = 0.01)
degMV(group, pvalues, counts, sign = 0.01)

Arguments

`group`	Character vector with group name for each sample in the same order than counts column names.
`pvalues`	pvalues of DEG analysis.
`counts`	Matrix with counts for each samples and each gene.
`sign`	Defining the cutoff to label significant features. row number should be the same length than pvalues vector.

Value

ggplot2 object

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degMV(colData(dds)[["group"]],
      res[, 4],
      counts(dds, normalized = TRUE))
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degMV(colData(dds)[["group"]],
      res[, 4],
      counts(dds, normalized = TRUE))

Create a deg object that can be used to plot expression values at shiny server:runGist(9930881)

Description

Create a deg object that can be used to plot expression values at shiny server:runGist(9930881)

Usage

degObj(counts, design, outfile)
degObj(counts, design, outfile)

Arguments

`counts`	Output from get_rank function.
`design`	Colour used for each gene.
`outfile`	File that will contain the object.

Value

R object to be load into vizExp.

Examples

data(humanGender)
library(SummarizedExperiment)
degObj(assays(humanGender)[[1]], colData(humanGender), NULL)
data(humanGender)
library(SummarizedExperiment)
degObj(assays(humanGender)[[1]], colData(humanGender), NULL)

Make groups of genes using expression profile.

Description

Note that this function doesn't calculate significant difference between groups, so the matrix used as input should be already filtered to contain only genes that are significantly different or the most interesting genes to study.

Usage

degPatterns(
  ma,
  metadata,
  minc = 15,
  summarize = "merge",
  time = "time",
  col = NULL,
  consensusCluster = FALSE,
  reduce = FALSE,
  cutoff = 0.7,
  scale = TRUE,
  pattern = NULL,
  groupDifference = NULL,
  eachStep = FALSE,
  plot = TRUE,
  fixy = NULL,
  nClusters = NULL,
  skipDendrogram = TRUE
)
degPatterns(
  ma,
  metadata,
  minc = 15,
  summarize = "merge",
  time = "time",
  col = NULL,
  consensusCluster = FALSE,
  reduce = FALSE,
  cutoff = 0.7,
  scale = TRUE,
  pattern = NULL,
  groupDifference = NULL,
  eachStep = FALSE,
  plot = TRUE,
  fixy = NULL,
  nClusters = NULL,
  skipDendrogram = TRUE
)

Arguments

`ma`	log2 normalized count matrix
`metadata`	data frame with sample information. Rownames should match `ma` column names row number should be the same length than p-values vector.
`minc`	integer minimum number of genes in a group that will be return
`summarize`	character column name in metadata that will be used to group replicates. If the column doesn't exist it'll merge the `time` and the `col` columns, if `col` doesn't exist it'll use `time` only. For instance, a merge between summarize and time parameters: control_point0 ... etc
`time`	character column name in metadata that will be used as variable that changes, normally a time variable.
`col`	character column name in metadata to separate samples. Normally control/mutant
`consensusCluster`	Indicates whether using ConsensusClusterPlus or `cluster::diana()`
`reduce`	boolean remove genes that are outliers of the cluster distribution. `boxplot` function is used to flag a gene in any group defined by `time` and `col` as outlier and it is removed from the cluster. Not used if `consensusCluster` is TRUE.
`cutoff`	This is deprecated.
`scale`	boolean scale the `ma` values by row
`pattern`	numeric vector to be used to find patterns like this from the count matrix. As well, it can be a character indicating the genes inside the count matrix to be used as reference.
`groupDifference`	Minimum abundance difference between the maximum value and minimum value for each feature. Please, provide the value in the same range than the `ma` value ( if `ma` is in log2, `groupDifference` should be inside that range).
`eachStep`	Whether apply `groupDifference` at each stem over `time` variable. This only work properly for one group with multiple time points.
`plot`	boolean plot the clusters found
`fixy`	vector integers used as ylim in plot
`nClusters`	an integer scalar or vector with the desired number of groups
`skipDendrogram`	a boolean to run or not dendextend. Temporary fix to memory issue in linux.

Details

It can work with one or more groups with 2 or more several time points. Before calculating the genes similarity among samples, all samples inside the same time point (time parameter) and group (col parameter) are collapsed together, and the mean value is the representation of the group for the gene abundance. Then, all pair-wise gene expression is calculated using cor.test R function using kendall as the statistical method. A distance matrix is created from those values. After that, cluster::diana() is used for the clustering of gene-gene distance matrix and cut the tree using the divisive coefficient of the clustering, giving as well by diana. Alternatively, if consensusCluster is on, it would use ConsensusClusterPlus to cut the tree in stable clusters. Finally, for each group of genes, only the ones that have genes higher than minc parameter will be added to the figure. The y-axis in the figure is the results of applying scale() R function, what is similar to creating a Z-score where values are centered to the mean and scaled to the ⁠standard desviation⁠ by each gene.

The different patterns can be merged to get similar ones into only one pattern. The expression correlation of the patterns will be used to decide whether some need to be merged or not.

Value

list wiht two items:

df is a data.frame with two columns. The first one with genes, the second with the clusters they belong.
pass is a vector of the clusters that pass the minc cutoff.
plot ggplot figure.
hr clustering of the genes in hclust format.
profile normalized count data used in the plot.
raw data.frame with gene values summarized by biological replicates and with metadata information attached.
summarise data.frame with clusters values summarized by group and with the metadata information attached.
normalized data.frame with the clusters values as used in the plot.
benchmarking plot showing the different patterns at different values for clustering cuttree function.
benchmarking_curve plot showing how the numbers of clusters and genes changed at different values for clustering cuttree function.

Examples

data(humanGender)
library(SummarizedExperiment)
library(ggplot2)
ma <- assays(humanGender)[[1]][1:100,]
des <- colData(humanGender)
des[["other"]] <- sample(c("a", "b"), 85, replace = TRUE)
res <- degPatterns(ma, des, time="group", col = "other")
# Use the data yourself for custom figures
 ggplot(res[["normalized"]],
        aes(group, value, color = other, fill = other)) +
  geom_boxplot() +
   geom_point(position = position_jitterdodge(dodge.width = 0.9)) +
   # change the method to make it smoother
   geom_smooth(aes(group=other), method = "lm")
data(humanGender)
library(SummarizedExperiment)
library(ggplot2)
ma <- assays(humanGender)[[1]][1:100,]
des <- colData(humanGender)
des[["other"]] <- sample(c("a", "b"), 85, replace = TRUE)
res <- degPatterns(ma, des, time="group", col = "other")
# Use the data yourself for custom figures
 ggplot(res[["normalized"]],
        aes(group, value, color = other, fill = other)) +
  geom_boxplot() +
   geom_point(position = position_jitterdodge(dodge.width = 0.9)) +
   # change the method to make it smoother
   geom_smooth(aes(group=other), method = "lm")

smart PCA from count matrix data

Description

nice plot using ggplot2 from prcomp function

Usage

degPCA(
  counts,
  metadata = NULL,
  condition = NULL,
  pc1 = "PC1",
  pc2 = "PC2",
  name = NULL,
  shape = NULL,
  data = FALSE
)
degPCA(
  counts,
  metadata = NULL,
  condition = NULL,
  pc1 = "PC1",
  pc2 = "PC2",
  name = NULL,
  shape = NULL,
  data = FALSE
)

Arguments

`counts`	matrix with count data
`metadata`	dara.frame with sample information
`condition`	character column in metadata to use to color samples
`pc1`	character PC to plot on x-axis
`pc2`	character PC to plot on y-axis
`name`	character if given, column in metadata to print label
`shape`	character if given, column in metadata to shape points
`data`	Whether return PCA data or just plot the PCA.

Value

if ⁠results <-⁠ used, the function return the output of prcomp().

Author(s)

Lorena Pantano, Rory Kirchner, Michael Steinbaugh

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
colData(humanGender)[idx,], design=~group)
degPCA(log2(counts(dse)+0.5), colData(dse),
  condition="group", name="group", shape="group")
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
colData(humanGender)[idx,], design=~group)
degPCA(log2(counts(dse)+0.5), colData(dse),
  condition="group", name="group", shape="group")

Plot top genes allowing more variables to color and shape points

Description

Plot top genes allowing more variables to color and shape points

Usage

degPlot(
  dds,
  xs,
  res = NULL,
  n = 9,
  genes = NULL,
  group = NULL,
  batch = NULL,
  metadata = NULL,
  ann = c("geneID", "symbol"),
  slot = 1L,
  log2 = TRUE,
  xsLab = xs,
  ysLab = "abundance",
  color = "black",
  groupLab = group,
  batchLab = batch,
  sizePoint = 1
)
degPlot(
  dds,
  xs,
  res = NULL,
  n = 9,
  genes = NULL,
  group = NULL,
  batch = NULL,
  metadata = NULL,
  ann = c("geneID", "symbol"),
  slot = 1L,
  log2 = TRUE,
  xsLab = xs,
  ysLab = "abundance",
  color = "black",
  groupLab = group,
  batchLab = batch,
  sizePoint = 1
)

Arguments

`dds`	DESeq2::DESeqDataSet object or SummarizedExperiment or Matrix or data.frame. In case of a DESeqDataSet object, always the normalized expression will be used from `counts(dds, normalized = TRUE)`.
`xs`	Character, colname in colData that will be used as X-axes.
`res`	DESeq2::DESeqResults object.
`n`	Integer number of genes to plot from the `res` object. It will take the top N using padj values to order the table.
`genes`	Character of gene names matching rownames of count data.
`group`	Character, colname in colData to color points and add different lines for each level.
`batch`	Character, colname in colData to shape points, normally used by batch effect visualization.
`metadata`	Metadata in case dds is a matrix.
`ann`	Columns in rowData (if available) used to print gene names. First element in the vector is the column name in rowData that matches the row.names of the `dds` or `count` object. Second element in the vector is the column name in rowData that it will be used as the title for each gene or feature figure.
`slot`	Name of the slot to use to get count data.
`log2`	Whether to apply or not log2 transformation.
`xsLab`	Character, alternative label for x-axis (default: same as xs)
`ysLab`	Character, alternative label for y-axis..
`color`	Color to use to plot groups. It can be one color, or a palette compatible with `ggplot2::scale_color_brewer()`.
`groupLab`	Character, alternative label for group (default: same as group).
`batchLab`	Character, alternative label for batch (default: same as batch).
`sizePoint`	Integer, indicates the size of the plotted points (default 1).

Value

ggplot showing the expresison of the genes

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dse <- DESeq(dse)
degPlot(dse, genes = rownames(dse)[1:10], xs = "group")
degPlot(dse, genes = rownames(dse)[1:10], xs = "group", color = "orange")
degPlot(dse, genes = rownames(dse)[1:10], xs = "group", group = "group",
        color = "Accent")
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dse <- DESeq(dse)
degPlot(dse, genes = rownames(dse)[1:10], xs = "group")
degPlot(dse, genes = rownames(dse)[1:10], xs = "group", color = "orange")
degPlot(dse, genes = rownames(dse)[1:10], xs = "group", group = "group",
        color = "Accent")

Plot clusters from degPattern function output

Description

This function helps to format the cluster plots from degPatterns(). It allows to control the layers and it returns a ggplot object that can accept more ggplot functions to allow customization.

Usage

degPlotCluster(
  table,
  time,
  color = NULL,
  min_genes = 10,
  process = FALSE,
  points = TRUE,
  boxes = TRUE,
  smooth = TRUE,
  lines = TRUE,
  facet = TRUE,
  cluster_column = "cluster",
  prefix_title = "Group: "
)
degPlotCluster(
  table,
  time,
  color = NULL,
  min_genes = 10,
  process = FALSE,
  points = TRUE,
  boxes = TRUE,
  smooth = TRUE,
  lines = TRUE,
  facet = TRUE,
  cluster_column = "cluster",
  prefix_title = "Group: "
)

Arguments

`table`	`normalized` element from `degPatterns()` output. It can be a data.frame with the following columns in there: `⁠genes, sample, expression, cluster, xaxis_column, color_column⁠`.
`time`	column name to use in the x-axis.
`color`	column name to use to color and divide the samples.
`min_genes`	minimum number of genes to be added to the plot.
`process`	whether to process the table if it is not ready for plotting.
`points`	Add points to the plot.
`boxes`	Add boxplot to the plot.
`smooth`	Add regression line to the plot.
`lines`	Add gene lines to the plot.
`facet`	Split figures based on cluster ID.
`cluster_column`	column name if cluster is in a column with a different name. Usefull, to plot cluster with different cutoffs used when grouping genes from the clustering step.
`prefix_title`	text to add before the cluster ID in the figure title.

Value

ggplot2 object.

Examples

data(humanGender)
library(SummarizedExperiment)
library(ggplot2)
ma <- assays(humanGender)[[1]][1:100,]
des <- colData(humanGender)
des[["other"]] <- sample(c("a", "b"), 85, replace = TRUE)
res <- degPatterns(ma, des, time="group", col = "other", plot = FALSE)
degPlotCluster(res$normalized, "group", "other")
degPlotCluster(res$normalized, "group", "other", lines = FALSE)

library(dplyr)
library(tidyr)
library(tibble)
table <- rownames_to_column(as.data.frame(ma), "genes") %>%
    gather("sample", "expression", -genes) %>%
    right_join(distinct(res$df[,c("genes", "cluster")]),
               by = "genes") %>%
    left_join(rownames_to_column(as.data.frame(des), "sample"),
              by = "sample") %>% 
              as.data.frame()
degPlotCluster(table, "group", "other", process = TRUE)
data(humanGender)
library(SummarizedExperiment)
library(ggplot2)
ma <- assays(humanGender)[[1]][1:100,]
des <- colData(humanGender)
des[["other"]] <- sample(c("a", "b"), 85, replace = TRUE)
res <- degPatterns(ma, des, time="group", col = "other", plot = FALSE)
degPlotCluster(res$normalized, "group", "other")
degPlotCluster(res$normalized, "group", "other", lines = FALSE)

library(dplyr)
library(tidyr)
library(tibble)
table <- rownames_to_column(as.data.frame(ma), "genes") %>%
    gather("sample", "expression", -genes) %>%
    right_join(distinct(res$df[,c("genes", "cluster")]),
               by = "genes") %>%
    left_join(rownames_to_column(as.data.frame(des), "sample"),
              by = "sample") %>% 
              as.data.frame()
degPlotCluster(table, "group", "other", process = TRUE)

Plot selected genes on a wide format

Description

Plot selected genes on a wide format

Usage

degPlotWide(counts, genes, group, metadata = NULL, batch = NULL)
degPlotWide(counts, genes, group, metadata = NULL, batch = NULL)

Arguments

`counts`	DESeq2::DESeqDataSet object or expression matrix
`genes`	character genes to plot.
`group`	character, colname in colData to color points and add different lines for each level
`metadata`	data.frame, information for each sample. Not needed if DESeq2::DESeqDataSet given as counts.
`batch`	character, colname in colData to shape points, normally used by batch effect visualization

Value

ggplot showing the expresison of the genes on the x axis

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dse <- DESeq(dse)
degPlotWide(dse, rownames(dse)[1:10], group = "group")
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dse <- DESeq(dse)
degPlotWide(dse, rownames(dse)[1:10], group = "group")

Plot main figures showing p-values distribution and mean-variance correlation

Description

This function joins the output of degMean, degVar and degMV in a single plot. See these functions for further information.

Usage

degQC(counts, groups, object = NULL, pvalue = NULL)
degQC(counts, groups, object = NULL, pvalue = NULL)

Arguments

`counts`	Matrix with counts for each samples and each gene.
`groups`	Character vector with group name for each sample in the same order than counts column names.
`object`	DEGSet oobject.
`pvalue`	pvalues of DEG analysis.

Value

ggplot2 object

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degQC(counts(dds, normalized=TRUE), colData(dds)[["group"]],
  pvalue = res[["pvalue"]])
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degQC(counts(dds, normalized=TRUE), colData(dds)[["group"]],
  pvalue = res[["pvalue"]])

Complete report from DESeq2 analysis

Description

Complete report from DESeq2 analysis

Usage

degResults(
  res = NULL,
  dds,
  rlogMat = NULL,
  name,
  org = NULL,
  FDR = 0.05,
  do_go = FALSE,
  FC = 0.1,
  group = "condition",
  xs = "time",
  path_results = ".",
  contrast = NULL
)
degResults(
  res = NULL,
  dds,
  rlogMat = NULL,
  name,
  org = NULL,
  FDR = 0.05,
  do_go = FALSE,
  FC = 0.1,
  group = "condition",
  xs = "time",
  path_results = ".",
  contrast = NULL
)

Arguments

`res`	output from `DESeq2::results()` function.
`dds`	`DESeq2::DESeqDataSet()` object.
`rlogMat`	matrix from `DESeq2::rlog()` function.
`name`	string to identify results
`org`	an organism annotation object, like org.Mm.eg.db. NULL if you want to skip this step.
`FDR`	int cutoff for false discovery rate.
`do_go`	boolean if GO enrichment is done.
`FC`	int cutoff for log2 fold change.
`group`	string column name in colData(dds) that separates samples in meaninful groups.
`xs`	string column name in colData(dss) that will be used as X axes in plots (i.e time)
`path_results`	character path where files are stored. NULL if you don't want to save any file.
`contrast`	list with character vector indicating the fold change values from different comparisons to add to the output table.

Value

ggplot2 object

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dse <- DESeq(dse)
res <- degResults(dds = dse, name = "test", org = NULL,
  do_go = FALSE, group = "group", xs = "group", path_results = NULL)
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dse <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dse <- DESeq(dse)
res <- degResults(dds = dse, name = "test", org = NULL,
  do_go = FALSE, group = "group", xs = "group", path_results = NULL)

DEGSet

Description

S4 class to store data from differentially expression analysis. It should be compatible with different package and stores the information in a way the methods will work with all of them.

Usage

DEGSet(resList, default)

DEGSet(resList, default)

as.DEGSet(object, ...)

## S4 method for signature 'TopTags'
as.DEGSet(object, default = "raw", extras = NULL)

## S4 method for signature 'data.frame'
as.DEGSet(object, contrast, default = "raw", extras = NULL)

## S4 method for signature 'DESeqResults'
as.DEGSet(object, default = "shrunken", extras = NULL)
DEGSet(resList, default)

DEGSet(resList, default)

as.DEGSet(object, ...)

## S4 method for signature 'TopTags'
as.DEGSet(object, default = "raw", extras = NULL)

## S4 method for signature 'data.frame'
as.DEGSet(object, contrast, default = "raw", extras = NULL)

## S4 method for signature 'DESeqResults'
as.DEGSet(object, default = "shrunken", extras = NULL)

Arguments

`resList`	List with results as elements containing log2FoldChange, pvalues and padj as column. Rownames should be feature names. Elements should have names.
`default`	The name of the element to use by default.
`object`	Different objects to be transformed to DEGSet when using `as.DEGSet`.
`...`	Optional parameters of the generic.
`extras`	List of extra tables related to the same comparison when using `as.DEGSet`.
`contrast`	To name the comparison when using `as.DEGSet`.

Details

For now supporting only DESeq2::results() output. Use constructor degComps() to create the object.

The list will contain one element for each comparison done. Each element has the following structure:

DEG table
Optional table with shrunk Fold Change when it has been done.

To access the raw table use deg(dgs, "raw"), to access the shrunken table use deg(dgs, "shrunken") or just deg(dgs).

Author(s)

Lorena Pantano

Examples

library(DESeq2)
library(edgeR)
library(limma)
dds <- makeExampleDESeqDataSet(betaSD = 1)
colData(dds)[["treatment"]] <- sample(colData(dds)[["condition"]], 12)
design(dds) <-  ~ condition + treatment
dds <- DESeq(dds)
res <- degComps(dds, combs = c("condition"))
deg(res)
deg(res, tidy = "tibble")
# From edgeR
dge <- DGEList(counts=counts(dds), group=colData(dds)[["treatment"]])
dge <- estimateCommonDisp(dge)
res <- as.DEGSet(topTags(exactTest(dge)))
# From limma
v <- voom(counts(dds), model.matrix(~treatment, colData(dds)), plot=FALSE)
fit <- lmFit(v)
fit <- eBayes(fit, robust=TRUE)
res <- as.DEGSet(topTable(fit, n = "Inf"), "A_vs_B")
library(DESeq2)
library(edgeR)
library(limma)
dds <- makeExampleDESeqDataSet(betaSD = 1)
colData(dds)[["treatment"]] <- sample(colData(dds)[["condition"]], 12)
design(dds) <-  ~ condition + treatment
dds <- DESeq(dds)
res <- degComps(dds, combs = c("condition"))
deg(res)
deg(res, tidy = "tibble")
# From edgeR
dge <- DGEList(counts=counts(dds), group=colData(dds)[["treatment"]])
dge <- estimateCommonDisp(dge)
res <- as.DEGSet(topTags(exactTest(dge)))
# From limma
v <- voom(counts(dds), model.matrix(~treatment, colData(dds)), plot=FALSE)
fit <- lmFit(v)
fit <- eBayes(fit, robust=TRUE)
res <- as.DEGSet(topTable(fit, n = "Inf"), "A_vs_B")

Plot gene signature for each group and signature

Description

Given a list of genes beloging to a different classes, like markers, plot for each group, the expression values for all the samples.

Usage

degSignature(
  counts,
  signature,
  group = NULL,
  metadata = NULL,
  slot = 1,
  scale = FALSE
)
degSignature(
  counts,
  signature,
  group = NULL,
  metadata = NULL,
  slot = 1,
  scale = FALSE
)

Arguments

`counts`	expression data. It accepts bcbioRNASeq, DESeqDataSet and SummarizedExperiment. As well, data.frame or matrix is supported, but it requires metadata in that case.
`signature`	data.frame with two columns: a) genes that match row.names of counts, b) label to classify the gene inside a group. Normally, cell tissue name.
`group`	character in metadata used to split data into different groups.
`metadata`	data frame with sample information. Rownames should match `ma` column names row number should be the same length than p-values vector.
`slot`	slotName in the case of SummarizedExperiment objects.
`scale`	Whether to scale or not the expression.

Value

ggplot plot.

Examples

data(humanGender)
data(geneInfo)
degSignature(humanGender, geneInfo, group = "group")
data(humanGender)
data(geneInfo)
degSignature(humanGender, geneInfo, group = "group")

Print Summary Statistics of Alpha Level Cutoffs

Description

Print Summary Statistics of Alpha Level Cutoffs

Usage

degSummary(
  object,
  alpha = c(0.1, 0.05, 0.01),
  contrast = NULL,
  caption = "",
  kable = FALSE
)
degSummary(
  object,
  alpha = c(0.1, 0.05, 0.01),
  contrast = NULL,
  caption = "",
  kable = FALSE
)

Arguments

`object`	Can be DEGSet or DESeqDataSet or DESeqResults.
`alpha`	Numeric vector of desired alpha cutoffs.
`contrast`	Character vector to use with `results()` function.
`caption`	Character vector to add as caption to the table.
`kable`	Whether return a `knitr::kable()` output. Default is data.frame.

Value

data.frame or knitr::kable().

Author(s)

Lorena Pantano

References

original idea of multiple alpha values and code syntax from Michael Steinbaugh.

Examples

library(DESeq2)
data(humanGender)
idx <- c(1:5, 75:80)
counts <- assays(humanGender)[[1]]
dse <- DESeqDataSetFromMatrix(counts[1:1000, idx],
                              colData(humanGender)[idx,],
                              design = ~group)
dse <- DESeq(dse)
res1 <- results(dse)
res2 <- degComps(dse, contrast = c("group_Male_vs_Female"))
degSummary(dse, contrast = "group_Male_vs_Female")
degSummary(res1)
degSummary(res1, kable = TRUE)
degSummary(res2[[1]])
library(DESeq2)
data(humanGender)
idx <- c(1:5, 75:80)
counts <- assays(humanGender)[[1]]
dse <- DESeqDataSetFromMatrix(counts[1:1000, idx],
                              colData(humanGender)[idx,],
                              design = ~group)
dse <- DESeq(dse)
res1 <- results(dse)
res2 <- degComps(dse, contrast = c("group_Male_vs_Female"))
degSummary(dse, contrast = "group_Male_vs_Female")
degSummary(res1)
degSummary(res1, kable = TRUE)
degSummary(res2[[1]])

Distribution of pvalues by standard desviation range

Description

This function pot the p-valyes distribution colored by the quantiles of the standard desviation of count data.

Usage

degVar(pvalues, counts)
degVar(pvalues, counts)

Arguments

`pvalues`	pvalues of DEG analysis
`counts`	Matrix with counts for each samples and each gene. row number should be the same length than pvalues vector.

Value

ggplot2 object

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degVar(res[, 4], counts(dds))
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degVar(res[, 4], counts(dds))

Distribution of the standard desviation of DE genes compared to the background

Description

Distribution of the standard desviation of DE genes compared to the background

Usage

degVB(tags, group, counts, pop = 400)
degVB(tags, group, counts, pop = 400)

Arguments

`tags`	List of genes that are DE.
`group`	Character vector with group name for each sample in the same order than counts column names.
`counts`	matrix with counts for each samples and each gene. Should be same length than pvalues vector.
`pop`	Number of random samples taken for background comparison.

Value

ggplot2 object

Examples

data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degVB(row.names(res)[1:20], colData(dds)[["group"]],
  counts(dds, normalized = TRUE))
data(humanGender)
library(DESeq2)
idx <- c(1:10, 75:85)
dds <- DESeqDataSetFromMatrix(assays(humanGender)[[1]][1:1000, idx],
  colData(humanGender)[idx,], design=~group)
dds <- DESeq(dds)
res <- results(dds)
degVB(row.names(res)[1:20], colData(dds)[["group"]],
  counts(dds, normalized = TRUE))

Create volcano plot from log2FC and adjusted pvalues data frame

Description

Create volcano plot from log2FC and adjusted pvalues data frame

Usage

degVolcano(
  stats,
  side = "both",
  title = "Volcano Plot with Marginal Distributions",
  pval.cutoff = 0.05,
  lfc.cutoff = 1,
  shade.colour = "orange",
  shade.alpha = 0.25,
  point.colour = "gray",
  point.alpha = 0.75,
  point.outline.colour = "darkgray",
  line.colour = "gray",
  plot_text = NULL
)
degVolcano(
  stats,
  side = "both",
  title = "Volcano Plot with Marginal Distributions",
  pval.cutoff = 0.05,
  lfc.cutoff = 1,
  shade.colour = "orange",
  shade.alpha = 0.25,
  point.colour = "gray",
  point.alpha = 0.75,
  point.outline.colour = "darkgray",
  line.colour = "gray",
  plot_text = NULL
)

Arguments

`stats`	data.frame with two columns: logFC and Adjusted.Pvalue
`side`	plot UP, DOWN or BOTH de-regulated points
`title`	title for the figure
`pval.cutoff`	cutoff for the adjusted pvalue. Default 0.05
`lfc.cutoff`	cutoff for the log2FC. Default 1
`shade.colour`	background color. Default orange.
`shade.alpha`	transparency value. Default 0.25
`point.colour`	colours for points. Default gray
`point.alpha`	transparency for points. Default 0.75
`point.outline.colour`	Default darkgray
`line.colour`	Defaul gray
`plot_text`	data.frame with three columns: logFC, Pvalue, Gene name

Details

This function was mainly developed by @jnhutchinson.

Value

The function will plot volcano plot together with density of the fold change and p-values on the top and the right side of the volcano plot.

Author(s)

Lorena Pantano, John Hutchinson

Examples

library(DESeq2)
dds <- makeExampleDESeqDataSet(betaSD = 1)
dds <- DESeq(dds)
stats <- results(dds)[,c("log2FoldChange", "padj")]
stats[["name"]] <- row.names(stats)
degVolcano(stats, plot_text = stats[1:10,])
library(DESeq2)
dds <- makeExampleDESeqDataSet(betaSD = 1)
dds <- DESeq(dds)
stats <- results(dds)[,c("log2FoldChange", "padj")]
stats[["name"]] <- row.names(stats)
degVolcano(stats, plot_text = stats[1:10,])

data.frame with chromose information for each gene

Description

data.frame with chromose information for each gene

Usage

data(geneInfo)
data(geneInfo)

Format

data.frame

Author(s)

Lorena Pantano, 2014-08-14

Source

biomart

Add correlation and p-value to a ggplot2 plot

Description

geom_cor will add the correlatin, method and p-value to the plot automatically guessing the position if nothing else specidfied. family font, size and colour can be used to change the format.

Usage

geom_cor(
  mapping = NULL,
  data = NULL,
  method = "spearman",
  xpos = NULL,
  ypos = NULL,
  inherit.aes = TRUE,
  ...
)
geom_cor(
  mapping = NULL,
  data = NULL,
  method = "spearman",
  xpos = NULL,
  ypos = NULL,
  inherit.aes = TRUE,
  ...
)

Arguments

`mapping`	Set of aesthetic mappings created by `aes()` or `aes_()`. If specified and `inherit.aes = TRUE` (the default), it is combined with the default mapping at the top level of the plot. You must supply `mapping` if there is no plot mapping.
`data`	The data to be displayed in this layer. There are three options: If `NULL`, the default, the data is inherited from the plot data as specified in the call to `ggplot()`. A `data.frame`, or other object, will override the plot data. All objects will be fortified to produce a data frame. See `fortify()` for which variables will be created. A `function` will be called with a single argument, the plot data. The return value must be a `data.frame.`, and will be used as the layer data.
`method`	Method to calculate the correlation. Values are passed to `cor.test()`. (Spearman, Pearson, Kendall).
`xpos`	Locate text at that position on the x axis.
`ypos`	Locate text at that position on the y axis.
`inherit.aes`	If `FALSE`, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. `borders()`.
`...`	other arguments passed on to `layer()`. These are often aesthetics, used to set an aesthetic to a fixed value, like `color = "red"` or `size = 3`. They may also be parameters to the paired geom/stat.

Details

It was integrated after reading this tutorial to extend ggplot2 layers

Examples

data(humanGender)
library(SummarizedExperiment)
library(ggplot2)
ggplot(as.data.frame(assay(humanGender)[1:1000,]),
       aes(x = NA20502, y = NA20504)) +
  geom_point() +
  ylim(0,1.1e5) +
  geom_cor(method = "kendall", ypos = 1e5) 
data(humanGender)
library(SummarizedExperiment)
library(ggplot2)
ggplot(as.data.frame(assay(humanGender)[1:1000,]),
       aes(x = NA20502, y = NA20504)) +
  geom_point() +
  ylim(0,1.1e5) +
  geom_cor(method = "kendall", ypos = 1e5)

DGEList object for DE genes betwen Male and Females

Description

DGEList object for DE genes betwen Male and Females

Usage

data(humanGender)
data(humanGender)

Format

DGEList

Author(s)

Lorena Pantano, 2017-08-37

Source

gEUvadis

Method to get the significant genes

Description

Function to get the features that are significant according to some thresholds from a DEGSet, DESeq2::DESeqResults and edgeR::topTags.

Usage

significants(object, padj = 0.05, fc = 0, direction = NULL, full = FALSE, ...)

## S4 method for signature 'DEGSet'
significants(object, padj = 0.05, fc = 0, direction = NULL, full = FALSE, ...)

## S4 method for signature 'DESeqResults'
significants(object, padj = 0.05, fc = 0, direction = NULL, full = FALSE, ...)

## S4 method for signature 'TopTags'
significants(object, padj = 0.05, fc = 0, direction = NULL, full = FALSE, ...)

## S4 method for signature 'list'
significants(
  object,
  padj = 0.05,
  fc = 0,
  direction = NULL,
  full = FALSE,
  newFDR = FALSE,
  ...
)
significants(object, padj = 0.05, fc = 0, direction = NULL, full = FALSE, ...)

## S4 method for signature 'DEGSet'
significants(object, padj = 0.05, fc = 0, direction = NULL, full = FALSE, ...)

## S4 method for signature 'DESeqResults'
significants(object, padj = 0.05, fc = 0, direction = NULL, full = FALSE, ...)

## S4 method for signature 'TopTags'
significants(object, padj = 0.05, fc = 0, direction = NULL, full = FALSE, ...)

## S4 method for signature 'list'
significants(
  object,
  padj = 0.05,
  fc = 0,
  direction = NULL,
  full = FALSE,
  newFDR = FALSE,
  ...
)

Arguments

`object`	DEGSet
`padj`	Cutoff for the FDR column.
`fc`	Cutoff for the log2FC column.
`direction`	Whether to take down/up/ignore. Valid arguments are down, up and NULL.
`full`	Whether to return full table or not.
`...`	Passed to deg. Default: value = NULL. Value can be 'raw', 'shrunken'.
`newFDR`	Whether to recalculate the FDR or not. See https://support.bioconductor.org/p/104059/#104072. Only used when a list is giving to the method.

Value

a dplyr::tbl_df data frame. gene column has the feature name. In the case of using this method with the results from degComps, log2FoldChange has the higher foldChange from the comparisons, and padj has the padj associated to the previous column. Then, there is two columns for each comparison, one for the log2FoldChange and another for the padj.

Author(s)

Lorena Pantano

Examples

library(DESeq2)
dds <- makeExampleDESeqDataSet(betaSD=1)
colData(dds)[["treatment"]] <- sample(colData(dds)[["condition"]], 12)
  design(dds) <-  ~ condition + treatment
dds <- DESeq(dds)
res <- degComps(dds, contrast = list("treatment_B_vs_A",
                                     c("condition", "A", "B")))
significants(res, full = TRUE)
# significants(res, full = TRUE, padj = 1) # all genes
library(DESeq2)
dds <- makeExampleDESeqDataSet(betaSD=1)
colData(dds)[["treatment"]] <- sample(colData(dds)[["condition"]], 12)
  design(dds) <-  ~ condition + treatment
dds <- DESeq(dds)
res <- degComps(dds, contrast = list("treatment_B_vs_A",
                                     c("condition", "A", "B")))
significants(res, full = TRUE)
# significants(res, full = TRUE, padj = 1) # all genes

Package 'DEGreport'

Help Index

Deprecated functions in package DEGreport

Description

Details

Author(s)

See Also

Create report of RNAseq DEG anlaysis

Description

Usage

Arguments

Value

Method to get all table stored for an specific comparison

Description

Usage

Arguments

Author(s)

References

Distribution of gene ratios used to calculate Size Factors.

Description

Usage

Arguments

Details

Value

References

Examples

Make nice colors for metadata

Description

Usage

Arguments

Examples

Automatize the use of results() for multiple comparisons

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Calculate the correlation relationshipt among all covariates in the metadata table

Description

Usage

Arguments

Value

Author(s)

Examples

Find correlation between pcs and covariates

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Method to get the default table to use.

Description

Usage

Arguments

Author(s)

Examples

Filter genes by group

Description

Usage

Arguments

Value

Examples

MA-plot from base means and log fold changes

Description

Usage

Arguments

Value

Author(s)

Examples

Distribution of expression of DE genes compared to the background

Description

Usage

Arguments

Value

Examples

Automatize the use of `results()` for multiple comparisons