Title: | Interactive Visualization of RNA-seq Data Using a Principal Components Approach |
---|---|
Description: | This package provides functionality for interactive visualization of RNA-seq datasets based on Principal Components Analysis. The methods provided allow for quick information extraction and effective data exploration. A Shiny application encapsulates the whole analysis. |
Authors: | Federico Marini [aut, cre] |
Maintainer: | Federico Marini <[email protected]> |
License: | MIT + file LICENSE |
Version: | 3.1.0 |
Built: | 2024-10-30 09:27:51 UTC |
Source: | https://github.com/bioc/pcaExplorer |
Computes the significance of (cor)relations between PCA scores and the sample
experimental covariates, using Kruskal-Wallis test for categorial variables
and the cor.test
based on Spearman's correlation for continuous
variables
correlatePCs(pcaobj, coldata, pcs = 1:4)
correlatePCs(pcaobj, coldata, pcs = 1:4)
pcaobj |
A |
coldata |
A |
pcs |
A numeric vector, containing the corresponding PC number |
A data.frame
object with computed p values for each covariate
and for each principal component
library(DESeq2) dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaobj <- prcomp(t(assay(rlt))) correlatePCs(pcaobj, colData(dds))
library(DESeq2) dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaobj <- prcomp(t(assay(rlt))) correlatePCs(pcaobj, colData(dds))
Functions that are on their way to the function afterlife. Their successors are also listed.
... |
Ignored arguments. |
The successors of these functions are likely coming after the rework that
led to the creation of the mosdef
package. See more into its
documentation for more details.
All functions throw a warning, with a deprecation message pointing towards its descendent (if available).
topGOtable()
is now being replaced by the more flexible
mosdef::run_topGO()
function
Federico Marini
# try(topGOtable())
# try(topGOtable())
Plot distribution of expression values
distro_expr(rld, plot_type = "density")
distro_expr(rld, plot_type = "density")
rld |
A |
plot_type |
Character, choose one of |
A plot with the distribution of the expression values
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) distro_expr(rlt)
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) distro_expr(rlt)
Extract and plot the expression profile of genes
geneprofiler(se, genelist = NULL, intgroup = "condition", plotZ = FALSE)
geneprofiler(se, genelist = NULL, intgroup = "condition", plotZ = FALSE)
se |
A |
genelist |
An array of characters, including the names of the genes of interest of which the profile is to be plotted |
intgroup |
A factor, needs to be in the |
plotZ |
Logical, whether to plot the scaled expression values. Defaults to
|
A plot of the expression profile for the genes
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) geneprofiler(rlt, paste0("gene", sample(1:1000, 20))) geneprofiler(rlt, paste0("gene", sample(1:1000, 20)), plotZ = TRUE)
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) geneprofiler(rlt, paste0("gene", sample(1:1000, 20))) geneprofiler(rlt, paste0("gene", sample(1:1000, 20)), plotZ = TRUE)
Computes and plots the principal components of the genes, eventually displaying the samples as in a typical biplot visualization.
genespca( x, ntop, choices = c(1, 2), arrowColors = "steelblue", groupNames = "group", biplot = TRUE, scale = 1, pc.biplot = TRUE, obs.scale = 1 - scale, var.scale = scale, groups = NULL, ellipse = FALSE, ellipse.prob = 0.68, labels = NULL, labels.size = 3, alpha = 1, var.axes = TRUE, circle = FALSE, circle.prob = 0.69, varname.size = 4, varname.adjust = 1.5, varname.abbrev = FALSE, returnData = FALSE, coordEqual = FALSE, scaleArrow = 1, useRownamesAsLabels = TRUE, point_size = 2, annotation = NULL )
genespca( x, ntop, choices = c(1, 2), arrowColors = "steelblue", groupNames = "group", biplot = TRUE, scale = 1, pc.biplot = TRUE, obs.scale = 1 - scale, var.scale = scale, groups = NULL, ellipse = FALSE, ellipse.prob = 0.68, labels = NULL, labels.size = 3, alpha = 1, var.axes = TRUE, circle = FALSE, circle.prob = 0.69, varname.size = 4, varname.adjust = 1.5, varname.abbrev = FALSE, returnData = FALSE, coordEqual = FALSE, scaleArrow = 1, useRownamesAsLabels = TRUE, point_size = 2, annotation = NULL )
x |
A |
ntop |
Number of top genes to use for principal components, selected by highest row variance |
choices |
Vector of two numeric values, to select on which principal components to plot |
arrowColors |
Vector of character, either as long as the number of the samples, or one single value |
groupNames |
Factor containing the groupings for the input data. Is efficiently chosen as the (interaction of more) factors in the colData for the object provided |
biplot |
Logical, whether to additionally draw the samples labels as in a biplot representation |
scale |
Covariance biplot (scale = 1), form biplot (scale = 0). When scale = 1, the inner product between the variables approximates the covariance and the distance between the points approximates the Mahalanobis distance. |
pc.biplot |
Logical, for compatibility with biplot.princomp() |
obs.scale |
Scale factor to apply to observations |
var.scale |
Scale factor to apply to variables |
groups |
Optional factor variable indicating the groups that the observations belong to. If provided the points will be colored according to groups |
ellipse |
Logical, draw a normal data ellipse for each group |
ellipse.prob |
Size of the ellipse in Normal probability |
labels |
optional Vector of labels for the observations |
labels.size |
Size of the text used for the labels |
alpha |
Alpha transparency value for the points (0 = transparent, 1 = opaque) |
var.axes |
Logical, draw arrows for the variables? |
circle |
Logical, draw a correlation circle? (only applies when prcomp was called with scale = TRUE and when var.scale = 1) |
circle.prob |
Size of the correlation circle in Normal probability |
varname.size |
Size of the text for variable names |
varname.adjust |
Adjustment factor the placement of the variable names, '>= 1' means farther from the arrow |
varname.abbrev |
Logical, whether or not to abbreviate the variable names |
returnData |
Logical, if TRUE returns a data.frame for further use, containing the selected principal components for custom plotting |
coordEqual |
Logical, default FALSE, for allowing brushing. If TRUE, plot using equal scale cartesian coordinates |
scaleArrow |
Multiplicative factor, usually >=1, only for visualization purposes, to allow for distinguishing where the variables are plotted |
useRownamesAsLabels |
Logical, if TRUE uses the row names as labels for plotting |
point_size |
Size of the points to be plotted for the observations (genes) |
annotation |
A |
The implementation of this function is based on the beautiful ggbiplot
package developed by Vince Vu, available at https://github.com/vqv/ggbiplot.
The adaptation and additional parameters are tailored to display typical genomics data
such as the transformed counts of RNA-seq experiments
An object created by ggplot
, which can be assigned and further customized.
library(DESeq2) dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- rlogTransformation(dds) groups <- colData(dds)$condition groups <- factor(groups, levels = unique(groups)) cols <- scales::hue_pal()(2)[groups] genespca(rlt, ntop=100, arrowColors = cols, groupNames = groups) groups_multi <- interaction(as.data.frame(colData(rlt)[, c("condition", "tissue")])) groups_multi <- factor(groups_multi, levels = unique(groups_multi)) cols_multi <- scales::hue_pal()(length(levels(groups_multi)))[factor(groups_multi)] genespca(rlt, ntop = 100, arrowColors = cols_multi, groupNames = groups_multi)
library(DESeq2) dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- rlogTransformation(dds) groups <- colData(dds)$condition groups <- factor(groups, levels = unique(groups)) cols <- scales::hue_pal()(2)[groups] genespca(rlt, ntop=100, arrowColors = cols, groupNames = groups) groups_multi <- interaction(as.data.frame(colData(rlt)[, c("condition", "tissue")])) groups_multi <- factor(groups_multi, levels = unique(groups_multi)) cols_multi <- scales::hue_pal()(length(levels(groups_multi)))[factor(groups_multi)] genespca(rlt, ntop = 100, arrowColors = cols_multi, groupNames = groups_multi)
Get an annotation data frame from biomaRt
get_annotation(dds, biomart_dataset, idtype)
get_annotation(dds, biomart_dataset, idtype)
dds |
A |
biomart_dataset |
A biomaRt dataset to use. To see the list, type
|
idtype |
Character, the ID type of the genes as in the row names of
|
A data frame for ready use in pcaExplorer
, retrieved from biomaRt.
library("airway") data("airway", package = "airway") airway dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design = ~dex+cell) ## Not run: get_annotation(dds_airway, "hsapiens_gene_ensembl", "ensembl_gene_id") ## End(Not run)
library("airway") data("airway", package = "airway") airway dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design = ~dex+cell) ## Not run: get_annotation(dds_airway, "hsapiens_gene_ensembl", "ensembl_gene_id") ## End(Not run)
Get an annotation data frame from org db packages
get_annotation_orgdb(dds, orgdb_species, idtype, key_for_genenames = "SYMBOL")
get_annotation_orgdb(dds, orgdb_species, idtype, key_for_genenames = "SYMBOL")
dds |
A |
orgdb_species |
Character string, named as the |
idtype |
Character, the ID type of the genes as in the row names of
|
key_for_genenames |
Character, corresponding to the column name for the
key in the orgDb package containing the official gene name (often called
gene symbol).
This parameter defaults to "SYMBOL", but can be adjusted in case the key is not
found in the annotation package (e.g. for |
A data frame for ready use in pcaExplorer
, retrieved from the
org db packages
library("airway") data("airway", package = "airway") airway dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design = ~dex+cell) anno_df <- get_annotation_orgdb(dds_airway, "org.Hs.eg.db", "ENSEMBL") head(anno_df)
library("airway") data("airway", package = "airway") airway dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design = ~dex+cell) anno_df <- get_annotation_orgdb(dds_airway, "org.Hs.eg.db", "ENSEMBL") head(anno_df)
Extract genes with highest loadings
hi_loadings( pcaobj, whichpc = 1, topN = 10, exprTable = NULL, annotation = NULL, title = "Top/bottom loadings" )
hi_loadings( pcaobj, whichpc = 1, topN = 10, exprTable = NULL, annotation = NULL, title = "Top/bottom loadings" )
pcaobj |
A |
whichpc |
An integer number, corresponding to the principal component of interest |
topN |
Integer, number of genes with top and bottom loadings |
exprTable |
A |
annotation |
A |
title |
The title of the plot |
A ggplot2 object, or a matrix
, if exprTable
is not null
dds <- makeExampleDESeqDataSet_multifac(betaSD = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaobj <- prcomp(t(SummarizedExperiment::assay(rlt))) hi_loadings(pcaobj, topN = 20) hi_loadings(pcaobj, topN = 10, exprTable = dds) hi_loadings(pcaobj, topN = 10, exprTable = counts(dds))
dds <- makeExampleDESeqDataSet_multifac(betaSD = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaobj <- prcomp(t(SummarizedExperiment::assay(rlt))) hi_loadings(pcaobj, topN = 20) hi_loadings(pcaobj, topN = 10, exprTable = dds) hi_loadings(pcaobj, topN = 10, exprTable = counts(dds))
Extracts the genes with the highest loadings for each principal component, and
performs functional enrichment analysis on them using the simple and quick routine
provided by the limma
package
limmaquickpca2go( se, pca_ngenes = 10000, inputType = "ENSEMBL", organism = "Mm", loadings_ngenes = 500, background_genes = NULL, scale = FALSE, ... )
limmaquickpca2go( se, pca_ngenes = 10000, inputType = "ENSEMBL", organism = "Mm", loadings_ngenes = 500, background_genes = NULL, scale = FALSE, ... )
se |
A |
pca_ngenes |
Number of genes to use for the PCA |
inputType |
Input format type of the gene identifiers. Deafults to |
organism |
Character abbreviation for the species, using |
loadings_ngenes |
Number of genes to extract the loadings (in each direction) |
background_genes |
Which genes to consider as background. |
scale |
Logical, defaults to FALSE, scale values for the PCA |
... |
Further parameters to be passed to the goana routine |
A nested list object containing for each principal component the terms enriched
in each direction. This object is to be thought in combination with the displaying feature
of the main pcaExplorer()
function
library("airway") library("DESeq2") library("limma") data("airway", package = "airway") airway dds_airway <- DESeqDataSet(airway, design = ~ cell + dex) ## Not run: rld_airway <- rlogTransformation(dds_airway) goquick_airway <- limmaquickpca2go(rld_airway, pca_ngenes = 10000, inputType = "ENSEMBL", organism = "Hs") ## End(Not run)
library("airway") library("DESeq2") library("limma") data("airway", package = "airway") airway dds_airway <- DESeqDataSet(airway, design = ~ cell + dex) ## Not run: rld_airway <- rlogTransformation(dds_airway) goquick_airway <- limmaquickpca2go(rld_airway, pca_ngenes = 10000, inputType = "ENSEMBL", organism = "Hs") ## End(Not run)
Constructs a simulated dataset of Negative Binomial data from different conditions.
The fold changes between the conditions can be adjusted with the betaSD_condition
and the betaSD_tissue
arguments.
makeExampleDESeqDataSet_multifac( n = 1000, m = 12, betaSD_condition = 1, betaSD_tissue = 3, interceptMean = 4, interceptSD = 2, dispMeanRel = function(x) 4/x + 0.1, sizeFactors = rep(1, m) )
makeExampleDESeqDataSet_multifac( n = 1000, m = 12, betaSD_condition = 1, betaSD_tissue = 3, interceptMean = 4, interceptSD = 2, dispMeanRel = function(x) 4/x + 0.1, sizeFactors = rep(1, m) )
n |
number of rows (genes) |
m |
number of columns (samples) |
betaSD_condition |
the standard deviation for condition betas, i.e. beta ~ N(0,betaSD) |
betaSD_tissue |
the standard deviation for tissue betas, i.e. beta ~ N(0,betaSD) |
interceptMean |
the mean of the intercept betas (log2 scale) |
interceptSD |
the standard deviation of the intercept betas (log2 scale) |
dispMeanRel |
a function specifying the relationship of the dispersions on
|
sizeFactors |
multiplicative factors for each sample |
This function is designed and inspired following the proposal of
makeExampleDESeqDataSet()
from the DESeq2
package. Credits are given
to Mike Love for the nice initial implementation
a DESeqDataSet()
with true dispersion,
intercept for two factors (condition and tissue) and beta values in the
metadata columns. Note that the true betas are provided on the log2 scale.
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) dds dds2 <- makeExampleDESeqDataSet_multifac(betaSD_condition = 1, betaSD_tissue = 4) dds2
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) dds dds2 <- makeExampleDESeqDataSet_multifac(betaSD_condition = 1, betaSD_tissue = 4) dds2
Pairwise scatter and correlation plot of counts
pair_corr(df, log = FALSE, method = "pearson", use_subset = TRUE)
pair_corr(df, log = FALSE, method = "pearson", use_subset = TRUE)
df |
A data frame, containing the (raw/normalized/transformed) counts |
log |
Logical, whether to convert the input values to log2 (with addition of a pseudocount). Defaults to FALSE. |
method |
Character string, one of |
use_subset |
Logical value. If TRUE, only 1000 values per sample will be used to speed up the plotting operations. |
A plot with pairwise scatter plots and correlation coefficients
library("airway") data("airway", package = "airway") airway dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design = ~dex+cell) pair_corr(counts(dds_airway)[1:100, ]) # use just a subset for the example
library("airway") data("airway", package = "airway") airway dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design = ~dex+cell) pair_corr(counts(dds_airway)[1:100, ]) # use just a subset for the example
Extracts the genes with the highest loadings for each principal component, and
performs functional enrichment analysis on them using routines and algorithms from
the topGO
package
pca2go( se, pca_ngenes = 10000, annotation = NULL, inputType = "geneSymbol", organism = "Mm", ensToGeneSymbol = FALSE, loadings_ngenes = 500, background_genes = NULL, scale = FALSE, return_ranked_gene_loadings = FALSE, annopkg = NULL, ... )
pca2go( se, pca_ngenes = 10000, annotation = NULL, inputType = "geneSymbol", organism = "Mm", ensToGeneSymbol = FALSE, loadings_ngenes = 500, background_genes = NULL, scale = FALSE, return_ranked_gene_loadings = FALSE, annopkg = NULL, ... )
se |
A |
pca_ngenes |
Number of genes to use for the PCA |
annotation |
A |
inputType |
Input format type of the gene identifiers. Will be used by the routines of |
organism |
Character abbreviation for the species, using |
ensToGeneSymbol |
Logical, whether to expect ENSEMBL gene identifiers, to convert to gene symbols
with the |
loadings_ngenes |
Number of genes to extract the loadings (in each direction) |
background_genes |
Which genes to consider as background. |
scale |
Logical, defaults to FALSE, scale values for the PCA |
return_ranked_gene_loadings |
Logical, defaults to FALSE. If TRUE, simply returns a list containing the top ranked genes with hi loadings in each PC and in each direction |
annopkg |
String containing the name of the organism annotation package. Can be used to
override the |
... |
Further parameters to be passed to the topGO routine |
A nested list object containing for each principal component the terms enriched
in each direction. This object is to be thought in combination with the displaying feature
of the main pcaExplorer()
function
library("airway") library("DESeq2") data("airway", package = "airway") airway dds_airway <- DESeqDataSet(airway, design= ~ cell + dex) ## Not run: rld_airway <- rlogTransformation(dds_airway) # constructing the annotation object anno_df <- data.frame(gene_id = rownames(dds_airway), stringsAsFactors = FALSE) library("AnnotationDbi") library("org.Hs.eg.db") anno_df$gene_name <- mapIds(org.Hs.eg.db, keys = anno_df$gene_id, column = "SYMBOL", keytype = "ENSEMBL", multiVals = "first") rownames(anno_df) <- anno_df$gene_id bg_ids <- rownames(dds_airway)[rowSums(counts(dds_airway)) > 0] library(topGO) pca2go_airway <- pca2go(rld_airway, annotation = anno_df, organism = "Hs", ensToGeneSymbol = TRUE, background_genes = bg_ids) ## End(Not run)
library("airway") library("DESeq2") data("airway", package = "airway") airway dds_airway <- DESeqDataSet(airway, design= ~ cell + dex) ## Not run: rld_airway <- rlogTransformation(dds_airway) # constructing the annotation object anno_df <- data.frame(gene_id = rownames(dds_airway), stringsAsFactors = FALSE) library("AnnotationDbi") library("org.Hs.eg.db") anno_df$gene_name <- mapIds(org.Hs.eg.db, keys = anno_df$gene_id, column = "SYMBOL", keytype = "ENSEMBL", multiVals = "first") rownames(anno_df) <- anno_df$gene_id bg_ids <- rownames(dds_airway)[rowSums(counts(dds_airway)) > 0] library(topGO) pca2go_airway <- pca2go(rld_airway, annotation = anno_df, organism = "Hs", ensToGeneSymbol = TRUE, background_genes = bg_ids) ## End(Not run)
Launch a Shiny App for interactive exploration of a dataset from the perspective of Principal Components Analysis
pcaExplorer( dds = NULL, dst = NULL, countmatrix = NULL, coldata = NULL, pca2go = NULL, annotation = NULL, runLocal = TRUE )
pcaExplorer( dds = NULL, dst = NULL, countmatrix = NULL, coldata = NULL, pca2go = NULL, annotation = NULL, runLocal = TRUE )
dds |
A |
dst |
A |
countmatrix |
A count matrix, with genes as rows and samples as columns. If not provided, it is possible to upload the data during the execution of the Shiny App |
coldata |
A data.frame containing the info on the covariates of each sample. If not provided, it is possible to upload the data during the execution of the Shiny App |
pca2go |
An object generated by the |
annotation |
A |
runLocal |
A logical indicating whether the app is to be run locally or remotely on a server, which determines how documentation will be accessed. |
A Shiny App is launched for interactive data exploration
library("airway") data("airway", package = "airway") airway dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design = ~dex+cell) ## Not run: rld_airway <- DESeq2::rlogTransformation(dds_airway) pcaExplorer(dds_airway, rld_airway) pcaExplorer(countmatrix = counts(dds_airway), coldata = colData(dds_airway)) pcaExplorer() # and then upload count matrix, covariate matrix (and eventual annotation) ## End(Not run)
library("airway") data("airway", package = "airway") airway dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design = ~dex+cell) ## Not run: rld_airway <- DESeq2::rlogTransformation(dds_airway) pcaExplorer(dds_airway, rld_airway) pcaExplorer(countmatrix = counts(dds_airway), coldata = colData(dds_airway)) pcaExplorer() # and then upload count matrix, covariate matrix (and eventual annotation) ## End(Not run)
pcaExplorer provides functionality for interactive visualization of RNA-seq datasets based on Principal Components Analysis. The methods provided allow for quick information extraction and effective data exploration. A Shiny application encapsulates the whole analysis.
pcaExplorer provides functionality for interactive visualization of RNA-seq datasets based on Principal Components Analysis. The methods provided allow for quick information extraction and effective data exploration. A Shiny application encapsulates the whole analysis.
Federico Marini [email protected], 2016
Maintainer: Federico Marini [email protected]
Useful links:
Report bugs at https://github.com/federicomarini/pcaExplorer/issues
Plots the results of PCA on a 2-dimensional space
pcaplot( x, intgroup = NULL, ntop = 500, returnData = FALSE, title = NULL, pcX = 1, pcY = 2, text_labels = TRUE, point_size = 3, ellipse = TRUE, ellipse.prob = 0.95 )
pcaplot( x, intgroup = NULL, ntop = 500, returnData = FALSE, title = NULL, pcX = 1, pcY = 2, text_labels = TRUE, point_size = 3, ellipse = TRUE, ellipse.prob = 0.95 )
x |
A |
intgroup |
Interesting groups: a character vector of
names in |
ntop |
Number of top genes to use for principal components, selected by highest row variance |
returnData |
logical, if TRUE returns a data.frame for further use, containing the selected principal components and intgroup covariates for custom plotting |
title |
The plot title |
pcX |
The principal component to display on the x axis |
pcY |
The principal component to display on the y axis |
text_labels |
Logical, whether to display the labels with the sample identifiers |
point_size |
Integer, the size of the points for the samples |
ellipse |
Logical, whether to display the confidence ellipse for the selected groups |
ellipse.prob |
Numeric, a value in the interval [0;1) |
An object created by ggplot
, which can be assigned and further customized.
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaplot(rlt, ntop = 200)
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaplot(rlt, ntop = 200)
Plots the results of PCA on a 3-dimensional space, interactively
pcaplot3d( x, intgroup = "condition", ntop = 500, returnData = FALSE, title = NULL, pcX = 1, pcY = 2, pcZ = 3, text_labels = TRUE, point_size = 3 )
pcaplot3d( x, intgroup = "condition", ntop = 500, returnData = FALSE, title = NULL, pcX = 1, pcY = 2, pcZ = 3, text_labels = TRUE, point_size = 3 )
x |
A |
intgroup |
Interesting groups: a character vector of
names in |
ntop |
Number of top genes to use for principal components, selected by highest row variance |
returnData |
logical, if TRUE returns a data.frame for further use, containing the selected principal components and intgroup covariates for custom plotting |
title |
The plot title |
pcX |
The principal component to display on the x axis |
pcY |
The principal component to display on the y axis |
pcZ |
The principal component to display on the z axis |
text_labels |
Logical, whether to display the labels with the sample identifiers |
point_size |
Integer, the size of the points for the samples |
A html-based visualization of the 3d PCA plot
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaplot3d(rlt, ntop = 200)
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaplot3d(rlt, ntop = 200)
Produces a scree plot for investigating the proportion of explained variance, or alternatively the cumulative value
pcascree(obj, type = c("pev", "cev"), pc_nr = NULL, title = NULL)
pcascree(obj, type = c("pev", "cev"), pc_nr = NULL, title = NULL)
obj |
A |
type |
Display absolute proportions or cumulative proportion. Possible values: "pev" or "cev" |
pc_nr |
How many principal components to display max |
title |
Title of the plot |
An object created by ggplot
, which can be assigned and further customized.
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaobj <- prcomp(t(SummarizedExperiment::assay(rlt))) pcascree(pcaobj, type = "pev") pcascree(pcaobj, type = "cev", title = "Cumulative explained proportion of variance - Test dataset")
dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- DESeq2::rlogTransformation(dds) pcaobj <- prcomp(t(SummarizedExperiment::assay(rlt))) pcascree(pcaobj, type = "pev") pcascree(pcaobj, type = "cev", title = "Cumulative explained proportion of variance - Test dataset")
Plots the significance of the (cor)relation of each covariate vs a principal component
plotPCcorrs(pccorrs, pc = 1, logp = TRUE)
plotPCcorrs(pccorrs, pc = 1, logp = TRUE)
pccorrs |
A |
pc |
An integer number, corresponding to the principal component of interest |
logp |
Logical, defaults to |
A base plot object
library(DESeq2) dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- rlogTransformation(dds) pcaobj <- prcomp(t(assay(rlt))) res <- correlatePCs(pcaobj, colData(dds)) plotPCcorrs(res)
library(DESeq2) dds <- makeExampleDESeqDataSet_multifac(betaSD_condition = 3, betaSD_tissue = 1) rlt <- rlogTransformation(dds) pcaobj <- prcomp(t(assay(rlt))) res <- correlatePCs(pcaobj, colData(dds)) plotPCcorrs(res)
A wrapper for extracting functional GO terms enriched in the DE genes, based on the algorithm and the implementation in the topGO package
topGOtable( DEgenes, BGgenes, ontology = "BP", annot = annFUN.org, mapping = "org.Mm.eg.db", geneID = "symbol", topTablerows = 200, fullNamesInRows = TRUE, addGeneToTerms = TRUE, plotGraph = FALSE, plotNodes = 10, writeOutput = FALSE, outputFile = "", topGO_method2 = "elim", do_padj = FALSE )
topGOtable( DEgenes, BGgenes, ontology = "BP", annot = annFUN.org, mapping = "org.Mm.eg.db", geneID = "symbol", topTablerows = 200, fullNamesInRows = TRUE, addGeneToTerms = TRUE, plotGraph = FALSE, plotNodes = 10, writeOutput = FALSE, outputFile = "", topGO_method2 = "elim", do_padj = FALSE )
DEgenes |
A vector of (differentially expressed) genes |
BGgenes |
A vector of background genes, e.g. all (expressed) genes in the assays |
ontology |
Which Gene Ontology domain to analyze: |
annot |
Which function to use for annotating genes to GO terms. Defaults to |
mapping |
Which |
geneID |
Which format the genes are provided. Defaults to |
topTablerows |
How many rows to report before any filtering |
fullNamesInRows |
Logical, whether to display or not the full names for the GO terms |
addGeneToTerms |
Logical, whether to add a column with all genes annotated to each GO term |
plotGraph |
Logical, if TRUE additionally plots a graph on the identified GO terms |
plotNodes |
Number of nodes to plot |
writeOutput |
Logical, if TRUE additionally writes out the result to a file |
outputFile |
Name of the file the result should be written into |
topGO_method2 |
Character, specifying which of the methods implemented by |
do_padj |
Logical, whether to perform the adjustment on the p-values from the specific topGO method, based on the FDR correction. Defaults to FALSE, since the assumption of independent hypotheses is somewhat violated by the intrinsic DAG-structure of the Gene Ontology Terms |
Allowed values assumed by the topGO_method2
parameter are one of the
following: elim
, weight
, weight01
, lea
,
parentchild
. For more details on this, please refer to the original
documentation of the topGO
package itself
A table containing the computed GO Terms and related enrichment scores
library("airway") library("DESeq2") data("airway", package = "airway") airway dds_airway <- DESeqDataSet(airway, design= ~ cell + dex) # Example, performing extraction of enriched functional categories in # detected significantly expressed genes ## Not run: dds_airway <- DESeq(dds_airway) res_airway <- results(dds_airway) library("AnnotationDbi") library("org.Hs.eg.db") res_airway$symbol <- mapIds(org.Hs.eg.db, keys = row.names(res_airway), column = "SYMBOL", keytype = "ENSEMBL", multiVals = "first") res_airway$entrez <- mapIds(org.Hs.eg.db, keys = row.names(res_airway), column = "ENTREZID", keytype = "ENSEMBL", multiVals = "first") resOrdered <- as.data.frame(res_airway[order(res_airway$padj),]) de_df <- resOrdered[resOrdered$padj < .05 & !is.na(resOrdered$padj),] de_symbols <- de_df$symbol bg_ids <- rownames(dds_airway)[rowSums(counts(dds_airway)) > 0] bg_symbols <- mapIds(org.Hs.eg.db, keys = bg_ids, column = "SYMBOL", keytype = "ENSEMBL", multiVals = "first") library(topGO) topgoDE_airway <- topGOtable(de_symbols, bg_symbols, ontology = "BP", mapping = "org.Hs.eg.db", geneID = "symbol") ## End(Not run)
library("airway") library("DESeq2") data("airway", package = "airway") airway dds_airway <- DESeqDataSet(airway, design= ~ cell + dex) # Example, performing extraction of enriched functional categories in # detected significantly expressed genes ## Not run: dds_airway <- DESeq(dds_airway) res_airway <- results(dds_airway) library("AnnotationDbi") library("org.Hs.eg.db") res_airway$symbol <- mapIds(org.Hs.eg.db, keys = row.names(res_airway), column = "SYMBOL", keytype = "ENSEMBL", multiVals = "first") res_airway$entrez <- mapIds(org.Hs.eg.db, keys = row.names(res_airway), column = "ENTREZID", keytype = "ENSEMBL", multiVals = "first") resOrdered <- as.data.frame(res_airway[order(res_airway$padj),]) de_df <- resOrdered[resOrdered$padj < .05 & !is.na(resOrdered$padj),] de_symbols <- de_df$symbol bg_ids <- rownames(dds_airway)[rowSums(counts(dds_airway)) > 0] bg_symbols <- mapIds(org.Hs.eg.db, keys = bg_ids, column = "SYMBOL", keytype = "ENSEMBL", multiVals = "first") library(topGO) topgoDE_airway <- topGOtable(de_symbols, bg_symbols, ontology = "BP", mapping = "org.Hs.eg.db", geneID = "symbol") ## End(Not run)