daVis

Introduction

This document explains the functionalities available in the daVis package.

daVis contains utility functions to visualize the output from differential expression analysis. The input data can be a model, a list of top tables, or a combination of these two. The model can be of class MArrayLM (limma), DGELRT (edgeR), or DESeqResults (DESeq2).

Installation

Download the package from Bioconductor:

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("daVis")

Load the package into R session

library(daVis)

Load data

Each visualization function takes as an input:

  • model - an object of class MArrayLM, DGELRT or DESeqResults
  • topTables - a list of top tables for multiple coefficients
  • combination of the model and/or top table
tmpDir <- tempfile(); dir.create(tmpDir)
exampleData <- createExampleData(
    path = tmpDir, 
    output = c("limma", "edgeR", "deseq2", "topTable")
)
## calcNormFactors has been renamed to normLibSizes
res.limma <- exampleData$limma
res.edger <- exampleData$edgeR
res.deseq <- exampleData$DESeq2
topTableList <- exampleData$topTable

Top table

The user should provide a named list of top tables for different contrasts/coefficients. Here, the list contains 4 top tables from 4 comparisons. Each top table must contain the following columns: 'logFC', 'P.Value', 'adj.P.Val'. Additionally, each table can contain columns with gene identifiers and averaged across all samples expression ('AveExpr' or 'logCPM').

length(topTableList)
## [1] 4

The names of the list are linked to the comparisons:

names(topTableList)
## [1] "B.LvsP" "L.LvsP" "B.PvsV" "L.PvsV"

Below a subset of an example top table:

ENTREZID SYMBOL logFC AveExpr P.Value adj.P.Val
24117 Wif1 -1.82 2.975 1.279e-10 9.387e-07
381290 Atp2b4 2.144 3.944 1.852e-10 9.387e-07
226101 Myof 2.33 6.223 2.827e-10 9.387e-07
78896 Ecrg4 -2.807 3.036 3.011e-10 9.387e-07
231830 Micall2 -2.253 4.761 3.527e-10 9.387e-07
16012 Igfbp6 2.896 1.978 2.931e-10 9.387e-07

Model

The object res.limma is of class:

class(res.limma)
## [1] "MArrayLM"
## attr(,"package")
## [1] "limma"

The object res.edger is of class:

class(res.edger)
## [1] "DGELRT"
## attr(,"package")
## [1] "edgeR"

The object res.deseq is of class:

class(res.deseq)
## [1] "DESeqResults"
## attr(,"package")
## [1] "DESeq2"

Visualizations

Volcano plot

A volcano plot enables a quick visual identification of the size and significance of the feature expression effects (top left/right). The significance of the effect is represented by the raw p-value on the y axis, so highly significant features are at the top of the plot. The size of the effect is represented by the log of the fold change (negative/positive for down/up-regulation), so features with high effects are at the right/left side of the plot. Below, the color scale is used for adjusted p-values (corrected for multiple testing across genes).

The documentation of the function containing description of each parameter can be obtained by:

help("daVolcanoPlot", "daVis")

Highlight top genes or genes of interest

The function daVolcanoPlot() creates a volcano plot for the provided model or list of top tables, or their combination. The feature identifier can be specified by featuresIdVar parameter. If empty (by default), row names of the input are used as feature identifiers. The colorVar, shapeVar, alphaVar and/or sizeVar parameters can be used to customize the plot. Here, the colorVar parameter is used to color the points by adjusted p-value. The topGenes parameter represents the number of top genes with highest logFC or p-value to highlight in the plot for each considered coefficient (0 by default). The features are then labeled by topGenesVar parameter. If empty, row names of the input are used.

Additionally, a set of genes of interest can be highlighted in red. The features can be specified using the genesToHighlight parameter and genesToHighlightVar indicates the identifier used to label genes of interest. genesToHighlight should be the same as the input data row names.

coefs <- c("B.LvsP", "L.LvsP")
genesOfInterest <- c("497097", "20671", "239273", "14862", "27395", "76408")

daVolcanoPlot(
  input = res.limma, 
  coef = coefs, 
  coefLabel = c("A", "B"),
  topGenes = 5,
  topGenesVar = "SYMBOL",
  genesToHighlight = genesOfInterest,
  genesToHighlightVar = "SYMBOL",
  colorVar = "adj.P.Val",
  facetNCol = 2
)
## Loading required namespace: ggrepel
## Found litedown! Enabling r-universe template

Additional fdr threshold

coefs <- c("B.LvsP", "L.LvsP")

daVolcanoPlot(
  input = res.limma, 
  coef = coefs, 
  coefLabel = c("A", "B"),
  facetNCol = 2,
  additionalThresholdsAdjPValue = 0.1,
  colorVar = "adj.P.Val"
)

Interactive plot

The interactive volcano plot will be created by changing the typePlot parameter.

coefs <- c("B.LvsP", "L.LvsP")

daVolcanoPlot(
  input = res.limma, 
  coef = coefs, 
  coefLabel = c("A", "B"),
  typePlot = "interactive"
)

Log-ratio plot

A log-ratio plot represents the differential effect (e.g., treatment versus control) for several conditions (e.g., compounds or concentrations) of the experiment (logFC scale). This enables to visualize a bigger subset of genes. The significance of genes can be represented via colored rows, e.g., red denotes significant genes, while grey indicates non-significant genes.

The documentation of the function containing description of each parameter can be obtained by:

help("daLogRatioPlot", "daVis")

Color feature labels

The function daLogRatioPlot() creates a log-ratio plot for the provided model, top tables (or list of those). The features should be specified using the features parameter, and the feature identifier can be specified using the featuresIdVar parameter. If the features parameter is not provided, the plot will display the top 20 features. The features labels can be colored by using featuresColor parameter. Here, the features are colored based on the significance for the coefficient. Additionally, the features can be labeled by providing the featuresVar parameter indicating the column names in the top table.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")

features <- c(
  "72243", "66704", "11781", "226101", "14620", "381290", "16012", 
  "100504225", "231830", "78896", "103889", "231991", "77034", "19687", 
  "100043805", "16669", "226162", "24117", "55987", "11419" 
)
dt <- topTableList[['B.LvsP']]
adjPVal <- dt[match(features, dt$ENTREZID), "adj.P.Val"]
signFeature <- ifelse(adjPVal <= 0.05, "red", "grey")

daLogRatioPlot(
  input = topTableList,
  features = features,
  featuresIdVar = "ENTREZID",
  featuresVar = c("SYMBOL", "GENENAME"),
  featuresMaxNChar = 35,
  coef = coefs,
  coefLabel = c("A", "B", "C", "D"),
  facetNCol = 4,
  featuresColor = signFeature,
  errorBars = FALSE
)

Facet by variable(s)

The coefficients can be grouped by specifying multiple sets of labels to the coefLabel parameter. The colors of the bars can be changed via color.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")

coefsLabel <- list(
  sub("(.+)\\.(.+)", "\\2", coefs),
  sub("(.+)\\.(.+)", "\\1", coefs)
)

colorPalette <- c(
  `B.PvsV` = "darkgreen", `L.PvsV` = "lightgreen",
  `B.LvsP` = "darkblue", `L.LvsP` = "lightblue"
)

daLogRatioPlot(
  input = topTableList,
  featuresIdVar = "ENTREZID",
  coef = coefs,
  coefLabel = coefsLabel,
  facetNCol = 4,
  errorBars = FALSE,
  color = colorPalette
)

# Note: using coef labels as names of the color palette also works 
# (if coefLabel is NOT a list) - for back-compatibility
# colorPalette <- c(
#   C = "darkgreen", D = "lightgreen",
#   A = "darkblue", B = "lightblue"
# )

Mixed input

The log ratio plot can be created for a (mixed) list of top table(s) and model(s).

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV", "A")

daLogRatioPlot(
  input = list(res.limma, A = topTableList[["B.LvsP"]]),
  featuresIdVar = "ENTREZID",
  coef = coefs,
  facetNCol = 5,
  errorBars = TRUE
)

Sort features

The features can be sorted with the featuresOrder parameter.

Based on significance

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")

daLogRatioPlot(
  input = res.limma,
  featuresIdVar = "ENTREZID",
  features = features, featuresOrder = "significance",
  coef = coefs,
  facetNCol = 4,
  errorBars = TRUE
)

Display text

A text can be displayed in the log ratio plot via the text parameter. This can be a column of the top table or a function formatting such column(s).
If the text doesn’t fit within the axes limits, the x-axis can be expanded via the xexpand parameter.

Significance star

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
# Note: the format.pval function can be used to format the p-value as a text
getSignifStar <- function(topTable){
  with(topTable, as.character(
    stats::symnum(
      x = adj.P.Val, 
      cutpoints = c(0, .001, .01, .05, .1, 1),
      symbols = c("***","**","*","."," "),
      corr = FALSE
    )
  ))
}
daLogRatioPlot(
  input = res.limma,
  coef = coefs,
  coefLabel = c("A", "B", "C", "D"),
  color = colorPalette,
  facetNCol = 4,
  text = getSignifStar
)

Log fold change

daLogRatioPlot(
  input = res.limma,
  coef = coefs,
  coefLabel = c("A", "B", "C", "D"),
  color = colorPalette,
  facetNCol = 4,
  text = function(topTable)
    with(topTable, round(logFC, digits = 1)),
  textCex = 3
)

Heatmap

A heatmap represents the differential effect (e.g. treatment versus control) for several conditions (e.g., compounds or concentrations) of the experiment. It is a graphical representation of the individual values contained in a matrix as colors. This enables to visualize a bigger subset of genes. The gene label can be colored indicating for example the significance of genes, e.g., black color denotes significant genes, while grey represents non-significant genes.

The documentation of the function containing description of each parameter can be obtained by:

help("daHeatmap", "daVis")

Color feature labels

The function daHeatmapLogFC() creates a heatmap for the provided model or list of top tables. The features should be specified using the features parameter, and the feature identifier can be specified using featuresIdVar. The features are labeled by row names of input (or featuresIdVar) by default. Different labels are possible by using the featuresVar parameter.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")

daHeatmapLogFC(
  input = topTableList,
  features = features,
  featuresIdVar = "ENTREZID",
  featuresVar = c("SYMBOL", "GENENAME"),
  featuresMaxNChar = 35,
  coef = coefs,
  coefLabel = c("A", "B", "C", "D"),
  featuresColor = signFeature
)

Nested coefficient labels

The coefficients can be grouped by specifying multiple sets of labels to the coefLabel parameter.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")

coefsLabel <- list(
  sub("(.+)\\.(.+)", "\\2", coefs),
  sub("(.+)\\.(.+)", "\\1", coefs)
)

daHeatmapLogFC(
  input = res.limma,
  features = features,
  featuresIdVar = "ENTREZID",
  featuresVar = c("SYMBOL", "GENENAME"),
  featuresMaxNChar = 35,
  coef = coefs,
  coefLabel = coefsLabel
)

Upset plot

An upset plot is used to represent the overlap (intersection) or difference of the sets of significant genes, down or up-regulated separately, between different differential effects. The different shades of blue are indicative for the number of differential effects (sharing these up- or down-regulated genes).

The documentation of the function containing description of each parameter can be obtained by:

help("daUpset", "daVis")

Down-regulated genes

The function daUpset() creates an upset plot for the provided model or list of top tables. The sets are created based on the row names of input or feature identifiers specified using the featuresIdVar parameter.

daUpset(
  input = topTableList,
  coef = coefs,
  featuresIdVar = "SYMBOL",
  fdr = 0.05,
  dir = "down",
  ylab = paste("Number of (shared) significantly \n", 
               "down-regulated genes"),
  xlab = paste("Number of significantly \n", 
               "down-regulated genes")
)

Return overlapping sets

If returnAnalysis is set to TRUE, the output of the daUpset() function is a list. The slot sets contains all the overlapping sets between specified coefficients. The sets contain identifiers based on featuresIdVar. The slot plot contains the plot object.

out <- daUpset(
  input = topTableList,
  coef = coefs,
  featuresIdVar = "SYMBOL",
  fdr = 0.05,
  dir = "down",
  ylab = paste("Number of (shared) significantly \n", 
               "down-regulated genes"),
  xlab = paste("Number of significantly \n", 
               "down-regulated genes"),
  returnAnalysis = TRUE
)
out$sets
out$plot

Scatter plot

A scatter plot visualizes the comparison of the logFC for different differential effects.

The documentation of the function containing description of each parameter can be obtained by:

help("daScatterPlot", "daVis")

Highlight genes of interest and top genes

The function daScatterPlot() creates a scatter plot for the provided model or list of top tables. The features of interest can be specified using genesToHighlight and genesToHighlightVar parameters. Genes with highest logFC and significance can be colored as well by topGenes. These can be labeled by topGenesVar.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
genesOfInterest <- c("497097", "20671", "239273", "14862", "27395", "76408")

daScatterPlot(
  input = topTableList,
  coef = coefs,
  coefLabel = c("A", "B", "C", "D"),
  genesToHighlight = genesOfInterest,
  genesToHighlightVar = "SYMBOL",
  topGenes = 5,
  topGenesVar = "SYMBOL",
  facetNCol = 3
)

Interactive plot with feature annotation

The interactive scatter plot will be created by changing the typePlot parameter. An additional feature annotation can be shown when hovering over the points. The featuresVar parameter allows to specify the column names that should be used to annotate the features.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")

daScatterPlot(
  input = topTableList,
  coef = coefs,
  coefLabel = c("A", "B", "C", "D"),
  featuresIdVar = "ENTREZID",
  featuresVar = c("SYMBOL", "GENENAME"),
  facetNCol = 3,
  typePlot = "interactive"
)

Add correlation

The Pearson correlation value can be shown in the plot when setting correlation parameter to TRUE.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")

daScatterPlot(
  input = topTableList,
  coef = coefs[c(1, 2)],
  coefLabel = c("A", "B"),
  featuresIdVar = "ENTREZID",
  correlation = TRUE
)

Waterfall plot

A waterfall plot visualizes the logFC for different differential effects colored by adjusted p-value.

The documentation of the function containing description of each parameter can be obtained by:

help("daWaterfallPlot", "daVis")

Multiple coefficients

The function daWaterfallPlot() creates a barplot plot for the provided model or list of top tables. When more than one coefficient specified, multiple plots side by side are created.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
features <- c(
  "24117", "381290", "226101", "78896", "231830", "16012", "16669", "55987",
  "231991", "14620", "20317", "74747", "11636", "20482", "194126", "270150",
  "17131", "16878", "20564", "73847" 
)

daWaterfallPlot(
  input = res.limma,
  coef = coefs,
  coefLabel = c("A", "B", "C", "D"),
  featuresIdVar = "ENTREZID",
  featuresVar = "SYMBOL",
  features = features,
  facetNCol = 4,
  colorVar = "adj.P.Val",
  color = c("maroon", "orange"),
  fillVar = "adj.P.Val",
  fill = c("maroon", "orange"),
  typePlot = "static"
)

MA plot

A MA plot visualizes the logFC versus log2 mean expression.

The documentation of the function containing description of each parameter can be obtained by:

help("daMAplot", "daVis")

Label top genes and genes of interest

The function daMAplot() creates a scatter plot (logFC versus mean expression) for the provided model or list of top tables. The top genes with highest absolute logFC and significance can be labeled. The number of top genes can be specified using the topGenes parameter. Top genes are labeled by feature names by default. It can be changed with topGenesVar. The genes of interest are indicated in green. Additionally, if coef indicates only one coefficient, the legend shows the number of significantly up- or down-regulated genes, or the number of non-significant genes. The points can be colored by direction (significant up- and down-regulated genes) if direction is set to TRUE. Customized colors can be used with color.

coefs <- c("B.LvsP", "L.LvsP")

daMAplot(
  input = res.limma, 
  coef = coefs[1], 
  coefLabel = "A",
  featuresIdVar = "ENTREZID",
  topGenes = 5,
  topGenesVar = "SYMBOL",
  genesToHighlight = genesOfInterest,
  genesToHighlightVar = "SYMBOL",
  direction = TRUE,
  color = c("steelblue", "firebrick", "grey")
)

Significant genes barplot

A barplot visualizes the number of significant genes per comparison.

The documentation of the function containing description of each parameter can be obtained by:

help("daSignificantGenesBarplot", "daVis")

Nested coefficient labels

The function daSignificantGenesBarplot() creates a barplot indicating the number of significant (up- and down-regulated) genes for the provided model or list of top tables. The coefficients can be grouped by specifying multiple sets of labels to the coefLabel parameter.

coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")

coefsLabel <- list(
  sub("(.+)\\.(.+)", "\\2", coefs),
  sub("(.+)\\.(.+)", "\\1", coefs)
)

daSignificantGenesBarplot(
  input = res.limma, 
  coef = coefs, 
  coefLabel = coefsLabel,
  addPercentage = TRUE
)

Session information

sessionInfo()
## R version 4.6.0 (2026-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] pander_0.6.6 daVis_0.99.3
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1            dplyr_1.2.1                
##  [3] farver_2.1.2                blob_1.3.0                 
##  [5] Biostrings_2.81.3           S7_0.2.2                   
##  [7] fastmap_1.2.0               digest_0.6.39              
##  [9] lifecycle_1.0.5             statmod_1.5.2              
## [11] KEGGREST_1.53.0             RSQLite_3.53.1             
## [13] magrittr_2.0.5              compiler_4.6.0             
## [15] rlang_1.2.0                 sass_0.4.10                
## [17] tools_4.6.0                 yaml_2.3.12                
## [19] knitr_1.51                  S4Arrays_1.13.0            
## [21] labeling_0.4.3              bit_4.6.0                  
## [23] DelayedArray_0.39.3         plyr_1.8.9                 
## [25] xml2_1.5.2                  RColorBrewer_1.1-3         
## [27] abind_1.4-8                 BiocParallel_1.47.0        
## [29] withr_3.0.2                 BiocGenerics_0.59.7        
## [31] sys_3.4.3                   grid_4.6.0                 
## [33] ggh4x_0.3.1                 stats4_4.6.0               
## [35] edgeR_4.11.1                ggplot2_4.0.3              
## [37] scales_1.4.0                SummarizedExperiment_1.43.0
## [39] cli_3.6.6                   UpSetR_1.4.1               
## [41] rmarkdown_2.31              crayon_1.5.3               
## [43] generics_0.1.4              otel_0.2.0                 
## [45] httr_1.4.8                  commonmark_2.0.0           
## [47] DBI_1.3.0                   cachem_1.1.0               
## [49] legendry_0.3.0              stringr_1.6.0              
## [51] parallel_4.6.0              AnnotationDbi_1.75.0       
## [53] XVector_0.53.0              matrixStats_1.5.0          
## [55] vctrs_0.7.3                 Matrix_1.7-5               
## [57] jsonlite_2.0.0              litedown_0.9               
## [59] IRanges_2.47.2              S4Vectors_0.51.3           
## [61] bit64_4.8.2                 ggrepel_0.9.8              
## [63] maketools_1.3.2             locfit_1.5-9.12            
## [65] limma_3.69.2                jquerylib_0.1.4            
## [67] glue_1.8.1                  org.Mm.eg.db_3.23.0        
## [69] codetools_0.2-20            ggtext_0.1.2               
## [71] stringi_1.8.7               gtable_0.3.6               
## [73] GenomicRanges_1.65.0        tibble_3.3.1               
## [75] pillar_1.11.1               htmltools_0.5.9            
## [77] Seqinfo_1.3.0               R6_2.6.1                   
## [79] evaluate_1.0.5              lattice_0.22-9             
## [81] Biobase_2.73.1              markdown_2.0               
## [83] png_0.1-9                   gridtext_0.1.6             
## [85] memoise_2.0.1               bslib_0.11.0               
## [87] Rcpp_1.1.1-1.1              gridExtra_2.3              
## [89] SparseArray_1.13.2          DESeq2_1.53.0              
## [91] xfun_0.58                   MatrixGenerics_1.25.0      
## [93] buildtools_1.0.0            pkgconfig_2.0.3