The sechm package is a wrapper around the ComplexHeatmap package to facilitate the creation of annotated heatmaps from objects of the Bioconductor class SummarizedExperiment (and extensions thereof).
To showcase the main functions, we will use an example object which contains (a subset of) RNAseq of mouse hippocampi after Forskolin-induced long-term potentiation:
suppressPackageStartupMessages({
library(SummarizedExperiment)
library(sechm)
})
data("Chen2017", package="sechm")
SE <- Chen2017
This is taken from Chen et al., 2017.
The sechm
function simplifies the generation of heatmaps
from SummarizedExperiment
. It minimally requires, as input,
a SummarizedExperiment
object and a set of genes (or
features, i.e. rows of sechm
) to plot:
g <- c("Egr1", "Nr4a1", "Fos", "Egr2", "Sgk1", "Arc", "Dusp1", "Fosb", "Sik1")
sechm(SE, features=g)
## Using assay 'logFC'
## Using assay 'logFC'
The assay can be selected, and any rowData
or
colData
columns can be specified as annotation:
rowData(SE)$meanLogCPM <- rowMeans(assays(SE)$logcpm)
sechm(SE, features=g, assayName="logFC", top_annotation=c("Condition","Time"), left_annotation=c("meanLogCPM"))
Column names are ommitted by default, but can be displayed:
## Using assay 'logFC'
Since sechm
uses the ComplexHeatmap
engine for plotting, any argument of
ComplexHeatmap::Heatmap
can be passed:
## Using assay 'logFC'
When plotting a lot of rows, by default row names are not shown (can be
overriden), but specific genes can be highlighted with the
mark
argument:
## Using assay 'logFC'
We can also add gaps using the same columns:
## Using assay 'logFC'
By default, rows are sorted using the MDS angle method (can
be altered with the sort.method
argument); this can be
disabled with:
## Using assay 'logFC'
# no reordering:
sechm(SE, features=row.names(SE), do.scale=TRUE, sortRowsOn=NULL,
cluster_rows=FALSE)
## Using assay 'logFC'
It is also possible to combine sorting with clusters using the
toporder
argument, or using gaps:.
# we first cluster rows, and save the clusters in the rowData:
rowData(SE)$cluster <- as.character(kmeans(t(scale(t(assay(SE)))),5)$cluster)
sechm(SE, features=1:30, do.scale=TRUE, toporder="cluster",
left_annotation="cluster", show_rownames=FALSE)
## Using assay 'logFC'
## Timing stopped at: 0 0 0.001
## Timing stopped at: 0.001 0 0
## Using assay 'logFC'
sechm
tries to guess whether the data plotted are
centered around zero, and adjusts the scale accordingly (this can be
disable with breaks=FALSE
). It also performs a quantile
capping to avoid extreme values taking most of the color scale, which is
especially relevant when plotting for instance fold-changes. This can be
controlled with the breaks
argument. Consider the three
following examples:
library(ComplexHeatmap)
g2 <- c(g,"Gm14288",tail(row.names(SE)))
draw(
sechm(SE, features=g2, assayName="logFC", breaks=1, column_title="breaks=1") +
sechm(SE, features=g2, assayName="logFC", breaks=0.995,
column_title="breaks=0.995", name="logFC(2)") +
sechm(SE, features=g2, assayName="logFC", breaks=0.985,
column_title="breaks=0.985", name="logFC(3)"),
merge_legends=TRUE)
With breaks=1
, the scale is made symmetric, but not
quantile capping is performed. In this way, most of the colorscale is
taken by the difference between one datapoint (first gene) and the rest,
making it difficult to distinguish patterns in the genes at the bottom.
Instead, with breaks=0.985
, the color scale is linear up
until the 0.985 quantile of the data, and ordinal after this. This
reduces our capacity to distinguish variations between the extreme
values, but enables us to visualize the others better.
Manual breaks can also be defined. The colors themselves can be passed as follows:
Annotation colors can be passed with the anno_colors
argument, but the simplest is to store them in the object’s
metadata:
## $Time
## 30 60 120 <NA>
## "#90EE90FF" "#65BE61FF" "#3A9034FF" "#006400FF"
##
## $Condition
## Control Forskolin
## "lightgrey" "Darkred"
metadata(SE)$anno_colors$Condition <- c(Control="white", Forskolin="black")
sechm(SE, features=g2, top_annotation="Condition")
## Using assay 'logFC'
Heatmap colors can be passed on in the same way:
## Using assay 'logFC'
The default assay to be displayed and the default annotation fields
to show can be specified in the default_view
metadata
element, as follows:
Finally, it is also possible to set colors as package-wide options:
At the moment, the following arguments can be set as global options:
assayName
, hmcols
,
left_annotation
, right_annotation
,
top_annotation
, bottom_annotation
,
anno_colors
, gaps_at
, breaks
.
To remove the predefined colors:
In order of priority, the arguments in the function call trump the object’s metadata, which trumps the global options.
Because sechm
produces a Heatmap
object
from ComplexHeatmap,
it is possible to combine them:
## Warning: Heatmap/annotation names are duplicated: logFC
However, doing so involves manual work to ensure that the labels and
colors are nice and coherent, and that the rows names match. As a
convenience, we provide the crossHm
function to handle
these issues. crossHm
works with a list of
SummarizedExperiment
objects:
# we build another SE object and introduce some variation in it:
SE2 <- SE
assays(SE2)$logcpm <- jitter(assays(SE2)$logcpm, factor=1000)
crossHm(list(SE1=SE, SE2=SE2), g, do.scale = TRUE,
top_annotation=c("Condition","Time"))
## Using assay 'logFC'
## Using assay 'logFC'
Scaling is applied to the datasets separately. A unique color scale can be enforced:
crossHm(list(SE1=SE, SE2=SE2), g, do.scale = TRUE,
top_annotation=c("Condition","Time"), uniqueScale = TRUE)
## Using assay 'logFC'
## Using assay 'logFC'
The package also includes a number of other convenience functions which we briefly describe here (see the functions’ help for more information):
log2FC()
adds two assays to an SE object, containing
per-sample log2-foldchanges, as well as scaledLFC (variance-scaled
log2-foldchanges, but without centering, so that the controls stay
around 0) relative to the mean of the (specified) controls.getDEA()
and getDEGs()
functions can
return a specific DEA or its set of differentially-expressed features,
provided that the DEA results tables are each saved in a column of
rowData (i.e. the whole table in one column), with a column name
starting with DEA.
.meltSE()
function can be used to extract a
dataframe (suitable for ggplot) containing colData, rowData, and assay
data for a given subset of features.## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] grid stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] sechm_1.15.0 ComplexHeatmap_2.23.0
## [3] SummarizedExperiment_1.36.0 Biobase_2.67.0
## [5] GenomicRanges_1.59.0 GenomeInfoDb_1.43.0
## [7] IRanges_2.41.0 S4Vectors_0.44.0
## [9] BiocGenerics_0.53.0 MatrixGenerics_1.19.0
## [11] matrixStats_1.4.1 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] circlize_0.4.16 shape_1.4.6.1 rjson_0.2.23
## [4] xfun_0.48 bslib_0.8.0 GlobalOptions_0.1.2
## [7] lattice_0.22-6 tools_4.4.1 curl_5.2.3
## [10] parallel_4.4.1 ca_0.71.1 highr_0.11
## [13] cluster_2.1.6 Matrix_1.7-1 randomcoloR_1.1.0.1
## [16] RColorBrewer_1.1-3 lifecycle_1.0.4 GenomeInfoDbData_1.2.13
## [19] compiler_4.4.1 stringr_1.5.1 munsell_0.5.1
## [22] codetools_0.2-20 clue_0.3-65 seriation_1.5.6
## [25] htmltools_0.5.8.1 sys_3.4.3 buildtools_1.0.0
## [28] sass_0.4.9 yaml_2.3.10 crayon_1.5.3
## [31] jquerylib_0.1.4 DelayedArray_0.33.1 cachem_1.1.0
## [34] iterators_1.0.14 TSP_1.2-4 abind_1.4-8
## [37] foreach_1.5.2 digest_0.6.37 stringi_1.8.4
## [40] Rtsne_0.17 maketools_1.3.1 fastmap_1.2.0
## [43] colorspace_2.1-1 cli_3.6.3 SparseArray_1.6.0
## [46] magrittr_2.0.3 S4Arrays_1.6.0 UCSC.utils_1.2.0
## [49] scales_1.3.0 registry_0.5-1 rmarkdown_2.28
## [52] XVector_0.46.0 httr_1.4.7 png_0.1-8
## [55] GetoptLong_1.0.5 evaluate_1.0.1 knitr_1.48
## [58] V8_6.0.0 doParallel_1.0.17 rlang_1.1.4
## [61] Rcpp_1.0.13 glue_1.8.0 BiocManager_1.30.25
## [64] jsonlite_1.8.9 R6_2.5.1 zlibbioc_1.52.0