Reduced dimension plotting is one
of the essential tools for the analysis of single cell data. However, as
the number of cells/nuclei in these plots increases, the usefulness of
these plots decreases. Many cells are plotted on top of each other
obscuring information, even when taking advantage of transparency
settings. This package provides binning strategies of cells/nuclei into
hexagon cells. Plotting summarized information of all cells/nuclei in
their respective hexagon cells presents information without
obstructions. The package seemlessly works with the two most common
object classes for the storage of single cell data;
SingleCellExperiment
from the SingleCellExperiment
package and Seurat
from the Seurat package. In this
vignette I will be presenting the use of schex
for
SingleCellExperiment
objects that are converted from
Seurat
objects.
In order to demonstrate the capabilities of the schex package, I will use the a subsetted dataset of Peripheral Blood Mononuclear Cells (PBMC) freely available from 10x Genomics. There are 80 single cells with 230 features in this dataset.
pbmc_small
#> An object of class Seurat
#> 230 features across 80 samples within 1 assay
#> Active assay: RNA (230 features, 20 variable features)
#> 3 layers present: counts, data, scale.data
#> 2 dimensional reductions calculated: pca, tsne
The dataset already contains two dimension reductions (PCA and TSNE). We will now add UMAP. Since there is a random component in the UMAP, we will set a seed. You can also add dimension reductions after conversion using package that include functionalities for SingleCellExperiment objects.
At this stage in the workflow we usually would like to plot aspects of our data in one of the reduced dimension representations. Instead of plotting this in an ordinary fashion, I will demonstrate how schex can provide a better way of plotting this.
First, I will calculate the hexagon cell representation for each cell
for a specified dimension reduction representation. I decide to use
nbins=40
which specifies that I divide my x range into 10
bins. Note that this might be a parameter that you want to play around
with depending on the number of cells/ nuclei in your dataset.
Generally, for more cells/nuclei, nbins
should be
increased.
First I plot how many cells are in each hexagon cell. This should be
relatively even, otherwise change the nbins
parameter in
the previous calculation.
Next I colour the hexagon cells by some meta information, such as the median total count in each hexagon cell.
Finally, I will visualize the gene expression of the CD1C gene in the hexagon cell representation.
gene_id <- "CD1C"
schex::plot_hexbin_feature(pbmc.sce,
type = "counts", feature = gene_id,
action = "mean", xlab = "UMAP1", ylab = "UMAP2",
title = paste0("Mean of ", gene_id)
)
schex
output as ggplot
objectsThe schex
packages renders ordinary ggplot
objects and thus these can be treated and manipulated using the ggplot
grammar.
For example the non-data components of the plots can be changed using
the function theme
.
gene_id <- "CD1C"
gg <- schex::plot_hexbin_feature(pbmc.sce,
type = "counts", feature = gene_id,
action = "mean", xlab = "UMAP1", ylab = "UMAP2",
title = paste0("Mean of ", gene_id)
)
gg + theme_void()
The fact that schex
renders ggplot
objects
can also be used to save these plots. Simply use ggsave
in
order to save any created plot.
To find the details of the session for reproducibility, use this:
sessionInfo()
#> R version 4.4.3 (2025-02-28)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] Seurat_5.2.1 SeuratObject_5.0.2
#> [3] sp_2.2-0 dplyr_1.1.4
#> [5] schex_1.21.0 SingleCellExperiment_1.29.2
#> [7] SummarizedExperiment_1.37.0 Biobase_2.67.0
#> [9] GenomicRanges_1.59.1 GenomeInfoDb_1.43.4
#> [11] IRanges_2.41.3 S4Vectors_0.45.4
#> [13] BiocGenerics_0.53.6 generics_0.1.3
#> [15] MatrixGenerics_1.19.1 matrixStats_1.5.0
#> [17] ggplot2_3.5.1 rmarkdown_2.29
#>
#> loaded via a namespace (and not attached):
#> [1] RColorBrewer_1.1-3 sys_3.4.3 jsonlite_1.9.1
#> [4] magrittr_2.0.3 spatstat.utils_3.1-2 farver_2.1.2
#> [7] vctrs_0.6.5 ROCR_1.0-11 spatstat.explore_3.3-4
#> [10] htmltools_0.5.8.1 S4Arrays_1.7.3 SparseArray_1.7.6
#> [13] sass_0.4.9 sctransform_0.4.1 parallelly_1.42.0
#> [16] KernSmooth_2.23-26 bslib_0.9.0 htmlwidgets_1.6.4
#> [19] ica_1.0-3 plyr_1.8.9 plotly_4.10.4
#> [22] zoo_1.8-13 cachem_1.1.0 buildtools_1.0.0
#> [25] igraph_2.1.4 mime_0.12 lifecycle_1.0.4
#> [28] pkgconfig_2.0.3 Matrix_1.7-3 R6_2.6.1
#> [31] fastmap_1.2.0 GenomeInfoDbData_1.2.13 fitdistrplus_1.2-2
#> [34] future_1.34.0 shiny_1.10.0 digest_0.6.37
#> [37] colorspace_2.1-1 patchwork_1.3.0 tensor_1.5
#> [40] RSpectra_0.16-2 irlba_2.3.5.1 labeling_0.4.3
#> [43] progressr_0.15.1 spatstat.sparse_3.1-0 httr_1.4.7
#> [46] polyclip_1.10-7 abind_1.4-8 compiler_4.4.3
#> [49] withr_3.0.2 fastDummies_1.7.5 hexbin_1.28.5
#> [52] ggforce_0.4.2 MASS_7.3-65 concaveman_1.1.0
#> [55] DelayedArray_0.33.6 tools_4.4.3 lmtest_0.9-40
#> [58] httpuv_1.6.15 future.apply_1.11.3 goftest_1.2-3
#> [61] glue_1.8.0 nlme_3.1-167 promises_1.3.2
#> [64] grid_4.4.3 Rtsne_0.17 cluster_2.1.8.1
#> [67] reshape2_1.4.4 spatstat.data_3.1-4 gtable_0.3.6
#> [70] tidyr_1.3.1 data.table_1.17.0 XVector_0.47.2
#> [73] spatstat.geom_3.3-5 RcppAnnoy_0.0.22 ggrepel_0.9.6
#> [76] RANN_2.6.2 pillar_1.10.1 stringr_1.5.1
#> [79] spam_2.11-1 RcppHNSW_0.6.0 later_1.4.1
#> [82] splines_4.4.3 tweenr_2.0.3 lattice_0.22-6
#> [85] deldir_2.0-4 survival_3.8-3 tidyselect_1.2.1
#> [88] maketools_1.3.2 miniUI_0.1.1.1 pbapply_1.7-2
#> [91] knitr_1.49 gridExtra_2.3 scattermore_1.2
#> [94] xfun_0.51 stringi_1.8.4 UCSC.utils_1.3.1
#> [97] lazyeval_0.2.2 yaml_2.3.10 evaluate_1.0.3
#> [100] codetools_0.2-20 entropy_1.3.1 tibble_3.2.1
#> [103] cli_3.6.4 uwot_0.2.3 xtable_1.8-4
#> [106] reticulate_1.41.0.1 munsell_0.5.1 jquerylib_0.1.4
#> [109] Rcpp_1.0.14 spatstat.random_3.3-2 globals_0.16.3
#> [112] png_0.1-8 spatstat.univar_3.1-2 parallel_4.4.3
#> [115] dotCall64_1.2 listenv_0.9.1 viridisLite_0.4.2
#> [118] scales_1.3.0 ggridges_0.5.6 purrr_1.0.4
#> [121] crayon_1.5.3 rlang_1.1.5 cowplot_1.1.3