Introduction to stJoincount

library(stJoincount)
library(pheatmap)
library(ggplot2)

v1.1.1

stJoincount: Quantification tool for spatial correlation between clusters in spatial transcriptomics preprocessed data using join count statistic approach.

Introduction

Spatial dependency is the relationship between location and attribute similarity. The measure reflects whether an attribute of a variable observed at one location is independent of values observed at neighboring locations. Positive spatial dependency exists when neighboring attributes are more similar than what could be explained by chance. Likewise, a negative spatial dependency is reflected by a dissimilarity of neighboring attributes. Join count analysis allows for quantification of the spatial dependencies of nominal data in an arrangement of spatially adjacent polygons.

This tool requires data produced with the 10X Genomics Visium Spatial Gene Expression platform with customized clusters. The purpose of this R package is to perform join count analysis for spatial correlation between clusters.

Installation

Users can install stJoincount with:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
      install.packages("BiocManager")
  }
BiocManager::install("stJoincount")

Examples of how to run this tool are below:

Preprocessing

In this vignette, we are going to use an human breast cancer spatial transcriptomics sample.

fpath <- system.file("extdata", "dataframe.rda", package="stJoincount")
load(fpath)
head(humanBC)
#>                    imagecol imagerow Cluster
#> AATTGCAGCAATCGAC-1 431.2129 476.8069       4
#> ACCAGGAGTGTGATCT-1 273.0446 117.8218       9
#> ACCTCCGCCCTCGCTG-1 448.2178 423.9109       7
#> AGGTGTATCGCCATGA-1 144.2822 317.5000       1
#> ATAGTTCCACCCACTC-1 431.5099 323.9109       7
#> CCGTATTAGCGCAGTT-1 462.1535 200.4950       2

Within the ‘extdata’ user can find a dataframe “humanBC.rda”. This example data is a data.frame that comes from a Seurat object of a human breast cancer sample. It contains the following information that is essential to this algorithm - barcode (index), cluster (they could either be categorical or numerical labels), central pixel location (imagerow and imagecol). This dataframe is simplified after combining metadata with spatial coordinates. The index contains barcodes, and at least three other columns that have these information are required and the column names should be the same as following: imagerow: The row pixel coordinate of the center of the spot imagecol: The column pixel coordinate of the center of the spot Cluster: The label that corresponding to this barcode

The following codes demonstrate how to generate the described data.frame from Seurat/spatialExperiment Objects.

An example data preparation from Seurat:

fpath <- system.file("extdata", "SeuratBC.rda", package="stJoincount")
load(fpath)
df <- dataPrepFromSeurat(seuratBC, "label")

An example data preparation from SpatialExperiment object:

fpath <- system.file("extdata", "SpeBC.rda", package="stJoincount")
load(fpath)
df2 <- dataPrepFromSpE(SpeObjBC, "label")

Raster processing

This tool first converts a labeled spatial tissue map into a raster object, in which each spatial feature is represented by a pixel coded by label assignment. This process includes automatic calculation of optimal raster resolution and extent for the sample.

resolutionList <- resolutionCalc(humanBC)
resolutionList
#> [1] 152.89604  64.20792
mosaicIntegration <- rasterizeEachCluster(humanBC)
#> No optimal number found, using n = 110 instead.
#> In this case, there may be minor deviations in the subsequent calculation process.
#>         The results are for reference only

Visualization

After the labeled spatial sample being converted, the raster map can be visualized by:

mosaicIntPlot(humanBC, mosaicIntegration)

Join count analysis

A neighbors list is then created from the rasterized sample, in which adjacent and diagonal neighbors for each pixel are identified. After adding binary spatial weights to the neighbors list, a multi-categorical join count analysis is performed to tabulate “joins” between all possible combinations of label pairs. The function returns the observed join counts, the expected count under conditions of spatial randomness, and the variance calculated under non-free sampling.

joincount.result <- joincountAnalysis(mosaicIntegration)
#> Warning in subset.nb(nbList, !(seq_len(length(nbList)) %in% emptyPos)):
#> subsetting caused increase in subgraph count

The z-score is then calculated as the difference between observed and expected counts, divided by the square root of the variance. A heatmap of z-scores represents the result from the join count analysis for all possible label pairs.

matrix <- zscoreMatrix(humanBC, joincount.result)
zscorePlot(matrix)

sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] ggplot2_3.5.1     pheatmap_1.0.12   stJoincount_1.9.0 BiocStyle_2.35.0 
#> 
#> loaded via a namespace (and not attached):
#>   [1] RcppAnnoy_0.0.22            splines_4.4.2              
#>   [3] later_1.3.2                 tibble_3.2.1               
#>   [5] polyclip_1.10-7             fastDummies_1.7.4          
#>   [7] lifecycle_1.0.4             sf_1.0-19                  
#>   [9] globals_0.16.3              lattice_0.22-6             
#>  [11] MASS_7.3-61                 magrittr_2.0.3             
#>  [13] plotly_4.10.4               sass_0.4.9                 
#>  [15] rmarkdown_2.29              jquerylib_0.1.4            
#>  [17] yaml_2.3.10                 httpuv_1.6.15              
#>  [19] Seurat_5.1.0                sctransform_0.4.1          
#>  [21] spam_2.11-0                 sp_2.1-4                   
#>  [23] spatstat.sparse_3.1-0       reticulate_1.40.0          
#>  [25] cowplot_1.1.3               pbapply_1.7-2              
#>  [27] DBI_1.2.3                   buildtools_1.0.0           
#>  [29] RColorBrewer_1.1-3          abind_1.4-8                
#>  [31] zlibbioc_1.52.0             Rtsne_0.17                 
#>  [33] GenomicRanges_1.59.1        purrr_1.0.2                
#>  [35] BiocGenerics_0.53.3         GenomeInfoDbData_1.2.13    
#>  [37] IRanges_2.41.1              S4Vectors_0.45.2           
#>  [39] ggrepel_0.9.6               irlba_2.3.5.1              
#>  [41] listenv_0.9.1               spatstat.utils_3.1-1       
#>  [43] maketools_1.3.1             terra_1.7-83               
#>  [45] units_0.8-5                 goftest_1.2-3              
#>  [47] RSpectra_0.16-2             spatstat.random_3.3-2      
#>  [49] fitdistrplus_1.2-1          parallelly_1.39.0          
#>  [51] leiden_0.4.3.1              codetools_0.2-20           
#>  [53] DelayedArray_0.33.2         tidyselect_1.2.1           
#>  [55] raster_3.6-30               UCSC.utils_1.3.0           
#>  [57] farver_2.1.2                matrixStats_1.4.1          
#>  [59] stats4_4.4.2                spatstat.explore_3.3-3     
#>  [61] jsonlite_1.8.9              e1071_1.7-16               
#>  [63] progressr_0.15.0            ggridges_0.5.6             
#>  [65] survival_3.7-0              tools_4.4.2                
#>  [67] ica_1.0-3                   Rcpp_1.0.13-1              
#>  [69] glue_1.8.0                  gridExtra_2.3              
#>  [71] SparseArray_1.7.2           xfun_0.49                  
#>  [73] MatrixGenerics_1.19.0       GenomeInfoDb_1.43.1        
#>  [75] dplyr_1.1.4                 withr_3.0.2                
#>  [77] BiocManager_1.30.25         fastmap_1.2.0              
#>  [79] boot_1.3-31                 fansi_1.0.6                
#>  [81] spData_2.3.3                digest_0.6.37              
#>  [83] R6_2.5.1                    mime_0.12                  
#>  [85] wk_0.9.4                    colorspace_2.1-1           
#>  [87] scattermore_1.2             tensor_1.5                 
#>  [89] spatstat.data_3.1-4         utf8_1.2.4                 
#>  [91] tidyr_1.3.1                 generics_0.1.3             
#>  [93] data.table_1.16.2           class_7.3-22               
#>  [95] httr_1.4.7                  htmlwidgets_1.6.4          
#>  [97] S4Arrays_1.7.1              spdep_1.3-6                
#>  [99] uwot_0.2.2                  pkgconfig_2.0.3            
#> [101] gtable_0.3.6                lmtest_0.9-40              
#> [103] SingleCellExperiment_1.29.1 XVector_0.47.0             
#> [105] sys_3.4.3                   htmltools_0.5.8.1          
#> [107] dotCall64_1.2               SeuratObject_5.0.2         
#> [109] scales_1.3.0                Biobase_2.67.0             
#> [111] png_0.1-8                   SpatialExperiment_1.17.0   
#> [113] spatstat.univar_3.1-1       knitr_1.49                 
#> [115] reshape2_1.4.4              rjson_0.2.23               
#> [117] nlme_3.1-166                proxy_0.4-27               
#> [119] cachem_1.1.0                zoo_1.8-12                 
#> [121] stringr_1.5.1               KernSmooth_2.23-24         
#> [123] parallel_4.4.2              miniUI_0.1.1.1             
#> [125] s2_1.1.7                    pillar_1.9.0               
#> [127] grid_4.4.2                  vctrs_0.6.5                
#> [129] RANN_2.6.2                  promises_1.3.0             
#> [131] xtable_1.8-4                cluster_2.1.6              
#> [133] evaluate_1.0.1              magick_2.8.5               
#> [135] cli_3.6.3                   compiler_4.4.2             
#> [137] rlang_1.1.4                 crayon_1.5.3               
#> [139] future.apply_1.11.3         labeling_0.4.3             
#> [141] classInt_0.4-10             plyr_1.8.9                 
#> [143] stringi_1.8.4               viridisLite_0.4.2          
#> [145] deldir_2.0-4                munsell_0.5.1              
#> [147] lazyeval_0.2.2              spatstat.geom_3.3-4        
#> [149] Matrix_1.7-1                RcppHNSW_0.6.0             
#> [151] patchwork_1.3.0             future_1.34.0              
#> [153] shiny_1.9.1                 SummarizedExperiment_1.37.0
#> [155] ROCR_1.0-11                 igraph_2.1.1               
#> [157] bslib_0.8.0