Spatial heterogeneity with SPIAT

library(SPIAT)

Cell colocalisation metrics allow capturing a dominant spatial pattern in an image. However, patterns are unlikely to be distributed evenly in a tissue, but rather there will be spatial heterogeneity of patterns. To measure this, SPIAT splits the image into smaller images (either using a grid or concentric circles around a reference cell population), followed by calculation of a spatial metric of a pattern of interest (e.g. cell colocalisation, entropy), and then measures the Prevalence and Distinctiveness of the pattern.

In this vignette we will use an inForm data file that’s already been formatted for SPIAT with format_image_to_spe(), which we can load with data(). We will use define_celltypes() to define the cells with certain combinations of markers.

data("simulated_image")

# define cell types
formatted_image <- define_celltypes(
    simulated_image, 
    categories = c("Tumour_marker","Immune_marker1,Immune_marker2", 
                   "Immune_marker1,Immune_marker3", 
                   "Immune_marker1,Immune_marker2,Immune_marker4", "OTHER"), 
    category_colname = "Phenotype", 
    names = c("Tumour", "Immune1", "Immune2", "Immune3", "Others"),
    new_colname = "Cell.Type")

Localised Entropy

Entropy in spatial analysis refers to the balance in the number of cells of distinct populations. An entropy score can be obtained for an entire image. However, the entropy of one image does not provide us spatial information of the image.

calculate_entropy(formatted_image, cell_types_of_interest = c("Immune1","Immune2"), 
                  feature_colname = "Cell.Type")
## [1] 0.9294873

We therefore propose the concept of Localised Entropy which calculates an entropy score for a predefined local region. These local regions can be calculated as defined in the next two sections.

Fishnet grid

One approach to calculate localised metric is to split the image into fishnet grid squares. For each grid square, grid_metrics() calculates the metric for that square and visualise the raster image. Users can choose any metric as the localised metric. Here we use entropy as an example.

For cases where the localised metric is not symmetrical (requires specifying a target and reference cell type), the first item in the vector used for cell_types_of_interest marks the reference cells and the second item the target cells. Here we are using Entropy, which is symmetrical, so we can use any order of cell types in the input.

data("defined_image")
grid <- grid_metrics(defined_image, FUN = calculate_entropy, n_split = 20,
                     cell_types_of_interest=c("Tumour","Immune3"), 
                     feature_colname = "Cell.Type")

After calculating the localised entropy for each of the grid squares, we can apply metrics like percentages of grid squares with patterns (Prevalence) and Moran’s I (Distinctiveness).

For the Prevalence, we need to select a threshold over which grid squares are considered ‘positive’ for the pattern. The selection of threshold depends on the pattern and metric the user chooses to find the localised pattern. Here we chose 0.75 for entropy because 0.75 is roughly the entropy of two cell types when their ratio is 1:5 or 5:1.

calculate_percentage_of_grids(grid, threshold = 0.75, above = TRUE)
## [1] 13
calculate_spatial_autocorrelation(grid, metric = "globalmoran")
## [1] 0.2197134

Gradients (based on concentric circles)

We can use the compute_gradient() function to calculate metrics (entropy, mixing score, percentage of cells within radius, marker intensity) for a range of radii from reference cells. Here, an increasing circle is drawn around each cell of the reference cell type and the desired score is calculated for cells within each circle.

The first item in the vector used for cell_types_of_interest marks the reference cells and the second item the target cells. Here, Immune1 cells are reference cells and Immune2 are target cells.

gradient_positions <- c(30, 50, 100)
gradient_entropy <- 
  compute_gradient(defined_image, radii = gradient_positions, 
                   FUN = calculate_entropy,  cell_types_of_interest = c("Immune1","Immune2"),
                   feature_colname = "Cell.Type")
length(gradient_entropy)
## [1] 3
head(gradient_entropy[[1]])
##         Cell.X.Position Cell.Y.Position Immune1 Immune2 total Immune1_log2
## Cell_15       109.67027        17.12956       1       0     1            0
## Cell_25       153.22795       128.29915       1       0     1            0
## Cell_30        57.29037        49.88533       1       0     1            0
## Cell_48        83.47798       295.75058       2       0     2            1
## Cell_56        35.24227       242.84862       1       0     1            0
## Cell_61       156.39943       349.08154       2       0     2            1
##         Immune2_log2 total_log2 Immune1ratio Immune1_entropy Immune2ratio
## Cell_15            0          0            1               0            0
## Cell_25            0          0            1               0            0
## Cell_30            0          0            1               0            0
## Cell_48            0          1            1               0            0
## Cell_56            0          0            1               0            0
## Cell_61            0          1            1               0            0
##         Immune2_entropy entropy
## Cell_15               0       0
## Cell_25               0       0
## Cell_30               0       0
## Cell_48               0       0
## Cell_56               0       0
## Cell_61               0       0

The compute_gradient() function outputs the numbers cells within each radii for each reference cell. The output is formatted as a list of data.frames, one for each specified radii. In each data.frame, the rows show the reference cells. The last column of the data.frame is the entropy calculated for cells in the circle of the reference cell. Users can then an average score or another aggregation metric to report the results.

An alternative approach is to combine the results of all the circles (rather than have one for each individual reference cell). Here, for each radii, we simultaneously identify all the cells in the circles surrounding each reference cell, and calculate a single entropy score. We have created a specific function for this - entropy_gradient_aggregated(). The output of this function is an overall entropy score for each radii.

gradient_pos <- seq(50, 500, 50) ##radii
gradient_results <- entropy_gradient_aggregated(defined_image, cell_types_of_interest = c("Immune3","Tumour"),
                                                feature_colname = "Cell.Type", radii = gradient_pos)
# plot the results
plot(1:10,gradient_results$gradient_df[1, 3:12])

Reproducibility

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] SPIAT_1.9.0                 SpatialExperiment_1.17.0   
##  [3] SingleCellExperiment_1.29.1 SummarizedExperiment_1.37.0
##  [5] Biobase_2.67.0              GenomicRanges_1.59.0       
##  [7] GenomeInfoDb_1.43.1         IRanges_2.41.1             
##  [9] S4Vectors_0.45.2            BiocGenerics_0.53.3        
## [11] generics_0.1.3              MatrixGenerics_1.19.0      
## [13] matrixStats_1.4.1           BiocStyle_2.35.0           
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3      sys_3.4.3               jsonlite_1.8.9         
##   [4] shape_1.4.6.1           magrittr_2.0.3          spatstat.utils_3.1-1   
##   [7] magick_2.8.5            farver_2.1.2            rmarkdown_2.29         
##  [10] GlobalOptions_0.1.2     zlibbioc_1.52.0         vctrs_0.6.5            
##  [13] spatstat.explore_3.3-3  elsa_1.1-28             terra_1.7-83           
##  [16] htmltools_0.5.8.1       S4Arrays_1.7.1          raster_3.6-30          
##  [19] SparseArray_1.7.2       sass_0.4.9              pracma_2.4.4           
##  [22] bslib_0.8.0             htmlwidgets_1.6.4       mmand_1.6.3            
##  [25] plyr_1.8.9              plotly_4.10.4           cachem_1.1.0           
##  [28] buildtools_1.0.0        lifecycle_1.0.4         iterators_1.0.14       
##  [31] pkgconfig_2.0.3         Matrix_1.7-1            R6_2.5.1               
##  [34] fastmap_1.2.0           GenomeInfoDbData_1.2.13 clue_0.3-66            
##  [37] digest_0.6.37           colorspace_2.1-1        tensor_1.5             
##  [40] crosstalk_1.2.1         labeling_0.4.3          fansi_1.0.6            
##  [43] spatstat.sparse_3.1-0   httr_1.4.7              polyclip_1.10-7        
##  [46] abind_1.4-8             compiler_4.4.2          bit64_4.5.2            
##  [49] withr_3.0.2             doParallel_1.0.17       apcluster_1.4.13       
##  [52] DelayedArray_0.33.2     rjson_0.2.23            gtools_3.9.5           
##  [55] tools_4.4.2             goftest_1.2-3           glue_1.8.0             
##  [58] dbscan_1.2-0            nlme_3.1-166            grid_4.4.2             
##  [61] Rtsne_0.17              cluster_2.1.6           reshape2_1.4.4         
##  [64] gtable_0.3.6            spatstat.data_3.1-4     tzdb_0.4.0             
##  [67] tidyr_1.3.1             data.table_1.16.2       sp_2.1-4               
##  [70] utf8_1.2.4              XVector_0.47.0          spatstat.geom_3.3-3    
##  [73] ggrepel_0.9.6           RANN_2.6.2              foreach_1.5.2          
##  [76] pillar_1.9.0            stringr_1.5.1           vroom_1.6.5            
##  [79] circlize_0.4.16         dplyr_1.1.4             lattice_0.22-6         
##  [82] bit_4.5.0               deldir_2.0-4            tidyselect_1.2.1       
##  [85] ComplexHeatmap_2.23.0   maketools_1.3.1         knitr_1.49             
##  [88] gridExtra_2.3           xfun_0.49               dittoSeq_1.19.0        
##  [91] pheatmap_1.0.12         stringi_1.8.4           UCSC.utils_1.3.0       
##  [94] lazyeval_0.2.2          yaml_2.3.10             evaluate_1.0.1         
##  [97] codetools_0.2-20        tibble_3.2.1            BiocManager_1.30.25    
## [100] cli_3.6.3               munsell_0.5.1           jquerylib_0.1.4        
## [103] Rcpp_1.0.13-1           spatstat.random_3.3-2   png_0.1-8              
## [106] spatstat.univar_3.1-1   parallel_4.4.2          ggplot2_3.5.1          
## [109] viridisLite_0.4.2       scales_1.3.0            ggridges_0.5.6         
## [112] purrr_1.0.2             crayon_1.5.3            GetoptLong_1.0.5       
## [115] rlang_1.1.4             cowplot_1.1.3

Author Contributions

AT, YF, TY, ML, JZ, VO, MD are authors of the package code. MD and YF wrote the vignette. AT, YF and TY designed the package.