Title: | smoothclust |
---|---|
Description: | Method for segmentation of spatial domains and spatially-aware clustering in spatial transcriptomics data. The method generates spatial domains with smooth boundaries by smoothing gene expression profiles across neighboring spatial locations, followed by unsupervised clustering. Spatial domains consisting of consistent mixtures of cell types may then be further investigated by applying cell type compositional analyses or differential analyses. |
Authors: | Lukas M. Weber [aut, cre] |
Maintainer: | Lukas M. Weber <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.3.0 |
Built: | 2024-12-04 07:03:21 UTC |
Source: | https://github.com/bioc/smoothclust |
Method for segmentation of spatial domains and spatially-aware clustering.
smoothclust( input, assay_name = "counts", spatial_coords = NULL, method = c("uniform", "kernel", "knn"), bandwidth = 0.05, k = 18, truncate = 0.05, sparse = TRUE )
smoothclust( input, assay_name = "counts", spatial_coords = NULL, method = c("uniform", "kernel", "knn"), bandwidth = 0.05, k = 18, truncate = 0.05, sparse = TRUE )
input |
Input data, which can be provided as either a
|
assay_name |
For a |
spatial_coords |
Numeric matrix of spatial coordinates, assumed to contain x coordinates in first column and y coordinates in second column. This argument is only used if the input is a numeric matrix. |
method |
Method used for smoothing. Options are |
bandwidth |
Bandwidth parameter for smoothing, expressed as proportion
of width or height (whichever is greater) of tissue area. Only used for
|
k |
Number of nearest neighbors parameter for |
truncate |
Truncation threshold parameter if |
sparse |
Whether to return output assay or numeric matrix as sparse
matrix. Default = TRUE. In most cases (e.g. if using
|
Method for segmentation of spatial domains and spatially-aware clustering in spatial transcriptomics data.
Method for segmentation of spatial domains and spatially-aware clustering in spatial transcriptomics data. The method generates spatial domains with smooth boundaries by smoothing gene expression profiles across neighboring spatial locations, followed by unsupervised clustering. Spatial domains consisting of consistent mixtures of cell types may then be further investigated by applying cell type compositional analyses or differential analyses.
Returns spatially smoothed expression values, which can then be used
as the input for further downstream analyses. Results are returned either
as a SpatialExperiment
object containing a new assay
named
<assay_name>_smooth
(e.g. counts_smooth
or
logcounts_smooth
), or as a numeric matrix, depending on the input
type.
library(STexampleData) # load data spe <- Visium_humanDLPFC() # keep spots over tissue spe <- spe[, colData(spe)$in_tissue == 1] # run smoothclust # using "knn" method for faster runtime in this example spe <- smoothclust(spe, method = "knn", k = 6) # see vignette for extended example using default method and including # downstream analysis steps
library(STexampleData) # load data spe <- Visium_humanDLPFC() # keep spots over tissue spe <- spe[, colData(spe)$in_tissue == 1] # run smoothclust # using "knn" method for faster runtime in this example spe <- smoothclust(spe, method = "knn", k = 6) # see vignette for extended example using default method and including # downstream analysis steps
Function for clustering smoothness evaluation metric
smoothness_metric(spatial_coords, labels, k = 6)
smoothness_metric(spatial_coords, labels, k = 6)
spatial_coords |
Numeric matrix containing spatial coordinates of
points, formatted as nrow = number of points, ncol = 2 (assuming x and y
dimensions). For example, 'spatial_coords = spatialCoords(spe)' if using a
|
labels |
Numeric vector of cluster labels for each point. For example,
'labels <- as.numeric(colData(spe)$label)' if using a
|
k |
Number of k nearest neighbors to use in calculation. Default = 6 (from 10x Genomics Visium platform). |
Function to calculate clustering smoothness evaluation metric, defined as the average number of nearest neighbors per point that are from a different cluster. This metric can be used to quantify and compare the relative smoothness of the boundaries of clusters or spatial domains.
Returns a list containing (i) a vector of values at each point (i.e. the number of nearest neighbors that are from a different cluster at each point) and (ii) the average value across all points.
library(STexampleData) library(scran) library(scater) # load data spe <- Visium_humanDLPFC() # keep spots over tissue spe <- spe[, colData(spe)$in_tissue == 1] # run smoothclust # using "knn" method for faster runtime in this example # see vignette for example using default method spe <- smoothclust(spe, method = "knn", k = 6) # calculate logcounts spe <- logNormCounts(spe, assay.type = "counts_smooth") # preprocessing steps for clustering # remove mitochondrial genes is_mito <- grepl("(^mt-)", rowData(spe)$gene_name, ignore.case = TRUE) spe <- spe[!is_mito, ] # select top highly variable genes (HVGs) dec <- modelGeneVar(spe) top_hvgs <- getTopHVGs(dec, prop = 0.1) spe <- spe[top_hvgs, ] # dimensionality reduction set.seed(123) spe <- runPCA(spe) # run k-means clustering set.seed(123) k <- 5 clus <- kmeans(reducedDim(spe, "PCA"), centers = k)$cluster colLabels(spe) <- factor(clus) # calculate smoothness metric res <- smoothness_metric(spatialCoords(spe), as.numeric(colData(spe)$label)) # results str(res) head(res$n_discordant) res$mean_discordant
library(STexampleData) library(scran) library(scater) # load data spe <- Visium_humanDLPFC() # keep spots over tissue spe <- spe[, colData(spe)$in_tissue == 1] # run smoothclust # using "knn" method for faster runtime in this example # see vignette for example using default method spe <- smoothclust(spe, method = "knn", k = 6) # calculate logcounts spe <- logNormCounts(spe, assay.type = "counts_smooth") # preprocessing steps for clustering # remove mitochondrial genes is_mito <- grepl("(^mt-)", rowData(spe)$gene_name, ignore.case = TRUE) spe <- spe[!is_mito, ] # select top highly variable genes (HVGs) dec <- modelGeneVar(spe) top_hvgs <- getTopHVGs(dec, prop = 0.1) spe <- spe[top_hvgs, ] # dimensionality reduction set.seed(123) spe <- runPCA(spe) # run k-means clustering set.seed(123) k <- 5 clus <- kmeans(reducedDim(spe, "PCA"), centers = k)$cluster colLabels(spe) <- factor(clus) # calculate smoothness metric res <- smoothness_metric(spatialCoords(spe), as.numeric(colData(spe)$label)) # results str(res) head(res$n_discordant) res$mean_discordant