| Title: | Identify Batch-Biased Spatially Variable Genes |
|---|---|
| Description: | BatchSVG is a method to identify batch-biased spatially variable genes (SVGs) in spatial transcriptomics data. The batch variable can be defined as sample, donor sex, or other batch effects of interest. The BatchSVG method is based on the binomial deviance model (Townes et al, 2019). |
| Authors: | Christine Hou [aut] (ORCID: <https://orcid.org/0009-0001-5350-0629>), Kinnary Shah [aut, cre], Jacqueline R. Thompson [aut], Stephanie C. Hicks [aut, fnd] (ORCID: <https://orcid.org/0000-0002-7858-0231>) |
| Maintainer: | Kinnary Shah <[email protected]> |
| License: | Artistic-2.0 |
| Version: | 1.5.2 |
| Built: | 2026-05-08 21:03:54 UTC |
| Source: | https://github.com/bioc/BatchSVG |
Function to identify the bias genes based on user-selected threshold of number of standard deviation in relative change in deviance and rank difference.
biasDetect( list_batch_df, threshold = c("both", "dev", "rank"), nSD_dev = NULL, nSD_rank = NULL, plot_point_size = 3, plot_point_shape = 16, plot_text_size = 3, plot_palette = "YlOrRd" )biasDetect( list_batch_df, threshold = c("both", "dev", "rank"), nSD_dev = NULL, nSD_rank = NULL, plot_point_size = 3, plot_point_shape = 16, plot_text_size = 3, plot_palette = "YlOrRd" )
list_batch_df |
|
threshold |
A character string specifying the filtering criterion. Must be one of:
|
nSD_dev |
|
nSD_rank |
|
plot_point_size |
|
plot_point_shape |
|
plot_text_size |
|
plot_palette |
|
A named list where each element corresponds to a batch and contains:
"Plot": A diagnostic plot (either deviance, rank, or both).
"Table": A filtered data frame containing outlier genes based on the
specified threshold.
# use the result generated from featureSelect() data(list_batch_df) biaGenes <- biasDetect(list_batch_df = list_batch_df, threshold = "both", nSD_dev = 3, nSD_rank = 3)# use the result generated from featureSelect() data(list_batch_df) biaGenes <- biasDetect(list_batch_df = list_batch_df, threshold = "both", nSD_dev = 3, nSD_rank = 3)
The function computes batch-adjusted deviance values, ranks the
genes accordingly, and quantifies batch effects in terms of standard
deviations from the mean difference. The list follows the order of batch
effects provided in batch_effects.
featureSelect(input, batch_effects = NULL, VGs = NULL, verbose = TRUE)featureSelect(input, batch_effects = NULL, VGs = NULL, verbose = TRUE)
input |
A |
batch_effects |
A character vector specifying column names in
|
VGs |
A character vector specifying the variable genes (VGs) to be analyzed. Only genes present in this vector will be retained for feature selection. |
verbose |
Logical (TRUE or FALSE). Default is TRUE. If TRUE, progress messages will be printed; If FALSE, messages will be suppressed. |
A named list where each element corresponds to a batch effect. Each batch contains a data frame with the following columns:
"gene_id": Gene identifier.
"gene_name": Gene name.
"dev_default": Deviance score without batch correction.
"dev_
"rank_default": Rank of the gene based on deviance without batch correction.
"rank_
"d_diff": Relative change in deviance between default and batch-corrected models.
"nSD_dev_
"r_diff": Rank difference between default and batch-corrected models.
"nSD_rank_
library(spatialLIBD) spatialLIBD_spe <- fetch_data(type = "spe") libd_svg <- read.csv( system.file("extdata","libd-all_nnSVG_p-05-features-df.csv", package = "BatchSVG"), row.names = 1, check.names = FALSE) list_batch_df <- featureSelect(input = spatialLIBD_spe, batch_effects = "subject", VGs = libd_svg$gene_id)library(spatialLIBD) spatialLIBD_spe <- fetch_data(type = "spe") libd_svg <- read.csv( system.file("extdata","libd-all_nnSVG_p-05-features-df.csv", package = "BatchSVG"), row.names = 1, check.names = FALSE) list_batch_df <- featureSelect(input = spatialLIBD_spe, batch_effects = "subject", VGs = libd_svg$gene_id)
The list_batch_df dataset contains results from featureSelect() applied
to the spatial transcriptomics data from the spatialLIBD package.
data(list_batch_df)data(list_batch_df)
A named list of data frames, where each element corresponds to a batch effect:
"gene_id": Gene identifier.
"gene_name": Gene name.
"dev_default": Deviance score without batch correction.
"dev_(batch name)": Deviance score with batch correction.
"rank_default": Rank of the gene based on deviance without batch correction.
"rank_(batch name)": Rank of the gene based on deviance with batch correction.
"d_diff": Relative change in deviance between default and batch-corrected models.
"nSD_dev_(batch name)": number of standard deviation of relative change in deviance for the batch.
"r_diff": Rank difference between default and batch-corrected models.
"nSD_rank_(batch name)": number of standard deviation of rank difference for the batch.
https://github.com/christinehou11/BatchSVG/blob/main/inst/scripts/make-list_batch_df.R
This function generates visualizations to assess the impact of batch effects on spatially variable genes (SVGs) by analyzing changes in deviance and rank. The function bins the deviations into normalized standard deviation (nSD) intervals and creates histograms and scatter plots to illustrate the distribution of batch effects.
svg_nSD(list_batch_df, sd_interval_dev, sd_interval_rank)svg_nSD(list_batch_df, sd_interval_dev, sd_interval_rank)
list_batch_df |
A named list of data frames, where each data frame corresponds to a batch effect and contains columns with deviance and rank differences. |
sd_interval_dev |
A numeric vector specifying the
interval widths for standard deviation bins for each batch when analyzing
the relative change in deviance. The order of values must correspond to
the order of batches in |
sd_interval_rank |
|
A combined ggplot object containing:
Deviance Plots:
Histogram of deviance differences across SVGs, colored by nSD intervals.
Scatter plot comparing deviance values before and after batch correction.
Rank Plots:
Histogram of rank differences across SVGs, colored by nSD intervals.
Scatter plot comparing ranks before and after batch correction.
The function arranges plots for each batch in a grid format for easy comparison.
# use the result generated from featureSelect() data(list_batch_df) plots <- svg_nSD(list_batch_df = list_batch_df, sd_interval_dev = 3, sd_interval_rank = 3)# use the result generated from featureSelect() data(list_batch_df) plots <- svg_nSD(list_batch_df = list_batch_df, sd_interval_dev = 3, sd_interval_rank = 3)