Title: | In-Silico Annotation of Doublets for Single Cell RNA Sequencing Data |
---|---|
Description: | In single cell RNA sequencing (scRNA-seq) data combinations of cells are sometimes considered a single cell (doublets). The scds package provides methods to annotate doublets in scRNA-seq data computationally. |
Authors: | Dennis Kostka [aut, cre], Bais Abha [aut] |
Maintainer: | Dennis Kostka <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.23.0 |
Built: | 2024-11-27 05:12:28 UTC |
Source: | https://github.com/bioc/scds |
Annotates doublets/multiplets using a binary classification approach to discriminate artificial doublets from original data.
bcds(sce, ntop = 500, srat = 1, verb = FALSE, retRes = FALSE, nmax = "tune", varImp = FALSE, estNdbl = FALSE)
bcds(sce, ntop = 500, srat = 1, verb = FALSE, retRes = FALSE, nmax = "tune", varImp = FALSE, estNdbl = FALSE)
sce |
single cell experiment ( |
ntop |
integer, indicating number of top variance genes to consider. Default: 500 |
srat |
numeric, indicating ratio between orginal number of "cells" and simulated doublets; Default: 1 |
verb |
progress messages. Default: FALSE |
retRes |
logical, should the trained classifier be returned? Default: FALSE |
nmax |
maximum number of training rounds; integer or "tune". Default: "tune" |
varImp |
logical, should variable (i.e., gene) importance be returned? Default: FALSE |
estNdbl |
logical, should the numer of doublets be estimated from the data. Enables doublet calls. Default:FALSE. Use with caution. |
sce input sce object SingleCellExperiment
with doublet scores
added to colData as "bcds_score" column, and possibly more (details)
data("sce_chcl") ## create small data set using only 100 cells sce_chcl_small = sce_chcl[, 1:100] sce_chcl_small = bcds(sce_chcl_small)
data("sce_chcl") ## create small data set using only 100 cells sce_chcl_small = sce_chcl[, 1:100] sce_chcl_small = bcds(sce_chcl_small)
Annotates doublets/multiplets using co-expression based approach
cxds(sce, ntop = 500, binThresh = 0, verb = FALSE, retRes = FALSE, estNdbl = FALSE)
cxds(sce, ntop = 500, binThresh = 0, verb = FALSE, retRes = FALSE, estNdbl = FALSE)
sce |
single cell experiment ( |
ntop |
integer, indimessageing number of top variance genes to consider. Default: 500 |
binThresh |
integer, minimum counts to consider a gene "present" in a cell. Default: 0 |
verb |
progress messages. Default: FALSE |
retRes |
logical, whether to return gene pair scores & top-scoring gene pairs? Default: FALSE. |
estNdbl |
logical, should the numer of doublets be estimated from the data. Enables doublet calls. Default:FALSE. Use with caution. |
sce input sce object SingleCellExperiment
with doublet scores added to colData as "cxds_score" column.
data("sce_chcl") ## create small data set using only 100 cells sce_chcl_small = sce_chcl[, 1:100] sce_chcl_small = cxds(sce_chcl_small)
data("sce_chcl") ## create small data set using only 100 cells sce_chcl_small = sce_chcl[, 1:100] sce_chcl_small = cxds(sce_chcl_small)
Annotates doublets/multiplets using the hybrid approach
cxds_bcds_hybrid(sce, cxdsArgs = NULL, bcdsArgs = NULL, verb = FALSE, estNdbl = FALSE, force = FALSE)
cxds_bcds_hybrid(sce, cxdsArgs = NULL, bcdsArgs = NULL, verb = FALSE, estNdbl = FALSE, force = FALSE)
sce |
single cell experiment ( |
cxdsArgs |
list, arguments for cxds function in list form. Default: NULL |
bcdsArgs |
list, arguments for bcds function in list form. Default: NULL |
verb |
logical, switch on/off progress messages |
estNdbl |
logical, should the numer of doublets be estimated from the data. Enables doublet calls. Default:FALSE. Use with caution. |
force |
logical, force a (re)run of |
sce input sce object SingleCellExperiment
with doublet scores added to colData as "hybrid_score" column.
data("sce_chcl") ## create small data set using only 100 cells sce_chcl_small = sce_chcl[, 1:100] sce_chcl_small = cxds_bcds_hybrid(sce_chcl_small)
data("sce_chcl") ## create small data set using only 100 cells sce_chcl_small = sce_chcl[, 1:100] sce_chcl_small = cxds_bcds_hybrid(sce_chcl_small)
Extract top-scoring gene pairs from an SingleCellExperiment where cxds has been run
cxds_getTopPairs(sce, n = 100)
cxds_getTopPairs(sce, n = 100)
sce |
single cell experiment to analyze; needs "counts" in assays slot. |
n |
integer. The number of gene pairs to extract. Default: 100 |
matrix Matrix with two colulmns, each containing gene indexes for gene pairs (rows).
Wrapper for getting doublet calls
get_dblCalls_ALL(scrs_real, scrs_sim, rel_loss = 1)
get_dblCalls_ALL(scrs_real, scrs_sim, rel_loss = 1)
scrs_real |
numeric vector, the scores for the real/original data |
scrs_sim |
numeric vector, the scores for the artificial doublets |
rel_loss |
numeric scalar, relative weight of a false positive classification compared with a false negative. Default:1 (same loss for fp and fn). |
numeric, matrix containing the (estimated) number of doublets, the score threshold and the fraction of artificial doublets missed (false negative rate, of sorts) as columns and four types of estimating: "youden", "balanced" and a false negative rate of artificial doublets of 0.1 and 0.01, respecitvely.
Given score vectors for real data and artificial doubles, derive doublet calls based on determining doublet score cutoffs.
get_dblCalls_dist(scrs_real, scrs_sim, type = "balanced")
get_dblCalls_dist(scrs_real, scrs_sim, type = "balanced")
scrs_real |
numeric vector, the scores for the real/original data |
scrs_sim |
numeric vector, the scores for the artificial doublets |
type |
character or numeric, describes how the score threshold for calling doublets is determined. Either |
numeric, vector containing the (estimated) number of doublets, the score threshold and the fraction of artificial doublets missed (false negative rate, of sorts)
Given class probabilities (or scores) discriminating real data from artificial doublets, derive doublet calls. Based on selecting a ROC cutoff, see The Inconsistency of ‘‘Optimal’’ Cutpoints Obtained using Two Criteria basedon the Receiver Operating Characteristic Curve, (doi).
get_dblCalls_ROC(scrs_real, scrs_sim, rel_loss = 1)
get_dblCalls_ROC(scrs_real, scrs_sim, rel_loss = 1)
scrs_real |
numeric vector, the scores for the real/original data |
scrs_sim |
numeric vector, the scores for the artificial doublets |
rel_loss |
numeric scalar, relative weight of a false positive classification compared with a false negative. Default:1 (same loss for fp and fn). |
numeric, vector containing the (estimated) number of doublets, the score threshold and the fraction of artificial doublets missed (false negative rate, of sorts)
SingleCellExperiment
) objectExample data set, created by randomly sampling genes and cells from a real data set (ch_cl, i.e., the cell lines data from https://satijalab.org/seurat/hashing_vignette.html). Contains raw counts in the counts
assay slot.
sce_chcl
sce_chcl
a single cell experiment object (SingleCellExperiment
) with raw counts in the counts
in assays, and colData with experimental annotations.