Package 'SplineDV' reference manual

Title:	Differential Variability (DV) analysis for single-cell RNA sequencing data. (e.g. Identify Differentially Variable Genes across two experimental conditions)
Description:	A spline based scRNA-seq method for identifying differentially variable (DV) genes across two experimental conditions. Spline-DV constructs a 3D spline from 3 key gene statistics: mean expression, coefficient of variance, and dropout rate. This is done for both conditions. The 3D spline provides the “expected” behavior of genes in each condition. The distance of the observed mean, CV and dropout rate of each gene from the expected 3D spline is used to measure variability. As the final step, the spline-DV method compares the variabilities of each condition to identify differentially variable (DV) genes.
Authors:	Shreyan Gupta [aut, cre] , James Cai [aut]
Maintainer:	Shreyan Gupta <[email protected]>
License:	GPL-2
Version:	0.99.14
Built:	2025-02-08 10:24:57 UTC
Source:	https://github.com/bioc/SplineDV

Plot 3D scatter plot in two conditions

Description

Plots 3D gene statistic scatter plot of target genes in two conditions.

Usage

DVPlot(df, targetgene = NULL, ptSize = 3, lwd = 5, dlwd = 7)
DVPlot(df, targetgene = NULL, ptSize = 3, lwd = 5, dlwd = 7)

Arguments

`df`	Resultant DataFrame after splineDV analysis
`targetgene`	An integer value. Defines the minimum reads required for a cell to be included in the analysis.
`ptSize`	An integer value. Defines point size for dots
`lwd`	An integer value. Defines line width for spline.
`dlwd`	An integer value. Defines Line width for target gene distance.

Value

3D plotly scatter plot

Author(s)

Shreyan Gupta <[email protected]>

Examples

# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5), 0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))
Y <- abs(matrix(round(rpois(2000*500, lambda=0.5), 0), nrow=2000, ncol=500))
rownames(Y) <- paste0('g', as.character(1:2000))

## Run splineDV
res <- splineDV(X, Y)
fig <- DVPlot(res)
# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5), 0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))
Y <- abs(matrix(round(rpois(2000*500, lambda=0.5), 0), nrow=2000, ncol=500))
rownames(Y) <- paste0('g', as.character(1:2000))

## Run splineDV
res <- splineDV(X, Y)
fig <- DVPlot(res)

Plot HVG 3D scatter plot

Description

Plots 3D scatter plot with HVGs or target gene(s) highlighted. The 3D spline represents the estimated expected behavior of genes in the sample.

Usage

HVGPlot(df, targetgene = NULL, ptSize = 3, lwd = 5, dlwd = 7)
HVGPlot(df, targetgene = NULL, ptSize = 3, lwd = 5, dlwd = 7)

Arguments

`df`	Resultant DataFrame after splineHVG analysis
`targetgene`	An integer value. Defines the minimum reads required for a cell to be included in the analysis.
`ptSize`	An integer value. Defines point size for dots
`lwd`	An integer value. Defines line width for spline.
`dlwd`	An integer value. Defines Line width for target gene distance.

Value

3D plotly scatter plot.

Author(s)

Shreyan Gupta <[email protected]>

Examples

# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5),0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))

## Run splineHVG
res <- splineHVG(X)
fig <- HVGPlot(res)
# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5),0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))

## Run splineHVG
res <- splineHVG(X)
fig <- HVGPlot(res)

Quality control

Description

QC filter scRNA-seq expression data. Removes lowly expressed genes and cells. Also removes cells with high mitochondrial gene expression.

Usage

hvgQC(X, ncounts = 1000, ncells = 15, mtPerc = 15)
hvgQC(X, ncounts = 1000, ncells = 15, mtPerc = 15)

Arguments

`X`	Count matrix or SingleCellExperiment.
`ncounts`	An integer value. Defines the minimum reads required for a cell to be included in the analysis.
`ncells`	An integer value. Defines the minimum cells required for a gene to be included in the analysis.
`mtPerc`	A double value. Defines the minimum percent mitochondrial genes expression required for a cell to be excluded from the analysis.

Value

QC filtered SingleCellExperiment.

Author(s)

Shreyan Gupta <[email protected]>

Examples

# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5),0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))

## Run splineHVG
QCres <- hvgQC(X)
# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5),0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))

## Run splineHVG
QCres <- hvgQC(X)

Differential Variability (DV) analysis. 3 gene statistics are used - Mean, CV and Dropout rate of each gene. A smooth.spline is modeled using these statistics. First, the distance vectors of each gene from spline is computed using the splineHVG function for each condition. Then the vector distance between the distance vectors are computed. A high

Usage

splineDV(
  X,
  Y,
  ncounts = 1000,
  ncells = 15,
  spar = 0.5,
  mtPerc = 15,
  detailed = FALSE,
  verbose = TRUE
)
splineDV(
  X,
  Y,
  ncounts = 1000,
  ncells = 15,
  spar = 0.5,
  mtPerc = 15,
  detailed = FALSE,
  verbose = TRUE
)

Arguments

`X`	Count matrix or SingleCellExperiment (Test sample)
`Y`	Count matrix or SingleCellExperiment (Control sample)
`ncounts`	An integer value. Defines the minimum reads required for a cell to be included in the analysis.
`ncells`	An integer value. Defines the minimum cells required for a gene to be included in the analysis.
`spar`	A double value. Smoothing parameter for Spline.
`mtPerc`	A double value. Defines the minimum percent mitochondrial genes expression required for a cell to be excluded from the analysis.
`detailed`	A boolean value. Defines whether to add individual splineHVG DataFrame to the output.
`verbose`	A Boolean value (TRUE/FALSE), if TRUE, prints messages.

Value

A DataFrame with DV Statistics. The statistics include log1p(mean), log1p(CV), dropout rates, nearest point on spline (splinex, spliney and splinez) and distance from spline for each sample. The distance across conditions is stored in "vectorDist". P values are computed assuming normal distribution of distance differences.

Author(s)

Shreyan Gupta <[email protected]>

Examples

# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5), 0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))
Y <- abs(matrix(round(rpois(2000*500, lambda=0.5), 0), nrow=2000, ncol=500))
rownames(Y) <- paste0('g', as.character(1:2000))

## Run splineDV
res <- splineDV(X, Y)
# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5), 0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))
Y <- abs(matrix(round(rpois(2000*500, lambda=0.5), 0), nrow=2000, ncol=500))
rownames(Y) <- paste0('g', as.character(1:2000))

## Run splineDV
res <- splineDV(X, Y)

Spline-HVG

Description

Compute Highly Variable Genes from an scRNAseq expression data. Uses 3 gene statstics - Mean, CV and Dropout rate to model a 3D spline to estimate the expected behavior of genes. A gene is considered highly variable if the actual gene expression is far from the estimated 3D spline.

Usage

splineHVG(
  X,
  QC = TRUE,
  ncounts = 1000,
  ncells = 15,
  mtPerc = 15,
  spar = 0.5,
  nHVGs = 2000,
  use.ndist = TRUE,
  verbose = TRUE
)
splineHVG(
  X,
  QC = TRUE,
  ncounts = 1000,
  ncells = 15,
  mtPerc = 15,
  spar = 0.5,
  nHVGs = 2000,
  use.ndist = TRUE,
  verbose = TRUE
)

Arguments

`X`	Count matrix or SingleCellExperiment
`QC`	A Boolean value (TRUE/FALSE), if TRUE, a quality control is applied over the data.
`ncounts`	An integer value. Defines the minimum reads required for a cell to be included in the analysis.
`ncells`	An integer value. Defines the minimum cells required for a gene to be included in the analysis.
`mtPerc`	A double value. Defines the minimum percent mitochondrial genes expression required for a cell to be excluded from the analysis.
`spar`	A double value. Smoothing parameter for Spline.
`nHVGs`	An integer value. Number of top Highly Variable Genes (HVGs) to select.
`use.ndist`	A Boolean value (TRUE/FALSE), if TRUE, computes the nearest point on spline by nearest-neighbor search (TRUE Recommended). Else, uses the position of the corresponding gene on the spline for distance computation.
`verbose`	A Boolean value (TRUE/FALSE), if TRUE, prints messages.

Value

A DataFrame with highly variable gene selection statistics. The statistics include log1p(mean), log1p(CV), dropout rates, nearest point on spline (splinex, spliney and splinez) and distance from spline. A higher distance signifies higher variability.

Author(s)

Shreyan Gupta <[email protected]>

Examples

# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5),0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))

## Run splineHVG
res <- splineHVG(X)
# example code
## Generate example count data
X <- abs(matrix(round(rpois(2000*500, lambda=0.5),0), nrow=2000, ncol=500))
rownames(X) <- paste0('g', as.character(1:2000))

## Run splineHVG
res <- splineHVG(X)

Package 'SplineDV'

Help Index

Plot 3D scatter plot in two conditions

Description

Usage

Arguments

Value

Author(s)

Examples

Plot HVG 3D scatter plot

Description

Usage

Arguments

Value

Author(s)

Examples

Quality control

Description

Usage

Arguments

Value

Author(s)

Examples

Spline-DV

Description

Usage

Arguments

Value

Author(s)

Examples

Spline-HVG

Description

Usage

Arguments

Value

Author(s)

Examples