Package 'scRecover'

Title: scRecover for imputation of single-cell RNA-seq data
Description: scRecover is an R package for imputation of single-cell RNA-seq (scRNA-seq) data. It will detect and impute dropout values in a scRNA-seq raw read counts matrix while keeping the real zeros unchanged, since there are both dropout zeros and real zeros in scRNA-seq data. By combination with scImpute, SAVER and MAGIC, scRecover not only detects dropout and real zeros at higher accuracy, but also improve the downstream clustering and visualization results.
Authors: Zhun Miao, Xuegong Zhang <[email protected]>
Maintainer: Zhun Miao <[email protected]>
License: GPL
Version: 1.23.0
Built: 2024-12-07 06:12:34 UTC
Source: https://github.com/bioc/scRecover

Help Index


countsSampling: Downsampling the read counts in a cell

Description

This function is used to downsample the read counts in a cell for single-cell RNA-seq (scRNA-seq) data. It takes a non-negative vector of scRNA-seq raw read counts of a cell as input.

Usage

countsSampling(counts, fraction = 0.1)

Arguments

counts

A cell's raw read counts for each gene, could be a vector or a SingleCellExperiment object.

fraction

Fraction of reads to be downsampled, should be between 0-1, default is 0.1.

Value

A vector of the downsampled read counts of each gene in the cell.

Author(s)

Zhun Miao.

See Also

scRecover, for imputation of single-cell RNA-seq data.

estDropoutNum, for estimating dropout gene number in a cell.

normalization, for normalization of single-cell RNA-seq data.

scRecoverTest, a test dataset for scRecover.

Examples

# Load test data
data(scRecoverTest)

# Downsample the read counts in oneCell
oneCell.down <- countsSampling(counts = oneCell, fraction = 0.1)

estDropoutNum: Estimate dropout gene number in a cell

Description

This function is used to estimate dropout gene number in a cell for single-cell RNA-seq (scRNA-seq) data. It takes a non-negative vector of scRNA-seq raw read counts of a cell as input.

Usage

estDropoutNum(sample = NULL, depth = 20, histCounts = NULL,
  return = "dropoutNum")

Arguments

sample

A cell's raw read counts for each gene, could be a vector or a SingleCellExperiment object.

depth

Relative sequencing depth to be predicted compared with initial sample depth, should between 0-100, default is 20.

histCounts

Optional. Only needed when sample is blank or sample = NULL. A histogram table of raw read counts for the cell.

return

A character for choosing the return value type of the function. "dropoutNum" (default) for dropout gene number, "geneNumPredict" for all expressed gene number predicted, "transcriptNum" for all transcript number predicted.

Value

The dropout gene number (or all expressed gene number) predicted in a cell.

Author(s)

Zhun Miao.

See Also

scRecover, for imputation of single-cell RNA-seq data.

countsSampling, for downsampling the read counts in a cell.

normalization, for normalization of single-cell RNA-seq data.

scRecoverTest, a test dataset for scRecover.

Examples

# Load test data
data(scRecoverTest)

# Estimate dropout gene number in a cell
estDropoutNum(sample = counts[,1], return = "dropoutNum")

# Estimate all expressed gene number in a cell
estDropoutNum(sample = counts[,1], return = "geneNumPredict")

normalization: Normalization for single-cell RNA-seq data

Description

This function is used to normalize single-cell RNA-seq (scRNA-seq) data. It takes a non-negative matrix of scRNA-seq raw read counts or a SingleCellExperiment object as input.

Usage

normalization(counts)

Arguments

counts

A non-negative integer matrix of scRNA-seq raw read counts or a SingleCellExperiment object which contains the read counts matrix. The rows of the matrix are genes and columns are samples/cells.

Value

A normalized scRNA-seq read counts matrix.

Author(s)

Zhun Miao.

See Also

scRecover, for imputation of single-cell RNA-seq data.

estDropoutNum, for estimating dropout gene number in a cell.

countsSampling, for downsampling the read counts in a cell.

scRecoverTest, a test dataset for scRecover.

Examples

# Load test data
data(scRecoverTest)

# Normalization of counts
counts.norm <- normalization(counts = counts)

scRecover: Imputation for single-cell RNA-seq data

Description

This function is used to impute missing values in single-cell RNA-seq (scRNA-seq) data. It takes a non-negative matrix of scRNA-seq raw read counts or a SingleCellExperiment object as input. So users should map the reads (obtained from sequencing libraries of the samples) to the corresponding genome and count the reads mapped to each gene according to the gene annotation to get the raw read counts matrix in advance.

Usage

scRecover(counts, Kcluster = NULL, labels = NULL, outputDir = NULL,
  depth = 20, SAVER = FALSE, MAGIC = FALSE, UMI = FALSE,
  hist_raw_counts = NULL, hist_RUG_counts = NULL, parallel = FALSE,
  BPPARAM = bpparam(), verbose = TRUE)

Arguments

counts

A non-negative integer matrix of scRNA-seq raw read counts or a SingleCellExperiment object which contains the read counts matrix. The rows of the matrix are genes and columns are samples/cells.

Kcluster

An integer specifying the number of cell subpopulations. This parameter can be determined based on prior knowledge or clustering of raw data. Kcluster is used to determine the candidate neighbors of each cell.

labels

Optional. Only needed when Kcluster is blank or Kcluster = NULL. A character/integer vector specifying the cell type of each column in the raw count matrix. Each cell type should have at least two cells.

outputDir

The path of the output directory. If not specified, a folder named with prefix 'outDir_scRecover_' under the temporary directory will be used.

depth

Relative sequencing depth to be predicted compared with initial sample depth, should between 2-100, default is 20.

SAVER

Whether use and improve SAVER in imputation, default is FALSE.

MAGIC

Whether use and improve MAGIC in imputation, default is FALSE.

UMI

Whether use full UMI data, default is FALSE. If TRUE, hist_raw_counts and hist_RUG_counts should be specified.

hist_raw_counts

A list contains the histogram table of raw read counts for each cell in counts.

hist_RUG_counts

A list contains the histogram table of raw UMI-gene counts for each cell in counts.

parallel

If FALSE (default), no parallel computation is used; if TRUE, parallel computation using BiocParallel, with argument BPPARAM.

BPPARAM

An optional parameter object passed internally to bplapply when parallel=TRUE. If not specified, bpparam() (default) will be used.

verbose

Whether to show specific calculation progress, default is TRUE.

Value

Imputed counts matrices will be saved in the output directory specified by outputDir.

Author(s)

Zhun Miao.

See Also

estDropoutNum, for estimating dropout gene number in a cell.

countsSampling, for downsampling the read counts in a cell.

normalization, for normalization of single-cell RNA-seq data.

scRecoverTest, a test dataset for scRecover.

Examples

# Load test data for scRecover
data(scRecoverTest)

# Run scRecover with Kcluster specified
scRecover(counts = counts, Kcluster = 2)

# Or run scRecover with labels specified
# scRecover(counts = counts, labels = labels)

scRecoverTest: A test dataset for scRecover

Description

A test dataset containing a single-cell RNA-seq (scRNA-seq) read counts matrix and its cell type information.

Usage

data(scRecoverTest)

Format

  • counts. A non-negative integer matrix of scRNA-seq raw read counts, rows are genes and columns are cells.

  • labels. A vector of integer specifying the cell types in the read counts matrix, corresponding to the columns of counts.

  • oneCell. A non-negative vector of scRNA-seq raw read counts of a cell for each gene.

Details

  • counts. A matrix of raw read counts of scRNA-seq data which has 200 genes (rows) and 150 cells (columns).

  • labels. A vector of integer specifying the two cell types in counts. Also could be generated by: labels <- c(rep(1,50), rep(2,100)).

  • oneCell. A vector of a cell's raw read counts for 24538 gene.

Source

Petropoulos S, et al. Cell, 2016, 165(4): 1012-1026.

See Also

scRecover, for imputation of single-cell RNA-seq data.

estDropoutNum, for estimating dropout gene number in a cell.

countsSampling, for downsampling the read counts in a cell.

normalization, for normalization of single-cell RNA-seq data.

Examples

# Load test data for scRecover
data(scRecoverTest)