Package 'MEB'

Title: A normalization-invariant minimum enclosing ball method to detect differentially expressed genes for RNA-seq and scRNA-seq data
Description: This package provides a method to identify differential expression genes in the same or different species. Given that non-DE genes have some similarities in features, a scaling-free minimum enclosing ball (SFMEB) model is built to cover those non-DE genes in feature space, then those DE genes, which are enormously different from non-DE genes, being regarded as outliers and rejected outside the ball. The method on this package is described in the article 'A minimum enclosing ball method to detect differential expression genes for RNA-seq data'. The SFMEB method is extended to the scMEB method that considering two or more potential types of cells or unknown labels scRNA-seq dataset DEGs identification.
Authors: Yan Zhou, Jiadi Zhu
Maintainer: Jiadi Zhu <[email protected]>, Yan Zhou <[email protected]>
License: GPL-2
Version: 1.19.0
Built: 2024-07-15 05:35:11 UTC
Source: https://github.com/bioc/MEB

Help Index


Detect differential expression genes for RNA-seq data

Description

Use a normalization-invariant minimum enclosing ball (NIMEB) method to discriminate differential expression (DE) genes in the same or different species.

Usage

NIMEB(countsTable, train_id, gamma, nu = 0.01, reject_rate = 0.1,
ds = FALSE)

Arguments

countsTable

Matrix or data.frame of short read counts for each genes in the same or different species.

train_id

A vector shows the position of housekeeping genes or conserved genes in countsTable.

gamma

A parameter needed for all kernels except linear.

nu

parameter needed for one-classification.

reject_rate

A value used in controling the scale of ball, default is 0.01.

ds

A value to show the data is for the same species or different species. If ds is FALSE, the data is the same species, else the data is the different species.

Value

list(.) A list of results, "model" represents the model of NIMEB, which could be used to discriminate a new gene, "gamma" represents the selected gamma parameters in model NIMEB, "train_error" represents the corresponding train_error when the value of gamma changed.

Examples

## Simulation data for the same species.
library(SummarizedExperiment)
data(sim_data_sp)
gamma <- seq(1e-06,5e-05,1e-06)
sim_model_sp <- NIMEB(countsTable = assay(sim_data_sp), 
train_id=seq_len(1000), gamma, nu = 0.01, reject_rate = 0.05, ds = FALSE)

## Real data for the same species.
data(real_data_sp)
gamma <- seq(1e-06,5e-05,1e-06)
real_model_sp <- NIMEB(countsTable = assay(real_data_sp), 
train_id=seq_len(530), gamma, nu = 0.01, reject_rate = 0.1, ds = FALSE)

## Simulation data for the different species.
data(sim_data_dsp)
gamma <- seq(1e-07,2e-05,1e-06)
sim_model_dsp <- NIMEB(countsTable = assay(sim_data_dsp), 
train_id=seq_len(1000), gamma, nu = 0.01, reject_rate = 0.1, ds = TRUE)

## Real data for the different species.
data(real_data_dsp)
gamma <- seq(5e-08,5e-07,1e-08)
real_model_dsp <- NIMEB(countsTable = assay(real_data_dsp), 
train_id=seq_len(143), gamma, nu = 0.01, reject_rate = 0.1, ds = TRUE)

A real dataset of genes between the different species.

Description

This data set includes two species and 19330 genes with corresponding short read counts, in which the first 143 genes are conserved genes.

Usage

real_data_dsp

Format

A data.frame contains two species and 19330 genes.

Source

Brawand, D., Soumillon, M., Necsulea, A. and Julien, P. et al. (2011). The evolution of gene expression levels in mammalian organs. Nature, 478, 343-348.


A real dataset of genes between the same species.

Description

This data set includes two samples and each sample with five biological replicates and 16519 genes with corresponding short read counts, in which the first 530 genes are housekeeping genes.

Usage

real_data_sp

Format

A data.frame contains two samples and each sample with five biological replicates and 16519 genes.

Source

Marioni J.C., Mason C.E., et al. (2008). RNA-seq: an assessment of technical reproducibility and comparisonwith gene expression arrays. Genome Res. 18(9), 1509–1517.


scMEB: A fast and clustering-independent method for detecting differentially expressed genes in single-cell RNA-seq data

Description

Using the Minimum Enclosing Ball (MEB) method to discriminate differential expression genes (DEGs) without prior cell clustering results.

Usage

scMEB(sce, stable_idx, filtered = FALSE, 
gamma = seq(1e-04,0.001,1e-05), nu = 0.01, reject_rate = 0.1)

Arguments

sce

A SingleCellExperiment class scRNA-seq data.

stable_idx

A vector shows the name of stable expressed gene in sce.

filtered

A logical value to show if the data have been filtered.

gamma

A parameter needed for all kernels except linear.

nu

A parameter needed for one-classification.

reject_rate

A value used in controling the scale of ball, default is 0.01.

Value

list(.) A list of results, "model" represents the model of scMEB, which could be used to discriminate a new gene, "dat_pca" represents the first 50 PCs of each genes, "gamma" represents the selected gamma parameters in model scMEB, "train_error" represents the corresponding train_error when the value of gamma changed, "dist" shows the distance between the points and the radius of the sphere in feature space.

Examples

## Simulation data for scRNA-seq data generated from splatter package.
library(SingleCellExperiment)
data(sim_scRNA_data)
data(stable_gene)
sim_scRNA <- scMEB(sce=sim_scRNA_data, stable_idx=stable_gene, 
filtered = FALSE, gamma = seq(1e-04,0.001,1e-05), nu = 0.01, 
reject_rate = 0.1)

A simulation dataset of genes between the differetn species.

Description

This data set includes two species and 18472 genes with corresponding short read counts, in which the first 1000 genes are conserved genes.

Usage

sim_data_dsp

Format

A data.frame contains two species and 18472 genes.

Source

Jiadi Zhu, Yan Zhou, Junhui Wang, Youlong Yang (2019, pending publication). A minimum enclosing ball method to detect differential expression genes for RNA-seq data.


A simulation dataset of genes between the same species.

Description

This data set includes two samples and 10623 genes with corresponding short read counts, in which the first 1000 genes are housekeeping genes.

Usage

sim_data_sp

Format

A data.frame contains two samples and 10623 genes.

Source

Jiadi Zhu, Yan Zhou, Junhui Wang, Youlong Yang (2019, pending publication). A minimum enclosing ball method to detect differential expression genes for RNA-seq data.


A simulation scRNA-seq dataset from splatter package.

Description

This dataset is a SingleCellExperiment class, it includes 5,000 genes and 100 cells.

Usage

sim_scRNA_data

Format

A SingleCellExperiment class data includes 5,000 genes and 100 cells.

Source

A simulation data generated from spaltter package.


The name of the stable genes in all dataset.

Description

The name of the stable genes in all dataset that used to build the model.

Usage

stable_gene

Format

A vector contains the name of the stable genes.

Source

The simulation data (sim_scRNA_data) generated from spaltter package.