Package 'EasyCellType' reference manual

Title:	Annotate cell types for scRNA-seq data
Description:	We developed EasyCellType which can automatically examine the input marker lists obtained from existing software such as Seurat over the cell markerdatabases. Two quantification approaches to annotate cell types are provided: Gene set enrichment analysis (GSEA) and a modified versio of Fisher's exact test. The function presents annotation recommendations in graphical outcomes: bar plots for each cluster showing candidate cell types, as well as a dot plot summarizing the top 5 significant annotations for each cluster.
Authors:	Ruoxing Li [aut, cre, ctb], Ziyi Li [ctb]
Maintainer:	Ruoxing Li <[email protected]>
License:	Artistic-2.0
Version:	1.9.0
Built:	2025-02-17 03:34:41 UTC
Source:	https://github.com/bioc/EasyCellType

Tissues in CellMarker database.

Description

A list containing 2 elements: Human tissues and Mouse tissues.

Usage

data(cellmarker_tissue)
data(cellmarker_tissue)

Format

A list with 2 elements:

Human: Human tissue
Mouse: Mouse tissue

Tissues in Clustermole database.

Description

A list containing 2 elements: Human tissues and Mouse tissues.

Usage

data(clustermole_tissue)
data(clustermole_tissue)

Format

A list with 2 elements:

Human: Human tissue
Mouse: Mouse tissue

Title Summarize markers contirbuting to the cell type annotation

Description

Title Summarize markers contirbuting to the cell type annotation

Usage

coremarkers(test, data, species)
coremarkers(test, data, species)

Arguments

`test`	Test used to annotation cell types: "GSEA" or "fisher"
`data`	Annotation results.
`species`	"Human" or "Mouse"

Value

A data frame containing genes contributed to cell annotation

Examples

## core_markers <- coremarkers("GSEA", data)

## core_markers <- coremarkers("GSEA", data)

Annotate cell types for scRNA-seq data

Description

This function is used to run the annotation analysis using either GSEA or a modified Fisher's exact test. We expect users to input a data frame containing expressed markers, cluster information and the differential score (log fold change). The gene lists in that data frame should be sorted by their differential score.

Usage

easyct(
  data,
  db = "cellmarker",
  genetype = "Entrezid",
  species = "Human",
  tissue = NULL,
  p_cut = 0.5,
  test = "GSEA",
  scoretype = "std"
)
easyct(
  data,
  db = "cellmarker",
  genetype = "Entrezid",
  species = "Human",
  tissue = NULL,
  p_cut = 0.5,
  test = "GSEA",
  scoretype = "std"
)

Arguments

`data`	A data frame containing the markers, cluster, and expression scores; Marker genes should be sorted in each cluster. Order of the columns should be gene, cluster and expression level score. An example data can be loaded using 'data(gene_pbmc)'.
`db`	Name of the reference database: cellmarker, clustermole or panglaodb;
`genetype`	Indicate the gene type in the input data frame: "Entrezid" or "symbol".
`species`	Human or Mouse. Human in default.
`tissue`	Tissue types can be specified when running the analysis. Length of tissue can be larger than 1. The possible tissues can be seen using 'data(cellmarker_tissue)', 'data(clustermole_tissue)' and 'data(panglao_tissue)'.
`p_cut`	Cutoff of the P value for GSEA.
`test`	"GSEA" or "fisher"; "GSEA" is used in default.
`scoretype`	Argument used for GSEA. Default value is "std". If all scores are positive, then scoretype should be "pos".

Value

A list containing the test results for each cluster.

Examples

data(gene_pbmc)
result <- easyct(gene_pbmc, db="cellmarker", species="Human", 
tissue=c("Blood", "Peripheral blood", "Blood vessel",
"Umbilical cord blood", "Venous blood"), p_cut=0.3, test="GSEA", scoretype="pos")

data(gene_pbmc)
result <- easyct(gene_pbmc, db="cellmarker", species="Human", 
tissue=c("Blood", "Peripheral blood", "Blood vessel",
"Umbilical cord blood", "Venous blood"), p_cut=0.3, test="GSEA", scoretype="pos")

Differential expressed marker genes in 9 clusters.

Description

A data frame containing marker genes, clusters as well as the average of log 2 fold changes. The original data set is from 10X genomics, and we followed the standard workflow provided by Seurat package to process data, and then format to get the data frame.

Usage

data(gene_pbmc)
data(gene_pbmc)

Format

A data frame with 727 rows and 3 variables:

gene: Entrez IDs of the marker genes
cluster: Cluster
score: Average of log 2 fold changes getting from the process procedure

Source

https://cf.10xgenomics.com/samples/cell/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz

Title Convert gene symbol to Entrez ID

Description

This function is used to convert the gene symbol to Entrez Id. Used in easyct function.

Usage

mapsymbol(d, species)
mapsymbol(d, species)

Arguments

`d`	A data frame where first column contains gene symbols.
`species`	"Human" or "Mouse".

Value

A data frame containing gene symbols and the corresponding Entrez ID

Tissues in Panglao database.

Description

A list containing 2 elements: Human tissues and Mouse tissues.

Usage

data(panglao_tissue)
data(panglao_tissue)

Format

A list with 2 elements:

Human: Human tissue
Mouse: Mouse tissue

Peripheral Blood Mononuclear Cells (PBMC) data.

Description

Count matrix of Peripheral Blood Mononuclear Cells (PBMC). The original data set is from 10X genomics.

Usage

data(pbmc_data)
data(pbmc_data)

Format

A large dgCMatrix: 32378 * 2700

i: Row index of the non-zero values
p: A vector to refer the column index of the non-zero values
Dim: Dimension of the matrix
Dimnames: A list of length 2 containing the row names and column names of the matrix
x: Vector containing all the non-zero values

Source

https://cf.10xgenomics.com/samples/cell/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz

Create bar plots for each cluster

Description

This function is used to generate set of bar plots presenting up to 10 candidate cell types for each cluster.

Usage

plot_bar(test = "GSEA", data, cluster = NULL)
plot_bar(test = "GSEA", data, cluster = NULL)

Arguments

`test`	"GSEA" or "fisher"
`data`	Annotation results
`cluster`	Cluster can be specified to print plots.

Value

Bar plots showing show up to 10 candidate cell types for each cluster.

Examples

data(gene_pbmc)
result <- easyct(gene_pbmc, db="cellmarker", species="Human", 
tissue=c("Blood", "Peripheral blood", "Blood vessel",
"Umbilical cord blood", "Venous blood"), p_cut=0.3, test="GSEA", scoretype="pos")
plot_bar("GSEA", result)

data(gene_pbmc)
result <- easyct(gene_pbmc, db="cellmarker", species="Human", 
tissue=c("Blood", "Peripheral blood", "Blood vessel",
"Umbilical cord blood", "Venous blood"), p_cut=0.3, test="GSEA", scoretype="pos")
plot_bar("GSEA", result)

Create dot plot for annotation results

Description

This function is used to generate a dor plot presenting the top 5 candidate cell types for each cluster.

Usage

plot_dot(test = "GSEA", data)
plot_dot(test = "GSEA", data)

Arguments

`test`	Test used to annotate cell types: "GSEA" or "fisher"
`data`	Annotation results

Value

A dot plot showing the top 5 significant cell types for each cluster.

Examples

data(gene_pbmc)
result <- easyct(gene_pbmc, db="cellmarker", species="Human", 
tissue=c("Blood", "Peripheral blood", "Blood vessel",
"Umbilical cord blood", "Venous blood"), p_cut=0.3, test="GSEA", scoretype="pos")
plot_dot("GSEA", result)

data(gene_pbmc)
result <- easyct(gene_pbmc, db="cellmarker", species="Human", 
tissue=c("Blood", "Peripheral blood", "Blood vessel",
"Umbilical cord blood", "Venous blood"), p_cut=0.3, test="GSEA", scoretype="pos")
plot_dot("GSEA", result)

Title Annotate cell types for single cell RNA data

Description

This function is used to process the annotation test results. Processed data will be used to generate plots.

Usage

process_results(test, data)
process_results(test, data)

Arguments

`test`	Test used to annotation cell types: "GSEA" or "fisher"
`data`	Annotation results.

Value

A data frame used to generate plots.

Print test results

Description

This function is used to print summary table of annotation results for a specific cluster.

Usage

summarycelltype(test, results, cluster)
summarycelltype(test, results, cluster)

Arguments

`test`	"GSEA" or "fisher".
`results`	Annotation results.
`cluster`	Cluster of interest.

Value

A summary table of a annotation results. "core_enrichment" contains markers contributing on the annotation.

Examples

data(gene_pbmc)
result <- easyct(gene_pbmc, db="cellmarker", species="Human", 
tissue=c("Blood", "Peripheral blood", "Blood vessel",
"Umbilical cord blood", "Venous blood"), p_cut=0.3, test="GSEA", scoretype="pos")
summarycelltype(test="GSEA", results=result, cluster=0)

data(gene_pbmc)
result <- easyct(gene_pbmc, db="cellmarker", species="Human", 
tissue=c("Blood", "Peripheral blood", "Blood vessel",
"Umbilical cord blood", "Venous blood"), p_cut=0.3, test="GSEA", scoretype="pos")
summarycelltype(test="GSEA", results=result, cluster=0)

Fisher exact test used in function 'easyct'

Description

This function is used to conduct the modified Fisher's exact test.

Usage

test_fisher(testgenes, ref, cols)
test_fisher(testgenes, ref, cols)

Arguments

`testgenes`	A data frame containing query genes and the expression scores.
`ref`	The reference data base.
`cols`	Column names of the input data frame

Value

A data frame containg the results of fisher's exact test.

Package 'EasyCellType'

Help Index

Tissues in CellMarker database.

Description

Usage

Format

Tissues in Clustermole database.

Description

Usage

Format

Title Summarize markers contirbuting to the cell type annotation

Description

Usage

Arguments

Value

Examples

Annotate cell types for scRNA-seq data

Description

Usage

Arguments

Value

Examples

Differential expressed marker genes in 9 clusters.

Description

Usage

Format

Source

Title Convert gene symbol to Entrez ID

Description

Usage

Arguments

Value

Tissues in Panglao database.

Description

Usage

Format

Peripheral Blood Mononuclear Cells (PBMC) data.

Description

Usage

Format

Source

Create bar plots for each cluster

Description

Usage

Arguments

Value

Examples

Create dot plot for annotation results

Description

Usage

Arguments

Value

Examples

Title Annotate cell types for single cell RNA data

Description

Usage

Arguments

Value

Print test results

Description

Usage

Arguments

Value

Examples

Fisher exact test used in function 'easyct'

Description

Usage

Arguments

Value