Package 'HIPPO' reference manual

Title:	Heterogeneity-Induced Pre-Processing tOol
Description:	For scRNA-seq data, it selects features and clusters the cells simultaneously for single-cell UMI data. It has a novel feature selection method using the zero inflation instead of gene variance, and computationally faster than other existing methods since it only relies on PCA+Kmeans rather than graph-clustering or consensus clustering.
Authors:	Tae Kim [aut, cre], Mengjie Chen [aut]
Maintainer:	Tae Kim <[email protected]>
License:	GPL (>=2)
Version:	1.19.0
Built:	2025-03-18 03:18:56 UTC
Source:	https://github.com/bioc/HIPPO

re-export magrittr pipe operator

Description

re-export magrittr pipe operator

A reference data frame that matches ENSG IDs to HGNC symbols

Description

A reference data frame that matches ENSG IDs to HGNC symbols

Usage

ensg_hgnc
ensg_hgnc

Format

A data frame with 46606 rows and 2 columns

ensg: Ensembl ENSG IDs
hgnc: HGNC symbols

Source

http://www.biomart.org/

Access data from SCE object

Description

Access data from SCE object

Usage

get_data_from_sce(sce)
get_data_from_sce(sce)

Arguments

sce

SingleCellExperiment object

Value

count matrix

Examples

data(toydata)
X = get_data_from_sce(toydata)
data(toydata)
X = get_data_from_sce(toydata)

Access hippo object from SingleCellExperiment object.

Description

Access hippo object from SingleCellExperiment object.

Usage

get_hippo(sce)
get_hippo(sce)

Arguments

sce

SingleCellExperiment object

Value

hippo object embedded in SingleCellExperiment object

Examples

data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
hippo_object = get_hippo(toydata)
data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
hippo_object = get_hippo(toydata)

Return hippo_diffexp object

Description

Return hippo_diffexp object

Usage

get_hippo_diffexp(sce, k = 1)
get_hippo_diffexp(sce, k = 1)

Arguments

`sce`	SingleCellExperiment object with hippo
`k`	integer round of result of interest

Value

data frame of differential expression test

Examples

data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
toydata = hippo_diffexp(toydata)
result1 = get_hippo_diffexp(toydata)
data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
toydata = hippo_diffexp(toydata)
result1 = get_hippo_diffexp(toydata)

HIPPO's hierarchical clustering

Description

HIPPO's hierarchical clustering

Usage

hippo(sce, K = 20, z_threshold = 2, outlier_proportion = 0.001, verbose = TRUE)
hippo(sce, K = 20, z_threshold = 2, outlier_proportion = 0.001, verbose = TRUE)

Arguments

`sce`	SingleCellExperiment object
`K`	number of clusters to ultimately get
`z_threshold`	numeric > 0 as a z-value threshold for selecting the features
`outlier_proportion`	numeric between 0 and 1, a cut-off so that when the proportion of important features reach this number, the clustering terminates
`verbose`	if set to TRUE, it shows progress of the algorithm

Value

a list of clustering result for each level of k=1, 2, ... K.

Examples

data(toydata)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
data(toydata)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)

Conduct feature selection by computing test statistics for each gene

Description

Conduct feature selection by computing test statistics for each gene

Usage

hippo_diagnostic_plot(sce, show_outliers = FALSE, zvalue_thresh = 10)
hippo_diagnostic_plot(sce, show_outliers = FALSE, zvalue_thresh = 10)

Arguments

`sce`	SingleCellExperiment object with count matrix
`show_outliers`	boolean to indicate whether to circle the outliers with given zvalue_thresh
`zvalue_thresh`	a numeric v for defining outliers

Value

a diagnostic plot that shows genes with zero inflation

Examples

data(toydata)
hippo_diagnostic_plot(toydata, show_outliers=TRUE, zvalue_thresh = 2)
data(toydata)
hippo_diagnostic_plot(toydata, show_outliers=TRUE, zvalue_thresh = 2)

HIPPO's differential expression

Description

HIPPO's differential expression

Usage

hippo_diffexp(
  sce,
  top.n = 5,
  switch_to_hgnc = FALSE,
  ref = NA,
  k = NA,
  plottitle = ""
)
hippo_diffexp(
  sce,
  top.n = 5,
  switch_to_hgnc = FALSE,
  ref = NA,
  k = NA,
  plottitle = ""
)

Arguments

`sce`	SingleCellExperiment object with hippo
`top.n`	number of markers to return
`switch_to_hgnc`	if the current gene names are ensemble ids, and would like to switch to hgnc
`ref`	a data frame with columns 'hgnc' and 'ensg' to match each other, only required when switch_to_hgnc is set to TRUE
`k`	number of rounds of clustering that you'd like to see result. Default is 1 to K
`plottitle`	title of the resulting plot

Value

list of differential expression result

Examples

data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
result = hippo_diffexp(toydata)
data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
result = hippo_diffexp(toydata)

compute t-SNE or umap of each round of HIPPO

Description

compute t-SNE or umap of each round of HIPPO

Usage

hippo_dimension_reduction(
  sce,
  method = c("umap", "tsne"),
  perplexity = 30,
  featurelevel = 1
)
hippo_dimension_reduction(
  sce,
  method = c("umap", "tsne"),
  perplexity = 30,
  featurelevel = 1
)

Arguments

`sce`	SingleCellExperiment object with hippo object in it.
`method`	a string that determines the method for dimension reduction: either 'umap' or 'tsne
`perplexity`	numeric perplexity parameter for Rtsne function
`featurelevel`	the round of clustering that you will extract features to reduce the dimension

Value

a data frame of dimension reduction result for each k in 1, ..., K

Examples

data(toydata)
set.seed(20200321)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
toydata = hippo_dimension_reduction(toydata, method="tsne")
hippo_tsne_plot(toydata)
data(toydata)
set.seed(20200321)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
toydata = hippo_dimension_reduction(toydata, method="tsne")
hippo_tsne_plot(toydata)

HIPPO's feature heatmap

Description

HIPPO's feature heatmap

Usage

hippo_feature_heatmap(
  sce,
  switch_to_hgnc = FALSE,
  ref = NA,
  top.n = 50,
  kk = 2,
  plottitle = ""
)
hippo_feature_heatmap(
  sce,
  switch_to_hgnc = FALSE,
  ref = NA,
  top.n = 50,
  kk = 2,
  plottitle = ""
)

Arguments

`sce`	SingleCellExperiment object with hippo
`switch_to_hgnc`	if the current gene names are ensemble ids, and would like to switch to hgnc
`ref`	a data frame with columns 'hgnc' and 'ensg' to match each other, only required when switch_to_hgnc is set to TRUE
`top.n`	number of markers to return
`kk`	integer for the round of clustering that you'd like to see result. Default is 2
`plottitle`	title for the plot

Value

list of differential expression result

Examples

data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
hippo_feature_heatmap(toydata)
data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
hippo_feature_heatmap(toydata)

visualize each round of hippo through t-SNE

Description

visualize each round of hippo through t-SNE

Usage

hippo_pca_plot(sce, k = NA, pointsize = 0.5, pointalpha = 0.5, plottitle = "")
hippo_pca_plot(sce, k = NA, pointsize = 0.5, pointalpha = 0.5, plottitle = "")

Arguments

`sce`	SincleCellExperiment object with hippo and t-SNE result in it
`k`	number of rounds of clustering that you'd like to see result. Default is 1 to K
`pointsize`	size of the point for the plot (default 0.5)
`pointalpha`	transparency level of points for the plot (default 0.5)
`plottitle`	title for the ggplot

Value

ggplot for pca in each round

Examples

data(toydata)
set.seed(20200321)
toydata = hippo(toydata, K = 10,z_threshold = 1)
hippo_pca_plot(toydata, k = 2:3)
data(toydata)
set.seed(20200321)
toydata = hippo(toydata, K = 10,z_threshold = 1)
hippo_pca_plot(toydata, k = 2:3)

visualize each round of hippo through t-SNE

Description

visualize each round of hippo through t-SNE

Usage

hippo_tsne_plot(sce, k = NA, pointsize = 0.5, pointalpha = 0.5, plottitle = "")
hippo_tsne_plot(sce, k = NA, pointsize = 0.5, pointalpha = 0.5, plottitle = "")

Arguments

`sce`	SincleCellExperiment object with hippo and t-SNE result in it
`k`	number of rounds of clustering that you'd like to see result. Default is 1 to K
`pointsize`	size of the point for the plot (default 0.5)
`pointalpha`	transparency level of points for the plot (default 0.5)
`plottitle`	title for the ggplot output

Value

ggplot object for t-SNE in each round

Examples

data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
toydata = hippo_dimension_reduction(toydata, method="tsne")
hippo_tsne_plot(toydata)
data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
toydata = hippo_dimension_reduction(toydata, method="tsne")
hippo_tsne_plot(toydata)

visualize each round of hippo through UMAP

Description

visualize each round of hippo through UMAP

Usage

hippo_umap_plot(sce, k = NA, pointsize = 0.5, pointalpha = 0.5, plottitle = "")
hippo_umap_plot(sce, k = NA, pointsize = 0.5, pointalpha = 0.5, plottitle = "")

Arguments

`sce`	SingleCellExperiment object with hippo and UMAP result in it
`k`	number of rounds of clustering that you'd like to see result. Default is 1 to K
`pointsize`	size of the point for the plot (default 0.5)
`pointalpha`	transparency level of points for the plot (default 0.5)
`plottitle`	title of the resulting plot

Value

ggplot object for umap in each round

Examples

data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
toydata = hippo_dimension_reduction(toydata, method="umap")
hippo_umap_plot(toydata)
data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
toydata = hippo_dimension_reduction(toydata, method="umap")
hippo_umap_plot(toydata)

Expected zero proportion under Negative Binomial

Description

Expected zero proportion under Negative Binomial

Usage

nb_prob_zero(lambda, theta)
nb_prob_zero(lambda, theta)

Arguments

`lambda`	numeric vector of means of negative binomial
`theta`	numeric vector of the dispersion parameter for negative binomial, 0 if poisson

Value

numeric vector of expected zero proportion under Negative Binomial

Examples

nb_prob_zero(3, 1.1)
nb_prob_zero(3, 1.1)

Expected zero proportion under Poisson

Description

Expected zero proportion under Poisson

Usage

pois_prob_zero(lambda)
pois_prob_zero(lambda)

Arguments

lambda

numeric vector of means of Poisson

Value

numeric vector of expected proportion of zeros for each lambda

Examples

pois_prob_zero(3)
pois_prob_zero(3)

Preprocess UMI data without cell label so that each row contains information about each gene

Description

Preprocess UMI data without cell label so that each row contains information about each gene

Usage

preprocess_heterogeneous(X)
preprocess_heterogeneous(X)

Arguments

`X`	a matrix object with counts data

Value

data frame with one row for each gene.

Examples

data(toydata)
df = preprocess_heterogeneous(get_data_from_sce(toydata))
data(toydata)
df = preprocess_heterogeneous(get_data_from_sce(toydata))

Preprocess UMI data with inferred or known labels

Description

Preprocess UMI data with inferred or known labels

Usage

preprocess_homogeneous(sce, label)
preprocess_homogeneous(sce, label)

Arguments

`sce`	SingleCellExperiment object with counts data
`label`	a numeric or character vector of inferred or known label

Value

data frame with one row for each gene.

Examples

data(toydata)
labels = SingleCellExperiment::colData(toydata)$phenoid
df = preprocess_homogeneous(toydata, label = labels)
data(toydata)
labels = SingleCellExperiment::colData(toydata)$phenoid
df = preprocess_homogeneous(toydata, label = labels)

A sample single cell sequencing data subsetted from Zheng2017

Description

A sample single cell sequencing data subsetted from Zheng2017

Usage

toydata
toydata

Format

Single Cell experiment object with 10,000 genes and 100 cells

Source

https://www.nature.com/articles/ncomms14049

visualize each round of hippo through zero proportion plot

Description

visualize each round of hippo through zero proportion plot

Usage

zero_proportion_plot(
  sce,
  switch_to_hgnc = FALSE,
  ref = NA,
  k = NA,
  plottitle = "",
  top.n = 5,
  pointsize = 0.5,
  pointalpha = 0.5,
  textsize = 3
)
zero_proportion_plot(
  sce,
  switch_to_hgnc = FALSE,
  ref = NA,
  k = NA,
  plottitle = "",
  top.n = 5,
  pointsize = 0.5,
  pointalpha = 0.5,
  textsize = 3
)

Arguments

`sce`	SingleCellExperiment object with hippo element in it
`switch_to_hgnc`	boolean argument to indicate whether to change the gene names from ENSG IDs to HGNC symbols
`ref`	a data frame with hgnc column and ensg column
`k`	select rounds of clustering that you would like to see result. Default is 1 to K
`plottitle`	Title of your plot output
`top.n`	number of top genes to show the name
`pointsize`	size of the ggplot point
`pointalpha`	transparency level of the ggplot point
`textsize`	text size of the resulting plot

Value

a ggplot object that shows the zero proportions for each round

Examples

data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
data(ensg_hgnc)
zero_proportion_plot(toydata, switch_to_hgnc = TRUE, ref = ensg_hgnc)
data(toydata)
set.seed(20200321)
toydata = hippo(toydata,K = 10,z_threshold = 1,outlier_proportion = 0.01)
data(ensg_hgnc)
zero_proportion_plot(toydata, switch_to_hgnc = TRUE, ref = ensg_hgnc)

Expected zero proportion under Negative Binomial

Description

Expected zero proportion under Negative Binomial

Usage

zinb_prob_zero(lambda, theta, pi)
zinb_prob_zero(lambda, theta, pi)

Arguments

`lambda`	gene mean
`theta`	dispersion parameter, 0 if zero-inflated poisson
`pi`	zero inflation, 0 if negative binomial

Value

Expected zero proportion under Zero-Inflated Negative Binomial

Examples

zinb_prob_zero(3, 1.1, 0.1)
zinb_prob_zero(3, 1.1, 0.1)

Package 'HIPPO'

Help Index

re-export magrittr pipe operator

Description

A reference data frame that matches ENSG IDs to HGNC symbols

Description

Usage

Format

Source

Access data from SCE object

Description

Usage

Arguments

Value

Examples

Access hippo object from SingleCellExperiment object.

Description

Usage

Arguments

Value

Examples

Return hippo_diffexp object

Description

Usage

Arguments

Value

Examples

HIPPO's hierarchical clustering

Description

Usage

Arguments

Value

Examples

Conduct feature selection by computing test statistics for each gene

Description

Usage

Arguments

Value

Examples

HIPPO's differential expression

Description

Usage

Arguments

Value

Examples

compute t-SNE or umap of each round of HIPPO

Description

Usage

Arguments

Value

Examples

HIPPO's feature heatmap

Description

Usage

Arguments

Value

Examples

visualize each round of hippo through t-SNE

Description

Usage

Arguments

Value

Examples

visualize each round of hippo through t-SNE

Description

Usage

Arguments

Value

Examples

visualize each round of hippo through UMAP

Description

Usage

Arguments

Value

Examples

Expected zero proportion under Negative Binomial

Description

Usage

Arguments

Value