Package 'airpart' reference manual

Title:	Differential cell-type-specific allelic imbalance
Description:	Airpart identifies sets of genes displaying differential cell-type-specific allelic imbalance across cell types or states, utilizing single-cell allelic counts. It makes use of a generalized fused lasso with binomial observations of allelic counts to partition cell types by their allelic imbalance. Alternatively, a nonparametric method for partitioning cell types is offered. The package includes a number of visualizations and quality control functions for examining single cell allelic imbalance datasets.
Authors:	Wancen Mu [aut, cre] , Michael Love [aut, ctb]
Maintainer:	Wancen Mu <[email protected]>
License:	GPL-2
Version:	1.15.0
Built:	2025-03-25 05:53:55 UTC
Source:	https://github.com/bioc/airpart

Fit beta-binomial across cell types

Description

This function performs additional inference on the allelic ratio across cell types, giving posterior mean and credible intervals per cell type. A Cauchy prior is centered for each cell type on the allelic ratio from the fused lasso across all genes in the gene cluster (or using a weighted means if the fused lasso is not provided).

Usage

allelicRatio(sce, formula, nogroup = FALSE, level = 0.95, DAItest = FALSE, ...)
allelicRatio(sce, formula, nogroup = FALSE, level = 0.95, DAItest = FALSE, ...)

Arguments

`sce`	A SingleCellExperiment containing assays (`"ratio"`, `"counts"`) and colData (`"x"`, `"part"`)
`formula`	The same `formula` object used in `fusedLasso` or used in `ns` when want to detect smooth continuous-axis estimates. eg. formula <- ratio ~ ns(x, df = 5)
`nogroup`	Indicate whether there is previous group step. Default is FALSE represents either Generalized fused lasso or nonparametric method is used to derive partition. nogroup == TRUE means the hypothesis that groups of cell types sharing a common regulatory context is not valid
`level`	the level of credible interval (default is 0.95)
`DAItest`	Indicate whether to do Likelihood Ratio test on differential allelic imbalance(DAI) or equivalent to heterogeneity.
`...`	arguments to pass to `apeglm` functions

Value

posterior mean ("ar") for allelic ratio estimate is returned in the rowData for each cell type, as well as the "s" value, "fsr" false sign rate and credible interval ("lower" and "upper"). One can use "fsr" < 0.005 or credible intervals contain 0.5 or not for AI test significance. "p.value" shows DAI test result and "adj.p.value" is false discovery rate corrected p values.

Examples


sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub, DAItest = TRUE)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub, DAItest = TRUE)

Quality control on cells

Description

Quality control on cells

Usage

cellQC(
  sce,
  spike,
  threshold = 0,
  mad_sum = 5,
  mad_detected = 3,
  mad_spikegenes = 5
)
cellQC(
  sce,
  spike,
  threshold = 0,
  mad_sum = 5,
  mad_detected = 3,
  mad_spikegenes = 5
)

Arguments

`sce`	SingleCellExperiment with `counts` and `ratio`
`spike`	the character name of spike genes. If missing, `spikePercent` will all be zero and `filter_spike` will be false.
`threshold`	A numeric scalar specifying the threshold above which a gene is considered to be detected.
`mad_sum`	A numeric scalar specifying exceed how many median absolute deviations from the median log10-counts a cell is considered to be filtered out. Default is 5.
`mad_detected`	A numeric scalar specifying exceed how many median absolute deviations from the median detected features a cell is considered to be filtered out. Default is 5.
`mad_spikegenes`	A numeric scalar specifying exceed how many median absolute deviations from the median spike genes expression percentage a cell is considered to be filtered out. Default is 5.

Value

A DataFrame of QC statistics includes

sum the sum of counts of each cell
detected the number of features above threshold
spikePercent the percentage of counts assignes to spike genes
filter_sum indicate whether log10-counts within mad_sum median absolute deviations from the median log10-counts for the dataset
filter_detected indicate whether features detected by this cell within mad_detected median absolute deviations from the median detected features for the dataset
filter_spike indicate whether percentage expressed by spike genes within mad_spikegenes median absolute deviations from the median spike genes expression percentage for the dataset

Examples

sce <- makeSimulatedData()
sce <- preprocess(sce)
cellQCmetrics <- cellQC(sce)
keep_cell <- (
  cellQCmetrics$filter_sum | # sufficient features (genes)
    # sufficient molecules counted
    cellQCmetrics$filter_detected |
    # sufficient features expressed compared to spike genes
    cellQCmetrics$filter_spike
)
sce <- sce[, keep_cell]

# or manually setting threshold
cellQCmetrics <- cellQC(sce,
  spike = "Ercc",
  mad_detected = 4, mad_spikegenes = 4
)
keep_cell <- (
  cellQCmetrics$sum > 2.4 |
    cellQCmetrics$detected > 110
)
sce <- makeSimulatedData()
sce <- preprocess(sce)
cellQCmetrics <- cellQC(sce)
keep_cell <- (
  cellQCmetrics$filter_sum | # sufficient features (genes)
    # sufficient molecules counted
    cellQCmetrics$filter_detected |
    # sufficient features expressed compared to spike genes
    cellQCmetrics$filter_spike
)
sce <- sce[, keep_cell]

# or manually setting threshold
cellQCmetrics <- cellQC(sce,
  spike = "Ercc",
  mad_detected = 4, mad_spikegenes = 4
)
keep_cell <- (
  cellQCmetrics$sum > 2.4 |
    cellQCmetrics$detected > 110
)

Consensus Partitions

Description

Derive consensus partitions of an ensemble fused lasso partitions.

Usage

consensusPart(sce)
consensusPart(sce)

Arguments

sce

SingleCellExperiment

Value

A matrix grouping factor partition is replaced in metadata. Consensus Partation also stored in colData"part".

Examples

library(smurf)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
f <- ratio ~ p(x, pen = "gflasso") # formula for the GFL
sce_sub <- fusedLasso(sce,
  formula = f, model = "binomial", genecluster = 1,
  niter = 2, ncores = 2, se.rule.nct = 3
)
sce_sub <- consensusPart(sce_sub)
library(smurf)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
f <- ratio ~ p(x, pen = "gflasso") # formula for the GFL
sce_sub <- fusedLasso(sce,
  formula = f, model = "binomial", genecluster = 1,
  niter = 2, ncores = 2, se.rule.nct = 3
)
sce_sub <- consensusPart(sce_sub)

Estimate overdispersion parameter of a beta-binomial

Description

Estimate overdispersion parameter of a beta-binomial

Usage

estDisp(sce, genecluster, type = c("plot", "values"))
estDisp(sce, genecluster, type = c("plot", "values"))

Arguments

`sce`	SingleCellExperiment with `a1` matrix and `counts`
`genecluster`	the gene cluster for which to estimate the over-dispersion parameter. Default is the cluster with the most cells
`type`	whether to output the over-dispersion estimates as a plot or a value

Value

A ggplot object of the dispersion estimates over the mean, or a data.frame of the mean and dispersion estimates (theta)

Examples


sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
estDisp(sce, genecluster = 1)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
estDisp(sce, genecluster = 1)

Extract results from an airpart analysis

Description

results extracts a result table from an airpart analysis giving posterior allelic ratio estimates, s values, false sign rate(fsr), upper confidence interval and lower confidence interval.

Usage

extractResult(sce, estimates = c("ar", "svalue", "fsr", "lower", "upper"))
extractResult(sce, estimates = c("ar", "svalue", "fsr", "lower", "upper"))

Arguments

`sce`	SingleCellExperiment
`estimates`	the estimates want to be extracted. Default is allelic ratio estimates, can be `"svalue"`, `"fsr"`, `"lower"`(credible interval) and `"upper"`(credible interval)

Value

a DataFrame of estimates

Examples


sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub)
ar <- extractResult(sce_sub)
ar
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub)
ar <- extractResult(sce_sub)
ar

Quality control on features

Description

Quality control on features

Usage

featureQC(sce, spike, detection_limit = 1, threshold = 0.25, sd = 0.03, pc = 2)
featureQC(sce, spike, detection_limit = 1, threshold = 0.25, sd = 0.03, pc = 2)

Arguments

`sce`	SingleCellExperiment with `counts` and `ratio`
`spike`	the character name of spike genes. The default is `Ercc`
`detection_limit`	Numeric scalar providing the value above which observations are deemed to be expressed.
`threshold`	A numeric scalar specifying the threshold above which percentage of cells expressed within each cell type. Default is 0.25
`sd`	A numeric scalar specifying the cell type weighted allelic ratio mean standard deviation threshold above which are interested features with highly variation. Default is 0.03
`pc`	pseudocount in the `preprocess` step

Value

A DataFrame of QC statistics includes

filter_celltype indicate whether genes expressed in more than threshold cells for all cell types
sd read counts standard deviation for each feature
filter_sd indicate whether gene standard deviation exceed sd
filter_spike indicate no spike genes

Examples


sce <- makeSimulatedData()
sce <- preprocess(sce)
featureQCmetric <- featureQC(sce)
keep_feature <- (featureQCmetric$filter_celltype &
  featureQCmetric$filter_sd &
  featureQCmetric$filter_spike)
sce <- sce[keep_feature, ]

# or manually setting threshold
featureQCmetric <- featureQC(sce,
  spike = "Ercc",
  threshold = 0.25, sd = 0.03, pc = 2
)
keep_feature <- (featureQCmetric$filter_celltype &
  featureQCmetric$sd > 0.02)
sce <- makeSimulatedData()
sce <- preprocess(sce)
featureQCmetric <- featureQC(sce)
keep_feature <- (featureQCmetric$filter_celltype &
  featureQCmetric$filter_sd &
  featureQCmetric$filter_spike)
sce <- sce[keep_feature, ]

# or manually setting threshold
featureQCmetric <- featureQC(sce,
  spike = "Ercc",
  threshold = 0.25, sd = 0.03, pc = 2
)
keep_feature <- (featureQCmetric$filter_celltype &
  featureQCmetric$sd > 0.02)

Generalized fused lasso to partition cell types by allelic imbalance

Description

Fits generalized fused lasso with either binomial(link="logit") or Gaussian likelihood, leveraging functions from the smurf package.

Usage

fusedLasso(
  sce,
  formula,
  model = c("binomial", "gaussian"),
  genecluster,
  niter = 1,
  pen.weights,
  lambda = "cv1se.dev",
  k = 5,
  adj.matrix,
  lambda.length = 25L,
  se.rule.nct = 8,
  se.rule.mult = 0.5,
  ...
)
fusedLasso(
  sce,
  formula,
  model = c("binomial", "gaussian"),
  genecluster,
  niter = 1,
  pen.weights,
  lambda = "cv1se.dev",
  k = 5,
  adj.matrix,
  lambda.length = 25L,
  se.rule.nct = 8,
  se.rule.mult = 0.5,
  ...
)

Arguments

`sce`	A SingleCellExperiment containing assays (`"ratio"`, `"counts"`) and colData `"x"`
`formula`	A `formula` object which will typically involve a fused lasso penalty: default is just using cell-type 'x': `ratio ~ p(x, pen="gflasso")`. Other possibilities would be to use the Graph-Guided Fused Lasso penalty, or add covariates want to be adjusted for, which can include a gene-level baseline 'gene' `ratio ~ p(x, pen = "ggflasso") + gene + batch` See `glmsmurf` for more details
`model`	Either `"binomial"` or `"gaussian"` used to fit the generalized fused lasso
`genecluster`	which gene cluster to run the fused lasso on. Usually one first identifies an interesting gene cluster pattern by `summaryAllelicRatio`
`niter`	number of iterations to run; recommended to run 5 times if allelic ratio differences across cell types are within [0.05, 0.1]
`pen.weights`	argument as described in `glmsmurf`
`lambda`	argument as described in `glmsmurf`. Default lambda is determined by `"cv1se.dev"` (cross-validation within 1 standard error rule(SE); deviance)
`k`	number of cross-validation folds
`adj.matrix`	argument as described in `glmsmurf`
`lambda.length`	argument as described in `glmsmurf`
`se.rule.nct`	the number of cell types to trigger a different SE-based rule than 1 SE (to prioritize larger models, less fusing, good for detecting smaller, e.g. 0.05, allelic ratio differences). When the number of cell types is less than or equal to this value, `se.rule.mult` SE rule is used. When the number of cell types is larger than this value, the standard 1 SE rule is used.
`se.rule.mult`	the multiplier of the SE in determining the lambda: the chosen lambda is within `se.rule.mult` x SE of the minimum deviance. Small values will prioritize larger models, less fusing. Only used when number of cell types is equal to or less than `se.rule.nct`
`...`	additional arguments passed to `glmsmurf`

Details

Usually, we used a Generalized Fused Lasso penalty for the cell states in order to regularize all possible coefficient differences. Another possibility would be to use the Graph-Guided Fused Lasso penalty to only regularize the differences of coefficients of neighboring cell states.

When using a Graph-Guided Fused Lasso penalty, the adjacency matrix corresponding to the graph needs to be provided. The elements of this matrix are zero when two levels are not connected, and one when they are adjacent.

See the package vignette for more details and a complete description of a use case.

Value

A SummarizedExperiment with attached metadata and colData: a matrix grouping factor partition and the penalized parameter lambda are returned in metadata "partition" and "lambda". Partition and logistic group allelic estimates are stored in colData: "part" and "coef".

References

This function leverages the glmsmurf function from the smurf package. For more details see the following manuscript:

Devriendt S, Antonio K, Reynkens T, et al. Sparse regression with multi-type regularized feature modeling[J]. Insurance: Mathematics and Economics, 2021, 96: 248-261.

Examples

library(S4Vectors)
library(smurf)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
f <- ratio ~ p(x, pen = "gflasso") # formula for the GFL
sce_sub <- fusedLasso(sce,
  formula = f, model = "binomial", genecluster = 1, ncores = 1)
metadata(sce_sub)$partition
metadata(sce_sub)$lambda

# can add covariates or `gene` to the formula
f2 <- ratio ~ p(x, pen = "gflasso") + gene
sce_sub <- fusedLasso(sce[1:5,],
  formula = f2, model = "binomial",
  genecluster = 1, ncores = 1)

# Suppose we have 4 cell states, if we only want neibouring cell states
# to be grouped together with other cell states. Note here the names of
# the cell states should be given as row and column names.
nct <- nlevels(sce$x)
adjmatrix <- makeOffByOneAdjMat(nct)
colnames(adjmatrix) <- rownames(adjmatrix) <- levels(sce$x)
f <- ratio ~ p(x, pen = "ggflasso") # use graph-guided fused lasso
sce_sub <- fusedLasso(sce,
  formula = f, model = "binomial", genecluster = 1,
  lambda = 0.5, ncores = 1,
  adj.matrix = list(x = adjmatrix)
)
metadata(sce_sub)$partition
library(S4Vectors)
library(smurf)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
f <- ratio ~ p(x, pen = "gflasso") # formula for the GFL
sce_sub <- fusedLasso(sce,
  formula = f, model = "binomial", genecluster = 1, ncores = 1)
metadata(sce_sub)$partition
metadata(sce_sub)$lambda

# can add covariates or `gene` to the formula
f2 <- ratio ~ p(x, pen = "gflasso") + gene
sce_sub <- fusedLasso(sce[1:5,],
  formula = f2, model = "binomial",
  genecluster = 1, ncores = 1)

# Suppose we have 4 cell states, if we only want neibouring cell states
# to be grouped together with other cell states. Note here the names of
# the cell states should be given as row and column names.
nct <- nlevels(sce$x)
adjmatrix <- makeOffByOneAdjMat(nct)
colnames(adjmatrix) <- rownames(adjmatrix) <- levels(sce$x)
f <- ratio ~ p(x, pen = "ggflasso") # use graph-guided fused lasso
sce_sub <- fusedLasso(sce,
  formula = f, model = "binomial", genecluster = 1,
  lambda = 0.5, ncores = 1,
  adj.matrix = list(x = adjmatrix)
)
metadata(sce_sub)$partition

Gene clustering based on allelic ratio matrix with pseudo-count

Description

Gene clustering based on allelic ratio matrix with pseudo-count

Usage

geneCluster(
  sce,
  G,
  method = c("GMM", "hierarchical"),
  minClusterSize = 3,
  plot = TRUE,
  ...
)
geneCluster(
  sce,
  G,
  method = c("GMM", "hierarchical"),
  minClusterSize = 3,
  plot = TRUE,
  ...
)

Arguments

`sce`	SingleCellExperiment containing assays `"ratio_pseudo"` and colData factor `"x"`
`G`	An integer vector specifying the numbers of clusters for which the BIC is to be calculated. The default is G=c(8, 12, 16, 20, 24).
`method`	the method to do gene clustering. The default is the Gaussian Mixture Modeling which is likely to be more accurate. `"hierarchical"` represents automatic hierarchical clustering which is faster to compute.
`minClusterSize`	Minimum cluster size of `"hierarchical"` method.
`plot`	logical, whether to make a PCA plot
`...`	Catches unused arguments in indirect or list calls via do.call as described in `Mclust`

Value

gene cluster IDs are stored in the rowData column cluster and a table of gene cluster is returned in metadata geneCluster

References

This function leverages Mclust from the mclust package, or hclust.

For mclust see: Luca Scrucca and Michael Fop and T. Brendan Murphy, Adrian E. Raftery "mclust 5: clustering, classification and density estimation using Gaussian finite mixture models" 2016. The R Journal. doi: 10.32614/RJ-2016-021

Examples


sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))

Plot allelic ratio result as forest

Description

Draw a forest plot to visualized cell type specific allelic ratio estimator and confidence interval. It is based on the forestplot-package's forestplot function.

Usage

makeForest(
  sce,
  genepoi,
  ctpoi = seq_len(nlevels(sce$x)),
  showtext = FALSE,
  xticks,
  boxsize = 0.25,
  xlab = "Allelic Ratio",
  col,
  grid = structure(seq(0.1, 0.9, 0.1), gp = gpar(lty = 2, col = "#CCCCFF")),
  ...
)
makeForest(
  sce,
  genepoi,
  ctpoi = seq_len(nlevels(sce$x)),
  showtext = FALSE,
  xticks,
  boxsize = 0.25,
  xlab = "Allelic Ratio",
  col,
  grid = structure(seq(0.1, 0.9, 0.1), gp = gpar(lty = 2, col = "#CCCCFF")),
  ...
)

Arguments

`sce`	A SingleCellExperiment containing colData allelic ratio estimator in the third column and last two column is the confidence interval.
`genepoi`	the gene position index or gene name vector that want to be plotted. Ordered by increased cell type svalue. Default is the top 40 genes that has minimum svalue in any cell type or all genes if number of genes smaller than 40.
`ctpoi`	the cell type position index that want to be plotted.
`showtext`	indicate whether show the svalue information along the forestplot.
`xticks`	argument as described in `forestplot`
`boxsize`	Override the default box size based on precision
`xlab`	x-axis label. Default is "Allelic Ratio"
`col`	Set the colors for all the elements. See `fpColors` for details
`grid`	If you want a discrete gray dashed grid at the level of the ticks you can set this parameter to TRUE. If you set the parameter to a vector of values lines will be drawn at the corresponding positions. If you want to specify the `gpar` of the lines then either directly pass a `gpar` object or set the gp attribute e.g. `attr(line_vector, "gp") <-` `gpar(lty=2, col = "red")`
`...`	Passsed on the other argument in `forestplot`.

Value

generates a forest plot

Examples

sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub)
makeForest(sce_sub, showtext = TRUE)

# if want to change some properties, like ticks position
library(forestplot)
xticks <- seq(from = 0, to = 1, by = 0.25)
xtlab <- rep(c(TRUE, FALSE), length.out = length(xticks))
attr(xticks, "labels") <- xtlab
genepoi <- paste0("gene", seq_len(5))
ctpoi <- c(1, 3)
makeForest(sce_sub, genepoi, ctpoi,
  xticks = xticks,
  col = fpColors(box = c("blue", "red", "black", "darkgreen"))
)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub)
makeForest(sce_sub, showtext = TRUE)

# if want to change some properties, like ticks position
library(forestplot)
xticks <- seq(from = 0, to = 1, by = 0.25)
xtlab <- rep(c(TRUE, FALSE), length.out = length(xticks))
attr(xticks, "labels") <- xtlab
genepoi <- paste0("gene", seq_len(5))
ctpoi <- c(1, 3)
makeForest(sce_sub, genepoi, ctpoi,
  xticks = xticks,
  col = fpColors(box = c("blue", "red", "black", "darkgreen"))
)

Plot allelic ratio as heatmap

Description

Plot allelic ratio as heatmap

Usage

makeHeatmap(
  sce,
  assay = c("ratio_pseudo", "ratio", "counts"),
  genecluster = NULL,
  show_row_names = FALSE,
  order_by_group = TRUE,
  ...
)
makeHeatmap(
  sce,
  assay = c("ratio_pseudo", "ratio", "counts"),
  genecluster = NULL,
  show_row_names = FALSE,
  order_by_group = TRUE,
  ...
)

Arguments

`sce`	SingleCellExperiment
`assay`	the assay to be plotted. Choices are `"ratio_pseudo"` which is the default, `"ratio"`, `"counts"`.
`genecluster`	an integer indicates which gene cluster heatmap want to be returned.
`show_row_names`	show row names or not
`order_by_group`	indicate whether order by group or order by cell types
`...`	Passsed on the other argument in `Heatmap`.

Value

generates a heatmap

Examples

set.seed(2021)
sce <- makeSimulatedData(p.vec = c(0.3, 0.5, 0.5, 0.3), ncl = 1)
sce <- preprocess(sce)
# display allelic ratio pattern in whole dataset
makeHeatmap(sce)

sce <- geneCluster(sce, G = seq_len(4), plot = FALSE)
sce_sub <- wilcoxExt(sce, genecluster = 1)
# display specific gene cluster partition result
makeHeatmap(sce_sub)
# display by cell type orders
makeHeatmap(sce_sub, order_by_group = FALSE)
set.seed(2021)
sce <- makeSimulatedData(p.vec = c(0.3, 0.5, 0.5, 0.3), ncl = 1)
sce <- preprocess(sce)
# display allelic ratio pattern in whole dataset
makeHeatmap(sce)

sce <- geneCluster(sce, G = seq_len(4), plot = FALSE)
sce_sub <- wilcoxExt(sce, genecluster = 1)
# display specific gene cluster partition result
makeHeatmap(sce_sub)
# display by cell type orders
makeHeatmap(sce_sub, order_by_group = FALSE)

Generating adjancy matrix for neighboring cell states.

Description

To use the Graph-Guided Fused Lasso penalty to only regularize the differences of coefficients of neighboring areas, suitable for time/spatial analysis. The adjacency matrix corresponding to the graph needs to be provided. The elements of this matrix are zero when two levels are not connected, and one when they are adjacent.

Usage

makeOffByOneAdjMat(nct)
makeOffByOneAdjMat(nct)

Arguments

nct

the number of cell types/states

Details

If manually input the adjacency matrix, this matrix has to be symmetric and the names of the cell states should be given as row and column names.

Examples

sce <- makeSimulatedData()
nct <- nlevels(sce$x)
adjmatrix <- makeOffByOneAdjMat(nct)
colnames(adjmatrix) <- rownames(adjmatrix) <- levels(sce$x)

sce <- makeSimulatedData()
nct <- nlevels(sce$x)
adjmatrix <- makeOffByOneAdjMat(nct)
colnames(adjmatrix) <- rownames(adjmatrix) <- levels(sce$x)

Make simulated data for airpart

Description

Make simulated data for airpart

Usage

makeSimulatedData(
  mu1 = 2,
  mu2 = 10,
  nct = 4,
  n = 30,
  ngenecl = 50,
  theta = 20,
  ncl = 3,
  p.vec = rep(c(0.2, 0.8, 0.5, 0.5, 0.7, 0.9), each = 2),
  totalClusters = FALSE
)
makeSimulatedData(
  mu1 = 2,
  mu2 = 10,
  nct = 4,
  n = 30,
  ngenecl = 50,
  theta = 20,
  ncl = 3,
  p.vec = rep(c(0.2, 0.8, 0.5, 0.5, 0.7, 0.9), each = 2),
  totalClusters = FALSE
)

Arguments

`mu1`	low count (typical of "noisy" ratio estimates)
`mu2`	high count
`nct`	number of cell types
`n`	number of cells per cell type
`ngenecl`	number of genes per cluster
`theta`	overdispersion parameter (higher is closer to binomial)
`ncl`	number of gene cluster
`p.vec`	the allelic ratio vector which follows gene cluster order. (length is nct * ncl)
`totalClusters`	logical, whether cell types should cluster by total count

Value

SingleCellExperiment with the following elements as assays

a1 allelic count matrix for the numerator/effect allele
a2 allelic count matrix for the denominator/non-effect allele
true.ratio a matrix of the true probabilities (allelic ratios) for the cell types

Also x in the colData is a vector of annotated cell types in the same order as cells in count matrix

Examples

library(SummarizedExperiment)
sce <- makeSimulatedData()
assayNames(sce)
library(SummarizedExperiment)
sce <- makeSimulatedData()
assayNames(sce)

Plot group partition and posterior allelic ratio estimates by step

Description

Plot group partition and posterior allelic ratio estimates by step

Usage

makeStep(sce, xlab = "cell type")
makeStep(sce, xlab = "cell type")

Arguments

`sce`	SingleCellExperiment
`xlab`	the x axis name.

Value

a ggplot2 object.

Examples


sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub)
makeStep(sce_sub)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub)
makeStep(sce_sub)

Posterior mean allelic ratio estimates in violin plots

Description

Posterior mean allelic ratio estimates in violin plots

Usage

makeViolin(sce, xlab = "cell type", ylim = c(0, 1))
makeViolin(sce, xlab = "cell type", ylim = c(0, 1))

Arguments

`sce`	SingleCellExperiment
`xlab`	the x axis name.
`ylim`	the y axis range

Value

a ggplot2 object, n represents number of cells in that cell type.

Examples


sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub)
makeViolin(sce_sub)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
sce_sub <- wilcoxExt(sce, genecluster = 1)
sce_sub <- allelicRatio(sce_sub)
makeViolin(sce_sub)

Preprocess the SingleCellExperiment

Description

Preprocess the SingleCellExperiment

Usage

preprocess(sce, pc = 2)
preprocess(sce, pc = 2)

Arguments

`sce`	SingleCellExperiment with `a1` (effect allele) and `a2` (non-effect allele). The allelic ratio will be calculated as a1 / (a1 + a2).
`pc`	pseudocount for calculating the smoothed ratio

Value

SingleCellExperiment with total count, allelic ratio = a1/(a1 + a2), and pseud-ocount-smoothed ratio

Examples

library(SummarizedExperiment)
sce <- makeSimulatedData()
sce <- preprocess(sce)
assayNames(sce)
library(SummarizedExperiment)
sce <- makeSimulatedData()
sce <- preprocess(sce)
assayNames(sce)

Allelic ratio summary

Description

Oroduce allelic ratio summaries for each gene cluster

Usage

summaryAllelicRatio(sce, genecluster)
summaryAllelicRatio(sce, genecluster)

Arguments

`sce`	SingleCellExperiment
`genecluster`	an optional vector of gene cluster IDs. if nothing is given, all cluster's summaries will be calculated

Value

a list of gene cluster summary tables containing:

weighted.mean weighted mean of allelic ratio for the cell types
mean mean allelic ratio for the cell types
var variance of allelic ratio for the cell types

Examples


library(S4Vectors)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
summary <- summaryAllelicRatio(sce, genecluster = c(1, 3))
summary
library(S4Vectors)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = 1:4)
summary <- summaryAllelicRatio(sce, genecluster = c(1, 3))
summary

Extension on Pairwise Mann Whitney Wilcoxon Test for partitioning

Description

Extends the Pairwise Mann Whitney Wilcoxon Test by combining hierarchical clustering for partition.

Usage

wilcoxExt(
  sce,
  genecluster,
  threshold,
  adj.matrix,
  p.adjust.method = "none",
  ncores = NULL,
  ...
)
wilcoxExt(
  sce,
  genecluster,
  threshold,
  adj.matrix,
  p.adjust.method = "none",
  ncores = NULL,
  ...
)

Arguments

`sce`	A SingleCellExperiment containing assays (`"ratio"`, `"counts"`) and colData `"x"`
`genecluster`	which gene cluster result want to be returned. Usually identified interesting gene cluster pattern by `summaryAllelicRatio`
`threshold`	a vector with candidate thresholds for raw p-value cut-off. Default is 10^seq(from=-2,to=-0.4,by=0.2). For details please see vignette
`adj.matrix`	an adjacency matrix with 1 indicates cell states allowed to be grouped together, 0 otherwise.
`p.adjust.method`	method for adjusting p-values (see `p.adjust`). Can be abbreviated
`ncores`	A cluster object created by `makeCluster`. Or an integer to indicate number of child-processes (integer values are ignored on Windows) for parallel evaluations
`...`	additional arguments to pass to `wilcox.test`.

Value

A matrix grouping factor partition and the significant cut-off threshold are returned in metadata "partition" and "threshold". Partation also stored in colData"part". Note we recommend the returned "threshold" is not at the ends of input "threshold".

Examples

library(S4Vectors)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
sce_sub <- wilcoxExt(sce, genecluster = 1)
metadata(sce_sub)$partition
metadata(sce_sub)$threshold

# Suppose we have 4 cell states, if we don't want cell state 1
# to be grouped together with other cell states
adj.matrix <- 1 - diag(4)
colnames(adj.matrix) <- rownames(adj.matrix) <- levels(sce$x)
adj.matrix[1, c(2, 3, 4)] <- 0
adj.matrix[c(2, 3, 4), 1] <- 0
thrs <- 10^seq(from = -2, to = -0.4, by = 0.1)
sce_sub <- wilcoxExt(sce,
  genecluster = 1, threshold = thrs,
  adj.matrix = adj.matrix
)
metadata(sce_sub)$partition
library(S4Vectors)
sce <- makeSimulatedData()
sce <- preprocess(sce)
sce <- geneCluster(sce, G = seq_len(4))
sce_sub <- wilcoxExt(sce, genecluster = 1)
metadata(sce_sub)$partition
metadata(sce_sub)$threshold

# Suppose we have 4 cell states, if we don't want cell state 1
# to be grouped together with other cell states
adj.matrix <- 1 - diag(4)
colnames(adj.matrix) <- rownames(adj.matrix) <- levels(sce$x)
adj.matrix[1, c(2, 3, 4)] <- 0
adj.matrix[c(2, 3, 4), 1] <- 0
thrs <- 10^seq(from = -2, to = -0.4, by = 0.1)
sce_sub <- wilcoxExt(sce,
  genecluster = 1, threshold = thrs,
  adj.matrix = adj.matrix
)
metadata(sce_sub)$partition

Package 'airpart'

Help Index

Fit beta-binomial across cell types

Description

Usage

Arguments

Value

Examples

Quality control on cells

Description

Usage

Arguments

Value

Examples

Consensus Partitions

Description

Usage

Arguments

Value

Examples

Estimate overdispersion parameter of a beta-binomial

Description

Usage

Arguments

Value

Examples

Extract results from an airpart analysis

Description

Usage

Arguments

Value

Examples

Quality control on features

Description

Usage

Arguments

Value

Examples

Generalized fused lasso to partition cell types by allelic imbalance

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Gene clustering based on allelic ratio matrix with pseudo-count

Description

Usage

Arguments

Value

References

See Also

Examples

Plot allelic ratio result as forest

Description

Usage

Arguments

Value

See Also

Examples

Plot allelic ratio as heatmap

Description

Usage

Arguments

Value

Examples

Generating adjancy matrix for neighboring cell states.

Description

Usage

Arguments

Details

Examples

Make simulated data for airpart

Description

Usage

Arguments

Value

Examples