Title: | Bayesian Analysis of Single-Cell Sequencing data |
---|---|
Description: | Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend. |
Authors: | Catalina Vallejos [aut, cre] , Nils Eling [aut], Alan O'Callaghan [aut], Sylvia Richardson [ctb], John Marioni [ctb] |
Maintainer: | Catalina Vallejos <[email protected]> |
License: | GPL-3 |
Version: | 2.19.0 |
Built: | 2024-12-29 04:08:33 UTC |
Source: | https://github.com/bioc/BASiCS |
Partitions data based on either cells or genes. Attempts to find a partitioning scheme which is "balanced" for either total reads per cell across all genes (partitioning by gene) or total expression per gene across all cells (partitioning by gene). When partitioning by cell, at least 20 cells must be in each partition or BASiCS_MCMC will fail. If this partitioning fails, it will continue recursively up to a maximum number of iterations (20 by default).
.generateSubsets( Data, NSubsets, SubsetBy = c("cell", "gene"), Alpha = 0.05, WithSpikes = FALSE, MaxDepth = 20, .Depth = 1 )
.generateSubsets( Data, NSubsets, SubsetBy = c("cell", "gene"), Alpha = 0.05, WithSpikes = FALSE, MaxDepth = 20, .Depth = 1 )
Data |
a SingleCellExperiment object |
NSubsets |
Integer specifying the number of batches into which to divide Data for divide and conquer inference. |
SubsetBy |
Partition by "cell" or by "gene". |
Alpha |
p-value threshold for ANOVA testing of "balance" |
WithSpikes |
Similar to argument for BASiCS_MCMC - do the Data contain spikes? |
MaxDepth |
Maximum number of recursive |
.Depth |
Internal parameter. Do not set. |
A list of SingleCellExperiment objects
Methods for subsetting BASiCS_Result and BASiCS_ResultsDE objects.
## S4 method for signature 'BASiCS_ResultsDE,ANY,ANY,ANY' x[i, j, drop = FALSE] ## S4 method for signature 'BASiCS_ResultsDE,ANY,ANY' x[[i]] ## S4 method for signature 'BASiCS_Result,ANY,ANY,ANY' x[i, j, drop = FALSE]
## S4 method for signature 'BASiCS_ResultsDE,ANY,ANY,ANY' x[i, j, drop = FALSE] ## S4 method for signature 'BASiCS_ResultsDE,ANY,ANY' x[[i]] ## S4 method for signature 'BASiCS_Result,ANY,ANY,ANY' x[i, j, drop = FALSE]
x |
Object being subsetted. |
i |
See |
j , drop
|
Ignored. |
An object of the same class as x
.
Converting BASiCS results objects to data.frames
## S4 method for signature 'BASiCS_ResultsDE' as.data.frame(x, Parameter, Filter = TRUE, ProbThreshold = NULL) ## S4 method for signature 'BASiCS_ResultDE' as.data.frame(x, Filter = TRUE, ProbThreshold = NULL) ## S4 method for signature 'BASiCS_ResultVG' as.data.frame(x, Filter = TRUE, ProbThreshold = NULL)
## S4 method for signature 'BASiCS_ResultsDE' as.data.frame(x, Parameter, Filter = TRUE, ProbThreshold = NULL) ## S4 method for signature 'BASiCS_ResultDE' as.data.frame(x, Filter = TRUE, ProbThreshold = NULL) ## S4 method for signature 'BASiCS_ResultVG' as.data.frame(x, Filter = TRUE, ProbThreshold = NULL)
x |
An object of class BASiCS_ResultVG, BASiCS_ResultDE, or BASiCS_ResultsDE. |
Parameter |
For BASiCS_ResultsDE objects only. Character scalar specifying which table of results to output. Available options are "Mean", (mu, mean expression), "Disp" (delta, overdispersion) and "ResDisp" (epsilon, residual overdispersion). |
Filter |
Logical scalar. If |
ProbThreshold |
Only used if |
A data.frame of test results.
Convert concentration in moles per microlitre to molecule counts
BASiCS_CalculateERCC(Mix, DilutionFactor, VolumePerCell)
BASiCS_CalculateERCC(Mix, DilutionFactor, VolumePerCell)
Mix |
The name of the spike-in mix to use. |
DilutionFactor |
The dilution factor applied to the spike-in mixture.
e.g., 1 microlitre per 50ml would be a 1/50000 |
VolumePerCell |
The volume of spike-in mixture added to each well, or to each cell. |
The molecule counts per well, or per cell, based on the input parameters.
Container of an MCMC sample of the BASiCS' model parameters
as generated by the function BASiCS_MCMC
.
parameters
List of matrices containing MCMC chains for each model parameter. Depending on the mode in which BASiCS was run, the following parameters can appear in the list:
mu
MCMC chain for gene-specific mean expression parameters
, biological genes only
(matrix with
q.bio
columns, all elements must be positive numbers)
MCMC chain for gene-specific biological over-dispersion
parameters , biological genes only
(matrix with
q.bio
columns, all elements must be positive numbers)
MCMC chain for cell-specific mRNA content normalisation parameters
(matrix with
n
columns, all elements must be positive
numbers and the sum of its elements must be equal to n
)
This parameter is only used when spike-in genes are available.
MCMC chain for cell-specific technical normalisation parameters
(matrix with
n
columns,
all elements must be positive numbers)
MCMC chain for cell-specific random effects
(matrix with
n
columns, all elements must be positive numbers)
MCMC chain for technical over-dispersion parameter(s)
(matrix, all elements must be positive,
each colum represents 1 batch)
beta
Only relevant for regression BASiCS model (Eling et al,
2017). MCMC chain for regression coefficients (matrix with k
columns,
where k
represent the number of chosen basis functions + 2)
sigma2
Only relevant for regression BASiCS model (Eling et al, 2017). MCMC chain for the residual variance (matrix with one column, sigma2 represents a global parameter)
epsilon
Only relevant for regression BASiCS model (Eling et al,
2017). MCMC chain for the gene-specific residual over-dispersion parameter
(matrix with q
columns)
RefFreq
Only relevant for no-spikes BASiCS model (Eling et al, 2017). For each biological gene, this vector displays the proportion of times for which each gene was used as a reference (within the MCMC algorithm), when using the stochastic reference choice described in (Eling et al, 2017). This information has been kept as it is useful for the developers of this library. However, we do not expect users to need it.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
# A BASiCS_Chain object created by the BASiCS_MCMC function. Data <- makeExampleBASiCS_Data() # To run the model without regression Chain <- BASiCS_MCMC(Data, N = 100, Thin = 2, Burn = 2, Regression = FALSE) # To run the model using the regression model ChainReg <- BASiCS_MCMC(Data, N = 100, Thin = 2, Burn = 2, Regression = TRUE)
# A BASiCS_Chain object created by the BASiCS_MCMC function. Data <- makeExampleBASiCS_Data() # To run the model without regression Chain <- BASiCS_MCMC(Data, N = 100, Thin = 2, Burn = 2, Regression = FALSE) # To run the model using the regression model ChainReg <- BASiCS_MCMC(Data, N = 100, Thin = 2, Burn = 2, Regression = TRUE)
'show' method for BASiCS_Chain
objects.
'updateObject' method for BASiCS_Chain
objects. It is used to convert outdated BASiCS_Chain
objects into a version that is compatible with the Bioconductor release
of BASiCS. Do not use this method is BASiCS_Chain
already contains a parameters
slot.
## S4 method for signature 'BASiCS_Chain' show(object) ## S4 method for signature 'BASiCS_Chain' updateObject(object, ..., verbose = FALSE)
## S4 method for signature 'BASiCS_Chain' show(object) ## S4 method for signature 'BASiCS_Chain' updateObject(object, ..., verbose = FALSE)
object |
A |
... |
Additional arguments of |
verbose |
Additional argument of
|
Prints a summary of the properties of object
.
Returns an updated BASiCS_Chain
object that
contains all model parameters in a single slot object (list).
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Data <- makeExampleBASiCS_Data() Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 2, Regression = FALSE) # Not run # New_Chain <- updateObject(Old_Chain)
Data <- makeExampleBASiCS_Data() Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 2, Regression = FALSE) # Not run # New_Chain <- updateObject(Old_Chain)
Remove global offset in mean expression between two
BASiCS_Chain
objects.
BASiCS_CorrectOffset(Chain, ChainRef, min.mean = 1)
BASiCS_CorrectOffset(Chain, ChainRef, min.mean = 1)
Chain |
a 'BASiCS_MCMC' object to which the offset correction should be applied (with respect to 'ChainRef'). |
ChainRef |
a 'BASiCS_MCMC' object to be used as the reference in the offset correction procedure. |
min.mean |
Minimum mean expression threshold required for inclusion in offset calculation. Similar to 'min.mean' in 'scran::computeSumFactors'. |
A list whose first element is an offset corrected version of 'Chain' (using 'ChainRef' as a reference), whose second element is the point estimate for the offset and whose third element contains iteration-specific offsets.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Alan O'Callaghan
# Loading two 'BASiCS_Chain' objects (obtained using 'BASiCS_MCMC') data(ChainSC) data(ChainRNA) A <- BASiCS_CorrectOffset(ChainSC, ChainRNA) # Offset corrected versions for ChainSC (with respect to ChainRNA). A$Chain A$Offset
# Loading two 'BASiCS_Chain' objects (obtained using 'BASiCS_MCMC') data(ChainSC) data(ChainRNA) A <- BASiCS_CorrectOffset(ChainSC, ChainRNA) # Offset corrected versions for ChainSC (with respect to ChainRNA). A$Chain A$Offset
Calculates denoised expression counts by removing cell-specific technical variation. The latter includes global-scaling normalisation and therefore no further normalisation is required.
BASiCS_DenoisedCounts(Data, Chain, WithSpikes = TRUE)
BASiCS_DenoisedCounts(Data, Chain, WithSpikes = TRUE)
Data |
An object of class |
Chain |
An object of class |
WithSpikes |
A logical scalar specifying whether denoised spike-in
genes should be generated as part of the output value. This only applies
when the |
See vignette browseVignettes("BASiCS")
A matrix of denoised expression counts. In line with global scaling
normalisation strategies, these are defined as
for biological genes and
for spike-in genes. For this
calculation
are estimated by their corresponding
posterior medians. If spike-ins are not used,
is set equal to 1.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Data <- makeExampleBASiCS_Data(WithSpikes = TRUE) ## The N and Burn parameters used here are optimised for speed ## and should not be used in regular use. ## For more useful parameters, ## see the vignette (\code{browseVignettes("BASiCS")}) Chain <- BASiCS_MCMC(Data, N = 1000, Thin = 10, Burn = 500, Regression = FALSE, PrintProgress = FALSE) DC <- BASiCS_DenoisedCounts(Data, Chain)
Data <- makeExampleBASiCS_Data(WithSpikes = TRUE) ## The N and Burn parameters used here are optimised for speed ## and should not be used in regular use. ## For more useful parameters, ## see the vignette (\code{browseVignettes("BASiCS")}) Chain <- BASiCS_MCMC(Data, N = 1000, Thin = 10, Burn = 500, Regression = FALSE, PrintProgress = FALSE) DC <- BASiCS_DenoisedCounts(Data, Chain)
Calculates normalised and denoised expression rates, by removing the effect of technical variation.
BASiCS_DenoisedRates(Data, Chain, Propensities = FALSE)
BASiCS_DenoisedRates(Data, Chain, Propensities = FALSE)
Data |
an object of class |
Chain |
an object of class |
Propensities |
If |
See vignette
A matrix of denoised expression rates (biological genes only)
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Data <- makeExampleBASiCS_Data(WithSpikes = TRUE) ## The N and Burn parameters used here are optimised for speed ## and should not be used in regular use. ## For more useful parameters, ## see the vignette (\code{browseVignettes("BASiCS")}) Chain <- BASiCS_MCMC(Data, N = 1000, Thin = 10, Burn = 500, Regression = FALSE, PrintProgress = FALSE) DR <- BASiCS_DenoisedRates(Data, Chain)
Data <- makeExampleBASiCS_Data(WithSpikes = TRUE) ## The N and Burn parameters used here are optimised for speed ## and should not be used in regular use. ## For more useful parameters, ## see the vignette (\code{browseVignettes("BASiCS")}) Chain <- BASiCS_MCMC(Data, N = 1000, Thin = 10, Burn = 500, Regression = FALSE, PrintProgress = FALSE) DR <- BASiCS_DenoisedRates(Data, Chain)
Functions to detect highly and lowly variable genes. If the BASiCS_Chain object was generated using the regression approach, BASiCS finds the top highly variable genes based on the posteriors of the epsilon parameters. Otherwise, the old approach is used, which initially performs a variance decomposition.
BASiCS_DetectVG( Chain, Task = c("HVG", "LVG"), PercentileThreshold = NULL, VarThreshold = NULL, ProbThreshold = 2/3, EpsilonThreshold = NULL, EFDR = 0.1, OrderVariable = c("Prob", "GeneIndex", "GeneName"), Plot = FALSE, MinESS = 100, ... ) BASiCS_DetectLVG(Chain, ...) BASiCS_DetectHVG(Chain, ...)
BASiCS_DetectVG( Chain, Task = c("HVG", "LVG"), PercentileThreshold = NULL, VarThreshold = NULL, ProbThreshold = 2/3, EpsilonThreshold = NULL, EFDR = 0.1, OrderVariable = c("Prob", "GeneIndex", "GeneName"), Plot = FALSE, MinESS = 100, ... ) BASiCS_DetectLVG(Chain, ...) BASiCS_DetectHVG(Chain, ...)
Chain |
an object of class |
Task |
Search for highly variable genes ( |
PercentileThreshold |
Threshold to detect a percentile of variable genes
(must be a positive value, between 0 and 1).
Default: |
VarThreshold |
Variance contribution threshold
(must be a positive value, between 0 and 1). This is only used when the
BASiCS non-regression model was used to generate the Chain object.
Default: |
ProbThreshold |
Optional parameter. Posterior probability threshold
(must be a positive value, between 0 and 1). If |
EpsilonThreshold |
Threshold for residual overdispersion above which |
EFDR |
Target for expected false discovery rate related
to HVG/LVG detection. If |
OrderVariable |
Ordering variable for output.
Possible values: |
Plot |
If |
MinESS |
The minimum effective sample size for a gene to be included in the HVG or LVG tests. This helps to remove genes with poor mixing from detection of HVGs/LVGs. Default is 100. If set to NA, genes are not checked for effective sample size the tests are performed. |
... |
Graphical parameters (see |
In some cases, the EFDR calibration step may fail to find probability threshold that controls the EFDR at the chosen level. In cases like
See vignette
An object of class BASiCS_ResultVG
.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Vallejos, Marioni and Richardson (2015). PLoS Computational Biology.
# Loads short example chain (non-regression implementation) data(ChainSC) # Highly and lowly variable genes detection (within a single group of cells) DetectHVG <- BASiCS_DetectHVG(ChainSC, VarThreshold = 0.60, EFDR = 0.10, Plot = TRUE) DetectLVG <- BASiCS_DetectLVG(ChainSC, VarThreshold = 0.40, EFDR = 0.10, Plot = TRUE) # Loads short example chain (regression implementation) data(ChainSCReg) # Highly and lowly variable genes detection (within a single group of cells) DetectHVG <- BASiCS_DetectHVG(ChainSCReg, PercentileThreshold = 0.90, EFDR = 0.10, Plot = TRUE) DetectLVG <- BASiCS_DetectLVG(ChainSCReg, PercentileThreshold = 0.10, EFDR = 0.10, Plot = TRUE) ## Highly and lowly variable genes detection based on residual overdispersion ## threshold DetectHVG <- BASiCS_DetectHVG(ChainSCReg, EpsilonThreshold = log(2), Plot = TRUE) DetectLVG <- BASiCS_DetectLVG(ChainSCReg, EpsilonThreshold = -log(2), Plot = TRUE)
# Loads short example chain (non-regression implementation) data(ChainSC) # Highly and lowly variable genes detection (within a single group of cells) DetectHVG <- BASiCS_DetectHVG(ChainSC, VarThreshold = 0.60, EFDR = 0.10, Plot = TRUE) DetectLVG <- BASiCS_DetectLVG(ChainSC, VarThreshold = 0.40, EFDR = 0.10, Plot = TRUE) # Loads short example chain (regression implementation) data(ChainSCReg) # Highly and lowly variable genes detection (within a single group of cells) DetectHVG <- BASiCS_DetectHVG(ChainSCReg, PercentileThreshold = 0.90, EFDR = 0.10, Plot = TRUE) DetectLVG <- BASiCS_DetectLVG(ChainSCReg, PercentileThreshold = 0.10, EFDR = 0.10, Plot = TRUE) ## Highly and lowly variable genes detection based on residual overdispersion ## threshold DetectHVG <- BASiCS_DetectHVG(ChainSCReg, EpsilonThreshold = log(2), Plot = TRUE) DetectLVG <- BASiCS_DetectLVG(ChainSCReg, EpsilonThreshold = -log(2), Plot = TRUE)
Plot a histogram of effective sample size or Geweke's diagnostic z-statistic. See effectiveSize and geweke.diag for more details.
BASiCS_DiagHist( object, Parameter = NULL, Measure = c("ess", "geweke.diag", "rhat"), VLine = TRUE, na.rm = TRUE ) BASiCS_diagHist(...)
BASiCS_DiagHist( object, Parameter = NULL, Measure = c("ess", "geweke.diag", "rhat"), VLine = TRUE, na.rm = TRUE ) BASiCS_diagHist(...)
object |
an object of class |
Parameter |
Optional name of a chain parameter to restrict the
histogram; if not supplied, all parameters will be assessed.
Default |
Measure |
Character scalar specifying the diagnostic measure to plot.
Current options are effective sample size, the Geweke diagnostic
criterion, and the |
VLine |
Numeric scalar indicating a threshold value to be displayed as
a dashed line on the plot.
Alternatively, can be set to |
na.rm |
Logical scalar indicating whether NA values should be removed before calculating effective sample size. |
... |
Unused. |
A ggplot object.
Alan O'Callaghan
Geweke, J. Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In _Bayesian Statistics 4_ (ed JM Bernado, JO Berger, AP Dawid and AFM Smith). Clarendon Press, Oxford, UK.
# Built-in example chain data(ChainSC) # See effective sample size distribution across all parameters BASiCS_DiagHist(ChainSC) # For mu only BASiCS_DiagHist(ChainSC, Parameter = "mu")
# Built-in example chain data(ChainSC) # See effective sample size distribution across all parameters BASiCS_DiagHist(ChainSC) # For mu only BASiCS_DiagHist(ChainSC, Parameter = "mu")
Plot parameter values and effective sample size. See effectiveSize for more details on this diagnostic measure.
BASiCS_DiagPlot( object, Parameter = "mu", Measure = c("ess", "geweke.diag", "rhat"), x = NULL, y = NULL, LogX = isTRUE(x %in% c("mu", "delta")), LogY = isTRUE(y %in% c("mu", "delta")), Smooth = TRUE, HLine = TRUE, na.rm = TRUE ) BASiCS_diagPlot(...)
BASiCS_DiagPlot( object, Parameter = "mu", Measure = c("ess", "geweke.diag", "rhat"), x = NULL, y = NULL, LogX = isTRUE(x %in% c("mu", "delta")), LogY = isTRUE(y %in% c("mu", "delta")), Smooth = TRUE, HLine = TRUE, na.rm = TRUE ) BASiCS_diagPlot(...)
object |
an object of class |
Parameter |
Name of the parameter to be plotted.
Default |
Measure |
Character scalar specifying the diagnostic measure to plot.
Current options are effective sample size, the Geweke diagnostic
criterion, and the |
x , y
|
Optional MCMC parameter values to be plotted on the x or y axis, respectively. If neither is supplied, Parameter will be plotted on the x axis and effective sample size will be plotted on the y axis as a density plot. |
LogX , LogY
|
A logical value indicating whether to use a log10 transformation for the x or y axis, respectively. |
Smooth |
A logical value indicating whether to use smoothing
(specifically hexagonal binning using |
HLine |
Numeric scalar or vector indicating threshold value(s) to be
displayed as a dashed line on the plot when |
na.rm |
Logical value indicating whether NA values should be removed before calculating effective sample size. |
... |
Unused. |
A ggplot object.
Alan O'Callaghan
# Built-in example chain data(ChainSC) # Point estimates versus effective sample size BASiCS_DiagPlot(ChainSC, Parameter = "mu") # Effective sample size as colour, mu as x, delta as y. BASiCS_DiagPlot(ChainSC, x = "mu", y = "delta") # Point estimates versus Geweke diagnostic BASiCS_DiagPlot(ChainSC, Parameter = "mu", Measure = "geweke.diag")
# Built-in example chain data(ChainSC) # Point estimates versus effective sample size BASiCS_DiagPlot(ChainSC, Parameter = "mu") # Effective sample size as colour, mu as x, delta as y. BASiCS_DiagPlot(ChainSC, x = "mu", y = "delta") # Point estimates versus Geweke diagnostic BASiCS_DiagPlot(ChainSC, Parameter = "mu", Measure = "geweke.diag")
Performs MCMC inference on batches of data. Data
is divided
into NSubsets
batches, and BASiCS_MCMC
is run on each
batch separately.
BASiCS_DivideAndConquer( Data, NSubsets = 5, SubsetBy = c("cell", "gene"), Alpha = 0.05, WithSpikes, Regression, BPPARAM = BiocParallel::bpparam(), PriorParam = BASiCS_PriorParam(Data, PriorMu = "EmpiricalBayes"), RunName, StoreChains, StoreDir, Start, ... )
BASiCS_DivideAndConquer( Data, NSubsets = 5, SubsetBy = c("cell", "gene"), Alpha = 0.05, WithSpikes, Regression, BPPARAM = BiocParallel::bpparam(), PriorParam = BASiCS_PriorParam(Data, PriorMu = "EmpiricalBayes"), RunName, StoreChains, StoreDir, Start, ... )
Data |
SingleCellExperiemnt object |
NSubsets |
The number of batches to create and perform MCMC inference with. |
SubsetBy |
A character value specifying whether batches should consist
of a subset of the cells in |
Alpha |
A numeric value specifying the statistical significance level used to determine whether the average library size or average count are significantly different between batches. |
WithSpikes , Regression , PriorParam
|
See |
BPPARAM |
A |
RunName , StoreChains , StoreDir , Start
|
Unused. If used when calling this function, they are likely to result in undefined behaviour. |
... |
Passed to |
Subsets are chosen such that the average library size (when partitioning
by cells) or average count (when partitioning by genes) is not significantly
different between batches, at a significance level Alpha
.
A list of BASiCS_Chain objects.
Simple, Scalable and Accurate Posterior Interval Estimation Cheng Li and Sanvesh Srivastava and David B. Dunson arXiv (2016)
bp <- BiocParallel::SnowParam() Data <- BASiCS_MockSCE() BASiCS_DivideAndConquer( Data, NSubsets = 2, SubsetBy = "gene", N = 8, Thin = 2, Burn = 4, WithSpikes = TRUE, Regression = TRUE, BPPARAM = bp )
bp <- BiocParallel::SnowParam() Data <- BASiCS_MockSCE() BASiCS_DivideAndConquer( Data, NSubsets = 2, SubsetBy = "gene", N = 8, Thin = 2, Burn = 4, WithSpikes = TRUE, Regression = TRUE, BPPARAM = bp )
BASiCS_Draw
creates a simulated dataset from the
posterior of a fitted model implemented in BASiCS.
BASiCS_Draw( Chain, BatchInfo = gsub(".*_Batch([0-9a-zA-Z])", "\\1", colnames(Chain@parameters[["nu"]])), N = sample(nrow(Chain@parameters[["nu"]]), 1) )
BASiCS_Draw( Chain, BatchInfo = gsub(".*_Batch([0-9a-zA-Z])", "\\1", colnames(Chain@parameters[["nu"]])), N = sample(nrow(Chain@parameters[["nu"]]), 1) )
Chain |
An object of class |
BatchInfo |
Vector of batch information from the SingleCellExperiment object used as input to BASiCS_MCMC. |
N |
The integer index for the draw to be used to sample from the posterior predictive distribution. If not supplied, a random value is chosen. |
An object of class SingleCellExperiment
,
including synthetic data generated by the model implemented in BASiCS.
Alan O'Callaghan
Vallejos, Marioni and Richardson (2015). PLoS Computational Biology.
data(ChainSC) BASiCS_Draw(ChainSC) data(ChainSC) BASiCS_Draw(ChainSC)
data(ChainSC) BASiCS_Draw(ChainSC) data(ChainSC) BASiCS_Draw(ChainSC)
A function to calculate effective sample size
BASiCS_Chain
objects.
BASiCS_EffectiveSize(object, Parameter, na.rm = TRUE) BASiCS_effectiveSize(...)
BASiCS_EffectiveSize(object, Parameter, na.rm = TRUE) BASiCS_effectiveSize(...)
object |
an object of class |
Parameter |
The parameter to use to calculate effective sample size.
Possible
values: |
na.rm |
Remove |
... |
Unused. |
A vector with effective sample sizes for all the elements
of Parameter
data(ChainSC) BASiCS_EffectiveSize(ChainSC, Parameter = "mu")
data(ChainSC) BASiCS_EffectiveSize(ChainSC, Parameter = "mu")
BASiCS_Filter
indicates which transcripts and
cells pass a pre-defined inclusion criteria. The output of this
function used to generate a
SingleCellExperiment
object required to run BASiCS.
For more systematic tools for quality control, please refer to the
scater
Bioconductor package.
BASiCS_Filter( Counts, Tech = rep(FALSE, nrow(Counts)), SpikeInput = NULL, BatchInfo = NULL, MinTotalCountsPerCell = 2, MinTotalCountsPerGene = 2, MinCellsWithExpression = 2, MinAvCountsPerCellsWithExpression = 2 )
BASiCS_Filter( Counts, Tech = rep(FALSE, nrow(Counts)), SpikeInput = NULL, BatchInfo = NULL, MinTotalCountsPerCell = 2, MinTotalCountsPerGene = 2, MinCellsWithExpression = 2, MinAvCountsPerCellsWithExpression = 2 )
Counts |
Matrix of dimensions |
Tech |
Logical vector of length |
SpikeInput |
Vector of length |
BatchInfo |
Vector of length |
MinTotalCountsPerCell |
Minimum value of total expression counts
required per cell (biological and technical).
Default: |
MinTotalCountsPerGene |
Minimum value of total expression counts
required per transcript (biological and technical).
Default: |
MinCellsWithExpression |
Minimum number of cells where expression
must be detected (positive count). Criteria applied to each transcript.
Default: |
MinAvCountsPerCellsWithExpression |
Minimum average number of
counts per cells where expression is detected. Criteria applied to
each transcript. Default value: |
A list of 2 elements
Counts
Filtered matrix of expression counts
Tech
Filtered vector of spike-in indicators
SpikeInput
Filtered vector of spike-in genes input molecules
BatchInfo
Filtered vector of the 'BatchInfo' argument
IncludeGenes
Inclusion indicators for transcripts
IncludeCells
Inclusion indicators for cells
Catalina A. Vallejos [email protected]
set.seed(1) Counts <- matrix(rpois(50*10, 2), ncol = 10) rownames(Counts) <- c(paste0('Gene', 1:40), paste0('Spike', 1:10)) Tech <- c(rep(FALSE,40),rep(TRUE,10)) set.seed(2) SpikeInput <- rgamma(10,1,1) SpikeInfo <- data.frame('SpikeID' = paste0('Spike', 1:10), 'SpikeInput' = SpikeInput) Filter <- BASiCS_Filter(Counts, Tech, SpikeInput, MinTotalCountsPerCell = 2, MinTotalCountsPerGene = 2, MinCellsWithExpression = 2, MinAvCountsPerCellsWithExpression = 2) SpikeInfoFilter <- SpikeInfo[SpikeInfo$SpikeID %in% rownames(Filter$Counts),]
set.seed(1) Counts <- matrix(rpois(50*10, 2), ncol = 10) rownames(Counts) <- c(paste0('Gene', 1:40), paste0('Spike', 1:10)) Tech <- c(rep(FALSE,40),rep(TRUE,10)) set.seed(2) SpikeInput <- rgamma(10,1,1) SpikeInfo <- data.frame('SpikeID' = paste0('Spike', 1:10), 'SpikeInput' = SpikeInput) Filter <- BASiCS_Filter(Counts, Tech, SpikeInput, MinTotalCountsPerCell = 2, MinTotalCountsPerGene = 2, MinCellsWithExpression = 2, MinAvCountsPerCellsWithExpression = 2) SpikeInfoFilter <- SpikeInfo[SpikeInfo$SpikeID %in% rownames(Filter$Counts),]
BASiCS_MCMC
functionLoads pre-computed MCMC chains generated by the
BASiCS_MCMC
function, creating
a BASiCS_Chain
object
BASiCS_LoadChain(RunName = "", StoreDir = getwd(), StoreUpdatedChain = FALSE)
BASiCS_LoadChain(RunName = "", StoreDir = getwd(), StoreUpdatedChain = FALSE)
RunName |
String used to index '.Rds' file containing the MCMC chain
(produced by the |
StoreDir |
Directory where '.Rds' file is stored.
Default: |
StoreUpdatedChain |
Only required when the input files contain an
outdated version of a |
An object of class BASiCS_Chain
.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Data <- makeExampleBASiCS_Data() Chain <- BASiCS_MCMC( Data, N = 50, Thin = 5, Burn = 5, Regression = FALSE, StoreChains = TRUE, StoreDir = tempdir(), RunName = "Test" ) ChainLoad <- BASiCS_LoadChain(RunName = "Test", StoreDir = tempdir())
Data <- makeExampleBASiCS_Data() Chain <- BASiCS_MCMC( Data, N = 50, Thin = 5, Burn = 5, Regression = FALSE, StoreChains = TRUE, StoreDir = tempdir(), RunName = "Test" ) ChainLoad <- BASiCS_LoadChain(RunName = "Test", StoreDir = tempdir())
MCMC sampler to perform Bayesian inference for single-cell mRNA sequencing datasets using the model described in Vallejos et al (2015).
BASiCS_MCMC( Data, N, Thin, Burn, Regression, WithSpikes = TRUE, PriorParam = BASiCS_PriorParam(Data, PriorMu = "EmpiricalBayes"), FixNu = FALSE, SubsetBy = c("none", "gene", "cell"), NSubsets = 1, CombineMethod = c("pie", "consensus"), Weighting = c("naive", "n_weight", "inverse_variance"), Threads = getOption("Ncpus", default = 1L), BPPARAM = BiocParallel::bpparam(), ... )
BASiCS_MCMC( Data, N, Thin, Burn, Regression, WithSpikes = TRUE, PriorParam = BASiCS_PriorParam(Data, PriorMu = "EmpiricalBayes"), FixNu = FALSE, SubsetBy = c("none", "gene", "cell"), NSubsets = 1, CombineMethod = c("pie", "consensus"), Weighting = c("naive", "n_weight", "inverse_variance"), Threads = getOption("Ncpus", default = 1L), BPPARAM = BiocParallel::bpparam(), ... )
Data |
A |
N |
Total number of iterations for the MCMC sampler.
Use |
Thin |
Thining period for the MCMC sampler. Use |
Burn |
Burn-in period for the MCMC sampler. Use |
Regression |
If |
WithSpikes |
If |
PriorParam |
List of prior parameters for BASiCS_MCMC.
Should be created using |
FixNu |
Should the scaling normalisation factor |
SubsetBy |
Character value specifying whether a divide and
conquer inference strategy should be used. When this is set to |
NSubsets |
If |
CombineMethod |
The method used to combine
subposteriors if |
Weighting |
The weighting method used in the weighted
average chosen using |
Threads |
Integer specifying the number of threads to be used to
parallelise parameter updates. Default value is the globally set
|
BPPARAM |
A |
... |
Optional parameters.
|
An object of class BASiCS_Chain
.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Vallejos, Marioni and Richardson (2015). PLoS Computational Biology.
Vallejos, Richardson and Marioni (2016). Genome Biology.
Eling et al (2018). Cell Systems
Simple, Scalable and Accurate Posterior Interval Estimation Cheng Li and Sanvesh Srivastava and David B. Dunson arXiv (2016)
Bayes and Big Data: The Consensus Monte Carlo Algorithm Steven L. Scott, Alexander W. Blocker, Fernando V. Bonassi, Hugh A. Chipman, Edward I. George and Robert E. McCulloch International Journal of Management Science and Engineering Management (2016)
# Built-in simulated dataset set.seed(1) Data <- makeExampleBASiCS_Data() # To analyse real data, please refer to the instructions in: # https://github.com/catavallejos/BASiCS/wiki/2.-Input-preparation # Only a short run of the MCMC algorithm for illustration purposes # Longer runs migth be required to reach convergence Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 10, Regression = FALSE, PrintProgress = FALSE, WithSpikes = TRUE) # To run the regression version of BASiCS, use: Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 10, Regression = TRUE, PrintProgress = FALSE, WithSpikes = TRUE) # To run the non-spike version BASiCS requires the data to contain at least # 2 batches: set.seed(2) Data <- makeExampleBASiCS_Data(WithBatch = TRUE) Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 10, Regression = TRUE, PrintProgress = FALSE, WithSpikes = FALSE) # For illustration purposes we load a built-in 'BASiCS_Chain' object # (obtained using the 'BASiCS_MCMC' function) data(ChainSC) # `displayChainBASiCS` can be used to extract information from this output. # For example: head(displayChainBASiCS(ChainSC, Param = 'mu')) # Traceplot (examples only) plot(ChainSC, Param = 'mu', Gene = 1) plot(ChainSC, Param = 'phi', Cell = 1) plot(ChainSC, Param = 'theta', Batch = 1) # Calculating posterior medians and 95% HPD intervals ChainSummary <- Summary(ChainSC) # `displaySummaryBASiCS` can be used to extract information from this output # For example: head(displaySummaryBASiCS(ChainSummary, Param = 'mu')) # Graphical display of posterior medians and 95% HPD intervals # For example: plot(ChainSummary, Param = 'mu', main = 'All genes') plot(ChainSummary, Param = 'mu', Genes = 1:10, main = 'First 10 genes') plot(ChainSummary, Param = 'phi', main = 'All cells') plot(ChainSummary, Param = 'phi', Cells = 1:5, main = 'First 5 cells') plot(ChainSummary, Param = 'theta') # To constrast posterior medians of cell-specific parameters # For example: par(mfrow = c(1,2)) plot(ChainSummary, Param = 'phi', Param2 = 's', SmoothPlot = FALSE) # Recommended for large numbers of cells plot(ChainSummary, Param = 'phi', Param2 = 's', SmoothPlot = TRUE) # To constrast posterior medians of gene-specific parameters par(mfrow = c(1,2)) plot(ChainSummary, Param = 'mu', Param2 = 'delta', log = 'x', SmoothPlot = FALSE) # Recommended plot(ChainSummary, Param = 'mu', Param2 = 'delta', log = 'x', SmoothPlot = TRUE) # To obtain denoised rates / counts, see: # help(BASiCS_DenoisedRates) # and # help(BASiCS_DenoisedCounts) # For examples of differential analyses between 2 populations of cells see: # help(BASiCS_TestDE)
# Built-in simulated dataset set.seed(1) Data <- makeExampleBASiCS_Data() # To analyse real data, please refer to the instructions in: # https://github.com/catavallejos/BASiCS/wiki/2.-Input-preparation # Only a short run of the MCMC algorithm for illustration purposes # Longer runs migth be required to reach convergence Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 10, Regression = FALSE, PrintProgress = FALSE, WithSpikes = TRUE) # To run the regression version of BASiCS, use: Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 10, Regression = TRUE, PrintProgress = FALSE, WithSpikes = TRUE) # To run the non-spike version BASiCS requires the data to contain at least # 2 batches: set.seed(2) Data <- makeExampleBASiCS_Data(WithBatch = TRUE) Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 10, Regression = TRUE, PrintProgress = FALSE, WithSpikes = FALSE) # For illustration purposes we load a built-in 'BASiCS_Chain' object # (obtained using the 'BASiCS_MCMC' function) data(ChainSC) # `displayChainBASiCS` can be used to extract information from this output. # For example: head(displayChainBASiCS(ChainSC, Param = 'mu')) # Traceplot (examples only) plot(ChainSC, Param = 'mu', Gene = 1) plot(ChainSC, Param = 'phi', Cell = 1) plot(ChainSC, Param = 'theta', Batch = 1) # Calculating posterior medians and 95% HPD intervals ChainSummary <- Summary(ChainSC) # `displaySummaryBASiCS` can be used to extract information from this output # For example: head(displaySummaryBASiCS(ChainSummary, Param = 'mu')) # Graphical display of posterior medians and 95% HPD intervals # For example: plot(ChainSummary, Param = 'mu', main = 'All genes') plot(ChainSummary, Param = 'mu', Genes = 1:10, main = 'First 10 genes') plot(ChainSummary, Param = 'phi', main = 'All cells') plot(ChainSummary, Param = 'phi', Cells = 1:5, main = 'First 5 cells') plot(ChainSummary, Param = 'theta') # To constrast posterior medians of cell-specific parameters # For example: par(mfrow = c(1,2)) plot(ChainSummary, Param = 'phi', Param2 = 's', SmoothPlot = FALSE) # Recommended for large numbers of cells plot(ChainSummary, Param = 'phi', Param2 = 's', SmoothPlot = TRUE) # To constrast posterior medians of gene-specific parameters par(mfrow = c(1,2)) plot(ChainSummary, Param = 'mu', Param2 = 'delta', log = 'x', SmoothPlot = FALSE) # Recommended plot(ChainSummary, Param = 'mu', Param2 = 'delta', log = 'x', SmoothPlot = TRUE) # To obtain denoised rates / counts, see: # help(BASiCS_DenoisedRates) # and # help(BASiCS_DenoisedCounts) # For examples of differential analyses between 2 populations of cells see: # help(BASiCS_TestDE)
Creates a SingleCellExperiment object of Poisson-distributed approximating a homogeneous cell population.
BASiCS_MockSCE( NGenes = 100, NCells = 100, NSpikes = 20, WithBatch = TRUE, MeanMu = 1 )
BASiCS_MockSCE( NGenes = 100, NCells = 100, NSpikes = 20, WithBatch = TRUE, MeanMu = 1 )
NGenes |
Integer value specifying the number of genes that will be present in the output. |
NCells |
Integer value specifying the number of cells that will be present in the output. |
NSpikes |
Integer value specifying the number of spike-in genes that will be present in the output. |
WithBatch |
Logical value specifying whether a dummy |
MeanMu |
The log mean used to generate per-gene mean expression levels. |
A SingleCellExperiment object.
BASiCS_MockSCE()
BASiCS_MockSCE()
Produce plots assessing differential expression results
BASiCS_PlotDE(object, ...) ## S4 method for signature 'BASiCS_ResultsDE' BASiCS_PlotDE( object, Plots = c("MA", "Volcano", "Grid"), Parameters = intersect(c("Mean", "Disp", "ResDisp"), names(object@Results)), MuX = TRUE, ... ) ## S4 method for signature 'BASiCS_ResultDE' BASiCS_PlotDE( object, Plots = c("Grid", "MA", "Volcano"), Mu = NULL, TransLogit = FALSE ) ## S4 method for signature 'missing' BASiCS_PlotDE( GroupLabel1, GroupLabel2, ProbThresholds = seq(0.5, 0.9995, by = 0.00025), Epsilon, EFDR, Table, Measure, EFDRgrid, EFNRgrid, ProbThreshold, Mu, TransLogit = FALSE, Plots = c("Grid", "MA", "Volcano") )
BASiCS_PlotDE(object, ...) ## S4 method for signature 'BASiCS_ResultsDE' BASiCS_PlotDE( object, Plots = c("MA", "Volcano", "Grid"), Parameters = intersect(c("Mean", "Disp", "ResDisp"), names(object@Results)), MuX = TRUE, ... ) ## S4 method for signature 'BASiCS_ResultDE' BASiCS_PlotDE( object, Plots = c("Grid", "MA", "Volcano"), Mu = NULL, TransLogit = FALSE ) ## S4 method for signature 'missing' BASiCS_PlotDE( GroupLabel1, GroupLabel2, ProbThresholds = seq(0.5, 0.9995, by = 0.00025), Epsilon, EFDR, Table, Measure, EFDRgrid, EFNRgrid, ProbThreshold, Mu, TransLogit = FALSE, Plots = c("Grid", "MA", "Volcano") )
object |
A BASiCS_ResultsDE or BASiCS_ResultDE object. |
... |
Passed to methods. |
Plots |
Plots plot to produce? Options: "MA", "Volcano", "Grid". |
Parameters |
Character vector specifying the parameter(s) to produce plots for, Available options are "Mean", (mu, mean expression), "Disp" (delta, overdispersion) and "ResDisp" (epsilon, residual overdispersion). |
MuX |
Use Mu (mean expression across both chains) as the X-axis for all MA plots? Default: TRUE. |
Mu , GroupLabel1 , GroupLabel2 , ProbThresholds , Epsilon , EFDR , Table , Measure , EFDRgrid , EFNRgrid , ProbThreshold
|
Internal arguments. |
TransLogit |
Logical scalar controlling whether a logit transform is applied to the posterior probability in the y-axis of volcano plots. As logit(0) and logit(1) are undefined, we clip these values near the range of the data excluding 0 and 1. |
A plot (possibly several combined using
plot_grid
).
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Alan O'Callaghan
data(ChainSC) data(ChainRNA) Test <- BASiCS_TestDE(Chain1 = ChainSC, Chain2 = ChainRNA, GroupLabel1 = 'SC', GroupLabel2 = 'P&S', EpsilonM = log2(1.5), EpsilonD = log2(1.5), OffSet = TRUE) BASiCS_PlotDE(Test)
data(ChainSC) data(ChainRNA) Test <- BASiCS_TestDE(Chain1 = ChainSC, Chain2 = ChainRNA, GroupLabel1 = 'SC', GroupLabel2 = 'P&S', EpsilonM = log2(1.5), EpsilonD = log2(1.5), OffSet = TRUE) BASiCS_PlotDE(Test)
Visualise global offset in mean expression between two
BASiCS_Chain
objects.
BASiCS_PlotOffset( Chain1, Chain2, Type = c("offset estimate", "before-after", "MAPlot"), GroupLabel1 = "Group 1", GroupLabel2 = "Group 2" )
BASiCS_PlotOffset( Chain1, Chain2, Type = c("offset estimate", "before-after", "MAPlot"), GroupLabel1 = "Group 1", GroupLabel2 = "Group 2" )
Chain1 , Chain2
|
BASiCS_Chain objects to be plotted. |
Type |
The type of plot generated.
|
GroupLabel1 , GroupLabel2
|
Labels for Chain1 and Chain2 in the resulting plot(s). |
Plot objects.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Alan O'Callaghan
# Loading two 'BASiCS_Chain' objects (obtained using 'BASiCS_MCMC') data("ChainSC") data("ChainRNA") BASiCS_PlotOffset(ChainSC, ChainRNA)
# Loading two 'BASiCS_Chain' objects (obtained using 'BASiCS_MCMC') data("ChainSC") data("ChainRNA") BASiCS_PlotOffset(ChainSC, ChainRNA)
Plot variance decomposition results.
BASiCS_PlotVarianceDecomp( Decomp, beside = FALSE, nBatch = ((ncol(Decomp) - 2)/3) - 1, main = "Overall variance decomposition", xlabs = if (nBatch == 1) "Overall" else c("Overall", paste("Batch", seq_len(nBatch))), ylab = "% of variance" )
BASiCS_PlotVarianceDecomp( Decomp, beside = FALSE, nBatch = ((ncol(Decomp) - 2)/3) - 1, main = "Overall variance decomposition", xlabs = if (nBatch == 1) "Overall" else c("Overall", paste("Batch", seq_len(nBatch))), ylab = "% of variance" )
Decomp |
The output of |
beside |
If |
nBatch |
Number of batches. |
main |
Plot title. |
xlabs |
x-axis labels. Defaults to "Batch 1", "Batch 2", etc. |
ylab |
y axis label. |
A ggplot object.
Plots of HVG/LVG search.
BASiCS_PlotVG(object, Plot = c("Grid", "VG"), ...)
BASiCS_PlotVG(object, Plot = c("Grid", "VG"), ...)
object |
BASiCS_ResultVG object. |
Plot |
Character scalar specifying the type of plot to be made. Options are "Grid" and "VG". |
... |
Optional graphical parameters passed to |
A plot.
data(ChainSC) # Highly and lowly variable genes detection (within a single group of cells) DetectHVG <- BASiCS_DetectHVG(ChainSC, VarThreshold = 0.60, EFDR = 0.10, Plot = TRUE) BASiCS_PlotVG(DetectHVG)
data(ChainSC) # Highly and lowly variable genes detection (within a single group of cells) DetectHVG <- BASiCS_DetectHVG(ChainSC, VarThreshold = 0.60, EFDR = 0.10, Plot = TRUE) BASiCS_PlotVG(DetectHVG)
This is a convenience function to allow partial specification of prior parameters, and to ensure default parameters are consistent across usage within the package.
BASiCS_PriorParam( Data, k = 12, mu.mu = NULL, s2.mu = 0.5, s2.delta = 0.5, a.delta = 1, b.delta = 1, p.phi = rep(1, times = ncol(Data)), a.s = 1, b.s = 1, a.theta = 1, b.theta = 1, RBFMinMax = TRUE, FixLocations = !is.null(RBFLocations) | !is.na(MinGenesPerRBF), RBFLocations = NULL, MinGenesPerRBF = NA, variance = 1.2, m = numeric(k), V = diag(k), a.sigma2 = 2, b.sigma2 = 2, eta = 5, PriorMu = c("default", "EmpiricalBayes"), PriorDelta = c("log-normal", "gamma"), StochasticRef = TRUE, ConstrainProp = 0.2, GeneExponent = 1, CellExponent = 1 )
BASiCS_PriorParam( Data, k = 12, mu.mu = NULL, s2.mu = 0.5, s2.delta = 0.5, a.delta = 1, b.delta = 1, p.phi = rep(1, times = ncol(Data)), a.s = 1, b.s = 1, a.theta = 1, b.theta = 1, RBFMinMax = TRUE, FixLocations = !is.null(RBFLocations) | !is.na(MinGenesPerRBF), RBFLocations = NULL, MinGenesPerRBF = NA, variance = 1.2, m = numeric(k), V = diag(k), a.sigma2 = 2, b.sigma2 = 2, eta = 5, PriorMu = c("default", "EmpiricalBayes"), PriorDelta = c("log-normal", "gamma"), StochasticRef = TRUE, ConstrainProp = 0.2, GeneExponent = 1, CellExponent = 1 )
Data |
SingleCellExperiment object (required). |
k |
Number of regression terms, including k - 2 Gaussian radial basis functions (GRBFs). |
mu.mu , s2.mu
|
Mean and variance parameters for lognormal prior on mu. |
s2.delta |
Variance parameter for lognormal prior on delta when
|
a.delta , b.delta
|
Parameters for gamma prior on delta when
|
p.phi |
Parameter for dirichlet prior on phi. |
a.s , b.s
|
Parameters for gamma prior on s. |
a.theta , b.theta
|
Parameters for gamma prior on theta. |
RBFMinMax |
Should GRBFs be placed at the minimum and maximum of
|
FixLocations |
Should RBFLocations be fixed throughout MCMC, or adaptive
during burn-in? By default this is |
RBFLocations |
Numeric vector specifying locations of GRBFs in units
of |
MinGenesPerRBF |
Numeric scalar specifying the minimum number of genes
for GRBFs to be retained. If fewer than |
variance |
Variance of the GRBFs. |
m , V
|
Mean and (co)variance priors for regression coefficients. |
a.sigma2 , b.sigma2
|
Priors for inverse gamma prior on regression scale. |
eta |
Degrees of freedom for t distribution of regresion errors. |
PriorMu |
Indicates if the original prior ( |
PriorDelta |
Scalar character specifying the prior type to use for delta overdispersion parameter. Options are "log-normal" (recommended) and "gamma" (used in Vallejos et al. (2015)). |
StochasticRef |
Logical scalar specifying whether the reference gene for the no-spikes version should be chosen randomly at MCMC iterations. |
ConstrainProp |
Proportion of genes to be considered as reference genes
if |
GeneExponent , CellExponent
|
Exponents for gene and cell-specific parameters. These should not be outside of divide and conquer MCMC applications. |
A list containing the prior hyper-parameters that are required to
run the algoritm implemented in BASiCS_MCMC
.
BASiCS_PriorParam(makeExampleBASiCS_Data(), k = 12)
BASiCS_PriorParam(makeExampleBASiCS_Data(), k = 12)
Container of results for a single test (HVG/LVG/DE). This should be an abstract class (but this is R so no) and shouldn't be directly instantiated. Defines a very small amount of common behaviour for BASiCS_ResultDE and BASiCS_ResultVG.
Table
Tabular results for each gene.
Name
The name of the test performed (typically "Mean", "Disp" or "ResDisp")
ProbThreshold
Posterior probability threshold used in differential test.
EFDR,EFNR
Expected false discovery and expected false negative rates for differential test.
Extra
Additional objects for class flexibility.
Container of results for a single differential test.
Table
Tabular results for each gene.
Name
The name of the test performed (typically "Mean", "Disp" or "ResDisp")
GroupLabel1,GroupLabel2
Group labels.
ProbThreshold
Posterior probability threshold used in differential test.
EFDR,EFNR
Expected false discovery and expected false negative rates for differential test.
EFDRgrid,EFNRgrid
Grid of EFDR and EFNR values calculated before thresholds were fixed.
Epsilon
Minimum fold change or difference threshold.
Extra
objects for class flexibility.
Results of BASiCS_TestDE
Results
BASiCS_ResultDE
objects
Chain1,Chain2
BASiCS_Chain
objects.
GroupLabel1,GroupLabel2
Labels for Chain1 and Chain2
Offset
Ratio between median of chains
RowData
Annotation for genes
Extras
Slot for extra information to be added later
Container of results for a single HVG/LVG test.
Method
Character value detailing whether the test performed using
a threshold directly on epsilon values (Method="Epsilon"
),
variance decomposition (Method="Variance"
) or percentiles of epsilon
(Method="Percentile"
).
RowData
Optional DataFrame containing additional information about genes used in the test.
EFDRgrid,EFNRgrid
Grid of EFDR and EFNR values calculated before thresholds were fixed.
Threshold
Threshold used to calculate tail posterior probabilities for the HVG or LVG decision rule.
ProbThresholds
Probability thresholds used to calculate
EFDRGrid
and EFNRGrid
.
ProbThreshold
Posterior probability threshold used in the HVG/LVG decision rule.
Plotting the trend after Bayesian regression using a
BASiCS_Chain
object
BASiCS_ShowFit( object, xlab = "log(mu)", ylab = "log(delta)", pch = 16, smooth = TRUE, variance = 1.2, colour = "dark blue", markExcludedGenes = TRUE, GenesSel = NULL, colourGenesSel = "dark red", Uncertainty = TRUE )
BASiCS_ShowFit( object, xlab = "log(mu)", ylab = "log(delta)", pch = 16, smooth = TRUE, variance = 1.2, colour = "dark blue", markExcludedGenes = TRUE, GenesSel = NULL, colourGenesSel = "dark red", Uncertainty = TRUE )
object |
an object of class |
xlab |
As in |
ylab |
As in |
pch |
As in |
smooth |
Logical to indicate wether the smoothScatter function is used
to plot the scatter plot. Default value |
variance |
Variance used to build GRBFs for regression. Default value
|
colour |
colour used to denote genes within the scatterplot. Only used
when |
markExcludedGenes |
Whether or not lowly expressed genes that were
excluded from the regression fit are included in the scatterplot.
Default value |
GenesSel |
Vector of gene names to be highlighted in the scatterplot.
Only used when |
colourGenesSel |
colour used to denote the genes listed in
|
Uncertainty |
logical indicator. If true, statistical uncertainty around the regression fit is shown in the plot. |
A ggplot2 object
Nils Eling [email protected]
Catalina Vallejos [email protected]
Eling et al (2018). Cell Systems https://doi.org/10.1016/j.cels.2018.06.011
data(ChainRNAReg) BASiCS_ShowFit(ChainRNAReg)
data(ChainRNAReg) BASiCS_ShowFit(ChainRNAReg)
BASiCS_Sim
creates a simulated dataset from the model
implemented in BASiCS.
BASiCS_Sim(Mu, Mu_spikes = NULL, Delta, Phi = NULL, S, Theta, BatchInfo = NULL)
BASiCS_Sim(Mu, Mu_spikes = NULL, Delta, Phi = NULL, S, Theta, BatchInfo = NULL)
Mu |
Gene-specific mean expression parameters |
Mu_spikes |
|
Delta |
Gene-specific biological over-dispersion parameters
|
Phi |
Cell-specific mRNA content normalising parameters |
S |
Cell-specific technical normalising parameters |
Theta |
Technical variability parameter |
BatchInfo |
Vector detailing which batch each cell should be simulated
from. If spike-ins are not in use, the number of unique values contained in
|
An object of class SingleCellExperiment
,
including synthetic data generated by the model implemented in BASiCS.
Catalina A. Vallejos [email protected], Nils Eling
Vallejos, Marioni and Richardson (2015). PLoS Computational Biology.
# Simulated parameter values for 10 genes # (7 biogical and 3 spike-in) measured in 5 cells Mu <- c(8.36, 10.65, 4.88, 6.29, 21.72, 12.93, 30.19) Mu_spikes <- c(1010.72, 7.90, 31.59) Delta <- c(1.29, 0.88, 1.51, 1.49, 0.54, 0.40, 0.85) Phi <- c(1.00, 1.06, 1.09, 1.05, 0.80) S <- c(0.38, 0.40, 0.38, 0.39, 0.34) Theta <- 0.39 # Data with spike-ins, single batch Data <- BASiCS_Sim(Mu, Mu_spikes, Delta, Phi, S, Theta) head(SingleCellExperiment::counts(Data)) dim(SingleCellExperiment::counts(Data)) altExp(Data) rowData(altExp(Data)) # Data with spike-ins, multiple batches BatchInfo <- c(1,1,1,2,2) Theta2 <- rep(Theta, times = 2) Data <- BASiCS_Sim(Mu, Mu_spikes, Delta, Phi, S, Theta2, BatchInfo) # Data without spike-ins, multiple batches Data <- BASiCS_Sim(Mu, Mu_spikes = NULL, Delta, Phi = NULL, S, Theta2, BatchInfo)
# Simulated parameter values for 10 genes # (7 biogical and 3 spike-in) measured in 5 cells Mu <- c(8.36, 10.65, 4.88, 6.29, 21.72, 12.93, 30.19) Mu_spikes <- c(1010.72, 7.90, 31.59) Delta <- c(1.29, 0.88, 1.51, 1.49, 0.54, 0.40, 0.85) Phi <- c(1.00, 1.06, 1.09, 1.05, 0.80) S <- c(0.38, 0.40, 0.38, 0.39, 0.34) Theta <- 0.39 # Data with spike-ins, single batch Data <- BASiCS_Sim(Mu, Mu_spikes, Delta, Phi, S, Theta) head(SingleCellExperiment::counts(Data)) dim(SingleCellExperiment::counts(Data)) altExp(Data) rowData(altExp(Data)) # Data with spike-ins, multiple batches BatchInfo <- c(1,1,1,2,2) Theta2 <- rep(Theta, times = 2) Data <- BASiCS_Sim(Mu, Mu_spikes, Delta, Phi, S, Theta2, BatchInfo) # Data without spike-ins, multiple batches Data <- BASiCS_Sim(Mu, Mu_spikes = NULL, Delta, Phi = NULL, S, Theta2, BatchInfo)
Container of a summary of a BASiCS_Chain
object. In each element of the parameters
slot, first column contains
posterior medians; second and third columns respectively contain the lower
and upper limits of an high posterior density interval (for a given
probability).
parameters
List of parameters in which each entry contains a matrix: first column contains posterior medians, second column contains the lower limits of an high posterior density interval and third column contains the upper limits of high posterior density intervals.
Posterior medians (1st column), lower (2nd column) and upper
(3rd column) limits of gene-specific mean expression parameters .
Posterior medians (1st column), lower (2nd column) and upper
(3rd column) limits of gene-specific biological over-dispersion parameters
, biological genes only
Posterior medians (1st column), lower (2nd column) and upper
(3rd column) limits of cell-specific mRNA content normalisation
parameters
Posterior medians (1st column), lower (2nd column) and upper
(3rd column) limits of cell-specific technical normalisation
parameters
Posterior medians (1st column), lower (2nd column) and upper
(3rd column) limits of cell-specific random effects
Posterior median (1st column), lower (2nd column) and upper
(3rd column) limits of technical over-dispersion parameter(s)
(each row represents one batch)
beta
Posterior median (first column), lower (second column)
and upper (third column) limits of regression coefficients
sigma2
Posterior median (first column), lower (second column)
and upper (third column) limits of residual variance
epsilon
Posterior median (first column), lower (second column)
and upper (third column) limits of gene-specific residual over-dispersion
parameter
# A BASiCS_Summary object created by the Summary method. Data <- makeExampleBASiCS_Data() Chain <- BASiCS_MCMC(Data, N = 100, Thin = 2, Burn = 2, Regression = FALSE) ChainSummary <- Summary(Chain)
# A BASiCS_Summary object created by the Summary method. Data <- makeExampleBASiCS_Data() Chain <- BASiCS_MCMC(Data, N = 100, Thin = 2, Burn = 2, Regression = FALSE) ChainSummary <- Summary(Chain)
'show' method for BASiCS_Summary
objects.
## S4 method for signature 'BASiCS_Summary' show(object)
## S4 method for signature 'BASiCS_Summary' show(object)
object |
A |
Prints a summary of the properties of object
.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
data(ChainSC) show(ChainSC)
data(ChainSC) show(ChainSC)
Function to assess changes in expression between two groups of cells (mean and over-dispersion)
BASiCS_TestDE( Chain1, Chain2, EpsilonM = log2(1.5), EpsilonD = log2(1.5), EpsilonR = log2(1.5)/log2(exp(1)), ProbThresholdM = 2/3, ProbThresholdD = 2/3, ProbThresholdR = 2/3, OrderVariable = c("GeneIndex", "GeneName", "Mu"), GroupLabel1 = "Group1", GroupLabel2 = "Group2", Plot = TRUE, PlotOffset = TRUE, PlotOffsetType = c("offset estimate", "before-after", "MA plot"), Offset = TRUE, EFDR_M = 0.05, EFDR_D = 0.05, EFDR_R = 0.05, GenesSelect = rep(TRUE, ncol(Chain1@parameters[["mu"]])), min.mean = 1, MinESS = 100, ... )
BASiCS_TestDE( Chain1, Chain2, EpsilonM = log2(1.5), EpsilonD = log2(1.5), EpsilonR = log2(1.5)/log2(exp(1)), ProbThresholdM = 2/3, ProbThresholdD = 2/3, ProbThresholdR = 2/3, OrderVariable = c("GeneIndex", "GeneName", "Mu"), GroupLabel1 = "Group1", GroupLabel2 = "Group2", Plot = TRUE, PlotOffset = TRUE, PlotOffsetType = c("offset estimate", "before-after", "MA plot"), Offset = TRUE, EFDR_M = 0.05, EFDR_D = 0.05, EFDR_R = 0.05, GenesSelect = rep(TRUE, ncol(Chain1@parameters[["mu"]])), min.mean = 1, MinESS = 100, ... )
Chain1 |
an object of class |
Chain2 |
an object of class |
EpsilonM |
Minimum fold change tolerance threshold for detecting
changes in overall expression (must be a positive real number).
Default value: |
EpsilonD |
Minimum fold change tolerance threshold for detecting
changes in biological over-dispersion (must be a positive real number).
Default value: |
EpsilonR |
Minimum distance threshold for detecting
changes in residual over-dispersion (must be a positive real number).
Default value: |
ProbThresholdM |
Optional parameter. Probability threshold for detecting
changes in overall expression (must be a positive value, between 0 and 1).
If |
ProbThresholdD |
Optional parameter. Probability threshold for detecting
changes in cell-to-cell biological over-dispersion (must be a positive value,
between 0 and 1). Same usage as |
ProbThresholdR |
Optional parameter. Probability threshold for detecting
changes in residual over-dispersion (must be a positive value, between 0 and
1). Same usage as |
OrderVariable |
Ordering variable for output.
Possible values: |
GroupLabel1 |
Label assigned to reference group.
Default: |
GroupLabel2 |
Label assigned to reference group.
Default: |
Plot |
If |
PlotOffset |
If |
PlotOffsetType |
See argument |
Offset |
Optional argument to remove a fix offset effect (if not
previously removed from the MCMC chains). Default: |
EFDR_M |
Target for expected false discovery rate related to
the comparison of means. If |
EFDR_D |
Target for expected false discovery rate related to
the comparison of dispersions. If |
EFDR_R |
Target for expected false discovery rate related to
the comparison of residual over-dispersions. If |
GenesSelect |
Optional argument to provide a user-defined list
of genes to be considered for the comparison.
Default: |
min.mean |
Minimum mean expression threshold required for inclusion in offset calculation. Similar to 'min.mean' in 'scran::computeSumFactors'. This parameter is only relevant with 'Offset = TRUE'. |
MinESS |
The minimum effective sample size for a gene to be included in the tests for differential expression. This helps to remove genes with poor mixing from differential expression tests. Default is 100. If set to NA, genes are not checked for effective sample size before differential expression tests are performed. |
... |
Optional parameters. |
BASiCS_TestDE
returns an object of class
BASiCS_ResultsDE
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
# Loading two 'BASiCS_Chain' objects (obtained using 'BASiCS_MCMC') data(ChainSC) data(ChainRNA) Test <- BASiCS_TestDE( Chain1 = ChainSC, Chain2 = ChainRNA, GroupLabel1 = "SC", GroupLabel2 = "P&S", EpsilonM = log2(1.5), EpsilonD = log2(1.5), OffSet = TRUE ) # Results for the differential mean test head(as.data.frame(Test, Parameter = "Mean")) # Results for the differential over-dispersion test # This only includes genes marked as 'NoDiff' in Test$TableMean head(as.data.frame(Test, Parameter = "Disp")) # For testing differences in residual over-dispersion, two chains obtained # via 'BASiCS_MCMC(Data, N, Thin, Burn, Regression=TRUE)' need to be provided data(ChainSCReg) data(ChainRNAReg) Test <- BASiCS_TestDE( Chain1 = ChainSCReg, Chain2 = ChainRNAReg, GroupLabel1 = 'SC', GroupLabel2 = 'P&S', EpsilonM = log2(1.5), EpsilonD = log2(1.5), EpsilonR = log2(1.5)/log2(exp(1)), OffSet = TRUE ) ## Plotting the results of these tests BASiCS_PlotDE(Test)
# Loading two 'BASiCS_Chain' objects (obtained using 'BASiCS_MCMC') data(ChainSC) data(ChainRNA) Test <- BASiCS_TestDE( Chain1 = ChainSC, Chain2 = ChainRNA, GroupLabel1 = "SC", GroupLabel2 = "P&S", EpsilonM = log2(1.5), EpsilonD = log2(1.5), OffSet = TRUE ) # Results for the differential mean test head(as.data.frame(Test, Parameter = "Mean")) # Results for the differential over-dispersion test # This only includes genes marked as 'NoDiff' in Test$TableMean head(as.data.frame(Test, Parameter = "Disp")) # For testing differences in residual over-dispersion, two chains obtained # via 'BASiCS_MCMC(Data, N, Thin, Burn, Regression=TRUE)' need to be provided data(ChainSCReg) data(ChainRNAReg) Test <- BASiCS_TestDE( Chain1 = ChainSCReg, Chain2 = ChainRNAReg, GroupLabel1 = 'SC', GroupLabel2 = 'P&S', EpsilonM = log2(1.5), EpsilonD = log2(1.5), EpsilonR = log2(1.5)/log2(exp(1)), OffSet = TRUE ) ## Plotting the results of these tests BASiCS_PlotDE(Test)
Function to decompose total variability of gene expression into biological and technical components.
BASiCS_VarianceDecomp( Chain, OrderVariable = c("BioVarGlobal", "GeneName", "TechVarGlobal", "ShotNoiseGlobal"), Plot = TRUE, main = "Overall variance decomposition", ylab = "% of variance", beside = FALSE, palette = "Set1", legend = c("Biological", "Technical", "Shot noise"), names.arg = if (nBatch == 1) "Overall" else c("Overall", paste("Batch", seq_len(nBatch))) )
BASiCS_VarianceDecomp( Chain, OrderVariable = c("BioVarGlobal", "GeneName", "TechVarGlobal", "ShotNoiseGlobal"), Plot = TRUE, main = "Overall variance decomposition", ylab = "% of variance", beside = FALSE, palette = "Set1", legend = c("Biological", "Technical", "Shot noise"), names.arg = if (nBatch == 1) "Overall" else c("Overall", paste("Batch", seq_len(nBatch))) )
Chain |
an object of class |
OrderVariable |
Ordering variable for output.
Possible values: |
Plot |
If |
main |
Plot title. |
ylab |
y axis label. |
beside |
If |
palette |
Palette to be passed to |
legend |
Labels for variance components. |
names.arg |
X axis labels. |
See vignette
A data.frame
whose first 4 columns correspond to
GeneName
Gene name (as indicated by user)
BioVarGlobal
Percentage of variance explained by a biological component (overall across all cells)
TechVarGlobal
Percentage of variance explained by the technical component (overall across all cells)
ShotNoiseGlobal
Percentage of variance explained by the shot noise component (baseline Poisson noise, overall across all cells)
If more than 1 batch of cells are being analysed, the remaining columns contain the corresponding variance decomposition calculated within each batch.
Catalina A. Vallejos [email protected]
Vallejos, Marioni and Richardson (2015). PLoS Computational Biology.
# For illustration purposes we load a built-in 'BASiCS_Chain' object # (obtained using the 'BASiCS_MCMC' function) data(ChainSC) VD <- BASiCS_VarianceDecomp(ChainSC)
# For illustration purposes we load a built-in 'BASiCS_Chain' object # (obtained using the 'BASiCS_MCMC' function) data(ChainSC) VD <- BASiCS_VarianceDecomp(ChainSC)
Detection method for highly and lowly variable genes using a grid of variance contribution thresholds. Only used when HVG/LVG are found based on the variance decomposition.
BASiCS_VarThresholdSearchVG( Chain, Task = c("HVG", "LVG"), VarThresholdsGrid, EFDR = 0.1, Progress = TRUE ) BASiCS_VarThresholdSearchHVG(...) BASiCS_VarThresholdSearchLVG(...)
BASiCS_VarThresholdSearchVG( Chain, Task = c("HVG", "LVG"), VarThresholdsGrid, EFDR = 0.1, Progress = TRUE ) BASiCS_VarThresholdSearchHVG(...) BASiCS_VarThresholdSearchLVG(...)
Chain |
an object of class |
Task |
See |
VarThresholdsGrid |
Grid of values for the variance contribution threshold (they must be contained in (0,1)) |
EFDR |
Target for expected false discovery rate related to
HVG/LVG detection. Default: |
Progress |
If |
... |
Passed to methods. |
See vignette
BASiCS_VarThresholdSearchHVG
A table displaying the results of highly variable genes detection for different variance contribution thresholds.
BASiCS_VarThresholdSearchLVG
A table displaying the results of lowly variable genes detection for different variance contribution thresholds.
Catalina A. Vallejos [email protected]
Vallejos, Marioni and Richardson (2015). PLoS Computational Biology.
data(ChainSC) BASiCS_VarThresholdSearchHVG(ChainSC, VarThresholdsGrid = seq(0.55,0.65,by=0.01), EFDR = 0.10) BASiCS_VarThresholdSearchLVG(ChainSC, VarThresholdsGrid = seq(0.35,0.45,by=0.01), EFDR = 0.10)
data(ChainSC) BASiCS_VarThresholdSearchHVG(ChainSC, VarThresholdsGrid = seq(0.55,0.65,by=0.01), EFDR = 0.10) BASiCS_VarThresholdSearchLVG(ChainSC, VarThresholdsGrid = seq(0.35,0.45,by=0.01), EFDR = 0.10)
The functions listed here are no longer part of BASiCS
.
## Removed
BASiCS_D_TestDE
has been replaced by BASiCS_TestDE
.
## Removed
BASiCS_D_TestDE()
Catalina A. Vallejos [email protected]
Small extract (75 MCMC iterations, 350 randomly selected genes) from the chain obtained for the pool-and-split samples (this corresponds to the RNA 2i samples in Grun et al, 2014).
ChainRNA
ChainRNA
An object of class BASiCS_Chain
containing 75 MCMC iterations.
Grun, Kester and van Oudenaarden (2014). Nature Methods.
Small extract (75 MCMC iterations, 350 randomly selected genes) from the chain obtained for the pool-and-split samples (this corresponds to the RNA 2i samples in Grun et al, 2014).
ChainRNAReg
ChainRNAReg
An object of class BASiCS_Chain
containing 75 MCMC iterations.
Grun, Kester and van Oudenaarden (2014). Nature Methods.
Small extract (75 MCMC iterations, 350 randomly selected genes) from the chain obtained for the pool-and-split samples (this corresponds to the SC 2i samples in Grun et al, 2014).
ChainSC
ChainSC
An object of class BASiCS_Chain
containing 75 MCMC iterations.
Grun, Kester and van Oudenaarden (2014). Nature Methods.
Small extract (75 MCMC iterations, 350 randomly selected genes) from the chain obtained for the pool-and-split samples (this corresponds to the SC 2i samples in Grun et al, 2014).
ChainSCReg
ChainSCReg
An object of class BASiCS_Chain
containing 75 MCMC iterations.
Grun, Kester and van Oudenaarden (2014). Nature Methods.
Returns the dimensions (genes x cells) of a
BASiCS_Chain
## S4 method for signature 'BASiCS_Chain' dim(x)
## S4 method for signature 'BASiCS_Chain' dim(x)
x |
A |
An vector of dimensions
Catalina A. Vallejos [email protected]
data(ChainSC) dimnames(ChainSC)
data(ChainSC) dimnames(ChainSC)
Returns the dimension names (genes x cells) of a
BASiCS_Chain
## S4 method for signature 'BASiCS_Chain' dimnames(x)
## S4 method for signature 'BASiCS_Chain' dimnames(x)
x |
A |
A list of two elements: (1) a vector of gene names and (2) a vector of cell names.
Catalina A. Vallejos [email protected]
data(ChainSC) dimnames(ChainSC)
data(ChainSC) dimnames(ChainSC)
Accessors for the slots of a BASiCS_Chain
## S4 method for signature 'BASiCS_Chain' displayChainBASiCS(object, Parameter = "mu")
## S4 method for signature 'BASiCS_Chain' displayChainBASiCS(object, Parameter = "mu")
object |
an object of class |
Parameter |
Name of the slot to be used for the accessed.
Possible values: |
The requested slot of a BASiCS_Chain
object
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
help(BASiCS_MCMC)
help(BASiCS_MCMC)
BASiCS_Summary
objectAccessors for the slots of a BASiCS_Summary
object
## S4 method for signature 'BASiCS_Summary' displaySummaryBASiCS(object, Parameter = "mu")
## S4 method for signature 'BASiCS_Summary' displaySummaryBASiCS(object, Parameter = "mu")
object |
an object of class |
Parameter |
Name of the slot to be used for the accessed.
Possible values: |
The requested slot of a BASiCS_Summary
object
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
help(BASiCS_MCMC)
help(BASiCS_MCMC)
Methods for formatting BASiCS_Result and BASiCS_ResultsDE objects.
## S4 method for signature 'BASiCS_ResultsDE' format(x, Parameter, Filter = TRUE, ProbThreshold = NULL, ...) ## S4 method for signature 'BASiCS_ResultDE' format(x, Filter = TRUE, ProbThreshold = NULL, ...) ## S4 method for signature 'BASiCS_ResultVG' format(x, Filter = TRUE, ProbThreshold = NULL, ...)
## S4 method for signature 'BASiCS_ResultsDE' format(x, Parameter, Filter = TRUE, ProbThreshold = NULL, ...) ## S4 method for signature 'BASiCS_ResultDE' format(x, Filter = TRUE, ProbThreshold = NULL, ...) ## S4 method for signature 'BASiCS_ResultVG' format(x, Filter = TRUE, ProbThreshold = NULL, ...)
x |
Object being subsetted. |
Parameter |
Character scalar indicating which of the BASiCS_Result should be formatted. |
Filter |
Logical scalar indicating whether results should be
filtered based on differential expression or HVG/LVG status if
|
ProbThreshold |
Probability threshold to be used if |
... |
Passed to |
A data.frame
.
A synthetic SingleCellExperiment
object is
generated by simulating a dataset from the model underlying BASiCS. This is
used to illustrate BASiCS in some of the package and vignette examples.
makeExampleBASiCS_Data(WithBatch = FALSE, WithSpikes = TRUE)
makeExampleBASiCS_Data(WithBatch = FALSE, WithSpikes = TRUE)
WithBatch |
If |
WithSpikes |
If |
Note: In BASiCS versions < 1.5.22, makeExampleBASiCS_Data
used a fixed seed within the function. This has been removed to comply with
Bioconductor policies. If a reproducible example is required, please use
set.seed
prior to calling makeExampleBASiCS_Data
.
An object of class SingleCellExperiment
, with
synthetic data simulated from the model implemented in BASiCS.
If WithSpikes = TRUE
, it contains 70 genes (50 biological and
20 spike-in) and 30 cells. Alternatively, it contains 50 biological
genes and 30 cells.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Data <- makeExampleBASiCS_Data() is(Data, 'SingleCellExperiment')
Data <- makeExampleBASiCS_Data() is(Data, 'SingleCellExperiment')
BASiCS_Chain
creates a BASiCS_Chain
object from pre-computed MCMC chains.
newBASiCS_Chain(parameters)
newBASiCS_Chain(parameters)
parameters |
List of matrices containing MCMC chains for each model parameter.
. This parameter is only used when spike-in genes are available.
|
An object of class BASiCS_Chain
.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
Data <- makeExampleBASiCS_Data() # No regression model Chain <- BASiCS_MCMC(Data, N = 50, Thin = 5, Burn = 5, Regression = FALSE) ChainMu <- displayChainBASiCS(Chain, 'mu') ChainDelta <- displayChainBASiCS(Chain, 'delta') ChainPhi <- displayChainBASiCS(Chain, 'phi') ChainS <- displayChainBASiCS(Chain, 's') ChainNu <- displayChainBASiCS(Chain, 'nu') ChainTheta <- displayChainBASiCS(Chain, 'theta') ChainNew <- newBASiCS_Chain(parameters = list(mu = ChainMu, delta = ChainDelta, phi = ChainPhi, s = ChainS, nu = ChainNu, theta = ChainTheta)) # No regression model Chain <- BASiCS_MCMC(Data, N = 50, Thin = 5, Burn = 5, Regression = TRUE) ChainMu <- displayChainBASiCS(Chain, 'mu') ChainDelta <- displayChainBASiCS(Chain, 'delta') ChainPhi <- displayChainBASiCS(Chain, 'phi') ChainS <- displayChainBASiCS(Chain, 's') ChainNu <- displayChainBASiCS(Chain, 'nu') ChainTheta <- displayChainBASiCS(Chain, 'theta') ChainBeta <- displayChainBASiCS(Chain, 'beta') ChainSigma2 <- displayChainBASiCS(Chain, 'sigma2') ChainEpsilon <- displayChainBASiCS(Chain, 'epsilon') ChainNew <- newBASiCS_Chain(parameters = list(mu = ChainMu, delta = ChainDelta, phi = ChainPhi, s = ChainS, nu = ChainNu, theta = ChainTheta, beta = ChainBeta, sigma2 = ChainSigma2, epsilon = ChainEpsilon))
Data <- makeExampleBASiCS_Data() # No regression model Chain <- BASiCS_MCMC(Data, N = 50, Thin = 5, Burn = 5, Regression = FALSE) ChainMu <- displayChainBASiCS(Chain, 'mu') ChainDelta <- displayChainBASiCS(Chain, 'delta') ChainPhi <- displayChainBASiCS(Chain, 'phi') ChainS <- displayChainBASiCS(Chain, 's') ChainNu <- displayChainBASiCS(Chain, 'nu') ChainTheta <- displayChainBASiCS(Chain, 'theta') ChainNew <- newBASiCS_Chain(parameters = list(mu = ChainMu, delta = ChainDelta, phi = ChainPhi, s = ChainS, nu = ChainNu, theta = ChainTheta)) # No regression model Chain <- BASiCS_MCMC(Data, N = 50, Thin = 5, Burn = 5, Regression = TRUE) ChainMu <- displayChainBASiCS(Chain, 'mu') ChainDelta <- displayChainBASiCS(Chain, 'delta') ChainPhi <- displayChainBASiCS(Chain, 'phi') ChainS <- displayChainBASiCS(Chain, 's') ChainNu <- displayChainBASiCS(Chain, 'nu') ChainTheta <- displayChainBASiCS(Chain, 'theta') ChainBeta <- displayChainBASiCS(Chain, 'beta') ChainSigma2 <- displayChainBASiCS(Chain, 'sigma2') ChainEpsilon <- displayChainBASiCS(Chain, 'epsilon') ChainNew <- newBASiCS_Chain(parameters = list(mu = ChainMu, delta = ChainDelta, phi = ChainPhi, s = ChainS, nu = ChainNu, theta = ChainTheta, beta = ChainBeta, sigma2 = ChainSigma2, epsilon = ChainEpsilon))
newBASiCS_Data
creates a
SingleCellExperiment
object from a matrix of expression
counts and experimental information about spike-in genes.
newBASiCS_Data( Counts, Tech = rep(FALSE, nrow(Counts)), SpikeInfo = NULL, BatchInfo = NULL, SpikeType = "ERCC" )
newBASiCS_Data( Counts, Tech = rep(FALSE, nrow(Counts)), SpikeInfo = NULL, BatchInfo = NULL, SpikeType = "ERCC" )
Counts |
Matrix of dimensions |
Tech |
Logical vector of length |
SpikeInfo |
|
BatchInfo |
Vector of length |
SpikeType |
Character to indicate what type of spike-ins are in use.
Default value: |
An object of class SingleCellExperiment
.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
'plot' method for BASiCS_Chain
objects
## S4 method for signature 'BASiCS_Chain,ANY' plot( x, Parameter = "mu", Gene = NULL, Cell = NULL, Batch = 1, RegressionTerm = NULL, ... )
## S4 method for signature 'BASiCS_Chain,ANY' plot( x, Parameter = "mu", Gene = NULL, Cell = NULL, Batch = 1, RegressionTerm = NULL, ... )
x |
A |
Parameter |
Name of the slot to be used for the plot.
Possible values: |
Gene |
Specifies which gene is requested.
Required only if |
Cell |
Specifies which cell is requested.
Required only if |
Batch |
Specifies which batch is requested.
Required only if |
RegressionTerm |
Specifies which regression coefficient is requested.
Required only if |
... |
Unused. |
A plot object
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
help(BASiCS_MCMC)
help(BASiCS_MCMC)
'plot' method for BASiCS_Summary
objects
## S4 method for signature 'BASiCS_Summary,ANY' plot( x, Param = "mu", Param2 = NULL, Genes = NULL, Cells = NULL, Batches = NULL, RegressionTerms = NULL, xlab = "", ylab = "", xlim = "", ylim = NULL, pch = 16, col = "blue", bty = "n", SmoothPlot = TRUE, ... )
## S4 method for signature 'BASiCS_Summary,ANY' plot( x, Param = "mu", Param2 = NULL, Genes = NULL, Cells = NULL, Batches = NULL, RegressionTerms = NULL, xlab = "", ylab = "", xlim = "", ylim = NULL, pch = 16, col = "blue", bty = "n", SmoothPlot = TRUE, ... )
x |
A |
Param |
Name of the slot to be used for the plot.
Possible values: |
Param2 |
Name of the second slot to be used for the plot.
Possible values: |
Genes |
Specifies which genes are requested.
Required only if |
Cells |
Specifies which cells are requested.
Required only if |
Batches |
Specifies which batches are requested.
Required only if |
RegressionTerms |
Specifies which regression coefficients are requested.
Required only if |
xlab |
As in |
ylab |
As in |
xlim |
As in |
ylim |
As in |
pch |
As in |
col |
As in |
bty |
As in |
SmoothPlot |
Logical parameter. If |
... |
Other graphical parameters (see |
A plot object
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
help(BASiCS_MCMC)
help(BASiCS_MCMC)
rowData getter and setter for BASiCS_ResultsDE and BASiCS_ResultVG objects.
## S4 method for signature 'BASiCS_ResultsDE' rowData(x) ## S4 replacement method for signature 'BASiCS_ResultsDE' rowData(x) <- value ## S4 method for signature 'BASiCS_ResultVG' rowData(x) ## S4 replacement method for signature 'BASiCS_ResultVG' rowData(x) <- value
## S4 method for signature 'BASiCS_ResultsDE' rowData(x) ## S4 replacement method for signature 'BASiCS_ResultsDE' rowData(x) <- value ## S4 method for signature 'BASiCS_ResultVG' rowData(x) ## S4 replacement method for signature 'BASiCS_ResultVG' rowData(x) <- value
x |
BASiCS_ResultVG or BASiCS_ResultsDE object. |
value |
New |
For the getter, a DFrame. For setter, the modified
x
.
BASiCS_ResultDE
objectAccessors for the slots of a BASiCS_ResultDE
object
## S4 method for signature 'BASiCS_ResultDE' show(object)
## S4 method for signature 'BASiCS_ResultDE' show(object)
object |
an object of class |
Prints a summary of the properties of object
.
help(BASiCS_MCMC)
help(BASiCS_MCMC)
BASiCS_ResultsDE
objectAccessors for the slots of a BASiCS_ResultsDE
object
## S4 method for signature 'BASiCS_ResultsDE' show(object)
## S4 method for signature 'BASiCS_ResultsDE' show(object)
object |
an object of class |
Prints a summary of the properties of object
.
help(BASiCS_MCMC)
help(BASiCS_MCMC)
BASiCS_ResultVG
objectAccessors for the slots of a BASiCS_ResultVG
object
## S4 method for signature 'BASiCS_ResultVG' show(object)
## S4 method for signature 'BASiCS_ResultVG' show(object)
object |
an object of class |
Prints a summary of the properties of object
.
help(BASiCS_MCMC)
help(BASiCS_MCMC)
This can be used to extract a subset of a 'BASiCS_Chain' object. The subset can contain specific genes, cells or MCMC iterations
## S4 method for signature 'BASiCS_Chain' subset(x, Genes = NULL, Cells = NULL, Iterations = NULL)
## S4 method for signature 'BASiCS_Chain' subset(x, Genes = NULL, Cells = NULL, Iterations = NULL)
x |
A |
Genes , Cells
|
A vector of characters, logical values, or numbers, indicating which cells or genes will be extracted. |
Iterations |
Numeric vector of positive integers indicating which MCMC
iterations will be extracted. The maximum value in |
An object of class BASiCS_Chain
.
Catalina A. Vallejos [email protected]
data(ChainSC) # Extracts 3 first genes ChainSC1 <- subset(ChainSC, Genes = rownames(ChainSC)[1:3]) # Extracts 3 first cells ChainSC2 <- subset(ChainSC, Cells = colnames(ChainSC)[1:3]) # Extracts 10 first iterations ChainSC3 <- subset(ChainSC, Iterations = 1:10)
data(ChainSC) # Extracts 3 first genes ChainSC1 <- subset(ChainSC, Genes = rownames(ChainSC)[1:3]) # Extracts 3 first cells ChainSC2 <- subset(ChainSC, Cells = colnames(ChainSC)[1:3]) # Extracts 10 first iterations ChainSC3 <- subset(ChainSC, Iterations = 1:10)
For each of the BASiCS parameters (see Vallejos et al 2015),
Summary
returns the corresponding postior medians and limits of
the high posterior density interval (probabilty equal to prob
)
## S4 method for signature 'BASiCS_Chain' Summary(x, ..., prob = 0.95, na.rm = FALSE)
## S4 method for signature 'BASiCS_Chain' Summary(x, ..., prob = 0.95, na.rm = FALSE)
x |
A |
... |
Unused, only included for consistency with the generic. |
prob |
|
na.rm |
Unused, only included for consistency with the generic. |
An object of class BASiCS_Summary
.
Catalina A. Vallejos [email protected]
Nils Eling [email protected]
data(ChainSC) SummarySC <- Summary(ChainSC)
data(ChainSC) SummarySC <- Summary(ChainSC)