Title: | trajectory-based differential expression analysis for sequencing data |
---|---|
Description: | tradeSeq provides a flexible method for fitting regression models that can be used to find genes that are differentially expressed along one or multiple lineages in a trajectory. Based on the fitted models, it uses a variety of tests suited to answer different questions of interest, e.g. the discovery of genes for which expression is associated with pseudotime, or which are differentially expressed (in a specific region) along the trajectory. It fits a negative binomial generalized additive model (GAM) for each gene, and performs inference on the parameters of the GAM. |
Authors: | Koen Van den Berge [aut], Hector Roux de Bezieux [aut, cre] , Kelly Street [aut, ctb], Lieven Clement [aut, ctb], Sandrine Dudoit [ctb] |
Maintainer: | Hector Roux de Bezieux <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.21.0 |
Built: | 2024-10-31 06:33:32 UTC |
Source: | https://github.com/bioc/tradeSeq |
This test assesses whether average gene expression is associated with pseudotime.
associationTest(models, ...) ## S4 method for signature 'SingleCellExperiment' associationTest( models, global = TRUE, lineages = FALSE, l2fc = 0, nPoints = 2 * tradeSeq::nknots(models), contrastType = "start", inverse = ifelse(l2fc == 0, "Chol", "eigen") ) ## S4 method for signature 'list' associationTest(models, global = TRUE, lineages = FALSE, l2fc = 0)
associationTest(models, ...) ## S4 method for signature 'SingleCellExperiment' associationTest( models, global = TRUE, lineages = FALSE, l2fc = 0, nPoints = 2 * tradeSeq::nknots(models), contrastType = "start", inverse = ifelse(l2fc == 0, "Chol", "eigen") ) ## S4 method for signature 'list' associationTest(models, global = TRUE, lineages = FALSE, l2fc = 0)
models |
The fitted GAMs, typically the output from
|
... |
parameters including: |
global |
If TRUE, test for all lineages simultaneously. |
lineages |
If TRUE, test for all lineages independently. |
l2fc |
The log2 fold change threshold to test against. Note, that this will affect both the global test and the pairwise comparisons. |
nPoints |
The number of points used per lineage to set up the contrast. Defaults to 2 times the number of knots. Note that not all points may end up being actually used in the inference; only linearly independent contrasts will be used. |
contrastType |
The contrast used to impose the log2 fold-change
threshold. Defaults to |
inverse |
Method to use to invert variance-covariance matrix of contrasts.
Usually you do not want to change this. Options are
- |
If a log2 fold-change threshold has not been set, we use the QR decompositon
through qr.solve
to invert the variance-covariance matrix of the
contrasts. If instead a log2 fold chalnge-threshold has been set, we invert
that matrix using the Cholesky decomposition through chol2inv
.
A matrix with the wald statistic, the
degrees of freedom and the (unadjusted) p-value
associated with each gene for all the tests performed. If the testing
procedure was unsuccessful for a particular gene, NA
values will be
returned for that gene.
set.seed(8) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") sce <- fitGAM(counts = as.matrix(countMatrix), sds = crv, nknots = 5) assocRes <- associationTest(sce)
set.seed(8) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") sce <- fitGAM(counts = as.matrix(countMatrix), sds = crv, nknots = 5) assocRes <- associationTest(sce)
This object contains a vector that define the cell type for each cell in the data described in Paul et al. (2015).
data(celltype)
data(celltype)
An object of class character
of length 2660.
#' @references Franziska Paul, Yaara Arkin, Amir Giladi, DiegoAdhemar Jaitin, Ephraim Kenigsberg, Hadas KerenShaul, Deborah Winter, David Lara-Astiaso, Meital Gury, Assaf Weiner, Eyal David, Nadav Cohen, FeliciaKathrineBratt Lauridsen, Simon Haas, Andreas Schlitzer, Alexander Mildner, Florent Ginhoux, Steen Jung, Andreas Trumpp, BoTorben Porse, Amos Tanay, and Ido Amit. Transcriptional Heterogeneity and Lineage Commitment in Myeloid Progenitors. Cell, 163(7):1663–1677, 12 2015. ISSN 0092- 8674. doi: 10.1016/J.CELL.2015.11.013. URL https://www.sciencedirect.com/science/article/ ii/S0092867415014932?via
Cluster genes in clusters that have similar expression patterns
along all lineages in the trajectory. By default, this function uses the
clusterExperiment
package to do the clustering. If another clustering
method is of interest, one can extract fitted values to use for clustering,
see details in the vignette.
## S4 method for signature 'SingleCellExperiment' clusterExpressionPatterns( models, nPoints, genes, reduceMethod = "PCA", nReducedDims = 10, minSizes = 6, ncores = 1, random.seed = 176201, verbose = TRUE, ... ) ## S4 method for signature 'list' clusterExpressionPatterns( models, nPoints, genes, reduceMethod = "PCA", nReducedDims = 10, minSizes = 6, ncores = 1, random.seed = 176201, verbose = TRUE, ... )
## S4 method for signature 'SingleCellExperiment' clusterExpressionPatterns( models, nPoints, genes, reduceMethod = "PCA", nReducedDims = 10, minSizes = 6, ncores = 1, random.seed = 176201, verbose = TRUE, ... ) ## S4 method for signature 'list' clusterExpressionPatterns( models, nPoints, genes, reduceMethod = "PCA", nReducedDims = 10, minSizes = 6, ncores = 1, random.seed = 176201, verbose = TRUE, ... )
models |
The fitted GAMs, typically the output from
|
nPoints |
The number of points to use for clustering the expression patterns. |
genes |
A numerical or character vector specifying the genes from
|
reduceMethod |
Dimensionality reduction method used before running the
clustering methods. Passed to |
nReducedDims |
Number of dimensions kept after |
minSizes |
Minimum size of clusters.
Passed to |
ncores |
Number of cores to use. Passed to
|
random.seed |
Passed to |
verbose |
Passed to |
... |
Additional arguments to be passed to
|
This method adopts the RSEC
function from the clusterExperiment package to perform consensus clustering.
A list containing the scaled fitted values yhatScaled
(for
plotting) and a clusterExperiment
object, containing the
clustering results.
## Not run: data(gamList, package = "tradeSeq") clusterExpressionPatterns(models = gamList, nPoints = 200, genes = seq_len(11), verbose = FALSE) ## End(Not run)
## Not run: data(gamList, package = "tradeSeq") clusterExpressionPatterns(models = gamList, nPoints = 200, genes = seq_len(11), verbose = FALSE) ## End(Not run)
Assess differential expression patterns between conditions within a lineage.
Assess differential expression patterns between conditions within a lineage.
conditionTest(models, ...) ## S4 method for signature 'SingleCellExperiment' conditionTest( models, global = TRUE, pairwise = FALSE, lineages = FALSE, knots = NULL, l2fc = 0, eigenThresh = 0.01 )
conditionTest(models, ...) ## S4 method for signature 'SingleCellExperiment' conditionTest( models, global = TRUE, pairwise = FALSE, lineages = FALSE, knots = NULL, l2fc = 0, eigenThresh = 0.01 )
models |
The fitted GAMs, typically the output from
|
... |
parameters including: |
global |
If TRUE, test for all pairwise comparisons simultaneously, i.e. test for DE between all conditions in all lineages. |
pairwise |
If TRUE, return output for all comparisons between pairs of conditions.
Both |
lineages |
If TRUE, return output for all comparisons within each lineage.
Both |
knots |
Default to NULL. Otherwise, a vector of length 2 specifying the smallest and largest knots that are contrasted between conditions. |
l2fc |
The log2 fold change threshold to test against. Note, that this will affect both the global test and the pairwise comparisons. |
eigenThresh |
Eigenvalue threshold for inverting the variance-covariance matrix
of the coefficients to use for calculating the Wald test statistics. Lower values
are more lenient to adding more information but also decrease computational stability.
This argument should in general not be changed by the user but is provided
for back-compatability. Set to |
A matrix with the wald statistic, the number of degrees of freedom and the p-value associated with each gene for all the tests performed.
## artificial example data(crv, package = "tradeSeq") data("countMatrix", package = "tradeSeq") conditions <- factor(sample(1:2, size = ncol(countMatrix), replace = TRUE)) sce <- fitGAM(as.matrix(countMatrix), sds = crv, conditions = conditions) res <- conditionTest(sce)
## artificial example data(crv, package = "tradeSeq") data("countMatrix", package = "tradeSeq") conditions <- factor(sample(1:2, size = ncol(countMatrix), replace = TRUE)) sce <- fitGAM(as.matrix(countMatrix), sds = crv, conditions = conditions) res <- conditionTest(sce)
This object contains the gene expression counts from the data described in Paul et al. (2015).
data(countMatrix)
data(countMatrix)
An object of class dgCMatrix
with 240 rows and 2660 columns.
#' @references Franziska Paul, Yaara Arkin, Amir Giladi, DiegoAdhemar Jaitin, Ephraim Kenigsberg, Hadas KerenShaul, Deborah Winter, David Lara-Astiaso, Meital Gury, Assaf Weiner, Eyal David, Nadav Cohen, FeliciaKathrineBratt Lauridsen, Simon Haas, Andreas Schlitzer, Alexander Mildner, Florent Ginhoux, Steen Jung, Andreas Trumpp, BoTorben Porse, Amos Tanay, and Ido Amit. Transcriptional Heterogeneity and Lineage Commitment in Myeloid Progenitors. Cell, 163(7):1663–1677, 12 2015. ISSN 0092- 8674. doi: 10.1016/J.CELL.2015.11.013. URL https://www.sciencedirect.com/science/article/ ii/S0092867415014932?via
This dataset contains the Slingshot trajectory from the data described in Paul et al. (2015).
data(crv)
data(crv)
An object of class SlingshotDataSet
of length 1.
Franziska Paul, Yaara Arkin, Amir Giladi, DiegoAdhemar Jaitin, Ephraim Kenigsberg, Hadas KerenShaul, Deborah Winter, David Lara-Astiaso, Meital Gury, Assaf Weiner, Eyal David, Nadav Cohen, FeliciaKathrineBratt Lauridsen, Simon Haas, Andreas Schlitzer, Alexander Mildner, Florent Ginhoux, Steen Jung, Andreas Trumpp, BoTorben Porse, Amos Tanay, and Ido Amit. Transcriptional Heterogeneity and Lineage Commitment in Myeloid Progenitors. Cell, 163(7):1663–1677, 12 2015. ISSN 0092- 8674. doi: 10.1016/J.CELL.2015.11.013. URL https://www.sciencedirect.com/science/article/ ii/S0092867415014932?via
Assess differential expression between the average expression at the end points of lineages of a trajectory.
diffEndTest(models, ...) ## S4 method for signature 'SingleCellExperiment' diffEndTest(models, global = TRUE, pairwise = FALSE, l2fc = 0) ## S4 method for signature 'list' diffEndTest(models, global = TRUE, pairwise = FALSE, l2fc = 0)
diffEndTest(models, ...) ## S4 method for signature 'SingleCellExperiment' diffEndTest(models, global = TRUE, pairwise = FALSE, l2fc = 0) ## S4 method for signature 'list' diffEndTest(models, global = TRUE, pairwise = FALSE, l2fc = 0)
models |
The fitted GAMs, typically the output from
|
... |
parameters including: |
global |
If TRUE, test for all pairwise comparisons simultaneously. |
pairwise |
If TRUE, test for all pairwise comparisons independently. |
l2fc |
The log2 fold change threshold to test against. Note, that this will affect both the global test and the pairwise comparisons. |
The l2fc
argument allows to test against a particular fold change
threshold. For example, if the interest lies in discovering genes that are
differentially expressed with an absolute log2 fold change cut off above 1,
i.e. a fold change of at least 2, then one can test for this by setting
l2fc=1
as argument to the function.
A matrix with the wald statistic, the number of df and the p-value associated with each gene for all the tests performed. Also, for each possible pairwise comparision, the observed log fold changes. If the testing procedure was unsuccessful, the procedure will return NA test statistics, fold changes and p-values.
data(gamList, package = "tradeSeq") diffEndTest(gamList, global = TRUE, pairwise = TRUE)
data(gamList, package = "tradeSeq") diffEndTest(gamList, global = TRUE, pairwise = TRUE)
Perform test of differential expression patterns between lineages in a user-defined region based on the knots of the smoothers.
earlyDETest(models, ...) ## S4 method for signature 'SingleCellExperiment' earlyDETest( models, global = TRUE, pairwise = FALSE, knots = NULL, nPoints = 2 * nknots(models), l2fc = 0, eigenThresh = 0.01 ) ## S4 method for signature 'list' earlyDETest( models, global = TRUE, pairwise = FALSE, knots = NULL, nPoints = 2 * nknots(models), l2fc = 0, eigenThresh = 0.01 )
earlyDETest(models, ...) ## S4 method for signature 'SingleCellExperiment' earlyDETest( models, global = TRUE, pairwise = FALSE, knots = NULL, nPoints = 2 * nknots(models), l2fc = 0, eigenThresh = 0.01 ) ## S4 method for signature 'list' earlyDETest( models, global = TRUE, pairwise = FALSE, knots = NULL, nPoints = 2 * nknots(models), l2fc = 0, eigenThresh = 0.01 )
models |
The fitted GAMs, typically the output from
|
... |
parameters including: |
global |
If TRUE, test for all pairwise comparisons simultaneously. |
pairwise |
If TRUE, test for all pairwise comparisons independently. |
knots |
A vector of length 2 specifying the knots at the start and end of the region of interest. |
nPoints |
The number of points to be compared between lineages. Defaults to twice the number of knots |
l2fc |
The log2 fold change threshold to test against. Note, that this will affect both the global test and the pairwise comparisons. |
eigenThresh |
Eigenvalue threshold for inverting the variance-covariance matrix
of the coefficients to use for calculating the Wald test statistics. Lower values
are more lenient to adding more information but also decrease computational stability.
This argument should in general not be changed by the user but is provided
for back-compatability. Set to |
To help the user in choosing which knots to use when defining the
branching, the plotGeneCount
function has a models optional
parameter that can be used to visualize where the knots are.
A matrix with the wald statistic, the number of df and the p-value associated with each gene for all the tests performed. Also, for each possible pairwise comparision, the observed log fold changes. If the testing procedure was unsuccessful, the procedure will return NA test statistics, fold changes and p-values.
data(gamList, package = "tradeSeq") earlyDETest(gamList, knots = c(1, 2), global = TRUE, pairwise = TRUE)
data(gamList, package = "tradeSeq") earlyDETest(gamList, knots = c(1, 2), global = TRUE, pairwise = TRUE)
Evaluate the optimal number of knots required for fitGAM.
Evaluate an appropriate number of knots.
evaluateK(counts, ...) ## S4 method for signature 'matrix' evaluateK( counts, k = 3:10, nGenes = 500, sds = NULL, pseudotime = NULL, cellWeights = NULL, U = NULL, conditions = NULL, plot = TRUE, weights = NULL, offset = NULL, aicDiff = 2, verbose = TRUE, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), family = "nb", gcv = FALSE, ... ) ## S4 method for signature 'dgCMatrix' evaluateK( counts, k = 3:10, nGenes = 500, sds = NULL, pseudotime = NULL, cellWeights = NULL, plot = TRUE, U = NULL, weights = NULL, offset = NULL, aicDiff = 2, verbose = TRUE, conditions = NULL, control = mgcv::gam.control(), parallel = FALSE, BPPARAM = BiocParallel::bpparam(), family = "nb", gcv = FALSE, ... ) ## S4 method for signature 'SingleCellExperiment' evaluateK( counts, k = 3:10, nGenes = 500, sds = NULL, pseudotime = NULL, cellWeights = NULL, plot = TRUE, U = NULL, weights = NULL, offset = NULL, aicDiff = 2, verbose = TRUE, conditions = NULL, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), family = "nb", gcv = FALSE, ... )
evaluateK(counts, ...) ## S4 method for signature 'matrix' evaluateK( counts, k = 3:10, nGenes = 500, sds = NULL, pseudotime = NULL, cellWeights = NULL, U = NULL, conditions = NULL, plot = TRUE, weights = NULL, offset = NULL, aicDiff = 2, verbose = TRUE, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), family = "nb", gcv = FALSE, ... ) ## S4 method for signature 'dgCMatrix' evaluateK( counts, k = 3:10, nGenes = 500, sds = NULL, pseudotime = NULL, cellWeights = NULL, plot = TRUE, U = NULL, weights = NULL, offset = NULL, aicDiff = 2, verbose = TRUE, conditions = NULL, control = mgcv::gam.control(), parallel = FALSE, BPPARAM = BiocParallel::bpparam(), family = "nb", gcv = FALSE, ... ) ## S4 method for signature 'SingleCellExperiment' evaluateK( counts, k = 3:10, nGenes = 500, sds = NULL, pseudotime = NULL, cellWeights = NULL, plot = TRUE, U = NULL, weights = NULL, offset = NULL, aicDiff = 2, verbose = TRUE, conditions = NULL, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), family = "nb", gcv = FALSE, ... )
counts |
The count matrix, genes in rows and cells in columns. |
... |
parameters including: |
k |
The range of knots to evaluate. '3:10' by default. See details. |
nGenes |
The number of genes to use in the evaluation. Genes will be randomly selected. 500 by default. |
sds |
Slingshot object containing the lineages. |
pseudotime |
a matrix of pseudotime values, each row represents a cell and each column represents a lineage. |
cellWeights |
a matrix of cell weights defining the probability that a cell belongs to a particular lineage. Each row represents a cell and each column represents a lineage. |
U |
The design matrix of fixed effects. The design matrix should not contain an intercept to ensure identifiability. |
conditions |
This argument is in beta phase and should be used carefully. If each lineage consists of multiple conditions, this argument can be used to specify the conditions. tradeSeq will then fit a condition-specific smoother for every lineage. |
plot |
Whether to display diagnostic plots. Default to |
weights |
Optional: a matrix of weights with identical dimensions
as the |
offset |
Optional: the offset, on log-scale. If NULL, TMM is used to
account for differences in sequencing depth, see |
aicDiff |
Used for selecting genes with significantly varying AIC values over the range of evaluated knots to make the barplot output. Default is set to 2, meaning that only genes whose AIC range is larger than 2 will be used to check for the optimal number of knots through the barplot visualization that is part of the output of this function. |
verbose |
logical, should progress be verbose? |
parallel |
Logical, defaults to FALSE. Set to TRUE if you want to parallellize the fitting. |
BPPARAM |
object of class |
control |
Control object for GAM fitting, see |
family |
The distribution assumed, currently only |
gcv |
(In development). Logical, should a GCV score also be returned? |
The number of parameter to evaluate (and therefore the runtime) scales in
k
* the number of lineages. Moreover, we have found that, in practice,
values of k above 12-15 rarely lead to improved result, not matter the
complexity of the trajectory being considered. As such, we recommand that user
proceed with care when setting k to value higher than 15.
A plot of average AIC value over the range of selected knots, and a matrix of AIC and GCV values for the selected genes (rows) and the range of knots (columns).
## This is an artificial example, please check the vignette for a realistic one. set.seed(8) library(slingshot) data(sds, package="tradeSeq") loadings <- matrix(runif(2000*2, -2, 2), nrow = 2, ncol = 2000) counts <- round(abs(t(slingshot:::slingReducedDim(sds) %*% loadings))) + 100 aicK <- evaluateK(counts = counts, sds = sds, nGenes = 100, k = 3:5, verbose = FALSE)
## This is an artificial example, please check the vignette for a realistic one. set.seed(8) library(slingshot) data(sds, package="tradeSeq") loadings <- matrix(runif(2000*2, -2, 2), nrow = 2, ncol = 2000) counts <- round(abs(t(slingshot:::slingReducedDim(sds) %*% loadings))) + 100 aicK <- evaluateK(counts = counts, sds = sds, nGenes = 100, k = 3:5, verbose = FALSE)
This fits the NB-GAM model as described in
Van den Berge et al.[2019].
There are two ways to provide the required input in fitGAM
.
See Details and the vignette.
fitGAM(counts, ...) ## S4 method for signature 'matrix' fitGAM( counts, sds = NULL, pseudotime = NULL, cellWeights = NULL, conditions = NULL, U = NULL, genes = seq_len(nrow(counts)), weights = NULL, offset = NULL, nknots = 6, verbose = TRUE, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), sce = TRUE, family = "nb", gcv = FALSE ) ## S4 method for signature 'dgCMatrix' fitGAM( counts, sds = NULL, pseudotime = NULL, cellWeights = NULL, conditions = NULL, U = NULL, genes = seq_len(nrow(counts)), weights = NULL, offset = NULL, nknots = 6, verbose = TRUE, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), sce = TRUE, family = "nb", gcv = FALSE ) ## S4 method for signature 'SingleCellExperiment' fitGAM( counts, U = NULL, genes = seq_len(nrow(counts)), conditions = NULL, weights = NULL, offset = NULL, nknots = 6, verbose = TRUE, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), sce = TRUE, family = "nb", gcv = FALSE )
fitGAM(counts, ...) ## S4 method for signature 'matrix' fitGAM( counts, sds = NULL, pseudotime = NULL, cellWeights = NULL, conditions = NULL, U = NULL, genes = seq_len(nrow(counts)), weights = NULL, offset = NULL, nknots = 6, verbose = TRUE, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), sce = TRUE, family = "nb", gcv = FALSE ) ## S4 method for signature 'dgCMatrix' fitGAM( counts, sds = NULL, pseudotime = NULL, cellWeights = NULL, conditions = NULL, U = NULL, genes = seq_len(nrow(counts)), weights = NULL, offset = NULL, nknots = 6, verbose = TRUE, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), sce = TRUE, family = "nb", gcv = FALSE ) ## S4 method for signature 'SingleCellExperiment' fitGAM( counts, U = NULL, genes = seq_len(nrow(counts)), conditions = NULL, weights = NULL, offset = NULL, nknots = 6, verbose = TRUE, parallel = FALSE, BPPARAM = BiocParallel::bpparam(), control = mgcv::gam.control(), sce = TRUE, family = "nb", gcv = FALSE )
counts |
The count matrix of expression values, with genes in rows and cells in columns. Can be a matrix or a sparse matrix. |
... |
parameters including: |
sds |
an object of class |
pseudotime |
A matrix of pseudotime values, each row represents a cell and each column represents a lineage. |
cellWeights |
A matrix of cell weights defining the probability that a cell belongs to a particular lineage. Each row represents a cell and each column represents a lineage. If only a single lineage, provide a matrix with one column containing all values of 1. |
conditions |
This argument is in beta phase and should be used carefully. If each lineage consists of multiple conditions, this argument can be used to specify the conditions. tradeSeq will then fit a condition-specific smoother for every lineage. |
U |
The design matrix of fixed effects. The design matrix should not contain an intercept to ensure identifiability. |
genes |
The genes on which to run |
weights |
A matrix of weights with identical dimensions
as the |
offset |
The offset, on log-scale. If NULL, TMM is used to account for
differences in sequencing depth., see |
nknots |
Number of knots used to fit the GAM. Defaults to 6. It is recommended to use the 'evaluateK' function to guide in selecting an appropriate number of knots. |
verbose |
Logical, should progress be printed? |
parallel |
Logical, defaults to FALSE. Set to TRUE if you want to parallellize the fitting. |
BPPARAM |
object of class |
control |
Variables to control fitting of the GAM, see
|
sce |
Logical: should output be of SingleCellExperiment class? This is
recommended to be TRUE. If |
family |
The assumed distribution for the response. Is set to |
gcv |
(In development). Logical, should a GCV score also be returned? |
fitGAM
supports four different ways to input the required objects:
"Count matrix, matrix of pseudotime and matrix of cellWeights."
Input count matrix using counts
argument and
pseudotimes and cellWeights as a matrix, with number of rows equal to
number of cells, and number of columns equal to number of lineages.
"Count matrix and Slingshot input."Input count matrix using
counts
argument and Slingshot object using sds
argument.
"SingleCellExperiment Object after running slingshot on the object."
Input SingleCellExperiment Object using counts
argument.
"CellDataSet object after running the orderCells
function."
Input CellDataSet Object using counts
argument.
If sce=FALSE
, returns a list of length equal to the number of genes
(number of rows of counts
). Each element of the list is either a
gamObject
if the fiting procedure converged, or an error
message.
If sce=TRUE
, returns a singleCellExperiment
object with
the tradeSeq
results stored in the rowData
,
colData
and metadata
.
set.seed(8) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") sceGAM <- fitGAM(counts = as.matrix(countMatrix), sds = crv, nknots = 5)
set.seed(8) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") sceGAM <- fitGAM(counts = as.matrix(countMatrix), sds = crv, nknots = 5)
A list of 11 gamObject
obtained by fitting 10 genes on 15
cells randomly assigned to lineages with random pseudotimes.
data(gamList)
data(gamList)
Can be re-obtained by runing the code in the example section
of fitGAM
.
mgcv
.Return smoother p-values from the mgcv
package.
getSmootherPvalues(models)
getSmootherPvalues(models)
models |
the GAM models, typically the output from |
a matrix with the p-value associated with each lineage's smoother. The matrix has one row per gene where the fitting procedure converged.
data(gamList, package = "tradeSeq") getSmootherPvalues(gamList)
data(gamList, package = "tradeSeq") getSmootherPvalues(gamList)
Return test statistics from the mgcv
package.
getSmootherTestStats(models)
getSmootherTestStats(models)
models |
the GAM models, typically the output from |
a matrix with the wald statistics associated with each lineage's smoother. The matrix has one row per gene where the fitting procedure converged.
data(gamList, package = "tradeSeq") getSmootherPvalues(gamList)
data(gamList, package = "tradeSeq") getSmootherPvalues(gamList)
Get the number of knots used for the fit
nknots(models, ...) ## S4 method for signature 'SingleCellExperiment' nknots(models) ## S4 method for signature 'list' nknots(models)
nknots(models, ...) ## S4 method for signature 'SingleCellExperiment' nknots(models) ## S4 method for signature 'list' nknots(models)
models |
The fitted GAMs, typically the output from
|
... |
parameters including: |
A numeric, the number of nknots
data(gamList, package = "tradeSeq") nknots(gamList)
data(gamList, package = "tradeSeq") nknots(gamList)
Assess differences in expression patterns between lineages.
patternTest(models, ...) ## S4 method for signature 'list' patternTest( models, global = TRUE, pairwise = FALSE, nPoints = 2 * nknots(models), l2fc = 0, eigenThresh = 0.01 ) ## S4 method for signature 'SingleCellExperiment' patternTest( models, global = TRUE, pairwise = FALSE, nPoints = 2 * nknots(models), l2fc = 0, eigenThresh = 0.01 )
patternTest(models, ...) ## S4 method for signature 'list' patternTest( models, global = TRUE, pairwise = FALSE, nPoints = 2 * nknots(models), l2fc = 0, eigenThresh = 0.01 ) ## S4 method for signature 'SingleCellExperiment' patternTest( models, global = TRUE, pairwise = FALSE, nPoints = 2 * nknots(models), l2fc = 0, eigenThresh = 0.01 )
models |
The fitted GAMs, typically the output from
|
... |
parameters including: |
global |
If TRUE, test for all pairwise comparisons simultaneously.
If |
pairwise |
If TRUE, test for all pairwise comparisons, between lineages. |
nPoints |
The number of points to be compared between lineages. Defaults to twice the number of knots |
l2fc |
The log2 fold change threshold to test against. Note, that this will affect both the global test and the pairwise comparisons. |
eigenThresh |
Eigenvalue threshold for deciding on the rank of the
variance-covariance matrix of the contrasts defined by 'patternTest', and
to use for calculating the Wald test statistics. Lower values
are more lenient to adding more information but also decrease computational stability.
This argument should in general not be changed by the user but is provided
for back-compatability. Set to |
A matrix with the wald statistic, the number of df and the p-value associated with each gene for all the tests performed. Also, for each possible pairwise comparision, the observed log fold changes. If the testing procedure was unsuccessful, the procedure will return NA test statistics, fold changes and p-values.
data(gamList, package = "tradeSeq") patternTest(gamList, global = TRUE, pairwise = TRUE)
data(gamList, package = "tradeSeq") patternTest(gamList, global = TRUE, pairwise = TRUE)
Evaluate an appropriate number of knots.
plot_evalutateK_results(aicMat, k = NULL, aicDiff = 2)
plot_evalutateK_results(aicMat, k = NULL, aicDiff = 2)
aicMat |
The output from |
k |
The range of knots to evaluate. '3:10' by default. Extracted from the column names by default |
aicDiff |
Used for selecting genes with significantly varying AIC values over the range of evaluated knots to make the barplot output. Default is set to 2, meaning that only genes whose AIC range is larger than 2 will be used to check for the optimal number of knots through the barplot visualization that is part of the output of this function. |
## This is an artificial example, please check the vignette for a realistic one. set.seed(8) data(sds, package="tradeSeq") library(slingshot) loadings <- matrix(runif(2000*2, -2, 2), nrow = 2, ncol = 2000) counts <- round(abs(t(slingshot:::slingReducedDim(sds) %*% loadings))) + 100 aicK <- evaluateK(counts = counts, sds = sds, nGenes = 100, k = 3:5, verbose = FALSE, plot = FALSE) plot_evalutateK_results(aicK, k = 3:5)
## This is an artificial example, please check the vignette for a realistic one. set.seed(8) data(sds, package="tradeSeq") library(slingshot) loadings <- matrix(runif(2000*2, -2, 2), nrow = 2, ncol = 2000) counts <- round(abs(t(slingshot:::slingReducedDim(sds) %*% loadings))) + 100 aicK <- evaluateK(counts = counts, sds = sds, nGenes = 100, k = 3:5, verbose = FALSE, plot = FALSE) plot_evalutateK_results(aicK, k = 3:5)
Plot the gene in reduced dimensional space.
plotGeneCount(curve, ...) ## S4 method for signature 'SlingshotDataSet' plotGeneCount( curve, counts = NULL, gene = NULL, clusters = NULL, models = NULL, title = NULL ) ## S4 method for signature 'PseudotimeOrdering' plotGeneCount( curve, counts = NULL, gene = NULL, clusters = NULL, models = NULL, title = NULL ) ## S4 method for signature 'SingleCellExperiment' plotGeneCount( curve, counts = NULL, gene = NULL, clusters = NULL, models = NULL, title = NULL )
plotGeneCount(curve, ...) ## S4 method for signature 'SlingshotDataSet' plotGeneCount( curve, counts = NULL, gene = NULL, clusters = NULL, models = NULL, title = NULL ) ## S4 method for signature 'PseudotimeOrdering' plotGeneCount( curve, counts = NULL, gene = NULL, clusters = NULL, models = NULL, title = NULL ) ## S4 method for signature 'SingleCellExperiment' plotGeneCount( curve, counts = NULL, gene = NULL, clusters = NULL, models = NULL, title = NULL )
curve |
One of the following:
|
... |
parameters including: |
counts |
The count matrix, genes in rows and cells in columns. Only
needed if the input is of the type |
gene |
The name of gene for which you want to plot the count or the row
number of that gene in the count matrix. Alternatively, one can specify
the |
clusters |
The assignation of each cell to a cluster. Used to color the
plot. Either |
models |
The fitted GAMs, typically the output from
|
title |
Title for the plot. |
If both gene
and clusters
arguments are supplied, the
plot will be colored according to gene count level. If none are provided, the
function will fail.
A ggplot
object
set.seed(97) library(slingshot) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") rd <- slingReducedDim(crv) cl <- kmeans(rd, centers = 7)$cluster lin <- getLineages(rd, clusterLabels = cl, start.clus = 4) crv <- getCurves(lin) counts <- as.matrix(countMatrix) gamList <- fitGAM(counts = counts, pseudotime = slingPseudotime(crv, na = FALSE), cellWeights = slingCurveWeights(crv)) plotGeneCount(crv, counts, gene = "Mpo")
set.seed(97) library(slingshot) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") rd <- slingReducedDim(crv) cl <- kmeans(rd, centers = 7)$cluster lin <- getLineages(rd, clusterLabels = cl, start.clus = 4) crv <- getCurves(lin) counts <- as.matrix(countMatrix) gamList <- fitGAM(counts = counts, pseudotime = slingPseudotime(crv, na = FALSE), cellWeights = slingCurveWeights(crv)) plotGeneCount(crv, counts, gene = "Mpo")
Plot the smoothers estimated by tradeSeq
.
plotSmoothers(models, ...) ## S4 method for signature 'gam' plotSmoothers( models, nPoints = 100, lwd = 2, size = 2/3, xlab = "Pseudotime", ylab = "Log(expression + 1)", border = TRUE, alpha = 1, sample = 1 ) ## S4 method for signature 'SingleCellExperiment' plotSmoothers( models, counts, gene, nPoints = 100, lwd = 2, size = 2/3, xlab = "Pseudotime", ylab = "Log(expression + 1)", border = TRUE, alpha = 1, sample = 1, pointCol = NULL, curvesCols = NULL, plotLineages = TRUE, lineagesToPlot = NULL )
plotSmoothers(models, ...) ## S4 method for signature 'gam' plotSmoothers( models, nPoints = 100, lwd = 2, size = 2/3, xlab = "Pseudotime", ylab = "Log(expression + 1)", border = TRUE, alpha = 1, sample = 1 ) ## S4 method for signature 'SingleCellExperiment' plotSmoothers( models, counts, gene, nPoints = 100, lwd = 2, size = 2/3, xlab = "Pseudotime", ylab = "Log(expression + 1)", border = TRUE, alpha = 1, sample = 1, pointCol = NULL, curvesCols = NULL, plotLineages = TRUE, lineagesToPlot = NULL )
models |
Either the |
... |
parameters including: |
nPoints |
The number of points used to extrapolate the fit. Defaults to 100. |
lwd |
Line width of the smoother. Passed to |
size |
Character expansion of the data points. Passed to |
xlab |
x-axis label. Passed to |
ylab |
y-axis label. Passed to |
border |
Logical: should a white border be drawn around the mean smoother. |
alpha |
Numeric between 0 and 1, determines the transparency of data points,
see |
sample |
Numeric between 0 and 1, use to subsample the cells when there are too many so that it can plot faster. |
counts |
The matrix of gene expression counts. |
gene |
Gene name or row in count matrix of gene to plot. |
pointCol |
Plotting colors for each cell. Can be either character vector of
length 1, denoting a variable in the |
curvesCols |
Plotting colors for each curve Should be a list of colors of the exact same length as the number of curves, i.e. the number of lineages (if there is no conditions) or the number of lineages by the number of conditions. In the second case, the colors are grouped by condition (lineage 1 - condition 1, lineage 1 - condition 2,...). |
plotLineages |
Logical, should the mean smoothers for each lineage be plotted? |
lineagesToPlot |
A vector of integers referring to the lineages to be plotted. |
A ggplot
object
set.seed(8) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") counts <- as.matrix(countMatrix) sce <- fitGAM(counts = counts, sds = crv, nknots = 5) plotSmoothers(sce, counts, rownames(counts)[1]) # Show only first lineage curve curvesCols <- c("#440154FF", "transparent") plotSmoothers(sce, counts, rownames(counts)[1], curvesCols = curvesCols, border = FALSE) # Show only first curve and cells assigned to first lineage plotSmoothers(sce, counts, rownames(counts)[1], curvesCols = curvesCols, border = FALSE) + ggplot2::scale_color_manual(values = curvesCols)
set.seed(8) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") counts <- as.matrix(countMatrix) sce <- fitGAM(counts = counts, sds = crv, nknots = 5) plotSmoothers(sce, counts, rownames(counts)[1]) # Show only first lineage curve curvesCols <- c("#440154FF", "transparent") plotSmoothers(sce, counts, rownames(counts)[1], curvesCols = curvesCols, border = FALSE) # Show only first curve and cells assigned to first lineage plotSmoothers(sce, counts, rownames(counts)[1], curvesCols = curvesCols, border = FALSE) + ggplot2::scale_color_manual(values = curvesCols)
Get fitted values for each cell.
predictCells(models, ...) ## S4 method for signature 'SingleCellExperiment' predictCells(models, gene) ## S4 method for signature 'list' predictCells(models, gene)
predictCells(models, ...) ## S4 method for signature 'SingleCellExperiment' predictCells(models, gene) ## S4 method for signature 'list' predictCells(models, gene)
models |
Either the |
... |
parameters including: |
gene |
Gene name of gene for which to extract fitted values. |
Using the gene expression model of tradeSeq
available at
https://www.nature.com/articles/s41467-020-14766-3#Sec2.
the output of predictCells
returns the value for each cell.
A vector of fitted values.
data(gamList, package = "tradeSeq") predictCells(models = gamList, gene = 1)
data(gamList, package = "tradeSeq") predictCells(models = gamList, gene = 1)
Get smoothers estimated by tradeSeq
along a
grid. This function does not return fitted values but rather the predicted
mean smoother, for a user-defined grid of points.
predictSmooth(models, ...) ## S4 method for signature 'SingleCellExperiment' predictSmooth(models, gene, nPoints = 100, tidy = TRUE) ## S4 method for signature 'list' predictSmooth(models, gene, nPoints = 100)
predictSmooth(models, ...) ## S4 method for signature 'SingleCellExperiment' predictSmooth(models, gene, nPoints = 100, tidy = TRUE) ## S4 method for signature 'list' predictSmooth(models, gene, nPoints = 100)
models |
Either the |
... |
parameters including: |
gene |
Either a vector of gene names or an integer vector, corresponding to the row(s) of the gene(s). |
nPoints |
The number of points used to create the grid along the smoother for each lineage. Defaults to 100. |
tidy |
Logical: return tidy output. If TRUE, returns a |
Using the gene expression model of tradeSeq
available at
https://www.nature.com/articles/s41467-020-14766-3#Sec2.
the output of predictSmooth
returns the value for
equally space values of pseudotimes, and a constant value for
and
(arbitraly, we select the values of
).
A matrix
with estimated averages.
A vector of fitted values.
data(gamList, package = "tradeSeq") predictSmooth(models = gamList, gene = 1)
data(gamList, package = "tradeSeq") predictSmooth(models = gamList, gene = 1)
This dataset contains the toy example from the Slingshot R package vignette.
data(sds)
data(sds)
An object of class SlingshotDataSet
of length 1.
https://bioconductor.org/packages/release/bioc/html/slingshot.html
This function assesses differential expression between the average expression of the start and end points of a lineage.
startVsEndTest(models, ...) ## S4 method for signature 'SingleCellExperiment' startVsEndTest( models, global = TRUE, lineages = FALSE, pseudotimeValues = NULL, l2fc = 0 ) ## S4 method for signature 'list' startVsEndTest( models, global = TRUE, lineages = FALSE, pseudotimeValues = NULL, l2fc = 0 )
startVsEndTest(models, ...) ## S4 method for signature 'SingleCellExperiment' startVsEndTest( models, global = TRUE, lineages = FALSE, pseudotimeValues = NULL, l2fc = 0 ) ## S4 method for signature 'list' startVsEndTest( models, global = TRUE, lineages = FALSE, pseudotimeValues = NULL, l2fc = 0 )
models |
The fitted GAMs, typically the output from
|
... |
parameters including: |
global |
If TRUE, test for all lineages simultaneously. |
lineages |
If TRUE, test for all lineages independently. |
pseudotimeValues |
A vector of length 2, specifying two pseudotime values to be compared against each other, for every lineage of the trajectory. @details Note that this test assumes that all lineages start at a pseudotime value of zero, which is the starting point against which the end point is compared. |
l2fc |
The log2 fold change threshold to test against. Note, that this will affect both the global test and the pairwise comparisons. |
A matrix with the wald statistic, the number of df and the p-value associated with each gene for all the tests performed. Also, for each possible pairwise comparision, the observed log fold changes. If the testing procedure was unsuccessful, the procedure will return NA test statistics, fold changes and p-values.
data(gamList, package = "tradeSeq") startVsEndTest(gamList, global = TRUE, lineages = TRUE)
data(gamList, package = "tradeSeq") startVsEndTest(gamList, global = TRUE, lineages = TRUE)