Title: | Genomic trajectories with heterogeneous genetic and environmental backgrounds |
---|---|
Description: | PhenoPath infers genomic trajectories (pseudotimes) in the presence of heterogeneous genetic and environmental backgrounds and tests for interactions between them. |
Authors: | Kieran Campbell |
Maintainer: | Kieran Campbell <[email protected]> |
License: | Apache License (== 2.0) |
Version: | 1.31.0 |
Built: | 2024-12-02 06:15:36 UTC |
Source: | https://github.com/bioc/phenopath |
Fit a covariate latent variable model using coordinate ascent variational inference.
clvm(y, x, maxiter = 10000, elbo_tol = 1e-05, thin = 1, verbose = TRUE, z_init = 1, tau_q = 1, tau_mu = 1, tau_c = 1, a = 2, b = 2, tau_alpha = 1, a_beta = 10, b_beta = 1, q = rep(0, nrow(y)), model_mu = FALSE, scale_y = TRUE)
clvm(y, x, maxiter = 10000, elbo_tol = 1e-05, thin = 1, verbose = TRUE, z_init = 1, tau_q = 1, tau_mu = 1, tau_c = 1, a = 2, b = 2, tau_alpha = 1, a_beta = 10, b_beta = 1, q = rep(0, nrow(y)), model_mu = FALSE, scale_y = TRUE)
y |
A N-by-G (dynamic) input matrix |
x |
A N-by-P (static) input matrix |
maxiter |
Maximum number of CAVI iterations |
elbo_tol |
The (percent) change in the ELBO below which it is considered converged |
thin |
The number of iterations to wait each time before re-calculating the elbo |
verbose |
Print convergence messages |
z_init |
The initialisation of the latent trajectory. Should be one of
|
tau_q |
Hyperparameter tau_q |
tau_mu |
Hyperparameter tau_mu |
tau_c |
Hyperparameter tau_c |
a |
Hyperparameter a |
b |
Hyperparameter b |
tau_alpha |
Hyperparameter tau_alpha |
a_beta |
Hyperparameter a_beta |
b_beta |
Hyperparameter b_beta |
q |
Priors on the latent variables |
model_mu |
Logical - should a gene-specific intercept term be modelled? |
scale_y |
Logical - should the expression matrix be centre scaled? |
A list whose entries correspond to the converged values of the variational parameters along with the ELBO.
sim <- simulate_phenopath() fit <- clvm(sim$y, matrix(sim$x))
sim <- simulate_phenopath() fit <- clvm(sim$y, matrix(sim$x))
Get the interaction effect sizes
interaction_effects(phenopath_fit)
interaction_effects(phenopath_fit)
phenopath_fit |
An object of class |
TODO
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) beta <- interaction_effects(fit)
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) beta <- interaction_effects(fit)
Get the interaction standard deviations
interaction_sds(phenopath_fit)
interaction_sds(phenopath_fit)
phenopath_fit |
An object of class |
TODO
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) beta_sd <- interaction_sds(fit)
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) beta_sd <- interaction_sds(fit)
Computes a tidy data frame of the interaction effects, pathway scores, and significance for each gene and covariate interaction.
interactions(phenopath_fit, n = 3)
interactions(phenopath_fit, n = 3)
phenopath_fit |
An object returned by a call to |
n |
The number of standard deviations away from 0 the posterior of the interaction effect sizes should be to be designated as significant |
A data frame with the following columns:
feature
The feature (usually gene)
covariate
The covariate, specified from the order originally supplied to
the call to phenopath
interaction_effect_size
The effect size of the interaction (
from the statistical model)
significant
Boolean for whether the interaction effect is significantly
different from 0
chi
The precision of the ARD prior on
pathway_loading
The pathway loading , showing the overall
effect for each gene marginalised over the covariate
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) interactions(fit)
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) interactions(fit)
PhenoPath learns genomic trajectories in the presence of heterogenous environmental and genetic backgrounds. It takes input gene expression measurements that are modelled by a single unobserved factor (the "trajectory"). The regulation of genes along the trajectory is perturbed by an additional set of covariates (such as genetic or environmental status) allowing for the identification of covariate-trajectory interactions. The model is fitted using mean-field co-ordinate ascent variational inference.
phenopath(exprs_obj, x, sce_assay = "exprs", elbo_tol = 1e-05, z_init = 1, ...)
phenopath(exprs_obj, x, sce_assay = "exprs", elbo_tol = 1e-05, z_init = 1, ...)
exprs_obj |
Input gene expression, either
|
x |
The covariate vector, either
|
sce_assay |
The assay from |
elbo_tol |
The relative pct change in the ELBO below which is considered converged. See convergence section in details below. |
z_init |
The initialisation of the latent trajectory. Should be one of
|
... |
Additional arguments to be passed to |
Input expression
If an SummarizedExperiment
is provided, assay(exprs_obj, sce_assay)
is used.
This is assumed to be in
a form that is suitably normalised and approximately normal, such as
the log of TPM values (plus a suitable offset) or
similar.
Encoding covariates
See the vignette.
Convergence of variational inference
It is strongly recommended to call plot_elbo(phenopath_fit)
after the fitting procedure to
ensure the ELBO has approximately converged (though convergence metrics are printed during the
fitting process). For a good introduction to variational inference see Blei, D.M., Kucukelbir, A. & McAuliffe, J.D., 2017. Variational Inference: A Review for Statisticians. Journal of the American Statistical Association.
Additional arguments
Addition arguments to clvm
are passed via ...
. For full documentation, call ?clvm
.
Some notable options:
thin
- The ELBO is expensive to compute for larger datasets. The model will
compute the ELBO and compare convergence every thin
iterations.
q
and tau_q
- Priors (such as capture times) for the latent space. Note that
model_mu
should be true if q
is non-zero.
scale_y
By default the input expression is centre-scaled for each gene. If scale_y
is FALSE
this does not happen - but note that model_mu
should be TRUE
in such a case.
An S3 structure with the following entries:
m_z
The converged mean estimates of the trajectory
s_z
The converged standard deviation estimates of z
m_beta
A P-by-G matrix of interaction coefficients
s_beta
A P-by-G matrix of interaction standard deviations
clvm
for the underlying CAVI function, trajectory
to extract the latent trajectory, interaction_effects
for the interaction effect
sizes, significant_interactions
for the results of Bayesian significance testing.
sim <- simulate_phenopath() # returns a list with gene expression in y and covariates in x fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) # Extract the trajectory z <- trajectory(fit)
sim <- simulate_phenopath() # returns a list with gene expression in y and covariates in x fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) # Extract the trajectory z <- trajectory(fit)
Plots the evidence lower bound (ELBO) as a function of iterations
plot_elbo(fit)
plot_elbo(fit)
fit |
An object returned by a call to |
A ggplot2
object of the ELBO against the number of iterations
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x) plot_elbo(fit)
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x) plot_elbo(fit)
Print a PhenoPath fit
## S3 method for class 'phenopath_fit' print(x, ...)
## S3 method for class 'phenopath_fit' print(x, ...)
x |
A |
... |
Additional arguments |
A string representation of a phenopath_fit
object.
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) print(fit)
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) print(fit)
Given the results of clvm
, decide which features show significant
iteractions between the latent trajectory and covariates. Significant
features are designated as those where the variational mean of the interaction
coefficient falls outside the interval of 0.
significant_interactions(phenopath_fit, n = 3)
significant_interactions(phenopath_fit, n = 3)
phenopath_fit |
The results of a call to |
n |
The number of standard deviations away from 0 the posterior estimate of beta should be to be designated significant. |
A logical vector describing whether each feature passes the significance test.
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) signints <- significant_interactions(fit)
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) signints <- significant_interactions(fit)
Simulate synthetic data from the phenopath model, where each gene is sampled from one of four types (see details).
simulate_phenopath(N = 100, G = 40, G_de = NULL, G_pst = NULL, G_pst_beta = NULL, G_de_pst_beta = NULL)
simulate_phenopath(N = 100, G = 40, G_de = NULL, G_pst = NULL, G_pst_beta = NULL, G_de_pst_beta = NULL)
N |
Number of samples to simulate |
G |
Number of genes to simulate. Should be divisible by 4 |
G_de |
Number of genes to simulate from the differential expression regime |
G_pst |
Number of genes to simulate from the pseudotime regime |
G_pst_beta |
Number of genes to simulate from the pseudotime + beta interactions regime |
G_de_pst_beta |
Number of genes to simulate from the differential expression + pseudotime + interactions regime |
Will simulate data for a number of genes corresponding to one of four regimes:
de
('differential expression'), where the gene has no association to the latent
trajectory and exhibits differential expression only
pst
('pseudotime'), the gene shows pseudotemporal regulation but no
differential regulation
pst_beta
('pseudotime + beta interactions'), the gene shows pseudotemporal regulation
that is modulated by covariate interactions
de_pst_beta
('differential expression + pseudotime + interactions), all of the above
A list with four entries:
parameters
A tibble
with an entry for each gene and a column
for each parameter value and simulation regime (see details).
y
The N-by-G simulated gene expression matrix.
x
The N-length vector of covariates.
z
The N-length vector of pseudotimes.
sim <- simulate_phenopath()
sim <- simulate_phenopath()
Get the latent PhenoPath trajectory
trajectory(phenopath_fit)
trajectory(phenopath_fit)
phenopath_fit |
An object of class |
A vector of latent trajectory (pseudotime) values
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) z <- trajectory(fit)
sim <- simulate_phenopath() fit <- phenopath(sim$y, sim$x, elbo_tol = 1e-2) z <- trajectory(fit)