Title: | Functional Network Analysis |
---|---|
Description: | Algorithms for functional network analysis. Includes an implementation of a variational Dirichlet process Gaussian mixture model for nonparametric mixture modeling. |
Authors: | Leo Lahti, Olli-Pekka Huovilainen, Antonio Gusmao and Juuso Parkkinen |
Maintainer: | Leo Lahti <[email protected]> |
License: | GPL (>=2) |
Version: | 1.67.0 |
Built: | 2024-10-30 09:02:20 UTC |
Source: | https://github.com/bioc/netresponse |
Global modeling of transcriptional responses in interaction networks.
Package: | netresponse |
Type: | Package |
Version: | See sessionInfo() or DESCRIPTION file |
Date: | 2011-02-03 |
License: | GNU GPL >=2 |
LazyLoad: | yes |
Leo Lahti, Olli-Pekka Huovilainen, Antonio Gusmao and Juuso Parkkinen. Maintainer: Leo Lahti [email protected]
Leo Lahti et al.: Global modeling of transcriptional responses in interaction networks. Bioinformatics (2010). See citation('netresponse') for details.
## Not run: # Define parameters for toy data Ns <- 200 # number of samples (conditions) Nf <- 10 # number of features (nodes) feature.names <- paste('feat', seq(Nf), sep='') sample.names <- paste('sample', seq(Ns), sep='') # random seed set.seed( 123 ) # Random network netw <- pmax(array(sign(rnorm(Nf^2)), dim = c(Nf, Nf)), 0) # in pathway analysis nodes correspond to genes rownames(netw) <- colnames(netw) <- feature.names # Random responses of the nodes across conditions D <- array(rnorm(Ns*Nf), dim = c(Ns,Nf), dimnames = list(sample.names, feature.names)) D[1:100, 4:6] <- t(sapply(1:(Ns/2),function(x){rnorm(3, mean = 1:3)})) D[101:Ns, 4:6] <- t(sapply(1:(Ns/2),function(x){rnorm(3, mean = 7:9)})) # Calculate the model #model <- detect.responses(D, netw) ## Subnets (each is a list of nodes) #get.subnets( model ) ## End(Not run)
## Not run: # Define parameters for toy data Ns <- 200 # number of samples (conditions) Nf <- 10 # number of features (nodes) feature.names <- paste('feat', seq(Nf), sep='') sample.names <- paste('sample', seq(Ns), sep='') # random seed set.seed( 123 ) # Random network netw <- pmax(array(sign(rnorm(Nf^2)), dim = c(Nf, Nf)), 0) # in pathway analysis nodes correspond to genes rownames(netw) <- colnames(netw) <- feature.names # Random responses of the nodes across conditions D <- array(rnorm(Ns*Nf), dim = c(Ns,Nf), dimnames = list(sample.names, feature.names)) D[1:100, 4:6] <- t(sapply(1:(Ns/2),function(x){rnorm(3, mean = 1:3)})) D[101:Ns, 4:6] <- t(sapply(1:(Ns/2),function(x){rnorm(3, mean = 7:9)})) # Calculate the model #model <- detect.responses(D, netw) ## Subnets (each is a list of nodes) #get.subnets( model ) ## End(Not run)
Calculates and plots ellipse corresponding to specified confidence interval in 2-dimensional plot
add.ellipse( centroid, covmat, confidence = 0.95, npoints = 100, col = "black", ... )
add.ellipse( centroid, covmat, confidence = 0.95, npoints = 100, col = "black", ... )
centroid |
Vector with two elements defining the ellipse centroid. |
covmat |
Covariance matrix for the investigated data. Only diagonal covariances supported. |
confidence |
Confidence level determining the ellipse borders based on the covariance matrix. |
npoints |
Number of plotting points. |
col |
Color. |
... |
Other arguments to be passed. |
Used for plotting side effects.
Leo Lahti [email protected]
Latent class analysis based on (infinite) Gaussian mixture model. If the input is data matrix, a multivariate model is fitted; if the input is a vector, a univariate model is fitted
bic.mixture(x, max.modes, bic.threshold = 0, min.modes = 1, ...)
bic.mixture(x, max.modes, bic.threshold = 0, min.modes = 1, ...)
x |
samples x features matrix for multivariate analysis, or a vector for univariate analysis |
max.modes |
Maximum number of modes to be checked for mixture model selection |
bic.threshold |
BIC threshold which needs to be exceeded before a new mode is added to the mixture. |
min.modes |
minimum number of modes |
... |
Further optional arguments to be passed |
Fitted latent class model (parameters and free energy)
Contact: Leo Lahti [email protected]
See citation('netresponse')
Latent class analysis based on (infinite) Gaussian mixture model. If the input (dat) is data matrix, a multivariate model is fitted.
bic.mixture.multivariate(x, max.modes, bic.threshold = 0, min.modes = 1, ...)
bic.mixture.multivariate(x, max.modes, bic.threshold = 0, min.modes = 1, ...)
x |
matrix (for multivariate analysis) |
max.modes |
Maximum number of modes to be checked for mixture model selection |
bic.threshold |
BIC threshold which needs to be exceeded before a new mode is added to the mixture. |
min.modes |
Minimum number of modes to be checked for mixture model selection |
... |
Further optional arguments to be passed |
Fitted latent class model (parameters and free energy)
Contact: Leo Lahti [email protected]
See citation('netresponse')
Latent class analysis based on (infinite) Gaussian mixture model. If the input (dat) is data matrix, a multivariate model is fitted. If the input is a vector or a 1-dimensional matrix, a univariate model is fitted.
bic.mixture.univariate(x, max.modes, bic.threshold = 0, min.modes = 1, ...)
bic.mixture.univariate(x, max.modes, bic.threshold = 0, min.modes = 1, ...)
x |
dat vector (for univariate analysis) or a matrix (for multivariate analysis) |
max.modes |
Maximum number of modes to be checked for mixture model selection |
bic.threshold |
BIC threshold which needs to be exceeded before a new mode is added to the mixture. |
min.modes |
minimum number of modes |
... |
Further optional arguments to be passed |
Fitted latent class model (parameters and free energy)
Contact: Leo Lahti [email protected]
See citation('netresponse')
Select optimal number of mixture components by adding components until the increase in objective function is below threshold.
bic.select.best.mode(x, max.modes = 1, bic.threshold = 1, min.modes = 1)
bic.select.best.mode(x, max.modes = 1, bic.threshold = 1, min.modes = 1)
x |
dat vector (for univariate analysis) or a matrix (for multivariate analysis) |
max.modes |
Maximum number of modes to be checked for mixture model selection |
bic.threshold |
BIC threshold which needs to be exceeded before a new mode is added to the mixture. |
min.modes |
Optional. Minimum number of modes. |
Fitted latent class model (parameters and free energy)
Contact: Leo Lahti [email protected]
See citation('netresponse')
Center data matrix to 0 for each variable by removing the means.
centerData(X, rm.na = TRUE, meanvalue = NULL)
centerData(X, rm.na = TRUE, meanvalue = NULL)
X |
The data set: samples x features. Each feature will be centered. |
rm.na |
Ignore NAs. |
meanvalue |
Can be used to set a desired center value. The default is 0. |
Centered data matrix.
Note that the model assumes samples x features matrix, and centers each feature.
Leo Lahti [email protected]
See citation('netresponse')
centerData(matrix(rnorm(100), 10, 10))
centerData(matrix(rnorm(100), 10, 10))
Internal use to check input network and format detect.responses.
check.network(network, datamatrix, verbose = FALSE)
check.network(network, datamatrix, verbose = FALSE)
network |
Input network, see detect.responses |
datamatrix |
Input datamatrix, see detect.responses |
verbose |
Print intermediate messages |
formatted |
Formatted network (self-links removed) |
original |
Original network (possible in another representation format) |
delta |
Cost function changes corresponding to the 'formatted' network. |
nodes |
Nodes corresponding to the 'formatted' network. |
Maintainer: Leo Lahti [email protected]
See citation('netresponse')
detect.responses
# check.network(network, datamatrix, verbose = FALSE)
# check.network(network, datamatrix, verbose = FALSE)
Quantify association between modes and continuous variable
continuous.responses( annotation.vector, model, method = "t-test", min.size = 2, data = NULL )
continuous.responses( annotation.vector, model, method = "t-test", min.size = 2, data = NULL )
annotation.vector |
annotation vector with discrete factor levels, and named by the samples |
model |
NetResponse model object |
method |
method for enrichment calculation |
min.size |
minimum sample size for a response |
data |
data matrix (samples x features) |
List with each element corresponding to one variable and listing the responses according to association strength
Contact: Leo Lahti [email protected]
See citation('netresponse')
res <- continuous.responses(annotation.vector = NULL, model = NULL)
res <- continuous.responses(annotation.vector = NULL, model = NULL)
Main function of the NetResponse algorithm. Detect condition-specific network responses, given network and a set of measurements of node activity in a set of conditions. Returns a set of subnetworks and their estimated context-specific responses.
detect.responses( datamatrix, network = NULL, initial.responses = 1, max.responses = 10, max.subnet.size = 10, verbose = TRUE, prior.alpha = 1, prior.alphaKsi = 0.01, prior.betaKsi = 0.01, update.hyperparams = 0, implicit.noise = 0, vdp.threshold = 1e-05, merging.threshold = 0, ite = Inf, information.criterion = "BIC", speedup = TRUE, speedup.max.edges = 10, positive.edges = FALSE, mc.cores = 1, mixture.method = "vdp", bic.threshold = 0, pca.basis = FALSE, ... )
detect.responses( datamatrix, network = NULL, initial.responses = 1, max.responses = 10, max.subnet.size = 10, verbose = TRUE, prior.alpha = 1, prior.alphaKsi = 0.01, prior.betaKsi = 0.01, update.hyperparams = 0, implicit.noise = 0, vdp.threshold = 1e-05, merging.threshold = 0, ite = Inf, information.criterion = "BIC", speedup = TRUE, speedup.max.edges = 10, positive.edges = FALSE, mc.cores = 1, mixture.method = "vdp", bic.threshold = 0, pca.basis = FALSE, ... )
datamatrix |
Matrix of samples x features. For example, gene expression matrix with conditions on the rows, and genes on the columns. The matrix contains same features than the 'network' object, characterizing the network states across the different samples. |
network |
Binary network describing undirected pairwise interactions between features of 'datamatrix'. The following formats are supported: binary matrix, graphNEL, igraph, graphAM, Matrix, dgCMatrix, dgeMatrix |
initial.responses |
Initial number of components for each subnetwork model. Used to initialize calculations. |
max.responses |
Maximum number of responses for each subnetwork. Can be used to limit the potential number of network states. |
max.subnet.size |
Numeric. Maximum allowed subnetwork size. |
verbose |
Logical. Verbose parameter. |
prior.alpha , prior.alphaKsi , prior.betaKsi
|
Prior parameters for Gaussian mixture model that is calculated for each subnetwork (normal-inverse-Gamma prior). alpha tunes the mean; alphaKsi and betaKsi are the shape and scale parameters of the inverse Gamma function, respectively. |
update.hyperparams |
Logical. Indicate whether to update hyperparameters during modeling. |
implicit.noise |
Implicit noise parameter. Add implicit noise to vdp mixture model. Can help to avoid overfitting to local optima, if this appears to be a problem. |
vdp.threshold |
Minimal free energy improvement after which the variational Gaussian mixture algorithm is deemed converged. |
merging.threshold |
Minimal cost value improvement required for merging two subnetworks. |
ite |
Defines maximum number of iterations on posterior update (updatePosterior). Increasing this can potentially lead to more accurate results, but computation may take longer. |
information.criterion |
Information criterion for model selection. Default is BIC (Bayesian Information Criterion); other options include AIC and AICc. |
speedup |
Takes advantage of approximations to PCA, mutual information etc in various places to speed up calculations. Particularly useful with large and densely connected networks and/or large sample size. |
speedup.max.edges |
Used if speedup = TRUE. Applies prefiltering of edges for calculating new joint models between subnetwork pairs when potential cost changes (delta) are updated for a newly merged subnetwork and its neighborghs. Empirical mutual information between each such subnetwork pair is calculated based on their first principal components, and joint models will be calculated only for the top candidates up to the number specified by speedup.max.edges. It is expected that the subnetwork pair that will benefit most from joint modeling will be among the top mutual infomation candidates. This way it is possible to avoid calculating exhaustive many models on the network hubs. |
positive.edges |
Consider only the edges with positive association. Currently measured with Spearman correlation. |
mc.cores |
Number of cores to be used in parallelization. See help(mclapply) for details. |
mixture.method |
Specify the approach to use in mixture modeling. Options. vdp (nonparametric Variational Dirichlet process mixture model); bic (based on Gaussian mixture modeling with EM, using BIC to select the optimal number of components) |
bic.threshold |
BIC threshold which needs to be exceeded before a new mode is added to the mixture with mixture.method = "bic" |
pca.basis |
Transform data first onto PCA basis to try to avoid problems with non-diagonal covariances. |
... |
Further optional arguments to be passed. |
NetResponseModel object.
Maintainer: Leo Lahti [email protected]
See citation("netresponse").
## Not run: #data(toydata) # Load toy data set #D <- toydata$emat # Response matrix (for example, gene expression) #netw <- toydata$netw # Network # Run NetReponse algorithm # model <- detect.responses(D, netw, verbose = FALSE) ## End(Not run)
## Not run: #data(toydata) # Load toy data set #D <- toydata$emat # Response matrix (for example, gene expression) #netw <- toydata$netw # Network # Run NetReponse algorithm # model <- detect.responses(D, netw, verbose = FALSE) ## End(Not run)
A combined yeast data set with protein-protein interactions and gene expression (dna damage). Gene expression profiles are transformed into links by computing a Pearson correlation for all pairs of genes and treating all correlations above 0.85 as additional links. Number of genes: 1823, number of interactions: 12382, number of gene expression observations: 52, number of total links with PPI and expression links: 15547.
data(dna)
data(dna)
List of following objects:
PPI data matrix
gene expression profiles data matrix
Vector of gene ids corresponding to indices used in data matrices
Gene expression observation details
pooled matrix of PPI and expression links
PPI data pooled from yeast data sets of [1] and [2]. Dna damage expression set of [3].
Ulitsky, I. and Shamir, R. Identification of functional modules using network topology and high-throughput data. BMC Systems Biology 2007, 1:8.
Nariai, N., Kolaczyk, E. D. and Kasif, S. Probabilistic Protein Function Predition from Heterogenous Genome-Wide Data. PLoS ONE 2007, 2(3):e337.
Gasch, A., Huang, M., Metzner, S., Botstein, D. and Elledge, S. Genomic expression responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog Mex1p. Molecular Biology of the Cell 2001, 12:2987-3003.
data(dna)
data(dna)
Orders the responses by association strength (enrichment score) to a given sample set. For instance, if the samples correspond to a particular experimental factor, this function can be used to prioritize the responses according to their association strength to this factor.
enrichment.list.factor(models, level.samples, method, verbose = FALSE)
enrichment.list.factor(models, level.samples, method, verbose = FALSE)
models |
List of models. Each model should have a sample-cluster assignment matrix qofz. |
level.samples |
Measure enrichment of this sample (set) across the observed responses. |
method |
'hypergeometric' measures enrichment of factor levels in this response; 'precision' measures response purity for each factor level; 'dependency' measures logarithm of the joint density between response and factor level vs. their marginal densities: log(P(r,s)/(P(r)P(s))) |
verbose |
Follow progress by intermediate messages. |
A data frame which gives a data frame of responses ordered by enrichment score for the investigated sample. The model, response id and enrichment score are shown. The method field indicates the enrichment calculation method. The sample field lists the samples et for which the enrichments were calculated. The info field lists additional information on enrichment statistics.
Leo Lahti [email protected]
See citation('netresponse') for citation details.
#
#
Orders the responses by association strength (enrichment score) to a given sample set. For instance, if the samples correspond to a particular experimental factor, this function can be used to prioritize the responses according to their association strength to this factor.
enrichment.list.factor.minimal( groupings, method, verbose = FALSE, annotation.vector, level )
enrichment.list.factor.minimal( groupings, method, verbose = FALSE, annotation.vector, level )
groupings |
List of groupings. Each model should have a sample-cluster assignment matrix qofz. |
method |
'hypergeometric' measures enrichment of factor levels in this response; 'precision' measures response purity for each factor level; 'dependency' measures logarithm of the joint density between response and factor level vs. their marginal densities: log(P(r,s)/(P(r)P(s))) |
verbose |
Follow progress by intermediate messages. |
annotation.vector |
annotation vector |
level |
level |
A data frame which gives a data frame of responses ordered by enrichment score for the investigated sample. The model, response id and enrichment score are shown. The method field indicates the enrichment calculation method. The sample field lists the samples et for which the enrichments were calculated. The info field lists additional information on enrichment statistics.
Leo Lahti [email protected]
See citation('netresponse') for citation details.
res <- enrichment.list.factor.minimal(groupings = NULL, method = NULL, annotation.vector = NULL, level = NULL)
res <- enrichment.list.factor.minimal(groupings = NULL, method = NULL, annotation.vector = NULL, level = NULL)
List responses for each level of the given factor
factor.responses( annotation.vector, groupings, method = "hypergeometric", min.size = 2, data = NULL )
factor.responses( annotation.vector, groupings, method = "hypergeometric", min.size = 2, data = NULL )
annotation.vector |
annotation vector with discrete factor levels, and named by the samples |
groupings |
List of groupings. Each model should have a sample-cluster assignment matrix qofz, or a vector of cluster indices named by the samples. |
method |
method for enrichment calculation |
min.size |
minimum sample size for a response |
data |
data (samples x features; or a vector in univariate case) |
List with each element corresponding to one factor level and listing the responses according to association strength
Contact: Leo Lahti [email protected]
See citation('netresponse')
res <- factor.responses(annotation.vector = NULL, groupings = NULL)
res <- factor.responses(annotation.vector = NULL, groupings = NULL)
List responses for each level of the given factor
factor.responses.minimal( annotation.vector, groupings, method = "hypergeometric", min.size = 2, data = NULL )
factor.responses.minimal( annotation.vector, groupings, method = "hypergeometric", min.size = 2, data = NULL )
annotation.vector |
annotation vector with discrete factor levels, and named by the samples |
groupings |
List of groupings. Each model should have a sample-cluster assignment matrix qofz, or a vector of cluster indices named by the samples. |
method |
method for enrichment calculation |
min.size |
minimum sample size for a response |
data |
data (samples x features; or a vector in univariate case) |
List with each element corresponding to one factor level and listing the responses according to association strength
Contact: Leo Lahti [email protected]
See citation('netresponse')
res <- factor.responses.minimal(annotation.vector = NULL, groupings = NULL)
res <- factor.responses.minimal(annotation.vector = NULL, groupings = NULL)
Given subnetwork, orders the remaining features (genes) in the input data based on similarity with the subnetwork. Allows the identification of similar features that are not directly connected in the input network.
find.similar.features(model, subnet.id, datamatrix = NULL, verbose = FALSE, information.criterion = NULL)
find.similar.features(model, subnet.id, datamatrix = NULL, verbose = FALSE, information.criterion = NULL)
model |
NetResponseModel object. |
subnet.id |
Investigated subnetwork. |
datamatrix |
Optional. Can be used to compare subnetwork similarity with new data which was not used for learning the subnetworks. |
verbose |
Logical indicating whether progress of the algorithm should be indicated on the screen. |
information.criterion |
Information criterion for model selection. By default uses the same than in the 'model' object. |
The same similarity measure is used as when agglomerating the subnetworks: the features are ordered by delta (change) in the cost function, assuming that the feature would be merged in the subnetwork. The smaller the change, the more similar the feature is (change would minimize the new cost function value). Negative values of delta mean that the cost function would be improved by merging the new feature in the subnetwork, indicating features having coordinated response.
A data frame with elements feature.names (e.g. gene IDs) and delta, which indicates similarity level. See details for details. The smaller, the more similar. The data frame is ordered such that the features are listed by decreasing similarity.
Leo Lahti [email protected]
See citation('netresponse') for reference details.
data(toydata) model <- toydata$model subnet.id <- 'Subnet-1' # g <- find.similar.features(model, subnet.id) # List features that are similar to this subnetwork (delta < 0) # (ordered by decreasing similarity) # subset(g, delta < 0)
data(toydata) model <- toydata$model subnet.id <- 'Subnet-1' # g <- find.similar.features(model, subnet.id) # List features that are similar to this subnetwork (delta < 0) # (ordered by decreasing similarity) # subset(g, delta < 0)
Get subnetwork data
## S4 method for signature 'NetResponseModel' get.dat(model, subnet.id, sample = NULL)
## S4 method for signature 'NetResponseModel' get.dat(model, subnet.id, sample = NULL)
model |
Result from NetResponse (detect.responses function). |
subnet.id |
Subnet identifier. A natural number which specifies one of the subnetworks within the 'model' object. |
sample |
Define the retrieved samples |
Subnet data matrix
Leo Lahti [email protected]
See citation('netresponse')
## Load a pre-calculated netresponse model obtained with # model <- detect.responses(toydata$emat, toydata$netw, verbose = FALSE) # data( toydata ); get.dat(toydata$model)
## Load a pre-calculated netresponse model obtained with # model <- detect.responses(toydata$emat, toydata$netw, verbose = FALSE) # data( toydata ); get.dat(toydata$model)
Estimate mutual information for node pairs based on the first principal components.
get.mis(datamatrix, network, delta, network.nodes, G, params)
get.mis(datamatrix, network, delta, network.nodes, G, params)
datamatrix |
datamatrix |
network |
network |
delta |
delta |
network.nodes |
network.nodes |
G |
G |
params |
params |
mutual information matrix
Maintainer: Leo Lahti [email protected]
See citation('netresponse')
Retrieve the mixture model parameters of the NetResponse algorithm for a given subnetwork.
get.model.parameters(model, subnet.id = NULL)
get.model.parameters(model, subnet.id = NULL)
model |
Result from NetResponse (detect.responses function). |
subnet.id |
Subnet identifier. A natural number which specifies one of the subnetworks within the 'model' object. |
Only the non-empty components are returned. Note: the original data matrix needs to be provided for function call separately.
A list with the following elements:
mu |
Centroids for the mixture components. Components x nodes. |
sd |
Standard deviations for the mixture components. A vector over the nodes for each component, implying the diagonal covariance matrix of the model (i.e. diag(std^2)). Components x nodes |
w |
Vector of component weights. |
nodes |
List of nodes in the subnetwork. |
K |
Number of mixture components. |
Leo Lahti [email protected]
Leo Lahti et al.: Global modeling of transcriptional responses in interaction networks. Bioinformatics (2010). See citation('netresponse') for details.
# Load toy data data( toydata ) # Load toy data set D <- toydata$emat # Response matrix (for example, gene expression) model <- toydata$model # Pre-calculated model # Get model parameters for a given subnet # (Gaussian mixture: mean, covariance diagonal, mixture proportions) get.model.parameters(model, subnet.id = 1)
# Load toy data data( toydata ) # Load toy data set D <- toydata$emat # Response matrix (for example, gene expression) model <- toydata$model # Pre-calculated model # Get model parameters for a given subnet # (Gaussian mixture: mean, covariance diagonal, mixture proportions) get.model.parameters(model, subnet.id = 1)
List the detected subnetworks (each is a list of nodes in the corresponding subnetwork).
## S4 method for signature 'NetResponseModel' get.subnets( model, get.names = TRUE, min.size = 2, max.size = Inf, min.responses = 2 )
## S4 method for signature 'NetResponseModel' get.subnets( model, get.names = TRUE, min.size = 2, max.size = Inf, min.responses = 2 )
model |
Output from the detect.responses function. An object of NetResponseModel class. |
get.names |
Logical. Indicate whether to return subnetwork nodes using node names (TRUE) or node indices (FALSE). |
min.size , max.size
|
Numeric. Filter out subnetworks whose size is not within the limits specified here. |
min.responses |
Numeric. Filter out subnetworks with less responses (mixture components) than specified here. |
A list of subnetworks.
Leo Lahti [email protected]
Leo Lahti et al.: Global modeling of transcriptional responses in interaction networks. Bioinformatics (2010). See citation('netresponse') for details.
## Load a pre-calculated netresponse model obtained with # model <- detect.responses(toydata$emat, toydata$netw, verbose = FALSE) # data( toydata ); get.subnets(toydata$model)
## Load a pre-calculated netresponse model obtained with # model <- detect.responses(toydata$emat, toydata$netw, verbose = FALSE) # data( toydata ); get.subnets(toydata$model)
Investigate association of a continuous variable and the modes given a list of groupings
list.responses.continuous.multi( annotation.df, groupings, method = "t-test", pth = Inf, verbose = TRUE, rounding = NULL )
list.responses.continuous.multi( annotation.df, groupings, method = "t-test", pth = Inf, verbose = TRUE, rounding = NULL )
annotation.df |
annotation data.frame with discrete factor levels, rows named by the samples |
groupings |
Sample mode information. Each element corresponds to one grouping; each grouping lists samples for the modes within that grouping. |
method |
method for quantifying the association |
pth |
p-value threshold applied to adjusted p-values |
verbose |
verbose |
rounding |
rounding digits |
Table listing all associations between the factor levels and responses
Contact: Leo Lahti [email protected]
See citation('netresponse')
res <- list.responses.continuous.multi(annotation.df = NULL, groupings = NULL)
res <- list.responses.continuous.multi(annotation.df = NULL, groupings = NULL)
Investigate association of a continuous variable and the modes.
list.responses.continuous.single( annotation.df, groupings, method = "t-test", pth = Inf, verbose = TRUE, rounding = NULL, adjust.p = TRUE )
list.responses.continuous.single( annotation.df, groupings, method = "t-test", pth = Inf, verbose = TRUE, rounding = NULL, adjust.p = TRUE )
annotation.df |
annotation data.frame with discrete factor levels, rows named by the samples |
groupings |
Sample mode information. Each element corresponds to one of the modes and lists the samples assignment matrix qofz. Alternatively, a vector of mode indices named by the samples can be given. |
method |
method for quantifying the association |
pth |
p-value threshold (for adjusted p-values) |
verbose |
verbose |
rounding |
rounding digits |
adjust.p |
Adjust p-values (this will add p.adj column and remove pvalue column in the output table) |
Table listing all associations between the factor levels and responses
Contact: Leo Lahti [email protected]
See citation('netresponse')
res <- list.responses.continuous.single(annotation.df = NULL, groupings = NULL)
res <- list.responses.continuous.single(annotation.df = NULL, groupings = NULL)
List significantly associated responses for all factors and levels in the given annotation matrix
list.responses.factor( annotation.df, models, method = "hypergeometric", min.size = 2, qth = Inf, verbose = TRUE, data = NULL, rounding = NULL )
list.responses.factor( annotation.df, models, method = "hypergeometric", min.size = 2, qth = Inf, verbose = TRUE, data = NULL, rounding = NULL )
annotation.df |
annotation data.frame with discrete factor levels, rows named by the samples |
models |
List of models. Each model should have a sample-cluster assignment matrix qofz, or a vector of cluster indices named by the samples. |
method |
method for enrichment calculation |
min.size |
minimum sample size for a response |
qth |
q-value threshold |
verbose |
verbose |
data |
data (samples x features; or a vector in univariate case) |
rounding |
rounding digits |
Table listing all associations between the factor levels and responses
Contact: Leo Lahti [email protected]
See citation('netresponse')
List significantly associated responses for all factors and levels in the given annotation matrix
list.responses.factor.minimal( annotation.df, groupings, method = "hypergeometric", min.size = 2, pth = Inf, verbose = TRUE, data = NULL, rounding = NULL )
list.responses.factor.minimal( annotation.df, groupings, method = "hypergeometric", min.size = 2, pth = Inf, verbose = TRUE, data = NULL, rounding = NULL )
annotation.df |
annotation data.frame with discrete factor levels, rows named by the samples |
groupings |
List of groupings. Each model should have a sample-cluster assignment matrix qofz, or a vector of cluster indices named by the samples. |
method |
method for enrichment calculation |
min.size |
minimum sample size for a response |
pth |
p-value threshold; applied to adjusted p-value |
verbose |
verbose |
data |
data (samples x features; or a vector in univariate case) |
rounding |
rounding digits |
A list with two elements: Table listing all associations between the factor levels and responses; multiple p-value adjustment method
Contact: Leo Lahti [email protected]
See citation('netresponse')
List responses with significant associations to a given sample group.
list.significant.responses(model, sample, qth = 1, method = "hypergeometric")
list.significant.responses(model, sample, qth = 1, method = "hypergeometric")
model |
NetResponseModel object. |
sample |
User-specified samples group for which the enrichments are calculated. For instance, an annotation category. |
qth |
q-value threshold for enrichments |
method |
Enrichment method. |
Statistics of the significantly associated responses.
Leo Lahti [email protected]
See citation('netresponse')
response.enrichment
#
#
Convert grouping info into a list; each element corresponds to a group and lists samples in that group.
listify.groupings(groupings, verbose = FALSE)
listify.groupings(groupings, verbose = FALSE)
groupings |
a list, a vector, or a samplesxmodes assignment matrix |
verbose |
verbose |
Group list
Leo Lahti [email protected]
See citation('netresponse')
res <- listify.groupings(groupings = NULL)
res <- listify.groupings(groupings = NULL)
Fit Gaussian mixture model
mixture.model( x, mixture.method = "vdp", max.responses = 10, implicit.noise = 0, prior.alpha = 1, prior.alphaKsi = 0.01, prior.betaKsi = 0.01, vdp.threshold = 1e-05, initial.responses = 1, ite = Inf, speedup = TRUE, bic.threshold = 0, pca.basis = FALSE, min.responses = 1, ... )
mixture.model( x, mixture.method = "vdp", max.responses = 10, implicit.noise = 0, prior.alpha = 1, prior.alphaKsi = 0.01, prior.betaKsi = 0.01, vdp.threshold = 1e-05, initial.responses = 1, ite = Inf, speedup = TRUE, bic.threshold = 0, pca.basis = FALSE, min.responses = 1, ... )
x |
data matrix (samples x features, for multivariate analysis) or a vector (for univariate analysis) |
mixture.method |
Specify the approach to use in mixture modeling. Options. vdp (nonparametric Variational Dirichlet process mixture model); bic (based on Gaussian mixture modeling with EM, using BIC to select the optimal number of components) |
max.responses |
Maximum number of responses for each subnetwork. Can be used to limit the potential number of network states. |
implicit.noise |
Implicit noise parameter. Add implicit noise to vdp mixture model. Can help to avoid overfitting to local optima, if this appears to be a problem. |
prior.alpha , prior.alphaKsi , prior.betaKsi
|
Prior parameters for Gaussian mixture model that is calculated for each subnetwork (normal-inverse-Gamma prior). alpha tunes the mean; alphaKsi and betaKsi are the shape and scale parameters of the inverse Gamma function, respectively. |
vdp.threshold |
Minimal free energy improvement after which the variational Gaussian mixture algorithm is deemed converged. |
initial.responses |
Initial number of components for each subnetwork model. Used to initialize calculations. |
ite |
Maximum number of iterations on posterior update (updatePosterior). Increasing this can potentially lead to more accurate results, but computation may take longer. |
speedup |
Takes advantage of approximations to PCA, mutual information etc in various places to speed up calculations. Particularly useful with large and densely connected networks and/or large sample size. |
bic.threshold |
BIC threshold which needs to be exceeded before a new mode is added to the mixture with mixture.method = "bic" |
pca.basis |
pca.basis |
min.responses |
minimum number of responses |
... |
Further optional arguments to be passed. |
List with two elements: model: fitted mixture model (parameters and free energy); model.params: model parameters
Contact: Leo Lahti [email protected]
See citation("netresponse")
res <- mixture.model(NULL)
res <- mixture.model(NULL)
Subnetwork statistics: size and number of distinct responses for each subnet.
model.stats(models)
model.stats(models)
models |
NetResponse object or list of models |
A 'subnetworks x properties' data frame containing the following elements.
subnet.size: |
Vector of subnetwork sizes. |
subnet.responses: |
Vector giving the number of responses in each subnetwork. |
Leo Lahti <[email protected]>
Leo Lahti et al.: Global modeling of transcriptional responses in interaction networks. Bioinformatics (2010). See citation('netresponse') for reference details.
# Load a pre-calculated netresponse model obtained with # model <- detect.responses(toydata$emat, toydata$netw, verbose = FALSE) data(toydata) # Calculate summary statistics for the model stat <- model.stats(toydata$model)
# Load a pre-calculated netresponse model obtained with # model <- detect.responses(toydata$emat, toydata$netw, verbose = FALSE) data(toydata) # Calculate summary statistics for the model stat <- model.stats(toydata$model)
A NetResponse model.
Returned by detect.responses
function.
Leo Lahti [email protected]
showClass('NetResponseModel')
showClass('NetResponseModel')
Orders the responses by association strength (enrichment score) to a given sample set. For instance, if the samples correspond to a particular experimental factor, this function can be used to prioritize the responses according to their association strength to this factor.
order.responses( models, sample, method = "hypergeometric", min.size = 2, max.size = Inf, min.responses = 2, subnet.ids = NULL, verbose = FALSE, data = NULL )
order.responses( models, sample, method = "hypergeometric", min.size = 2, max.size = Inf, min.responses = 2, subnet.ids = NULL, verbose = FALSE, data = NULL )
models |
List of models. Each model should have a sample-cluster assignment matrix qofz. |
sample |
Measure enrichment of this sample (set) across the observed responses. |
method |
'hypergeometric' measures enrichment of factor levels in this response; 'precision' measures response purity for each factor level; 'dependency' measures logarithm of the joint density between response and factor level vs. their marginal densities: log(P(r,s)/(P(r)P(s))) |
min.size , max.size , min.responses
|
Optional parameters to filter the results based on subnet size and number of responses. |
subnet.ids |
Specify subnets for which the responses shall be ordered. By default, use all subnets. |
verbose |
Follow progress by intermediate messages. |
data |
data (samples x features; or a vector in univariate case) |
A data frame with elements 'ordered.responses' which gives a data frame of responses ordered by enrichment score for the investigated sample. The subnetwork, response id and enrichment score are shown. The method field indicates the enrichment calculation method. The sample field lists the samples et for which the enrichments were calculated. The info field lists additional information on enrichment statistics.
Tools for analyzing end results of the model.
Leo Lahti [email protected]
See citation('netresponse') for citation details.
res <- order.responses(models = NULL, sample = NULL) # - for given sample/s (factor level), # order responses (across all subnets) by association strength # (enrichment score); overrepresentation # order.responses(model, sample, method = 'hypergeometric')
res <- order.responses(models = NULL, sample = NULL) # - for given sample/s (factor level), # order responses (across all subnets) by association strength # (enrichment score); overrepresentation # order.responses(model, sample, method = 'hypergeometric')
A combined yeast data set with protein-protein interactions and gene expression (osmotick shock response). Gene expression profiles are transformed into links by computing a Pearson correlation for all pairs of genes and treating all correlations above 0.85 as additional links. Number of genes: 1711, number of interactions: 10250, number of gene expression observations: 133, number of total links with PPI and expression links: 14256.
data(osmo)
data(osmo)
List of following objects:
PPI data matrix
gene expression profiles data matrix
Vector of gene ids corresponding to indices used in data matrices
Gene expression observation details
pooled matrix of PPI and expression links
PPI data pooled from yeast data sets of [1] and [2]. Dna damage expression set of [3].
Ulitsky, I. and Shamir, R. Identification of functional modules using network topology and high-throughput data. BMC Systems Biology 2007, 1:8.
Nariai, N., Kolaczyk, E. D. and Kasif, S. Probabilistic Protein Function Predition from Heterogenous Genome-Wide Data. PLoS ONE 2007, 2(3):e337.
O'Rourke, S. and Herskowitz, I. Unique and redundant roles for Hog MAPK pathway components as revealed by whole-genome expression analysis. Molecular Biology of the Cell 2004, 15:532-42.
data(osmo)
data(osmo)
Plot association strength between user-defined category labels and responses in a selected subnetwork. Associations are showm in terms -log10(p) enrichment values for the annotation categories for the responses within the specified subnetwork. No correction for multiple testing.
plot_associations( x, subnet.id, labels, method = "hypergeometric", mode = "group.by.classes", ... )
plot_associations( x, subnet.id, labels, method = "hypergeometric", mode = "group.by.classes", ... )
x |
NetResponseModel object |
subnet.id |
Subnetwork. |
labels |
Factor. Labels for the data samples. Name by samples, or provide in the same order as in the original data. |
method |
Method to calculate association strength. |
mode |
group.by.responses or group.by.classes: indicate barplot grouping type. |
... |
Other arguments to be passed for plot_ |
Used for side effect (plotting).
Leo Lahti [email protected]
See citation('netresponse').
plot_responses
#
#
Plotting tool for measurement data. Produces boxplot for each feature in each annotation category for the selected subnetwork.
plot_data(x, subnet.id, labels, ...)
plot_data(x, subnet.id, labels, ...)
x |
NetResponseModel object. |
subnet.id |
Specify the subnetwork. |
labels |
Annotation categories. |
... |
Further arguments for plot function. |
ggplot2 plot object
Leo Lahti <[email protected]>
See citation('netresponse')
plot_responses
#
#
Plot expression matrix in color scale. For one-channel data; plot expression of each gene relative to its mean expression level over all samples. Blue indicates decreased expression and red indicates increased expression. Brightness of the color indicates magnitude of the change. Black denotes no change.
plot_expression(x, maintext, ...)
plot_expression(x, maintext, ...)
x |
samples x features matrix |
maintext |
main title |
... |
optional arguments |
Used for its side effects.
Leo Lahti [email protected]
See citation('netresponse').
#plot_expression(x)
#plot_expression(x)
Fast investigation of matrix objects; standard visualization choices are made automatically; fast and easy-to-use but does not necessarily provide optimal visualization.
plot_matrix( mat, type = "twoway", midpoint = 0, palette = NULL, colors = NULL, col.breaks = NULL, interval = 0.1, plot_axes = "both", row.tick = 1, col.tick = 1, cex.xlab = 0.9, cex.ylab = 0.9, xlab = NULL, ylab = NULL, limit.trunc = 0, mar = c(5, 4, 4, 2), ... )
plot_matrix( mat, type = "twoway", midpoint = 0, palette = NULL, colors = NULL, col.breaks = NULL, interval = 0.1, plot_axes = "both", row.tick = 1, col.tick = 1, cex.xlab = 0.9, cex.ylab = 0.9, xlab = NULL, ylab = NULL, limit.trunc = 0, mar = c(5, 4, 4, 2), ... )
mat |
matrix |
type |
String. Specifies visualization type. Options: 'oneway' (color scale ranges from white to dark red; the color can be changed if needed); 'twoway' (color scale ranges from dark blue through white to dark red; colors can be changed if needed) |
midpoint |
middle point for the color plot: smaller values are shown with blue, larger are shown with red in type = 'twoway' |
palette |
Optional. Color palette. |
colors |
Optional. Colors. |
col.breaks |
breakpoints for the color palette |
interval |
interval for palette color switches |
plot_axes |
String. Indicates whether to plot x-axis ('x'), y-axis ('y'), or both ('both'). |
row.tick |
interval for plotting row axis texts |
col.tick |
interval for plotting column axis texts |
cex.xlab |
use this to specify distinct font size for the x axis |
cex.ylab |
use this to specify distinct font size for the y axis |
xlab |
optional x axis labels |
ylab |
optional y axis labels |
limit.trunc |
color scale limit breakpoint |
mar |
image margins |
... |
optional parameters to be passed to function 'image', see help(image) for further details |
A list with the color palette (colors), color breakpoints (breaks), and palette function (palette.function)
Leo Lahti [email protected]
See citation('microbiome')
mat <- rbind(c(1,2,3,4,5), c(1, 3, 1), c(4,2,2)) plot_matrix(mat, 'twoway', midpoint = 3)
mat <- rbind(c(1,2,3,4,5), c(1, 3, 1), c(4,2,2)) plot_matrix(mat, 'twoway', midpoint = 3)
Plot a specific transcriptional response for a given subnetwork. TRUE, colors = TRUE, plot_type = 'twopi', ...)
plot_response( x, mynet, mybreaks, mypalette, plot_names = TRUE, colors = TRUE, plot_type = "twopi", ... )
plot_response( x, mynet, mybreaks, mypalette, plot_names = TRUE, colors = TRUE, plot_type = "twopi", ... )
x |
A numerical vector, or NULL. |
mynet |
Binary matrix specifying the interactions between nodes. |
mybreaks |
Specify breakpoints for color plot_ |
mypalette |
Specify palette for color plot_ |
plot_names |
Plot node names (TRUE) or indices (FALSE). |
colors |
Plot colors. Logical. |
plot_type |
Network plot mode. For instance, 'neato' or 'twopi'. |
... |
Further arguments for plot function. |
Used for its side-effects.
Leo Lahti, Olli-Pekka Huovilainen and Antonio Gusmao. Maintainer: Leo Lahti <[email protected]>
L. Lahti et al.: Global modeling of transcriptional responses in interaction networks. Submitted.
#tmp <- plot_response(model, mynet, # \tmaintext = paste('Subnetwork', subnet.id))
#tmp <- plot_response(model, mynet, # \tmaintext = paste('Subnetwork', subnet.id))
Plot the detected transcriptional responses for a given subnetwork. plot_mode = 'network', xaxis = TRUE, yaxis = TRUE, plot_type = 'twopi', mar = c(5, 4, 4, 2), horiz = TRUE, datamatrix = NULL, scale = FALSE, ...)
plot_responses( x, subnet.id, nc = 3, plot_names = TRUE, plot_mode = "network", xaxis = TRUE, yaxis = TRUE, plot_type = "twopi", mar = c(5, 4, 4, 2), horiz = TRUE, datamatrix = NULL, scale = FALSE, ... )
plot_responses( x, subnet.id, nc = 3, plot_names = TRUE, plot_mode = "network", xaxis = TRUE, yaxis = TRUE, plot_type = "twopi", mar = c(5, 4, 4, 2), horiz = TRUE, datamatrix = NULL, scale = FALSE, ... )
x |
Result from NetResponse (detect.responses function). |
subnet.id |
Subnet id. |
nc |
Number of columns for an array of images. |
plot_names |
Plot node names (TRUE) or indices (FALSE). |
plot_mode |
network: plot responses as a subnetwork graph; matrix, heatmap: plot subnetwork expression matrix. For both, expression of each gene is shown relative to the mean expression level of the gene; boxplot_data: feature-wise boxplots for hard sample-to-response assignments; response.barplot: estimated response centroids as barplot including 95 confidence intervals for the means; pca: PCA projection with estimated centroids and 95 two-dimensional case the original coordinates are used. |
xaxis , yaxis
|
Logical. Plot row/column names. |
plot_type |
Network plot mode. For instance, 'neato' or 'twopi'. |
mar |
Figure margins. |
horiz |
Logical. Horizontal barplot_ |
datamatrix |
datamatrix |
scale |
scale the phylotypes to unit length (only implemented for plot_mode = 'matrix' |
... |
Further arguments for plot function. |
Used for its side-effects.
Leo Lahti [email protected]
See citation('netresponse')
# #res <- detect.responses(D, netw) #vis <- plot_responses(res, subnet.id)
# #res <- detect.responses(D, netw) #vis <- plot_responses(res, subnet.id)
Plot the color scale used in visualization.
plot_scale( x, y, m = NULL, cex.axis = 1.5, label.step = 2, interval = 0.1, two.sided = TRUE, label.start = NULL, Nlab = 3, ... )
plot_scale( x, y, m = NULL, cex.axis = 1.5, label.step = 2, interval = 0.1, two.sided = TRUE, label.start = NULL, Nlab = 3, ... )
x |
Breakpoints for the plot_ |
y |
Color palette. |
m |
Breakpoints' upper limit. |
cex.axis |
Axis scale. |
label.step |
Density of the labels. |
interval |
Interval. |
two.sided |
Plot two-sided (TRUE) or one-sided (FALSE) visualization. |
label.start |
Label starting point. |
Nlab |
Number of labels to plot_ |
... |
Further arguments for plot function. |
Used for its side-effects.
Leo Lahti <[email protected]>
See citation('netresponse')
# #res <- detect.responses(D, netw, verbose = FALSE) #vis <- plot_responses(res, subnet.idx) #plot_scale(vis$breaks, vis$palette)
# #res <- detect.responses(D, netw, verbose = FALSE) #vis <- plot_responses(res, subnet.idx) #plot_scale(vis$breaks, vis$palette)
Plot the given subnetwork.
plot_subnet(x, subnet.id, network, plot_names = TRUE, ...)
plot_subnet(x, subnet.id, network, plot_names = TRUE, ...)
x |
Result from NetResponse (detect.responses function). |
subnet.id |
Subnet id. |
network |
Original network used in the modelling. |
plot_names |
Plot node names (TRUE) or indices (FALSE). |
... |
Further arguments for plot function. |
Used for its side-effects. Returns a matrix that describes the investigated subnetwork.
Leo Lahti, Olli-Pekka Huovilainen and Antonio Gusmao. Maintainer: Leo Lahti <[email protected]>
L. Lahti et al.: Global modeling of transcriptional responses in interaction networks. Submitted.
# # res <- detect.responses(D, netw, verbose = FALSE) # net <- plot_subnet(res, subnet.idx = 1)
# # res <- detect.responses(D, netw, verbose = FALSE) # net <- plot_subnet(res, subnet.idx = 1)
Plot mixtures.
PlotMixture( x, qofz, binwidth = 0.05, xlab.text = NULL, ylab.text = NULL, title.text = NULL )
PlotMixture( x, qofz, binwidth = 0.05, xlab.text = NULL, ylab.text = NULL, title.text = NULL )
x |
data vector |
qofz |
Mode assignment probabilities for each sample. Samples x modes. |
binwidth |
binwidth for histogram |
xlab.text |
xlab.text |
ylab.text |
ylab.text |
title.text |
title.text |
Used for its side-effects
Leo Lahti [email protected]
See citation('netresponse') for citation details.
# PlotMixture(x, qofz)
# PlotMixture(x, qofz)
Visualize data, centroids and response confidence intervals for a given Gaussian mixture model in two-dimensional (bivariate) case. Optionally, color the samples according to annotations labels.
PlotMixtureBivariate( x, means, sds, ws, labels = NULL, confidence = 0.95, main = "", ... )
PlotMixtureBivariate( x, means, sds, ws, labels = NULL, confidence = 0.95, main = "", ... )
x |
data matrix (samples x features) |
means |
mode centroids (modes x features) |
sds |
mode standard deviations, assuming diagonal covariance matrices (modes x features, each row giving the sqrt of covariance diagonal for the corresponding mode) |
ws |
weight for each mode |
labels |
Optional: sample class labels to be indicated in colors. |
confidence |
Confidence interval for the responses based on the covariances of each response. If NULL, no plotting. |
main |
title text |
... |
Further arguments for plot function. |
Used for its side-effects.
Leo Lahti [email protected]
See citation('netresponse') for citation details.
#plotMixture(dat, means, sds, ws)
#plotMixture(dat, means, sds, ws)
Visualize data, centroids and response confidence intervals for a given Gaussian mixture model with PCA. Optionally, color the samples according to annotations labels.
PlotMixtureMultivariate( x, means, sds, ws, labels = NULL, title = NULL, modes = NULL, pca = FALSE, qofz = NULL, ... )
PlotMixtureMultivariate( x, means, sds, ws, labels = NULL, title = NULL, modes = NULL, pca = FALSE, qofz = NULL, ... )
x |
data matrix (samples x features) |
means |
mode centroids (modes x features) |
sds |
mode standard deviations, assuming diagonal covariance matrices (modes x features, each row giving the sqrt of covariance diagonal for the corresponding mode) |
ws |
weight for each mode |
labels |
Optional: sample class labels to be indicated in colors. |
title |
title |
modes |
Optional: provide sample modes for visualization already in the input |
pca |
The data is projected on PCA plane by default (pca = TRUE). By setting this off (pca = FALSE) it is possible to visualize two-dimensional data in the original domain. |
qofz |
Sample-response probabilistic assignments matrix (samples x responses) |
... |
Further arguments for plot function. |
Used for its side-effects.
Leo Lahti [email protected]
See citation('netresponse') for citation details.
#plotMixture(dat, means, sds, ws)
#plotMixture(dat, means, sds, ws)
Visualize data, centroids and stds for a given univariate Gaussian mixture model with PCA.
PlotMixtureUnivariate( x, means = NULL, sds = NULL, ws = NULL, title.text = NULL, xlab.text = NULL, ylab.text = NULL, binwidth = 0.05, qofz = NULL, density.color = "darkgray", cluster.assignments = NULL, ... )
PlotMixtureUnivariate( x, means = NULL, sds = NULL, ws = NULL, title.text = NULL, xlab.text = NULL, ylab.text = NULL, binwidth = 0.05, qofz = NULL, density.color = "darkgray", cluster.assignments = NULL, ... )
x |
data vector |
means |
mode centroids |
sds |
mode standard deviations |
ws |
weight for each mode |
title.text |
Plot title |
xlab.text |
xlab.text |
ylab.text |
ylab.text |
binwidth |
binwidth for histogram |
qofz |
Mode assignment probabilities for each sample. Samples x modes. |
density.color |
Color for density lines |
cluster.assignments |
Vector of cluster indices, indicating cluster for each data point |
... |
Further arguments for plot function. |
Used for its side-effects
Leo Lahti [email protected]
See citation('netresponse') for citation details.
# plotMixtureUnivariate(dat, means, sds, ws)
# plotMixtureUnivariate(dat, means, sds, ws)
Visualize data, centroids and response confidence intervals for a given subnetwork with PCA. Optionally, color the samples according to annotations labels.
plotPCA(x, subnet.id, labels = NULL, confidence = 0.95, npoints = NULL, ...)
plotPCA(x, subnet.id, labels = NULL, confidence = 0.95, npoints = NULL, ...)
x |
NetResponseModel object. Output from the detect.responses function. |
subnet.id |
Subnetwork id. Either character as 'Subnetwork-2' or numeric as 2, which is then converted to character. |
labels |
Optional: sample class labels to be indicated in colors. |
confidence |
Confidence interval for the responses based on the covariances of each response. If NULL, no plotting. |
npoints |
Argument to the ellipse function |
... |
Further arguments for plot function. |
Used for its side-effects.
Leo Lahti [email protected]
See citation('netresponse') for citation details.
#plotPCA(x, subnet.id)
#plotPCA(x, subnet.id)
Function to read network files.
read.sif(sif.file, format = 'graphNEL', directed = FALSE, header = TRUE, sep = '\t', ...)
read.sif(sif.file, format = 'graphNEL', directed = FALSE, header = TRUE, sep = '\t', ...)
sif.file |
Name of network file in SIF format. |
format |
Output format: igraph or graphNEL |
directed |
Logical. Directed/undirected graph. Not used in the current model. |
header |
Logical. Indicate whether the SIF file has header or not. |
sep |
Field separator. |
... |
Further optional arguments to be passed for file reading. |
Read in SIF network file, return R graph object in igraph or graphNEL format.
R graph object in igraph or graphNEL format.
Leo Lahti [email protected]
#net <- read.sif('network.sif')
#net <- read.sif('network.sif')
Calculate enrichment values for a specified sample group in the given response.
response.enrichment( total.samples, response.samples, annotated.samples, method = "hypergeometric" )
response.enrichment( total.samples, response.samples, annotated.samples, method = "hypergeometric" )
total.samples |
All samples in the data |
response.samples |
Samples in the investigated subset |
annotated.samples |
Samples at the investigated annotation level for enrichment calculation |
method |
Enrichment method. |
List with enrichment statistics, depending on enrichment method.
Leo Lahti [email protected]
See citation('netresponse')
order.responses
#enr <- response.enrichment(subnet.id, models, sample, response, method)
#enr <- response.enrichment(subnet.id, models, sample, response, method)
List the most strongly associated response of a given subnetwork for each sample.
response2sample( model, subnet.id = NULL, component.list = TRUE, verbose = FALSE, data = NULL )
response2sample( model, subnet.id = NULL, component.list = TRUE, verbose = FALSE, data = NULL )
model |
A NetResponseModel object or list. |
subnet.id |
Subnet id. A natural number which specifies one of the subnetworks within the 'model' object. |
component.list |
List samples separately for each mixture component (TRUE). Else list the most strongly associated component for each sample (FALSE). |
verbose |
Follow progress by intermediate messages. |
data |
Data (features x samples; or a vector for univariate case) to predict response for given data points (currently implemented only for mixture.model output) Return: |
A list. Each element corresponds to one subnetwork response, and contains a list of samples that are associated with the response (samples for which this response has the highest probability P(response | sample)).
Leo Lahti [email protected]
Leo Lahti et al.: Global modeling of transcriptional responses in interaction networks. Bioinformatics (2010). See citation('netresponse') for citation details.
# Load example data data( toydata ) # Load toy data set D <- toydata$emat # Response matrix (for example, gene expression) model <- toydata$model # Pre-calculated model # Find the samples for each response (for a given subnetwork) response2sample(model, subnet.id = 1)
# Load example data data( toydata ) # Load toy data set D <- toydata$emat # Response matrix (for example, gene expression) model <- toydata$model # Pre-calculated model # Find the samples for each response (for a given subnetwork) response2sample(model, subnet.id = 1)
Probabilistic sample-response assignments for given subnet.
sample2response(model, subnet.id, mode = 'soft')
sample2response(model, subnet.id, mode = 'soft')
model |
Result from NetResponse (detect.responses function). |
subnet.id |
Subnet identifier. A natural number which specifies one of the subnetworks within the 'model' object. |
mode |
soft: gives samples x responses probabilistic assignment matrix; hard: gives the most likely response for each sample |
A matrix of probabilities. Sample-response assignments for given subnet, listing the probability of each response, given a sample.
Leo Lahti [email protected]
Leo Lahti et al.: Global modeling of transcriptional responses in interaction networks. Bioinformatics (2010). See citation('netresponse') for citation details.
data( toydata ) # Load toy data set D <- toydata$emat # Response matrix (for example, gene expression) netw <- toydata$netw # Network # Detect network responses #model <- detect.responses(D, netw, verbose = FALSE) # Assign samples to responses (soft, probabilistic assignments sum to 1) #response.probabilities <- sample2response(model, subnet.id = 'Subnet-1')
data( toydata ) # Load toy data set D <- toydata$emat # Response matrix (for example, gene expression) netw <- toydata$netw # Network # Detect network responses #model <- detect.responses(D, netw, verbose = FALSE) # Assign samples to responses (soft, probabilistic assignments sum to 1) #response.probabilities <- sample2response(model, subnet.id = 'Subnet-1')
Set breakpoints for two-way color palette.
set.breaks(mat, interval = 0.1)
set.breaks(mat, interval = 0.1)
mat |
Matrix to visualize. |
interval |
Density of color breakpoints. |
A vector listing the color breakpoints.
Leo Lahti, Olli-Pekka Huovilainen and Antonio Gusmao. Maintainer: Leo Lahti <[email protected]>
L. Lahti et al.: Global modeling of transcriptional responses in interaction networks. Submitted.
set.breaks(array(rnorm(100), dim = c(10, 10)), interval = .1)
set.breaks(array(rnorm(100), dim = c(10, 10)), interval = .1)
Split q of z.
Main function of the NetResponse algorithm. Detect condition-specific network responses, given network and a set of measurements of node activity in a set of conditions. Returns a set of subnetworks and their estimated context-specific responses.
## S3 method for class 'qofz' split(qOFz, c, new.c, dat, speedup = TRUE, min.size = 4)
## S3 method for class 'qofz' split(qOFz, c, new.c, dat, speedup = TRUE, min.size = 4)
qOFz |
qOFz |
c |
c |
new.c |
new.c |
dat |
dat |
speedup |
speedup |
min.size |
min.size |
INPUT: data, qOFz, hp_posterior, hp_prior, opts OUTPUT: list(new.qOFz, new.c); * new.qOFz: posterior over labels including the split clusters. * new.c: index of the newly created cluster. DESCRIPTION: Implements the VDP algorithm step 3a.
object Component must have at least min.size samples to be splitted.'
Toy data for NetResponse examples.
data(toydata)
data(toydata)
Toy data: a list with three elements:
emat: Data matrix (samples x features). This contains the same features that are provided in the network (toydata$netw). The matrix characterizes measurements of network states across different conditions.
netw: Binary matrix that describes pairwise interactions between features. This defines an undirected network over the features. A link between two nodes is denoted by 1.
model: A pre-calculated model. Object of NetResponseModel class, resulting from applying the netresponse algorithm on the toydata with model <- detect.responses(D, netw).
Leo Lahti et al.: Global modeling of transcriptional responses in interaction networks. Bioinformatics (2010).
data(toydata) D <- toydata$emat # Response matrix (samples x features) netw <- toydata$netw # Network between the features model <- toydata$model # Pre-calculated NetResponseModel obtained with # model <- detect.responses(D, netw)
data(toydata) D <- toydata$emat # Response matrix (samples x features) netw <- toydata$netw # Network between the features model <- toydata$model # Pre-calculated NetResponseModel obtained with # model <- detect.responses(D, netw)
Accelerated variational Dirichlet process Gaussian mixture.
vdp.mixt( dat, prior.alpha = 1, prior.alphaKsi = 0.01, prior.betaKsi = 0.01, do.sort = TRUE, threshold = 1e-05, initial.K = 1, ite = Inf, implicit.noise = 0, c.max = 10, speedup = TRUE, min.size = 5 )
vdp.mixt( dat, prior.alpha = 1, prior.alphaKsi = 0.01, prior.betaKsi = 0.01, do.sort = TRUE, threshold = 1e-05, initial.K = 1, ite = Inf, implicit.noise = 0, c.max = 10, speedup = TRUE, min.size = 5 )
dat |
Data matrix (samples x features). |
prior.alpha , prior.alphaKsi , prior.betaKsi
|
Prior parameters for Gaussian mixture model (normal-inverse-Gamma prior). alpha tunes the mean; alphaKsi and betaKsi are the shape and scale parameters of the inverse Gamma function, respectively. |
do.sort |
When true, qOFz will be sorted in decreasing fashion by component size, based on colSums(qOFz). The qOFz matrix describes the sample-component assigments in the mixture model. |
threshold |
Defines the minimal free energy improvement that stops the algorithm: used to define convergence limit. |
initial.K |
Initial number of mixture components. |
ite |
Defines maximum number of iterations on posterior update (updatePosterior). Increasing this can potentially lead to more accurate results, but computation may take longer. |
implicit.noise |
Adds implicit noise; used by vdp.mk.log.lambda.so and vdp.mk.hp.posterior.so. By adding noise (positive values), one can avoid overfitting to local optima in some cases, if this happens to be a problem. |
c.max |
Maximum number of candidates to consider in find.best.splitting. During mixture model calculations new mixture components can be created until this upper limit has been reached. Defines the level of truncation for a truncated stick-breaking process. |
speedup |
When learning the number of components, each component is splitted based on its first PCA component. To speed up, approximate by using only subset of data to calculate PCA. |
min.size |
Minimum size for a component required for potential splitting during mixture estimation. |
Implementation of the Accelerated variational Dirichlet process Gaussian mixture model algorithm by Kenichi Kurihara et al., 2007.
ALGORITHM SUMMARY This code implements Gaussian mixture models with diagonal covariance matrices. The following greedy iterative approach is taken in order to obtain the number of mixture models and their corresponding parameters:
1. Start from one cluster, $T = 1$. 2. Select a number of candidate clusters according to their values of 'Nc' = \sum_n=1^N q_z_n (z_n = c) (larger is better). 3. For each of the candidate clusters, c: 3a. Split c into two clusters, c1 and c2, through the bisector of its principal component. Initialise the responsibilities q_z_n(z_n = c_1) and q_z_n(z_n = c_2). 3b. Update only the parameters of c1 and c2 using the observations that belonged to c, and determine the new value for the free energy, FT+1. 3c. Reassign cluster labels so that cluster 1 corresponds to the largest cluster, cluster 2 to the second largest, and so on. 4. Select the split that lead to the maximal reduction of free energy, FT+1. 5. Update the posterior using the newly split data. 6. If FT - FT+1 < \epsilon then halt, else set T := T +1 and go to step 2.
The loop is implemented in the function greedy(...)
prior |
Prior parameters of the vdp-gm model (qofz: priors on observation lables; Mu: centroids; S2: variance). |
posterior |
Posterior estimates for the model parameters and statistics. |
weights |
Mixture proportions, or weights, for the Gaussian mixture components. |
centroids |
Centroids of the mixture components. |
sds |
Standard deviations for the mixture model components (posterior modes of the covariance diagonals square root). Calculated as sqrt(invgam.scale/(invgam.shape + 1)). |
qOFz |
Sample-to-cluster assigments (soft probabilistic associations). |
Nc |
Component sizes |
invgam.shape |
Shape parameter (alpha) of the inverse Gamma distribution |
invgam.scale |
Scale parameter (beta) of the inverse Gamma distribution |
Nparams |
Number of model parameters |
K |
Number of components in the mixture model |
opts |
Model parameters that were used. |
free.energy |
Free energy of the model. |
This implementation is based on the Variational Dirichlet Process Gaussian Mixture Model implementation, Copyright (C) 2007 Kenichi Kurihara (all rights reserved) and the Agglomerative Independent Variable Group Analysis package (in Matlab): Copyright (C) 2001-2007 Esa Alhoniemi, Antti Honkela, Krista Lagus, Jeremias Seppa, Harri Valpola, and Paul Wagner.
Maintainer: Leo Lahti [email protected]
Kenichi Kurihara, Max Welling and Nikos Vlassis: Accelerated Variational Dirichlet Process Mixtures. In B. Sch\'olkopf and J. Platt and T. Hoffman (eds.), Advances in Neural Information Processing Systems 19, 761–768. MIT Press, Cambridge, MA 2007.
set.seed(123) # Generate toy data with two Gaussian components dat <- rbind(array(rnorm(400), dim = c(200,2)) + 5, array(rnorm(400), dim = c(200,2))) # Infinite Gaussian mixture model with # Variational Dirichlet Process approximation mixt <- vdp.mixt( dat ) # Centroids of the detected Gaussian components mixt$posterior$centroids # Hard mixture component assignments for the samples apply(mixt$posterior$qOFz, 1, which.max)
set.seed(123) # Generate toy data with two Gaussian components dat <- rbind(array(rnorm(400), dim = c(200,2)) + 5, array(rnorm(400), dim = c(200,2))) # Infinite Gaussian mixture model with # Variational Dirichlet Process approximation mixt <- vdp.mixt( dat ) # Centroids of the detected Gaussian components mixt$posterior$centroids # Hard mixture component assignments for the samples apply(mixt$posterior$qOFz, 1, which.max)
Convert grouping info into a vector; each element corresponds to a group and lists samples in that group.
vectorize.groupings(groupings, verbose = FALSE)
vectorize.groupings(groupings, verbose = FALSE)
groupings |
a list, a vector, or a samplesxmodes assignment matrix |
verbose |
verbose |
Indicator vector
Leo Lahti [email protected]
See citation('netresponse')
#
#
Experimental version.
write.netresponse.results(x, subnet.ids = NULL, filename)
write.netresponse.results(x, subnet.ids = NULL, filename)
x |
NetResponseModel |
subnet.ids |
List of subnet ids to consider. By default, all subnets. |
filename |
Output file name. |
Used for side effects.
Leo Lahti [email protected]
See citation('netresponse')