Title: | Gene Set Analysis Exploiting Pathway Topology |
---|---|
Description: | Implements topological gene set analysis using a two-step empirical approach. It exploits graph decomposition theory to create a junction tree and reconstruct the most relevant signal path. In the first step clipper selects significant pathways according to statistical tests on the means and the concentration matrices of the graphs derived from pathway topologies. Then, it "clips" the whole pathway identifying the signal paths having the greatest association with a specific phenotype. |
Authors: | Paolo Martini <[email protected]>, Gabriele Sales <[email protected]>, Chiara Romualdi <[email protected]> |
Maintainer: | Paolo Martini <[email protected]> |
License: | AGPL-3 |
Version: | 1.47.0 |
Built: | 2024-10-30 04:43:09 UTC |
Source: | https://github.com/bioc/clipper |
Basing on either variance or mean clique test, this function identifies the paths that are mostly related with the phenotype under study.
clipper(expr, classes, graph, method=c("variance","mean", "both", "paired"), nperm=100, alphaV=0.05, b=100, root=NULL, trZero=0.001, signThr=0.05, maxGap=1, permute=TRUE, alwaysShrink=FALSE)
clipper(expr, classes, graph, method=c("variance","mean", "both", "paired"), nperm=100, alphaV=0.05, b=100, root=NULL, trZero=0.001, signThr=0.05, maxGap=1, permute=TRUE, alwaysShrink=FALSE)
expr |
an expression matrix or ExpressionSet with colnames for samples and row name for genes. |
classes |
vector of 1,2 indicating the classes of samples (columns). |
graph |
a |
method |
the kind of test to perform on the cliques. It could be mean, variance, mixed (the best between variance and mean) or paired mean. |
nperm |
number of permutations. Default = 100. |
alphaV |
pvalue threshold for variance test to be used during mean test. Default = 0.05. |
b |
number of permutations for mean analysis. Default = 100. |
root |
nodes by which rip ordering is performed (as far as possible) on the variables using the maximum cardinality search algorithm. |
trZero |
lowest pvalue detectable. This threshold avoids that -log(p) goes infinite. |
signThr |
significance threshold for clique pvalues. |
maxGap |
allow up to maxGap gaps in the best path computation. Default = 1. |
permute |
always performs permutations in the concentration matrix test. If FALSE, the test is made using the asymptotic distribution of the log-likelihood ratio. This option should be use only if samples size is >=40 per class. |
alwaysShrink |
always perform the shrinkage estimates of variance. |
The both method combines the results obtained from the mean and variance test. In particular it assign to the cliques the minimum of mean and variance p-values.
A matrix with a row for each paths. Columns are organized as follows:
Index of the starting clique
Index of the ending clique
Index of the clique where the maximum value is reached
Length of the path
Maximum score of the path
Average score along the path
Percentage of path activation
Impact of the path on the entire pathway
Cliques involved and significant
Cliques forming the path
Genes forming the significant cliques
Genes forming the path
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
cliqueVarianceTest
, cliqueMeanTest
,
getJunctionTreePaths
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) clipped <- clipper(all, classes, graph, "var", trZero=0.01, permute=FALSE) clipped[,1:5] }
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) clipped <- clipper(all, classes, graph, "var", trZero=0.01, permute=FALSE) clipped[,1:5] }
Basing on either variance or mean clique test, this function identifies the paths that are mostly related with the phenotype under study.
clipperAllRoots(expr, classes, graph, method=c("variance","mean", "both", "paired"), nperm=100, alphaV=0.05, b=100, trZero=0.001, signThr=0.05, maxGap=1, permute=TRUE, alwaysShrink=FALSE)
clipperAllRoots(expr, classes, graph, method=c("variance","mean", "both", "paired"), nperm=100, alphaV=0.05, b=100, trZero=0.001, signThr=0.05, maxGap=1, permute=TRUE, alwaysShrink=FALSE)
expr |
an expression matrix or ExpressionSet with colnames for samples and row name for genes. |
classes |
vector of 1,2 indicating the classes of samples (columns). |
graph |
a |
method |
the kind of test to perform on the cliques. It could be mean, variance, mixed (the best between variance and mean) or paired mean. |
nperm |
number of permutations. Default = 100. |
alphaV |
pvalue threshold for variance test to be used during mean test. Default = 0.05. |
b |
number of permutations for mean analysis. Default = 100. |
trZero |
lowest pvalue detectable. This threshold avoids that -log(p) goes infinite. |
signThr |
significance threshold for clique pvalues. |
maxGap |
allow up to maxGap gaps in the best path computation. Default = 1. |
permute |
always performs permutations in the concentration matrix test. If FALSE, the test is made using the asymptotic distribution of the log-likelihood ratio. This option should be use only if samples size is >=40 per class. |
alwaysShrink |
always perform the shrinkage estimates of variance. |
The both method combines the results obtained from the mean and variance test. In particular it assign to the cliques the minimum of mean and variance p-values.
A matrix with a row for each paths. Rownames have the form:
roots-paths.
Columns are organized as follows:
Index of the starting clique
Index of the ending clique
Index of the clique where the maximum value is reached
Length of the path
Maximum score of the path
Average score along the path
Percentage of path activation
Impact of the path on the entire pathway
Cliques involved and significant
Cliques forming the path
Genes forming the significant cliques
Genes forming the path
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
cliqueVarianceTest
, cliqueMeanTest
,
getJunctionTreePaths
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) clipped <- clipperAllRoots(all, classes, graph, "var", trZero=0.01, permute=FALSE) clipped[,1:5] }
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) clipped <- clipperAllRoots(all, classes, graph, "var", trZero=0.01, permute=FALSE) clipped[,1:5] }
It decomposes the graph in cliques and performs the mean test in every one.
cliqueMeanTest(expr, classes, graph, nperm, alphaV=0.05, b=100, root=NULL, permute=TRUE, alwaysShrink=FALSE)
cliqueMeanTest(expr, classes, graph, nperm, alphaV=0.05, b=100, root=NULL, permute=TRUE, alwaysShrink=FALSE)
expr |
an expression matrix or ExpressionSet with colnames for samples and row name for genes. |
classes |
vector of 1,2 indicating the classes of samples (columns). |
graph |
a |
nperm |
number of permutations. |
alphaV |
pvalue threshold for variance test to be used during mean test. |
b |
number of permutations for mean analysis. |
root |
nodes by which rip ordering is performed (as far as possible) on the variables using the maximum cardinality search algorithm. |
permute |
always performs permutations in the concentration matrix test. If FALSE, the test is made using the asymptotic distribution of the log-likelihood ratio. This option should be use only if samples size is >=40 per class. |
alwaysShrink |
always perform the shrinkage estimates of variance. |
a list with alphas (vector of cliques pvalues based on the mean test) and cliques (list of the cliques and related elements).
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) cliqueMeanTest(all, classes, graph, nperm=100, permute=FALSE)$alpha }
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) cliqueMeanTest(all, classes, graph, nperm=100, permute=FALSE)$alpha }
It decomposes the graph in cliques and performs the combination of mean e variance test in every one.
cliqueMixedTest(expr, classes, graph, nperm, alphaV=0.05, b=100, root=NULL, permute=TRUE, alwaysShrink=FALSE)
cliqueMixedTest(expr, classes, graph, nperm, alphaV=0.05, b=100, root=NULL, permute=TRUE, alwaysShrink=FALSE)
expr |
an expression matrix or ExpressionSet with colnames for samples and row name for genes. |
classes |
vector of 1,2 indicating the classes of samples (columns). |
graph |
a |
nperm |
number of permutations. |
alphaV |
pvalue threshold for variance test to be used during mean test. |
b |
number of permutations for mean analysis. |
root |
nodes by which rip ordering is performed (as far as possible) on the variables using the maximum cardinality search algorithm. |
permute |
always performs permutations in the concentration matrix test. If FALSE, the test is made using the asymptotic distribution of the log-likelihood ratio. This option should be use only if samples size is >=40 per class. |
alwaysShrink |
always perform the shrinkage estimates of variance. |
The method combines the results obtained from the mean and variance test. In particular it assign to the cliques the minimum of mean and variance p-values.
a list with alphas (vector of cliques pvalues based on the variance test) and cliques (list of the cliques and related elements).
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) cliqueMeanTest(all, classes, graph, nperm=100, permute=FALSE)$alpha }
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) cliqueMeanTest(all, classes, graph, nperm=100, permute=FALSE)$alpha }
It decomposes the graph in cliques and performs the paired mean test in every one.
cliquePairedTest(expr, classes, graph, nperm, alphaV=0.05, b=100, root=NULL, permute=TRUE, alwaysShrink=FALSE)
cliquePairedTest(expr, classes, graph, nperm, alphaV=0.05, b=100, root=NULL, permute=TRUE, alwaysShrink=FALSE)
expr |
an expression matrix or ExpressionSet with colnames for samples and row name for genes. |
classes |
vector of 1,2 indicating the classes of samples (columns). It is assumed that class labels are ordered so that the first occurrence of class 2 is paired with the first occurrence of class 1 and so on. |
graph |
a |
nperm |
number of permutations. |
alphaV |
pvalue threshold for variance test to be used during mean test. |
b |
number of permutations for mean analysis. |
root |
nodes by which rip ordering is performed (as far as possible) on the variables using the maximum cardinality search algorithm. |
permute |
always performs permutations in the concentration matrix test. If FALSE, the test is made using the asymptotic distribution of the log-likelihood ratio. This option should be use only if samples size is >=40 per class. |
alwaysShrink |
always perform the shrinkage estimates of variance. |
a list with alphas (vector of cliques pvalues based on the variance test) and cliques (list of the cliques and related elements).
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) cliquePairedTest(all, classes, graph, nperm=100, permute=FALSE)$alpha }
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) cliquePairedTest(all, classes, graph, nperm=100, permute=FALSE)$alpha }
It decomposes the graph in cliques and performs the variance test in every one.
cliqueVarianceTest(expr, classes, graph, nperm, alphaV=0.05, b=100, root=NULL, permute=TRUE, alwaysShrink=FALSE)
cliqueVarianceTest(expr, classes, graph, nperm, alphaV=0.05, b=100, root=NULL, permute=TRUE, alwaysShrink=FALSE)
expr |
an expression matrix or ExpressionSet with colnames for samples and row name for genes. |
classes |
vector of 1,2 indicating the classes of samples (columns). |
graph |
a |
nperm |
number of permutations. |
alphaV |
pvalue threshold for variance test to be used during mean test. |
b |
number of permutations for mean analysis. |
root |
nodes by which rip ordering is performed (as far as possible) on the variables using the maximum cardinality search algorithm. |
permute |
always performs permutations in the concentration matrix test. If FALSE, the test is made using the asymptotic distribution of the log-likelihood ratio. This option should be use only if samples size is >=40 per class. |
alwaysShrink |
always perform the shrinkage estimates of variance. |
a list with alphas (vector of cliques pvalues based on the variance test) and cliques (list of the cliques and related elements).
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) cliqueVarianceTest(all, classes, graph, nperm=100, permute=FALSE)$alpha }
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:20] classes <- c(rep(1,10), rep(2,10)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) cliqueVarianceTest(all, classes, graph, nperm=100, permute=FALSE)$alpha }
graphNel
object.Remove from a graphNEL
object the edge specified.
deleteEdge(graph, from, to)
deleteEdge(graph, from, to)
graph |
a graphNEL object. |
from |
a string with the name of the node where the edge start. |
to |
a string with the name of the node where the edge end. |
a graphNEL
object.
if (require(graphite)) { kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) head(edges(graph)) ## We are going to remove the edge 1026-1019 head(edges(deleteEdge(graph, "ENTREZID:1026", "ENTREZID:1019"))) }
if (require(graphite)) { kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) head(edges(graph)) ## We are going to remove the edge 1026-1019 head(edges(deleteEdge(graph, "ENTREZID:1026", "ENTREZID:1019"))) }
Easy clip function allows the full exploitation of Clipper Package features in a unique and easy to use function. Starting from an expression matrix and a pathway, these function extact the most transcriptionally altered portions of the graph.
easyClip(expr, classes, graph, method=c("variance","mean"), pathThr=0.05, pruneLevel=0.2, nperm=100, alphaV=0.05, b=100, root=NULL, trZero=0.001, signThr=0.05, maxGap=1, permute=TRUE)
easyClip(expr, classes, graph, method=c("variance","mean"), pathThr=0.05, pruneLevel=0.2, nperm=100, alphaV=0.05, b=100, root=NULL, trZero=0.001, signThr=0.05, maxGap=1, permute=TRUE)
expr |
an expression matrix or ExpressionSet with colnames for samples and row name for genes. |
classes |
vector of 1,2 indicating the classes of samples (columns). |
graph |
a |
method |
the kind of test to perform on the cliques. It could be either mean or variance. |
pathThr |
The significance threshold of the whole pathway test. Deafault = 0.05 |
pruneLevel |
a dissimilarity threshold. NULL means no pruning. |
nperm |
number of permutations. Default = 100. |
alphaV |
pvalue threshold for variance test to be used during mean test. Default = 0.05. |
b |
number of permutations for mean analysis. Default = 100. |
root |
nodes by which rip ordering is performed (as far as possible) on the variables using the maximum cardinality search algorithm. |
trZero |
lowest pvalue detectable. This threshold avoids that -log(p) goes infinite. |
signThr |
significance threshold for clique pvalues. |
maxGap |
allow up to maxGap gaps in the best path computation. Default = 1. |
permute |
always performs permutations in the concentration matrix test. If FALSE, the test is made using the asymptotic distribution of the log-likelihood ratio. This option should be use only if samples size is >=40 per class. |
a matrix with row as the different paths. Columns are organized as follwes: 1 - Index of the starting clique 2 - Index of the ending clique 3 - Index of the clique where the maximum value is reached 4 - length of the path 5 - maximum score of the path 6 - average score along the path 7 - percentage of path activation 8 - impact of the path on the entire pathway 9 - clique involved and significant 10 - clique forming the path 11 - genes forming the significant cliques 12 - genes forming the path)
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
cliqueVarianceTest
, cliqueMeanTest
,
getJunctionTreePaths
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:24] classes <- c(rep(1,12), rep(2,12)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) easyClip(all, classes, graph, nperm=10) }
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:24] classes <- c(rep(1,12), rep(2,12)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) easyClip(all, classes, graph, nperm=10) }
Summarization of the result for a quick look of clipper function.
easyLook(clipped)
easyLook(clipped)
clipped |
the output of either clipper o easyClip. |
Nice formatted output of clipper function.
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
It extracts the possible entry point of the graph. Entry points are defined as nodes with no entering edges.
getGraphEntryGenes(graph, byCliques=FALSE, root=NULL)
getGraphEntryGenes(graph, byCliques=FALSE, root=NULL)
graph |
a |
byCliques |
when TRUE it returns a list where entry point are organized by cliques. |
root |
nodes by which rip ordering is performed (as far as possible) on the variables using the maximum cardinality search algorithm. |
a vector of gene names representing the entry point of graph.
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
if (require(graphite)) { kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) getGraphEntryGenes(graph) }
if (require(graphite)) { kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) getGraphEntryGenes(graph) }
Find the shortest paths in the Junction tree designed with the cliques of the graph.
getJunctionTreePaths(graph, root=NULL)
getJunctionTreePaths(graph, root=NULL)
graph |
a |
root |
nodes by which rip ordering is performed (as far as possible) on the variables using the maximum cardinality search algorithm. |
list of clique indices representing the shortest paths of the graph.
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
if (require(graphite)) { kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) getJunctionTreePaths(graph) }
if (require(graphite)) { kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) getJunctionTreePaths(graph) }
Starting from the sorted elements of each clique of the list, this function generates names fusing in a string the element names.
nameCliques(cliques)
nameCliques(cliques)
cliques |
a list where each element is a clique. |
vector of strings
toyCliques <- list(c(45,36,90), c(36,1000,35)) nameCliques(toyCliques)
toyCliques <- list(c(45,36,90), c(36,1000,35)) nameCliques(toyCliques)
Performs variance and mean test using qpipf on the whole pathway.
pathQ(expr, classes, graph, nperm=100, alphaV=0.05, b=100, permute=TRUE, paired=FALSE, alwaysShrink=FALSE)
pathQ(expr, classes, graph, nperm=100, alphaV=0.05, b=100, permute=TRUE, paired=FALSE, alwaysShrink=FALSE)
expr |
an expression matrix or ExpressionSet with colnames for samples and rownames for expression features. |
classes |
vector of 1,2 indicating the classes of the samples (columns). |
graph |
a |
nperm |
number of permutations. Default = 100. |
alphaV |
pvalue significance threshold for variance test to be used during mean test. Default = 0.05. |
b |
number of permutations for mean analysis. Default = 100. |
permute |
always performs permutations in the concentration matrix test. If FALSE, the test is made using the asymptotic distribution of the log-likelihood ratio. This option should be use only if samples size is >=40 per class. |
paired |
perform the test for paired sample. It assumes that class labels are ordered so that the first occurrence of class 2 is paired with the first occurrence of class 1 and so on. |
alwaysShrink |
always perform the shrinkage estimates of variance. |
a list with alphaVar (pvalue for the variance test) and alphaMean (pvalue for mean test).
This function is based on the Gaussian Graphical Models and to use it
in a proper way it is necessary that the graph is an Direct Acyclic
Graph. Please check any graph in input using isAcyclic
from ggm package.
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C. Along signal paths: an empirical gene set approach exploiting pathway topology. NAR. 2012 Sep.
Massa MS, Chiogna M, Romualdi C. Gene set analysis exploiting the topology of a pathway. BMC System Biol. 2010 Sep 1;4:121.
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:24] classes <- c(rep(1,12), rep(2,12)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) pathQ(all, classes, graph, nperm=100, permute=FALSE) }
if (require(graphite) & require(ALL)){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) genes <- nodes(graph) data(ALL) all <- ALL[1:length(genes),1:24] classes <- c(rep(1,12), rep(2,12)) featureNames(all@assayData)<- genes graph <- subGraph(genes, graph) pathQ(all, classes, graph, nperm=100, permute=FALSE) }
Renders the topology of a pathway as a Cytoscape graph and marks the genes of the selected path.
plotInCytoscape(graph, path, color="#6699FF", main="graph")
plotInCytoscape(graph, path, color="#6699FF", main="graph")
graph |
a |
path |
vector summarizing a path (a rows of clipper output matrix). |
color |
color code string: genes of the most involved fragment will be colored using color. Deafult = "#6699FF" |
main |
a graph name to be used in Cytoscape. Default = 'graph' |
Requires the RCy3
package.
## Not run: if (require(graphite)) { if (requireNamespace("RCy3")){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) } path <- c(3,17,5,9,13.04,2.60,0.209,0.321,"6,7,8,9,10", "3,5,6,7,8,9,10,14,17", "ENTREZID:1029;ENTREZID:4193;ENTREZID:7157", "ENTREZID:1019;ENTREZID:1021;ENTREZID:1026;ENTREZID:1029;ENTREZID:595") plotInCytoscape(graph,path) } ## End(Not run)
## Not run: if (require(graphite)) { if (requireNamespace("RCy3")){ kegg <- pathways("hsapiens", "kegg") graph <- pathwayGraph(convertIdentifiers(kegg$'Chronic myeloid leukemia', "entrez")) } path <- c(3,17,5,9,13.04,2.60,0.209,0.321,"6,7,8,9,10", "3,5,6,7,8,9,10,14,17", "ENTREZID:1029;ENTREZID:4193;ENTREZID:7157", "ENTREZID:1019;ENTREZID:1021;ENTREZID:1026;ENTREZID:1029;ENTREZID:595") plotInCytoscape(graph,path) } ## End(Not run)
This function allows the user to chose only one representant of those paths that have more than 1-thr similarity. The best scoring path is choosen.
prunePaths(pathSummary, thr=NULL, clust=NULL, sep=";")
prunePaths(pathSummary, thr=NULL, clust=NULL, sep=";")
pathSummary |
a matrix resulting from clipper function. |
thr |
a dissimilarity threshold. NULL means no pruning. |
clust |
filename where path-cluster is saved. NULL means no cluster saved. |
sep |
the separator to split genes for similarity computation. Default = ; |
a matrix
toyEx <- matrix(c(1,1,5,3,5,2,5,3,8.2,3,2,1,0.3,0.1,2,1,"1;2;3;4;5","1;2;3", "1;2;3;4;5","1;2;3","1;2;3;4;5","1;2;3","1;2;3;4;5","1;2;3"),2,12) row.names(toyEx) <- c("1;5","1;3") toyEx prunePaths(toyEx, thr=0.1)
toyEx <- matrix(c(1,1,5,3,5,2,5,3,8.2,3,2,1,0.3,0.1,2,1,"1;2;3;4;5","1;2;3", "1;2;3;4;5","1;2;3","1;2;3;4;5","1;2;3","1;2;3;4;5","1;2;3"),2,12) row.names(toyEx) <- c("1;5","1;3") toyEx prunePaths(toyEx, thr=0.1)