Title: | MODA: MOdule Differential Analysis for weighted gene co-expression network |
---|---|
Description: | MODA can be used to estimate and construct condition-specific gene co-expression networks, and identify differentially expressed subnetworks as conserved or condition specific modules which are potentially associated with relevant biological processes. |
Authors: | Dong Li, James B. Brown, Luisa Orsini, Zhisong Pan, Guyu Hu and Shan He |
Maintainer: | Dong Li <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.33.0 |
Built: | 2024-11-18 03:40:57 UTC |
Source: | https://github.com/bioc/MODA |
Compare the background network and a set of condition-specific network. Conserved or condition-specific modules are indicated by the plain files, based on the statistics
CompareAllNets(ResultFolder, intModules, indicator, intconditionModules, conditionNames, specificTheta, conservedTheta)
CompareAllNets(ResultFolder, intModules, indicator, intconditionModules, conditionNames, specificTheta, conservedTheta)
ResultFolder |
where to store results |
intModules |
how many modules in the background network |
indicator |
identifier of current profile, served as a tag in name |
intconditionModules |
a numeric vector, each of them is the number of modules in each condition-specific network. Or just single number |
conditionNames |
a character vector, each of them is the name of condition. Or just single name |
specificTheta |
the threshold to define min(s)+specificTheta, less than which is considered as condition specific module. s is the sums of rows in Jaccard index matrix. See supplementary file. |
conservedTheta |
The threshold to define max(s)-conservedTheta, greater than which is considered as condition conserved module. s is the sums of rows in Jaccard index matrix. See supplementary file. |
None
Dong Li, [email protected]
WeightedModulePartitionHierarchical
,
comparemodulestwonets
data(synthetic) ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 specificTheta = 0.1 #threshold to define condition specific modules conservedTheta = 0.1#threshold to define conserved modules intModules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,CuttingCriterion) intModules2 <- WeightedModulePartitionHierarchical(datExpr2,ResultFolder, indicator2,CuttingCriterion) CompareAllNets(ResultFolder,intModules1,indicator1,intModules2,indicator2, specificTheta,conservedTheta)
data(synthetic) ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 specificTheta = 0.1 #threshold to define condition specific modules conservedTheta = 0.1#threshold to define conserved modules intModules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,CuttingCriterion) intModules2 <- WeightedModulePartitionHierarchical(datExpr2,ResultFolder, indicator2,CuttingCriterion) CompareAllNets(ResultFolder,intModules1,indicator1,intModules2,indicator2, specificTheta,conservedTheta)
Compare the background network and a condition-specific network. A Jaccard index is used to measure the similarity of two sets, which represents the similarity of each module pairs from two networks.
comparemodulestwonets(sourcehead, nm1, nm2, ind1, ind2)
comparemodulestwonets(sourcehead, nm1, nm2, ind1, ind2)
sourcehead |
prefix of where to store results |
nm1 |
how many modules in the background network |
nm2 |
how many modules in the condition-specific network |
ind1 |
indicator of the background network |
ind2 |
indicator of the condition-specific network |
A matrix where each entry is the Jaccard index of corresponding modules from two networks
Dong Li, [email protected]
data(synthetic) ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 intModules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,CuttingCriterion) intModules2 <- WeightedModulePartitionHierarchical(datExpr2,ResultFolder, indicator2,CuttingCriterion) JaccardMatrix <- comparemodulestwonets(ResultFolder,intModules1,intModules2, paste('/DenseModuleGene_',indicator1,sep=''), paste('/DenseModuleGene_',indicator2,sep=''))
data(synthetic) ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 intModules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,CuttingCriterion) intModules2 <- WeightedModulePartitionHierarchical(datExpr2,ResultFolder, indicator2,CuttingCriterion) JaccardMatrix <- comparemodulestwonets(ResultFolder,intModules1,intModules2, paste('/DenseModuleGene_',indicator1,sep=''), paste('/DenseModuleGene_',indicator2,sep=''))
Synthetic gene expression profile with 20 samples and 500 genes.
A matrix with 20 rows and 500 columns.
Dong Li, [email protected]
data(synthetic) ## plot the heatmap of the correlation matrix ... ## Not run: heatmap(cor(as.matrix(datExpr1)))
data(synthetic) ## plot the heatmap of the correlation matrix ... ## Not run: heatmap(cor(as.matrix(datExpr1)))
Synthetic gene expression profile with 25 samples and 500 genes.
A matrix with 25 rows and 500 columns.
Dong Li, [email protected]
data(synthetic) ## plot the heatmap of the correlation matrix ... ## Not run: heatmap(cor(as.matrix(datExpr2)))
data(synthetic) ## plot the heatmap of the correlation matrix ... ## Not run: heatmap(cor(as.matrix(datExpr2)))
Get identified partitionAssignment, only for synthetic data where gene names are numbers
getPartition(ResultFolder)
getPartition(ResultFolder)
ResultFolder |
folder used to save modules |
Number of partitions
Module detection on each condition-specific network, which is constructed from all samples but samples belonging to that condition
MIcondition(datExpr, conditionNames, ResultFolder, GeneNames, maxsize = 100, minsize = 30)
MIcondition(datExpr, conditionNames, ResultFolder, GeneNames, maxsize = 100, minsize = 30)
datExpr |
gene expression profile, rows are samples and columns genes, rowname should contain condition specifier |
conditionNames |
character vector, each as the condition name |
ResultFolder |
where to store the clusters |
GeneNames |
normally the gene official names to replace the colnames of datExpr |
maxsize |
the maximal nodes allowed in one module |
minsize |
the minimal nodes allowed in one module |
a numeric vector, each entry is the number of modules in condition-specific network
Dong Li, [email protected]
Statistics of all conditions. To highlight conserved or condition-specific by counting how frequent each module is lablelled as which, and then visualize the frequency by bar plot.
ModuleFrequency(ResultFolder, intModules, conditionNames, legendNames, indicator)
ModuleFrequency(ResultFolder, intModules, conditionNames, legendNames, indicator)
ResultFolder |
where to store results |
intModules |
how many modules in the background network |
conditionNames |
a character vector, each of them is the name |
legendNames |
a character vector, each of them is the condition name showing up in the frequency barplot of condition. Or just single name |
indicator |
identifier of current profile, served as a tag in name |
None
Dong Li, [email protected]
WeightedModulePartitionHierarchical
,
WeightedModulePartitionLouvain
,
WeightedModulePartitionSpectral
,
WeightedModulePartitionAmoutain
,
CompareAllNets
Assign the module scores by weights, and rank them from highest to lowest
modulesRank(foldername, indicator, GeneNames)
modulesRank(foldername, indicator, GeneNames)
foldername |
folder used to save modules |
indicator |
normally a specific tag of condition |
GeneNames |
Gene symbols, sometimes we need them instead of probe ids |
The numeber of modules
Dong Li, [email protected]
Compare the background network and a set of condition-specific network. returning a pair-wise matrix to show the normalized mutual information between each pair of networks in terms of partitioning
NMImatrix(ResultFolder, intModules, indicator, intconditionModules, conditionNames, Nsize, legendNames = NULL, plt = FALSE)
NMImatrix(ResultFolder, intModules, indicator, intconditionModules, conditionNames, Nsize, legendNames = NULL, plt = FALSE)
ResultFolder |
where to store results |
intModules |
how many modules in the background network |
indicator |
identifier of current profile, served as a tag in name |
intconditionModules |
a numeric vector, each of them is the number of modules in each condition-specific network. Or just single number |
conditionNames |
a character vector, each of them is the name of condition. Or just single name |
Nsize |
The number of genes in total |
legendNames |
a character vector, each of them is the condition name showing up in the similarity matrix plot if applicable |
plt |
a boolean value to indicate whether plot the similarity matrix |
NMI matrix indicating the similarity between each two networks
Dong Li, [email protected]
Calculate the average density of all resulting modules from a partition. The density of each module is defined as the average adjacency of the module genes.
PartitionDensity(ADJ, PartitionSet)
PartitionDensity(ADJ, PartitionSet)
ADJ |
gene similarity matrix |
PartitionSet |
vector indicates the partition label for genes |
partition density, defined as average density of all modules
Dong Li, [email protected]
Langfelder, Peter, and Steve Horvath. "WGCNA: an R package for weighted correlation network analysis." BMC bioinformatics 9.1 (2008): 1.
data(synthetic) ADJ1=abs(cor(datExpr1,use="p"))^10 dissADJ=1-ADJ1 hierADJ=hclust(as.dist(dissADJ), method="average" ) groups <- cutree(hierADJ, h = 0.8) pDensity <- PartitionDensity(ADJ1,groups)
data(synthetic) ADJ1=abs(cor(datExpr1,use="p"))^10 dissADJ=1-ADJ1 hierADJ=hclust(as.dist(dissADJ), method="average" ) groups <- cutree(hierADJ, h = 0.8) pDensity <- PartitionDensity(ADJ1,groups)
Calculate the average modularity of a partition. The modularity of each module is defined from a natural generalization of unweighted case.
PartitionModularity(ADJ, PartitionSet)
PartitionModularity(ADJ, PartitionSet)
ADJ |
gene similarity matrix |
PartitionSet |
vector indicates the partition label for genes |
partition modularity, defined as average modularity of all modules
Dong Li, [email protected]
Newman, Mark EJ. "Analysis of weighted networks." Physical review E 70.5 (2004): 056131.
data(synthetic) ADJ1=abs(cor(datExpr1,use="p"))^10 dissADJ=1-ADJ1 hierADJ=hclust(as.dist(dissADJ), method="average" ) groups <- cutree(hierADJ, h = 0.8) pDensity <- PartitionModularity(ADJ1,groups)
data(synthetic) ADJ1=abs(cor(datExpr1,use="p"))^10 dissADJ=1-ADJ1 hierADJ=hclust(as.dist(dissADJ), method="average" ) groups <- cutree(hierADJ, h = 0.8) pDensity <- PartitionModularity(ADJ1,groups)
Modules detection using igraph's community detection algorithms, when the resulted module is larger than expected, it is further devided by the same program
recursiveigraph(g, savefile, method = c("fastgreedy", "louvain"), maxsize = 200, minsize = 30)
recursiveigraph(g, savefile, method = c("fastgreedy", "louvain"), maxsize = 200, minsize = 30)
g |
igraph object, the network to be partitioned |
savefile |
plain text, used to store module, each line as a module |
method |
specify the community detection algorithm |
maxsize |
maximal module size |
minsize |
minimal module size |
None
Dong Li, [email protected]
Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of statistical mechanics: theory and experiment 2008.10 (2008): P10008.
Module detection based on the AMOUNTAIN algorithm, which tries to find the optimal module every time and use a modules extraction way
WeightedModulePartitionAmoutain(datExpr, Nmodule, foldername, indicatename, GeneNames, maxsize = 200, minsize = 3, power = 6, tao = 0.2)
WeightedModulePartitionAmoutain(datExpr, Nmodule, foldername, indicatename, GeneNames, maxsize = 200, minsize = 3, power = 6, tao = 0.2)
datExpr |
gene expression profile, rows are samples and columns genes |
Nmodule |
the number of clusters(modules) |
foldername |
where to store the clusters |
indicatename |
normally a specific tag of condition |
GeneNames |
normally the gene official names to replace the colnames of datExpr |
maxsize |
the maximal nodes allowed in one module |
minsize |
the minimal nodes allowed in one module |
power |
the power parameter of WGCNA, W_ij=|cor(x_i,x_j)|^pwr |
tao |
the threshold to cut the adjacency matrix |
None
Dong Li, [email protected]
Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of statistical mechanics: theory and experiment 2008.10 (2008): P10008.
data(synthetic) ResultFolder <- 'ForSynthetic' # where middle files are stored GeneNames <- colnames(datExpr1) intModules1 <- WeightedModulePartitionAmoutain(datExpr1,5,ResultFolder,'X', GeneNames,maxsize=100,minsize=50) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getPartition(ResultFolder) #randIndex(table(mymodule,truemodule),adjust=F)
data(synthetic) ResultFolder <- 'ForSynthetic' # where middle files are stored GeneNames <- colnames(datExpr1) intModules1 <- WeightedModulePartitionAmoutain(datExpr1,5,ResultFolder,'X', GeneNames,maxsize=100,minsize=50) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getPartition(ResultFolder) #randIndex(table(mymodule,truemodule),adjust=F)
Module detection based on the optimal cutting height of dendrogram, which is selected to make the average density or modularity of resulting partition maximal. The clustering and visulization function are from WGCNA.
WeightedModulePartitionHierarchical(datExpr, foldername, indicatename, cutmethod = c("Density", "Modularity"), power = 10)
WeightedModulePartitionHierarchical(datExpr, foldername, indicatename, cutmethod = c("Density", "Modularity"), power = 10)
datExpr |
gene expression profile, rows are samples and columns genes |
foldername |
where to store the clusters |
indicatename |
normally a specific tag of condition |
cutmethod |
cutting the dendrogram based on maximal average Density or Modularity |
power |
the power parameter of WGCNA, W_ij=|cor(x_i,x_j)|^power |
The number of clusters
Dong Li, [email protected]
Langfelder, Peter, and Steve Horvath. "WGCNA: an R package for weighted correlation network analysis." BMC bioinformatics 9.1 (2008): 1.
data(synthetic) ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 specificTheta = 0.1 #threshold to define condition specific modules conservedTheta = 0.1#threshold to define conserved modules intModules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,CuttingCriterion) #mymodule <- getPartition(ResultFolder) #randIndex(table(mymodule,truemodule),adjust=F)
data(synthetic) ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 specificTheta = 0.1 #threshold to define condition specific modules conservedTheta = 0.1#threshold to define conserved modules intModules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,CuttingCriterion) #mymodule <- getPartition(ResultFolder) #randIndex(table(mymodule,truemodule),adjust=F)
Module detection based on the Louvain algorithm, which tries to maximize overall modularity of resulting partition.
WeightedModulePartitionLouvain(datExpr, foldername, indicatename, GeneNames, maxsize = 200, minsize = 30, power = 6, tao = 0.2)
WeightedModulePartitionLouvain(datExpr, foldername, indicatename, GeneNames, maxsize = 200, minsize = 30, power = 6, tao = 0.2)
datExpr |
gene expression profile, rows are samples and columns genes |
foldername |
where to store the clusters |
indicatename |
normally a specific tag of condition |
GeneNames |
normally the gene official names to replace the colnames of datExpr |
maxsize |
the maximal nodes allowed in one module |
minsize |
the minimal nodes allowed in one module |
power |
the power parameter of WGCNA, W_ij=|cor(x_i,x_j)|^power |
tao |
the threshold to cut the adjacency matrix |
The number of clusters
Dong Li, [email protected]
Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of statistical mechanics: theory and experiment 2008.10 (2008): P10008.
data(synthetic) ResultFolder <- 'ForSynthetic' # where middle files are stored indicator <- 'X' # indicator for data profile 1 GeneNames <- colnames(datExpr1) intModules1 <- WeightedModulePartitionLouvain(datExpr1,ResultFolder,indicator,GeneNames) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getPartition(ResultFolder) #randIndex(table(mymodule,truemodule),adjust=F)
data(synthetic) ResultFolder <- 'ForSynthetic' # where middle files are stored indicator <- 'X' # indicator for data profile 1 GeneNames <- colnames(datExpr1) intModules1 <- WeightedModulePartitionLouvain(datExpr1,ResultFolder,indicator,GeneNames) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getPartition(ResultFolder) #randIndex(table(mymodule,truemodule),adjust=F)
Module detection based on the spectral clustering algorithm, which mainly solve the eigendecomposition on Laplacian matrix
WeightedModulePartitionSpectral(datExpr, foldername, indicatename, GeneNames, power = 6, nn = 10, k = 2)
WeightedModulePartitionSpectral(datExpr, foldername, indicatename, GeneNames, power = 6, nn = 10, k = 2)
datExpr |
gene expression profile, rows are samples and columns genes |
foldername |
where to store the clusters |
indicatename |
normally a specific tag of condition |
GeneNames |
normally the gene official names to replace the colnames of datExpr |
power |
the power parameter of WGCNA, W_ij=|cor(x_i,x_j)|^power |
nn |
the number of nearest neighbor, used to construct the affinity matrix |
k |
the number of clusters(modules) |
None
Dong Li, [email protected]
Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4 (2007): 395-416.
data(synthetic) ResultFolder <- 'ForSynthetic' # where middle files are stored indicator <- 'X' # indicator for data profile 1 GeneNames <- colnames(datExpr1) WeightedModulePartitionSpectral(datExpr1,ResultFolder,indicator, GeneNames,k=5) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getPartition(ResultFolder) #randIndex(table(mymodule,truemodule),adjust=F)
data(synthetic) ResultFolder <- 'ForSynthetic' # where middle files are stored indicator <- 'X' # indicator for data profile 1 GeneNames <- colnames(datExpr1) WeightedModulePartitionSpectral(datExpr1,ResultFolder,indicator, GeneNames,k=5) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getPartition(ResultFolder) #randIndex(table(mymodule,truemodule),adjust=F)