Title: | Differential Co-expression Analysis |
---|---|
Description: | A tool for the identification of differentially coexpressed links (DCLs) and differentially coexpressed genes (DCGs). DCLs are gene pairs with significantly different correlation coefficients under two conditions. DCGs are genes with significantly more DCLs than by chance. |
Authors: | Wenbin Wei, Sandeep Amberkar, Winston Hide |
Maintainer: | Wenbin Wei <[email protected]> |
License: | GPL (>2) |
Version: | 1.27.0 |
Built: | 2024-11-29 07:28:31 UTC |
Source: | https://github.com/bioc/diffcoexp |
This function identifies gene pairs coexpressed in at least one of two conditions.
coexpr(exprs.1, exprs.2, r.method = c("pearson", "spearman")[1], q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none")[1], rth = 0.5, qth = 0.1)
coexpr(exprs.1, exprs.2, r.method = c("pearson", "spearman")[1], q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none")[1], rth = 0.5, qth = 0.1)
exprs.1 |
a SummarizedExperiment, data frame or matrix for condition 1, with gene IDs as rownames and sample IDs as column names. |
exprs.2 |
a SummarizedExperiment, data frame or matrix for condition 2, with gene IDs as rownames and sample IDs as column names. |
r.method |
a character string specifying the method to be used to calculate correlation coefficients. It is passed to the cor function of the WGCNA package. |
q.method |
a character string specifying the method for adjusting p values. It is passed to the p.adjust function of the stats package. |
rth |
the cutoff of absolute value of correlation coefficients; must be within [0,1]. |
qth |
the cutoff of q-value (adjusted p value); must be within [0,1]. |
a data frame containing gene pairs that are coexpressed in at least one of the conditions with the criteria that absolute value of correlation coefficient is greater than rth and q value less than qth. It has the following columns:
Gene.1 |
Gene ID |
Gene.2 |
Gene ID |
cor.1 |
correlation coefficients under condition 1 |
cor.2 |
correlation coefficients under condition 2 |
cor.diff |
difference between correlation coefficients under condition 2 and condition 1 |
p.1 |
p value under null hypothesis that correlation coefficient under condition 1 equals to zero |
p.2 |
p value under null hypothesis that correlation coefficient under condition 2 equals to zero |
p.diffcor |
p value under null hypothesis that difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation |
q.1 |
adjusted p value under null hypothesis that correlation coefficient under condition 1 equals to zero |
q.2 |
adjusted p value under null hypothesis that correlation coefficient under condition 2 equals to zero |
q.diffcor |
adjusted p value under null hypothesis that the difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation |
data(gse4158part) allowWGCNAThreads() res=coexpr(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman") #The result is a data frames. str(res)
data(gse4158part) allowWGCNAThreads() res=coexpr(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman") #The result is a data frames. str(res)
This function calculates correlation coefficients of all gene pairs under two conditions and compare them using Fisher's Z-transformation.
comparecor(exprs.1, exprs.2, r.method = c("pearson", "spearman")[1], q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none")[1])
comparecor(exprs.1, exprs.2, r.method = c("pearson", "spearman")[1], q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none")[1])
exprs.1 |
a SummarizedExperiment, data frame or matrix for condition 1, with gene IDs as rownames and sample IDs as column names. |
exprs.2 |
a SummarizedExperiment, data frame or matrix for condition 2, with gene IDs as rownames and sample IDs as column names. |
r.method |
a character string specifying the method to be used to calculate correlation coefficients. It is passed to the cor function of the WGCNA package. |
q.method |
a character string specifying the method for adjusting p values. It is passed to the p.adjust function of the stats package. |
a data frame containing the differences between the correlation coefficients under two consitions and their p values. It has the following columns:
Gene.1 |
Gene ID |
Gene.2 |
Gene ID |
cor.1 |
correlation coefficients under condition 1 |
cor.2 |
correlation coefficients under condition 2 |
cor.diff |
difference between correlation coefficients under condition 2 and condition 1 |
p.1 |
p value under null hypothesis that correlation coefficient under condition 1 equals to zero |
p.2 |
p value under null hypothesis that correlation coefficient under condition 2 equals to zero |
p.diffcor |
p value under null hypothesis that difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation |
q.1 |
adjusted p value under null hypothesis that correlation coefficient under condition 1 equals to zero |
q.2 |
adjusted p value under null hypothesis that correlation coefficient under condition 2 equals to zero |
q.diffcor |
adjusted p value under null hypothesis that the difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation |
data(gse4158part) allowWGCNAThreads() res=comparecor(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman") #The result is a data frames. str(res)
data(gse4158part) allowWGCNAThreads() res=comparecor(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman") #The result is a data frames. str(res)
This function identifies differentially coexpressed links (DCLs) and differentially coexpressed genes (DCGs).
diffcoexp(exprs.1, exprs.2, r.method = c("pearson", "kendall", "spearman")[1], q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none")[1], rth = 0.5, qth = 0.1, r.diffth = 0.5, q.diffth = 0.1, q.dcgth = 0.1)
diffcoexp(exprs.1, exprs.2, r.method = c("pearson", "kendall", "spearman")[1], q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none")[1], rth = 0.5, qth = 0.1, r.diffth = 0.5, q.diffth = 0.1, q.dcgth = 0.1)
exprs.1 |
a SummarizedExperiment, data frame or matrix for condition 1, with gene IDs as rownames and sample IDs as column names. |
exprs.2 |
a SummarizedExperiment, data frame or matrix for condition 2, with gene IDs as rownames and sample IDs as column names. |
r.method |
a character string specifying the method to be used to calculate correlation coefficients. It is passed to the cor function of the WGCNA package. |
q.method |
a character string specifying the method for adjusting p values. It is passed to the p.adjust function of the stats package. |
rth |
the cutoff of absolute value of correlation coefficients; must be within [0,1]. |
qth |
the cutoff of q-value (adjusted p value); must be within [0,1]. |
r.diffth |
the cutoff of absolute value of the difference between the correlation coefficients of the two conditions; must be within [0,1]. |
q.diffth |
the cutoff of q-value (adjusted p value) of the difference between the correlation coefficients of the two conditions; must be within [0,1]. |
q.dcgth |
the cutoff of q-value (adjusted p value) of the genes enriched in the differentilly correlated gene pairs between the two conditions; must be within [0,1]. |
diffcoexp function identifies differentially coexpressed links (DCLs) and differentially coexpressed genes (DCGs). DCLs are gene pairs with significantly different correlation coefficients under two conditions (de la Fuente 2010, Jiang et al., 2016). DCGs are genes with significantly more DCLs than by chance (Yu et al., 2011, Jiang et al., 2016). It takes two gene expression matrices or data frames under two conditions as input, calculates gene-gene correlations under two conditions and compare them with Fisher's Z transformation, filter the correlation with the rth and qth and the correlation changes with r.diffth and q.diffth. It identifies DCGs using binomial probability model (Jiang et al., 2016).
The main steps are as follows:
a). Correlation coefficients and p values of all gene pairs under two conditions are calculated.
b). The difference between the correlation coefficients under two conditions are calculated and the p value is calculated using Fisher's Z-transformation.
c). p values are adjusted.
d). Gene pairs (links) coexpressed in at least one condition are identified using the criteria that at least one of the correlation coefficients under two conditions has absolute value greater than the threshold rth and adjusted p value less than the threshold qth. The links that meet the criteria are included in CLs.
e). Differentially coexpressed gene pairs (links) are identified from CLs using the criteria that the absolute value of the difference between the two correlation coefficients is greater the threshold r.diffth and adjusted p value is less than the threshold q.diffth. The links that meet the criteria are included in DCLs.
f). The DCLs are classified into three categories: "same signed", "diff signed", or "switched opposites". "same signed" indicates that the gene pair has same signed correlation coefficients under both conditions. "diff signed" indicates that the gene pair has oppositely signed correlation coefficients under two conditions and only one of them meets the criteria that absolute correlation coefficient is greater than the threshold rth and adjusted p value less than the threshold qth. "switched opposites" indicates that the gene pair has oppositely signed correlation coefficients under two conditions and both of them meet the criteria that absolute correlation coefficient is greater than the threshold rth and adjusted p value less than the threshold qth.
g). All the genes in DCLs are tested for their enrichment of DCLs, i.e, whether they have more DCLs than by chance using binomial probability model (Jiang et al., 2016). Those with adjusted p value less than the threshold q.dcgth are included in DCGs.
a list of two data frames.
The DCGs data frame contains genes that contribute to differentially correlated links (gene pairs) with q value less than q.dcgth. It has the following columns:
Gene |
Gene ID |
CLs |
Number of links with absolute correlation coefficient greater than rth and q value less than qth in at least one condition |
DCLs |
Number of links that meet the criteria for CLs and the criteria that absolute difference between the correlation coefficients of the two condition is greater than r.diffth and q value less than q.diffth |
DCL.same |
Number of subset of DCLs with same signed correlation coefficients in both conditions |
DCL.diff |
Number of subset of DCLs with oppositely signed correlation coefficients under two conditions but only one of them has absolute correlation coefficient greater than rth and q value less than qth |
DCL.switch |
Number of subset of DCLs with oppositely signed correlation coefficients under two conditions and both of them have absolute correlation coefficient greater than rth and q value less than qth |
p |
p value of having >=DCLs given CLs |
q |
adjusted p value |
The DCLs data frame contains the differentially correlated links (gene pairs) that meet the criteria that at least one of their correlation coefficients (cor.1 and/or cor.2) is greater than rth with q value (q.1 and/or q.2) less than qth and the absolute value of the difference between the correlation coefficients under two conditions (cor.diff) is greater than r.diffth with q.diffcor less than q.diffth. It has the following columns:
Gene.1 |
Gene ID |
Gene.2 |
Gene ID |
cor.1 |
correlation coefficients under condition 1 |
cor.2 |
correlation coefficients under condition 2 |
cor.diff |
difference between correlation coefficients under condition 2 and condition 1 |
p.1 |
p value under null hypothesis that correlation coefficient under condition 1 equals to zero |
p.2 |
p value under null hypothesis that correlation coefficient under condition 2 equals to zero |
p.diffcor |
p value under null hypothesis that difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation |
q.1 |
adjusted p value under null hypothesis that correlation coefficient under condition 1 equals to zero |
q.2 |
adjusted p value under null hypothesis that correlation coefficient under condition 2 equals to zero |
q.diffcor |
adjusted p value under null hypothesis that the difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation |
type |
can have value "same signed", "diff signed", or "switched opposites". "same signed" indicates that the gene pair has same signed correlation coefficients under both conditions. "diff signed" indicates that the gene pair has oppositely signed correlation coefficients under two conditions and only one of them meets the criteria that absolute correlation coefficient is greater than rth and q value less than qth. "switched opposites" indicates that the gene pair has oppositely signed correlation coefficients under two conditions and both of them meet the criteria that absolute correlation coefficient is greater than rth and q value less than qth. |
Wenbin Wei
1. de la Fuente A. From "differential expression" to "differential networking" - identification of dysfunctional regulatory networks in diseases. Trends in Genetics. 2010 Jul;26(7):326-33.
2. Jiang Z, Dong X, Li Z-G, He F, Zhang Z. Differential Coexpression Analysis Reveals Extensive Rewiring of Arabidopsis Gene Coexpression in Response to Pseudomonas syringae Infection. Scientific Reports. 2016 Dec;6(1):35064.
3. Yu H, Liu B-H, Ye Z-Q, Li C, Li Y-X, Li Y-Y. Link-based quantitative methods to identify differentially coexpressed genes and gene pairs. BMC bioinformatics. 2011;12(1):315.
data(gse4158part) allowWGCNAThreads() res=diffcoexp(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman") #The results are a list of two data frames, one for differentially co-expressed #links (DCLs, gene pairs) and one for differentially co-expressed genes (DCGs). str(res)
data(gse4158part) allowWGCNAThreads() res=diffcoexp(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman") #The results are a list of two data frames, one for differentially co-expressed #links (DCLs, gene pairs) and one for differentially co-expressed genes (DCGs). str(res)
expression of 400 genes in 14 samples (GSM94988 to GSM95001) of yeast after pulses 2 g/l glucose, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4158.
exprs.1
exprs.1
A matrix with 400 genes and 14 samples.
expression of 400 genes in 14 samples (GSM94988 to GSM95001) of yeast after pulses 0.2 g/l glucose, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4158.
exprs.2
exprs.2
A matrix with 400 genes and 12 samples.