Package 'diffcoexp'

Title: Differential Co-expression Analysis
Description: A tool for the identification of differentially coexpressed links (DCLs) and differentially coexpressed genes (DCGs). DCLs are gene pairs with significantly different correlation coefficients under two conditions. DCGs are genes with significantly more DCLs than by chance.
Authors: Wenbin Wei, Sandeep Amberkar, Winston Hide
Maintainer: Wenbin Wei <[email protected]>
License: GPL (>2)
Version: 1.27.0
Built: 2024-11-29 07:28:31 UTC
Source: https://github.com/bioc/diffcoexp

Help Index


Identification of gene pairs coexpressed in at least one of two conditions

Description

This function identifies gene pairs coexpressed in at least one of two conditions.

Usage

coexpr(exprs.1, exprs.2, r.method = c("pearson", "spearman")[1],
  q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr",
  "none")[1], rth = 0.5, qth = 0.1)

Arguments

exprs.1

a SummarizedExperiment, data frame or matrix for condition 1, with gene IDs as rownames and sample IDs as column names.

exprs.2

a SummarizedExperiment, data frame or matrix for condition 2, with gene IDs as rownames and sample IDs as column names.

r.method

a character string specifying the method to be used to calculate correlation coefficients. It is passed to the cor function of the WGCNA package.

q.method

a character string specifying the method for adjusting p values. It is passed to the p.adjust function of the stats package.

rth

the cutoff of absolute value of correlation coefficients; must be within [0,1].

qth

the cutoff of q-value (adjusted p value); must be within [0,1].

Value

a data frame containing gene pairs that are coexpressed in at least one of the conditions with the criteria that absolute value of correlation coefficient is greater than rth and q value less than qth. It has the following columns:

Gene.1

Gene ID

Gene.2

Gene ID

cor.1

correlation coefficients under condition 1

cor.2

correlation coefficients under condition 2

cor.diff

difference between correlation coefficients under condition 2 and condition 1

p.1

p value under null hypothesis that correlation coefficient under condition 1 equals to zero

p.2

p value under null hypothesis that correlation coefficient under condition 2 equals to zero

p.diffcor

p value under null hypothesis that difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation

q.1

adjusted p value under null hypothesis that correlation coefficient under condition 1 equals to zero

q.2

adjusted p value under null hypothesis that correlation coefficient under condition 2 equals to zero

q.diffcor

adjusted p value under null hypothesis that the difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation

Examples

data(gse4158part)
allowWGCNAThreads()
res=coexpr(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman")
#The result is a data frames.
str(res)

Compare gene-gene correlation coefficients under two conditions

Description

This function calculates correlation coefficients of all gene pairs under two conditions and compare them using Fisher's Z-transformation.

Usage

comparecor(exprs.1, exprs.2, r.method = c("pearson", "spearman")[1],
  q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr",
  "none")[1])

Arguments

exprs.1

a SummarizedExperiment, data frame or matrix for condition 1, with gene IDs as rownames and sample IDs as column names.

exprs.2

a SummarizedExperiment, data frame or matrix for condition 2, with gene IDs as rownames and sample IDs as column names.

r.method

a character string specifying the method to be used to calculate correlation coefficients. It is passed to the cor function of the WGCNA package.

q.method

a character string specifying the method for adjusting p values. It is passed to the p.adjust function of the stats package.

Value

a data frame containing the differences between the correlation coefficients under two consitions and their p values. It has the following columns:

Gene.1

Gene ID

Gene.2

Gene ID

cor.1

correlation coefficients under condition 1

cor.2

correlation coefficients under condition 2

cor.diff

difference between correlation coefficients under condition 2 and condition 1

p.1

p value under null hypothesis that correlation coefficient under condition 1 equals to zero

p.2

p value under null hypothesis that correlation coefficient under condition 2 equals to zero

p.diffcor

p value under null hypothesis that difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation

q.1

adjusted p value under null hypothesis that correlation coefficient under condition 1 equals to zero

q.2

adjusted p value under null hypothesis that correlation coefficient under condition 2 equals to zero

q.diffcor

adjusted p value under null hypothesis that the difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation

Examples

data(gse4158part)
allowWGCNAThreads()
res=comparecor(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman")
#The result is a data frames.
str(res)

Differential co-expression analysis

Description

This function identifies differentially coexpressed links (DCLs) and differentially coexpressed genes (DCGs).

Usage

diffcoexp(exprs.1, exprs.2, r.method = c("pearson", "kendall", "spearman")[1],
  q.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr",
  "none")[1], rth = 0.5, qth = 0.1, r.diffth = 0.5, q.diffth = 0.1,
  q.dcgth = 0.1)

Arguments

exprs.1

a SummarizedExperiment, data frame or matrix for condition 1, with gene IDs as rownames and sample IDs as column names.

exprs.2

a SummarizedExperiment, data frame or matrix for condition 2, with gene IDs as rownames and sample IDs as column names.

r.method

a character string specifying the method to be used to calculate correlation coefficients. It is passed to the cor function of the WGCNA package.

q.method

a character string specifying the method for adjusting p values. It is passed to the p.adjust function of the stats package.

rth

the cutoff of absolute value of correlation coefficients; must be within [0,1].

qth

the cutoff of q-value (adjusted p value); must be within [0,1].

r.diffth

the cutoff of absolute value of the difference between the correlation coefficients of the two conditions; must be within [0,1].

q.diffth

the cutoff of q-value (adjusted p value) of the difference between the correlation coefficients of the two conditions; must be within [0,1].

q.dcgth

the cutoff of q-value (adjusted p value) of the genes enriched in the differentilly correlated gene pairs between the two conditions; must be within [0,1].

Details

diffcoexp function identifies differentially coexpressed links (DCLs) and differentially coexpressed genes (DCGs). DCLs are gene pairs with significantly different correlation coefficients under two conditions (de la Fuente 2010, Jiang et al., 2016). DCGs are genes with significantly more DCLs than by chance (Yu et al., 2011, Jiang et al., 2016). It takes two gene expression matrices or data frames under two conditions as input, calculates gene-gene correlations under two conditions and compare them with Fisher's Z transformation, filter the correlation with the rth and qth and the correlation changes with r.diffth and q.diffth. It identifies DCGs using binomial probability model (Jiang et al., 2016).

The main steps are as follows:

a). Correlation coefficients and p values of all gene pairs under two conditions are calculated.

b). The difference between the correlation coefficients under two conditions are calculated and the p value is calculated using Fisher's Z-transformation.

c). p values are adjusted.

d). Gene pairs (links) coexpressed in at least one condition are identified using the criteria that at least one of the correlation coefficients under two conditions has absolute value greater than the threshold rth and adjusted p value less than the threshold qth. The links that meet the criteria are included in CLs.

e). Differentially coexpressed gene pairs (links) are identified from CLs using the criteria that the absolute value of the difference between the two correlation coefficients is greater the threshold r.diffth and adjusted p value is less than the threshold q.diffth. The links that meet the criteria are included in DCLs.

f). The DCLs are classified into three categories: "same signed", "diff signed", or "switched opposites". "same signed" indicates that the gene pair has same signed correlation coefficients under both conditions. "diff signed" indicates that the gene pair has oppositely signed correlation coefficients under two conditions and only one of them meets the criteria that absolute correlation coefficient is greater than the threshold rth and adjusted p value less than the threshold qth. "switched opposites" indicates that the gene pair has oppositely signed correlation coefficients under two conditions and both of them meet the criteria that absolute correlation coefficient is greater than the threshold rth and adjusted p value less than the threshold qth.

g). All the genes in DCLs are tested for their enrichment of DCLs, i.e, whether they have more DCLs than by chance using binomial probability model (Jiang et al., 2016). Those with adjusted p value less than the threshold q.dcgth are included in DCGs.

Value

a list of two data frames.

The DCGs data frame contains genes that contribute to differentially correlated links (gene pairs) with q value less than q.dcgth. It has the following columns:

Gene

Gene ID

CLs

Number of links with absolute correlation coefficient greater than rth and q value less than qth in at least one condition

DCLs

Number of links that meet the criteria for CLs and the criteria that absolute difference between the correlation coefficients of the two condition is greater than r.diffth and q value less than q.diffth

DCL.same

Number of subset of DCLs with same signed correlation coefficients in both conditions

DCL.diff

Number of subset of DCLs with oppositely signed correlation coefficients under two conditions but only one of them has absolute correlation coefficient greater than rth and q value less than qth

DCL.switch

Number of subset of DCLs with oppositely signed correlation coefficients under two conditions and both of them have absolute correlation coefficient greater than rth and q value less than qth

p

p value of having >=DCLs given CLs

q

adjusted p value

The DCLs data frame contains the differentially correlated links (gene pairs) that meet the criteria that at least one of their correlation coefficients (cor.1 and/or cor.2) is greater than rth with q value (q.1 and/or q.2) less than qth and the absolute value of the difference between the correlation coefficients under two conditions (cor.diff) is greater than r.diffth with q.diffcor less than q.diffth. It has the following columns:

Gene.1

Gene ID

Gene.2

Gene ID

cor.1

correlation coefficients under condition 1

cor.2

correlation coefficients under condition 2

cor.diff

difference between correlation coefficients under condition 2 and condition 1

p.1

p value under null hypothesis that correlation coefficient under condition 1 equals to zero

p.2

p value under null hypothesis that correlation coefficient under condition 2 equals to zero

p.diffcor

p value under null hypothesis that difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation

q.1

adjusted p value under null hypothesis that correlation coefficient under condition 1 equals to zero

q.2

adjusted p value under null hypothesis that correlation coefficient under condition 2 equals to zero

q.diffcor

adjusted p value under null hypothesis that the difference between two correlation coefficients under two conditions equals to zero using Fisher's r-to-Z transformation

type

can have value "same signed", "diff signed", or "switched opposites". "same signed" indicates that the gene pair has same signed correlation coefficients under both conditions. "diff signed" indicates that the gene pair has oppositely signed correlation coefficients under two conditions and only one of them meets the criteria that absolute correlation coefficient is greater than rth and q value less than qth. "switched opposites" indicates that the gene pair has oppositely signed correlation coefficients under two conditions and both of them meet the criteria that absolute correlation coefficient is greater than rth and q value less than qth.

Author(s)

Wenbin Wei

References

1. de la Fuente A. From "differential expression" to "differential networking" - identification of dysfunctional regulatory networks in diseases. Trends in Genetics. 2010 Jul;26(7):326-33.

2. Jiang Z, Dong X, Li Z-G, He F, Zhang Z. Differential Coexpression Analysis Reveals Extensive Rewiring of Arabidopsis Gene Coexpression in Response to Pseudomonas syringae Infection. Scientific Reports. 2016 Dec;6(1):35064.

3. Yu H, Liu B-H, Ye Z-Q, Li C, Li Y-X, Li Y-Y. Link-based quantitative methods to identify differentially coexpressed genes and gene pairs. BMC bioinformatics. 2011;12(1):315.

Examples

data(gse4158part)
allowWGCNAThreads()
res=diffcoexp(exprs.1 = exprs.1, exprs.2 = exprs.2, r.method = "spearman")
#The results are a list of two data frames, one for differentially co-expressed
#links (DCLs, gene pairs) and one for differentially co-expressed genes (DCGs).
str(res)

exprs.1

Description

expression of 400 genes in 14 samples (GSM94988 to GSM95001) of yeast after pulses 2 g/l glucose, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4158.

Usage

exprs.1

Format

A matrix with 400 genes and 14 samples.


exprs.2

Description

expression of 400 genes in 14 samples (GSM94988 to GSM95001) of yeast after pulses 0.2 g/l glucose, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4158.

Usage

exprs.2

Format

A matrix with 400 genes and 12 samples.