Package 'QuaternaryProd' reference manual

Title:	Computes the Quaternary Dot Product Scoring Statistic for Signed and Unsigned Causal Graphs
Description:	QuaternaryProd is an R package that performs causal reasoning on biological networks, including publicly available networks such as STRINGdb. QuaternaryProd is an open-source alternative to commercial products such as Inginuity Pathway Analysis. For a given a set of differentially expressed genes, QuaternaryProd computes the significance of upstream regulators in the network by performing causal reasoning using the Quaternary Dot Product Scoring Statistic (Quaternary Statistic), Ternary Dot product Scoring Statistic (Ternary Statistic) and Fisher's exact test (Enrichment test). The Quaternary Statistic handles signed, unsigned and ambiguous edges in the network. Ambiguity arises when the direction of causality is unknown, or when the source node (e.g., a protein) has edges with conflicting signs for the same target gene. On the other hand, the Ternary Statistic provides causal reasoning using the signed and unambiguous edges only. The Vignette provides more details on the Quaternary Statistic and illustrates an example of how to perform causal reasoning using STRINGdb.
Authors:	Carl Tony Fakhry [cre, aut], Ping Chen [ths], Kourosh Zarringhalam [aut, ths]
Maintainer:	Carl Tony Fakhry <[email protected]>
License:	GPL (>=3)
Version:	1.41.0
Built:	2025-03-23 06:28:37 UTC
Source:	https://github.com/bioc/QuaternaryProd

Computes the Quaternary Dot Product Scoring Statistic for Signed and Unsigned Causal Graphs

Description

QuaternaryProd is an R package that performs causal reasoning on biological networks, including publicly available networks such as STRINGdb. QuaternaryProd is an open-source alternative to commercial products such as Inginuity Pathway Analysis. For a given a set of differentially expressed genes, QuaternaryProd computes the significance of upstream regulators in the network by performing causal reasoning using the Quaternary Dot Product Scoring Statistic (Quaternary Statistic), Ternary Dot product Scoring Statistic (Ternary Statistic) and Fisher's exact test (Enrichment test). The Quaternary Statistic handles signed, unsigned and ambiguous edges in the network. Ambiguity arises when the direction of causality is unknown, or when the source node (e.g., a protein) has edges with conflicting signs for the same target gene. On the other hand, the Ternary Statistic provides causal reasoning using the signed and unambiguous edges only. The Vignette provides more details on the Quaternary Statistic and illustrates an example of how to perform causal reasoning using STRINGdb.

Details

Package:	QuaternaryProd
Type:	Package
Version:	1.15.3
Date:	2015-10-22
License:	GPL (>= 2)

Author(s)

Carl Tony Fakhry, Ping Chen and Kourosh Zarringhalam

Maintainer: Carl Tony Fakhry <[email protected]>

References

Carl Tony Fakhry, Parul Choudhary, Alex Gutteridge, Ben Sidders, Ping Chen, Daniel Ziemek, and Kourosh Zarringhalam. Interpreting transcriptional changes using causal graphs: new methods and their practical utility on public networks. BMC Bioinformatics, 17:318, 2016. ISSN 1471-2105. doi: 10.1186/s12859-016-1181-8.

Franceschini, A (2013). STRING v9.1: protein-protein interaction networks, with increased coverage and integration. In:'Nucleic Acids Res. 2013 Jan;41(Database issue):D808-15. doi: 10.1093/nar/gks1094. Epub 2012 Nov 29'.

Computes the probability mass function of the scores.

Description

This function computes the probability mass function for the Quaternary Dot Product Scoring Statistic for signed causal graphs. This includes scores with probabilities strictly greater than zero.

Usage

QP_Pmf(q_p, q_m, q_z, q_r, n_p, n_m, n_z, epsilon = 1e-16)
QP_Pmf(q_p, q_m, q_z, q_r, n_p, n_m, n_z, epsilon = 1e-16)

Arguments

`q_p`	Expected number of positive predictions.
`q_m`	Expected number of negative predictions.
`q_z`	Expected number of nil predictions.
`q_r`	Expected number of regulated predictions.
`n_p`	Number of positive predictions from experiments.
`n_m`	Number of negative predictions from experiments.
`n_z`	Number of nil predictions from experiments.
`epsilon`	parameter for thresholding probabilities of matrices. Default value is 1e-16.

Details

This function computes the probability for each score in the support of the distribution. The returned value is a vector of probabilities where the returned vector has names set equal to the corresponding scores.

Setting epsilon to zero will compute the probability mass function without ignoring any matrices with probabilities smaller than epsilon*D_max (D_max is the numerator associated with the matrix of highest probability for the given constraints). The default value of 1e-16 is experimentally validated to be a very reasonable threshold. Setting the threshold to higher values which are smaller than 1 will lead to understimating the probabilities of each score since more tables will be ignored.

Value

Vector of probabilities for scores in the support.

Author(s)

Carl Tony Fakhry, Ping Chen and Kourosh Zarringhalam

References

Examples

# Compute the probability mass function of the Quaternary Dot
# Product Scoring Statistic for the given table margins.
pmf <- QP_Pmf(50,50,50,0,50,50,50)

# Compute the probability mass function of the Quaternary Dot
# Product Scoring Statistic for the given table margins.
pmf <- QP_Pmf(50,50,50,0,50,50,50)

Computes the probability of a score.

Description

This function computes the probability of a score in the Quaternary Dot Product scoring distribution.

Usage

QP_Probability(score, q_p, q_m, q_z, q_r, n_p, n_m, n_z, epsilon = 1e-16)
QP_Probability(score, q_p, q_m, q_z, q_r, n_p, n_m, n_z, epsilon = 1e-16)

Arguments

`score`	The score for which the probability will be computed.
`q_p`	Expected number of positive predictions.
`q_m`	Expected number of negative predictions.
`q_z`	Expected number of nil predictions.
`q_r`	Expected number of regulated predictions.
`n_p`	Number of positive predictions from experiments.
`n_m`	Number of negative predictions from experiments.
`n_z`	Number of nil predictions from experiments.
`epsilon`	Threshold for probabilities of matrices. Default value is 1e-16.

Details

For computing p-values, the user is advised to use the p-value function which is optimized for such purposes.

Value

This function returns a numerical value, where the numerical value is the probability of the score.

Author(s)

Carl Tony Fakhry, Ping Chen and Kourosh Zarringhalam

References

Examples

# Computing The probability of score 50 
# for the given table margins. 
prob <- QP_Probability(0,50,50,50,0,50,50,50)

# Computing The probability of score 50 
# for the given table margins. 
prob <- QP_Probability(0,50,50,50,0,50,50,50)

Computes the p-value of a score.

Description

This function computes the right sided p-value for the Quaternary Dot Product Scoring Statistic.

Usage

QP_Pvalue(score, q_p, q_m, q_z, q_r, n_p, n_m, n_z, epsilon = 1e-16)
QP_Pvalue(score, q_p, q_m, q_z, q_r, n_p, n_m, n_z, epsilon = 1e-16)

Arguments

`score`	The score for which the p-value will be computed.
`q_p`	Expected number of positive predictions.
`q_m`	Expected number of negative predictions.
`q_z`	Expected number of nil predictions.
`q_r`	Expected number of regulated predictions.
`n_p`	Number of positive predictions from experiments.
`n_m`	Number of negative predictions from experiments.
`n_z`	Number of nil predictions from experiments.
`epsilon`	Threshold for probabilities of matrices. Default value is 1e-16.

Details

Value

This function returns a numerical value, where the numerical value is the p-value of the score.

Author(s)

Carl Tony Fakhry, Ping Chen and Kourosh Zarringhalam

References

Examples

# Computing The p-value of score 50 
# for the given table margins. 
pval <- QP_Pvalue(50,50,50,50,0,50,50,50)

# Computing The p-value of score 50 
# for the given table margins. 
pval <- QP_Pvalue(50,50,50,50,0,50,50,50)

Computes the p-value for a statistically significant score.

Description

This function computes the right sided p-value for the Quaternary Dot Product Scoring Statistic for statistically significant scores.

Usage

QP_SigPvalue(score, q_p, q_m, q_z, q_r, n_p, n_m, n_z, epsilon = 1e-16, sig_level = 0.05)
QP_SigPvalue(score, q_p, q_m, q_z, q_r, n_p, n_m, n_z, epsilon = 1e-16, sig_level = 0.05)

Arguments

`score`	The score for which the p-value will be computed.
`q_p`	Expected number of positive predictions.
`q_m`	Expected number of negative predictions.
`q_z`	Expected number of nil predictions.
`q_r`	Expected number of regulated predictions.
`n_p`	Number of positive predictions from experiments.
`n_m`	Number of negative predictions from experiments.
`n_z`	Number of nil predictions from experiments.
`epsilon`	Threshold for probabilities of matrices. Default value is 1e-16.
`sig_level`	Significance level of test hypothesis. Default value is 0.05.

Details

Value

This function returns a numerical value, where the numerical value is the p-value of a score if the score is statistically significant otherwise it returns -1.

Author(s)

Carl Tony Fakhry, Ping Chen and Kourosh Zarringhalam

References

Examples

# Computing The p-value of score 50 
# for the given table margins. 
pval <- QP_SigPvalue(50,50,50,50,0,50,50,50)

# Computing The p-value of score 50 
# for the given table margins. 
pval <- QP_SigPvalue(50,50,50,50,0,50,50,50)

Computes the support for the scores.

Description

This function computes the support of the Quaternary Dot Product Scoring distribution for signed causal graphs. This includes all scores which have probabilities strictly greater than 0.

Usage

QP_Support(q_p, q_m, q_z, q_r, n_p, n_m, n_z)
QP_Support(q_p, q_m, q_z, q_r, n_p, n_m, n_z)

Arguments

`q_p`	Expected number of positive predictions.
`q_m`	Expected number of negative predictions.
`q_z`	Expected number of nil predictions.
`q_r`	Expected number of regulated predictions.
`n_p`	Number of positive predictions from experiments.
`n_m`	Number of negative predictions from experiments.
`n_z`	Number of nil predictions from experiments.

Value

Integer vector of support.

Author(s)

Carl Tony Fakhry, Ping Chen and Kourosh Zarringhalam

References

Examples

# Compute the support of the Quaternary Dot Product Scoring distribution with the given margins.
QP_Support(50,50,50,0,50,50,50)

# Compute the support of the Quaternary Dot Product Scoring distribution with the given margins.
QP_Support(50,50,50,0,50,50,50)

This function runs a causal relation engine by computing the Quaternary Dot Product Scoring Statistic, Ternary Dot Product Scoring Statistic or the Enrichment test over the Homo Sapien STRINGdb causal network (version 10 provided under the Creative Commons license: https://creativecommons.org/licenses/by/3.0/). Note that the user has the option of specifying other causal networks with this function.

Description

This function runs a causal relation engine by computing the Quaternary Dot Product Scoring Statistic, Ternary Dot Product Scoring Statistic or the Enrichment test over the Homo Sapien STRINGdb causal network (version 10 provided under the Creative Commons license: https://creativecommons.org/licenses/by/3.0/). Note that the user has the option of specifying other causal networks with this function.

Usage

RunCRE_HSAStringDB(gene_expression_data, method = "Quaternary", 
                    fc.thresh = log2(1.3), pval.thresh = 0.05, 
                    only.significant.pvalues = FALSE, 
                    significance.level = 0.05,
                    epsilon = 1e-16, progressBar = TRUE, 
                    relations = NULL, entities = NULL)
RunCRE_HSAStringDB(gene_expression_data, method = "Quaternary", 
                    fc.thresh = log2(1.3), pval.thresh = 0.05, 
                    only.significant.pvalues = FALSE, 
                    significance.level = 0.05,
                    epsilon = 1e-16, progressBar = TRUE, 
                    relations = NULL, entities = NULL)

Arguments

`gene_expression_data`	A data frame for gene expression data. The `gene_expression_data` data frame must have three columns `entrez`, `fc` and `pvalue`. `entrez` denotes the entrez id of a given gene, `fc` denotes the fold change of a gene, and `pvalue` denotes the p-value. The `entrez` column must be of type integer or character, and the `fc` and `pvalue` columns must be numeric values.
`method`	Choose one of `Quaternary`, `Ternary` or `Enrichment`. Default is `Quaternary`.
`fc.thresh`	Threshold for fold change in `gene_expression_data` data frame. Any row in gene_expression_data with abosolute value of `fc` smaller than `fc.thresh` will be ignored. Default value is `fc.thresh = log2(1.3)`.
`pval.thresh`	Threshold for p-values in `gene_expression_data` data frame. All rows in `gene_expression_data` with p-values greater than `pval.thresh` will be ingnored. Default value is `pval.thresh = 0.05`.
`only.significant.pvalues`	If `only.significant.pvalues = TRUE` then only p-values for statistically significant regulators are computed otherwise uncomputed p-values are set to -1. The default value is `only.significant.pvalues = FALSE`.
`significance.level`	When `only.significant.pvalues = TRUE`, only p-values which are less than or equal to `significance.level` are computed. The default value is `significance.level = 0.05`.
`epsilon`	Threshold for probabilities of matrices. Default value is `threshold = 1e-16`.
`progressBar`	Progress bar for the percentage of computed p-values for the regulators in the network. Default value is `progressBar = TRUE`.
`relations`	A data frame containing pairs of connected entities in a causal network, and the type of causal relation between them. The data frame must have three columns with column names: srcuid, trguid and mode respective of order. srcuid stands for source entity, trguid stands for target entity and mode stands for the type of relation between srcuid and trguid. The relation has to be one of +1 for upregulation, -1 for downregulation or 0 for regulation without specified direction of regulation. All three columns must be of type integer. Default value is `relations = NULL`.
`entities`	A data frame of mappings for all entities present in data frame relations. entities must contain four columns: uid, id, symbol and type respective of order. uid must be of type integer and id, symbol and type must be of type character. uid includes every source and target node in the network (i.e relations), id is the id of uid (e.g entrez id of an mRNA), symbol is the symbol of id and type is the type of entity of id (e.g mRNA, protein, drug or compound). Default value is `entities = NULL`.

Value

This function returns a data frame containing parameters concerning the method used. The p-values of each of the regulators is also computed, and the data frame is in increasing order of p-values of the goodness of fit score for the given regulators. The column names of the data frame are:

uid The regulator in the causal network.
symbol Symbol of the regulator.
regulation Direction of regulation of the regulator.
correct.pred Number of correct predictions in gene_expression_data when compared to predictions made by the network.
incorrect.pred Number of incorrect predictions in gene_expression_data when compared to predictions made by the network.
score The number of correct predictions minus the number of incorrect predictions.
total.reachable Total Number of children of the given regulator.
significant.reachable Number of children of the given regulator that are also present in gene_expression_data.
total.ambiguous Total number of children of the given regulator which are regulated by the given regulator without knowing the direction of regulation.
significant.ambiguous Total number of children of the given regulator which are regulated by the given regulator without knowing the direction of regulation and are also present in gene_expression_data.
unknown Number of target nodes in the causal network which do not interact with the given regulator.
pvalue P-value of the score computed according to the selected method. If only.significant.pvalues = TRUE and the pvalue of the regulator is greater than significance.level, then the p-value is not computed and is set to a value of -1.

Author(s)

Carl Tony Fakhry, Ping Chen and Kourosh Zarringhalam

References

Examples


# Get gene expression data
e2f3 <- system.file("extdata", "e2f3_sig.txt", package = "QuaternaryProd")
e2f3 <- read.table(e2f3, sep = "\t", header = TRUE, stringsAsFactors = FALSE)

# Rename column names appropriately and remove duplicated entrez ids
names(e2f3) <- c("entrez", "pvalue", "fc")
e2f3 <- e2f3[!duplicated(e2f3$entrez),]

# Compute the Quaternary Dot Product Scoring statistic for statistically significant
# regulators in the STRINGdb network
enrichment_results <- RunCRE_HSAStringDB(e2f3, method = "Enrichment",
                             fc.thresh = log2(1.3), pval.thresh = 0.05,
                             only.significant.pvalues = TRUE)
enrichment_results[1:4, c("uid","symbol","regulation","pvalue")]

# Get gene expression data
e2f3 <- system.file("extdata", "e2f3_sig.txt", package = "QuaternaryProd")
e2f3 <- read.table(e2f3, sep = "\t", header = TRUE, stringsAsFactors = FALSE)

# Rename column names appropriately and remove duplicated entrez ids
names(e2f3) <- c("entrez", "pvalue", "fc")
e2f3 <- e2f3[!duplicated(e2f3$entrez),]

# Compute the Quaternary Dot Product Scoring statistic for statistically significant
# regulators in the STRINGdb network
enrichment_results <- RunCRE_HSAStringDB(e2f3, method = "Enrichment",
                             fc.thresh = log2(1.3), pval.thresh = 0.05,
                             only.significant.pvalues = TRUE)
enrichment_results[1:4, c("uid","symbol","regulation","pvalue")]

Package 'QuaternaryProd'

Help Index

Computes the Quaternary Dot Product Scoring Statistic for Signed and Unsigned Causal Graphs

Description

Details

Author(s)

References

Computes the probability mass function of the scores.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Computes the probability of a score.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Computes the p-value of a score.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Computes the p-value for a statistically significant score.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Computes the support for the scores.

Description

Usage

Arguments

Value

Author(s)

References

Examples

Description

Usage

Arguments

Value

Author(s)

References

Examples