Title: | R implementation of information measures |
---|---|
Description: | This package consolidates a comprehensive set of information measurements, encompassing mutual information, conditional mutual information, interaction information, partial information decomposition, and part mutual information. |
Authors: | Chu Pan [aut, cre] |
Maintainer: | Chu Pan <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.17.0 |
Built: | 2024-10-30 07:32:43 UTC |
Source: | https://github.com/bioc/Informeasure |
The CMI.measure function is used to calculate the expected mutual information between two random variables conditioned on the third one from the joint count table.
CMI.measure( XYZ, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
CMI.measure( XYZ, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
XYZ |
a joint count distribution table of three random variables. |
method |
six probability estimation algorithms are available, "ML" is the default. |
lambda.probs |
the shrinkage intensity, only called when the probability estimator is "shrink". |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
verbose |
a logic variable. if verbose is true, report the shrinkage intensity. |
Six probability estimation methods are available to evaluate the underlying bin probability from observed counts:
method = "ML": maximum likelihood estimator, also referred to empirical probability,
method = "Jeffreys": Dirichlet distribution estimator with prior a = 0.5,
method = "Laplace": Dirichlet distribution estimator with prior a = 1,
method = "SG": Dirichlet distribution estimator with prior a = 1/length(XY),
method = "minimax": Dirichlet distribution estimator with prior a = sqrt(sum(XY))/length(XY),
method = "shrink": shrinkage estimator.
CMI.measure returns the conditional mutual information.
#' Hausser, J., & Strimmer, K. (2009). Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks. Journal of Machine Learning Research, 1469-1484.
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm XYZ <- discretize3D(x, y, z, "uniform_width") # corresponding conditional mutual information CMI.measure(XYZ)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm XYZ <- discretize3D(x, y, z, "uniform_width") # corresponding conditional mutual information CMI.measure(XYZ)
CMI.plugin measures the expected mutual information between two random variables conditioned on the third one from the joint probability distribution table.
CMI.plugin(probs, unit = c("log", "log2", "log10"))
CMI.plugin(probs, unit = c("log", "log2", "log10"))
probs |
the joint probability distribution table of three random variables. |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
CMI.plugin returns the conditional mutual information.
Wyner, A. D. (1978). A definition of conditional mutual information for arbitrary ensembles. Information & Computation, 38(1), 51-59.
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm count_xyz <- discretize3D(x, y, z, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xyz <- freqs.empirical(count_xyz) # corresponding conditional mutual information CMI.plugin(probs_xyz)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm count_xyz <- discretize3D(x, y, z, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xyz <- freqs.empirical(count_xyz) # corresponding conditional mutual information CMI.plugin(probs_xyz)
The function of discretize1D is used to assign the observations of a set of continuous random variables to bins, and returns a corresponding one-dimensional count table. Two of the most common discretization methods are available: "uniform width" and "uniform frequency".
discretize1D(x, algorithm = c("uniform_width", "uniform_frequency"))
discretize1D(x, algorithm = c("uniform_width", "uniform_frequency"))
x |
a numeric vector of the random variable x. |
algorithm |
two discretization algorithms are available, "uniform_width" is the default. |
Uniform width-based method ("uniform_width") divides the continuous data into N bins with equal width, while Uniform frequency-based method ("uniform_frequency") divides the continuous data into N bins with (approximate) equal count number. By default in both methods, the number of bins N is initialized into a round-off value according to the square root of the data size.
discretize1D returns a one-dimensional count table.
# a numeric vector corresponding to a continuous random variable x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) # corresponding count table estimated by "uniform width" algorithm discretize1D(x, "uniform_width") # corresponding count table estimated by "uniform frequency" algorithm discretize1D(x, "uniform_frequency")
# a numeric vector corresponding to a continuous random variable x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) # corresponding count table estimated by "uniform width" algorithm discretize1D(x, "uniform_width") # corresponding count table estimated by "uniform frequency" algorithm discretize1D(x, "uniform_frequency")
discretize1d.uniform_frequency assigns the observations of a continuous random variables to bins according to the "uniform frequency" method, and returns a corresponding count table.
discretize1d.uniform_frequency(x)
discretize1d.uniform_frequency(x)
x |
a numeric vector of a random variable. |
Uniform frequency-based method ("uniform_frequency") divides the continuous data into N bins with (approximate) equal count number. The number of bins N is initialized into a round-off value according to the square root of the data size.
discretize1d.uniform_frequency returns a one-dimensional count table.
# a numeric vector corresponding to a continuous random variable x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) # corresponding count table estimated by "uniform frequency" algorithm discretize1d.uniform_frequency(x)
# a numeric vector corresponding to a continuous random variable x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) # corresponding count table estimated by "uniform frequency" algorithm discretize1d.uniform_frequency(x)
discretize1d.uniform_width assigns the observations of continuous random variables to bins according to the "uniform width" method, and returns a corresponding count table.
discretize1d.uniform_width(x)
discretize1d.uniform_width(x)
x |
a numeric vector of a random variable. |
Uniform width-based method ("uniform_width") divides the continuous data into N bins with equal width. The number of bins N is initialized into a round-off value according to the square root of the data size.
discretize1d.uniform_width returns a count table.
# a numeric vector corresponding to a continuous random variable x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) # corresponding count table estimated by "uniform width" algorithm discretize1d.uniform_width(x)
# a numeric vector corresponding to a continuous random variable x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) # corresponding count table estimated by "uniform width" algorithm discretize1d.uniform_width(x)
The function of discretize2D is used to assign the observations of two sets of continuous random variables to bins, and returns a corresponding two-dimensional count table. Two of the most common discretization methods are available: "uniform width" and "uniform frequency".
discretize2D(x, y, algorithm = c("uniform_width", "uniform_frequency"))
discretize2D(x, y, algorithm = c("uniform_width", "uniform_frequency"))
x |
a numeric vector of the random variable x. |
y |
a numeric vector of the random variable y. |
algorithm |
two discretization algorithms are available, "uniform_width" is the default. |
Uniform width-based method ("uniform_width") divides the continuous data into N bins with equal width, while Uniform frequency-based method ("uniform_frequency") divides the continuous data into N bins with (approximate) equal count number. By default in both methods, the number of bins N is initialized into a round-off value according to the square root of the data size.
discretize2D returns a 2-dimensional count table.
# two numeric vectors that correspond to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding count table estimated by "uniform width" algorithm discretize2D(x,y, "uniform_width") # corresponding count table estimated by "uniform frequency" algorithm discretize2D(x,y, "uniform_frequency")
# two numeric vectors that correspond to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding count table estimated by "uniform width" algorithm discretize2D(x,y, "uniform_width") # corresponding count table estimated by "uniform frequency" algorithm discretize2D(x,y, "uniform_frequency")
discretize2d.uniform_frequency assigns the observations of two continuous random variables to bins according to the "uniform frequency" method, and returns a corresponding 2-dimensional count table.
discretize2d.uniform_frequency(x, y)
discretize2d.uniform_frequency(x, y)
x |
a numeric vector of the first random variable. |
y |
a numeric vector of the second random variable. |
Uniform frequency-based method ("uniform_frequency") divides the continuous data into N bins with (approximate) equal count number. The number of bins N is initialized into a round-off value according to the square root of the data size.
discretize2d.uniform_frequency returns a 2-dimensional count table.
# two numeric vectors corresponding to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding joint count table estimated by "uniform frequency" algorithm discretize2d.uniform_frequency(x,y)
# two numeric vectors corresponding to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding joint count table estimated by "uniform frequency" algorithm discretize2d.uniform_frequency(x,y)
discretize2d.uniform_width assigns the observations of two continuous random variables to bins according to the "uniform width" method, and returns a corresponding 2-dimensional count table.
discretize2d.uniform_width(x, y)
discretize2d.uniform_width(x, y)
x |
a numeric vector of the first random variable. |
y |
a numeric vector of the second random variable. |
Uniform width-based method ("uniform_width") divides the continuous data into N bins with equal width. The number of bins N is initialized into a round-off value according to the square root of the data size.
discretize2d.uniform_width returns a 2-dimensional count table.
# two numeric vectors corresponding to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding joint count table estimated by "uniform width" algorithm discretize2d.uniform_width(x,y)
# two numeric vectors corresponding to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding joint count table estimated by "uniform width" algorithm discretize2d.uniform_width(x,y)
The function of discretize3D is used to assign the observations of three sets of continuous random variables to bins, and returns a corresponding three-dimensional count table. Two of the most common discretization methods are available: "uniform width" and "uniform frequency".
discretize3D(x, y, z, algorithm = c("uniform_width", "uniform_frequency"))
discretize3D(x, y, z, algorithm = c("uniform_width", "uniform_frequency"))
x |
a numeric vector of the random variable x. |
y |
a numeric vector of the random variable y. |
z |
a numeric vector of the random variable z. |
algorithm |
two discretization algorithms are available, "uniform_width" is the default. |
Uniform width-based method ("uniform_width") divides the continuous data into N bins with equal width, while Uniform frequency-based method ("uniform_frequency") divides the continuous data into N bins with (approximate) equal count number. By default in both methods, the number of bins N is initialized into a round-off value according to the square root of the data size.
discretize3D returns a 3-dimensional count table.
# three vectors that correspond to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding count table estimated by "uniform width" algorithm discretize3D(x,y,z, "uniform_width") # corresponding count table estimated by "uniform frequency" algorithm discretize3D(x,y,z, "uniform_frequency")
# three vectors that correspond to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding count table estimated by "uniform width" algorithm discretize3D(x,y,z, "uniform_width") # corresponding count table estimated by "uniform frequency" algorithm discretize3D(x,y,z, "uniform_frequency")
discretize3d.uniform_frequency assigns the observations of three continuous random variables to bins according to the "uniform frequency" method, and returns a corresponding 3-dimensional count table.
discretize3d.uniform_frequency(x, y, z)
discretize3d.uniform_frequency(x, y, z)
x |
a numeric vector of the first random variable. |
y |
a numeric vector of the second random variable. |
z |
a numeric vector of the third random variable. |
Uniform frequency-based method ("uniform_frequency") divides the continuous data into N bins with (approximate) equal count number. The number of bins N is initialized into a round-off value according to the square root of the data size.
discretize3d.uniform_frequency returns a 3-dimensional count table.
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform frequency" algorithm discretize3d.uniform_frequency(x,y,z)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform frequency" algorithm discretize3d.uniform_frequency(x,y,z)
discretize3d.uniform_width assigns the observations of three continuous random variables to bins according to the "uniform width" method, and returns a corresponding 3-dimensional count table.
discretize3d.uniform_width(x, y, z)
discretize3d.uniform_width(x, y, z)
x |
a numeric vector of the first random variable. |
y |
a numeric vector of the second random variable. |
z |
a numeric vector of the third random variable. |
The uniform width-based method ("uniform_width") that divides the continuous data into N bins with equal width. The number of bins is initialized into a round-off value according to the square root of the data size.
discretize3d.uniform_width returns a 3-dimensional count table.
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm discretize3d.uniform_width(x,y,z)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm discretize3d.uniform_width(x,y,z)
The II.measure function is used to calculate the amount information contained in a set of variables from the joint count table. The number of variables here is limited to three.
II.measure( XYZ, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
II.measure( XYZ, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
XYZ |
a joint count distribution table of three random variables. |
method |
six probability estimation algorithms are available, "ML" is the default. |
lambda.probs |
the shrinkage intensity, only called when the probability estimator is "shrink". |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
verbose |
a logic variable. if verbose is true, report the shrinkage intensity. |
Six probability estimation methods are available to evaluate the underlying bin probability from observed counts:
method = "ML": maximum likelihood estimator, also referred to empirical probability,
method = "Jeffreys": Dirichlet distribution estimator with prior a = 0.5,
method = "Laplace": Dirichlet distribution estimator with prior a = 1,
method = "SG": Dirichlet distribution estimator with prior a = 1/length(XY),
method = "minimax": Dirichlet distribution estimator with prior a = sqrt(sum(XY))/length(XY),
method = "shrink": shrinkage estimator.
II.measure returns the interaction information.
Hausser, J., & Strimmer, K. (2009). Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks. Journal of Machine Learning Research, 1469-1484.
Mcgill, W. J. (1954). Multivariate information transmission. Psychometrika, 19(2), 97-116.
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm XYZ <- discretize3D(x, y, z, "uniform_width") # corresponding interaction information II.measure(XYZ)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm XYZ <- discretize3D(x, y, z, "uniform_width") # corresponding interaction information II.measure(XYZ)
II.plugin measures the amount information contained in a set of variables from the joint probability distribution table. The number of variables here is limited to three.
II.plugin(probs, unit = c("log", "log2", "log10"))
II.plugin(probs, unit = c("log", "log2", "log10"))
probs |
the joint probability distribution table of three random variables. |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
II.plugin returns the interaction information.
Mcgill, W. J. (1954). Multivariate information transmission. Psychometrika, 19(2), 97-116.
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm count_xyz <- discretize3D(x, y, z, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xyz <- freqs.empirical(count_xyz) # corresponding interaction information II.plugin(probs_xyz)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm count_xyz <- discretize3D(x, y, z, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xyz <- freqs.empirical(count_xyz) # corresponding interaction information II.plugin(probs_xyz)
The MI.measure function is used to calculate the mutual information between two random variables from the joint count table.
MI.measure( XY, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
MI.measure( XY, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
XY |
a joint count distribution table of two random variables. |
method |
six probability estimation algorithms are available, "ML" is the default. |
lambda.probs |
the shrinkage intensity, only called when the probability estimator is "shrink". |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
verbose |
a logic variable. if verbose is true, report the shrinkage intensity. |
Six probability estimation methods are available to evaluate the underlying bin probability from observed counts:
method = "ML": maximum likelihood estimator, also referred to empirical probability,
method = "Jeffreys": Dirichlet distribution estimator with prior a = 0.5,
method = "Laplace": Dirichlet distribution estimator with prior a = 1,
method = "SG": Dirichlet distribution estimator with prior a = 1/length(XY),
method = "minimax": Dirichlet distribution estimator with prior a = sqrt(sum(XY))/length(XY),
method = "shrink": shrinkage estimator.
MI.measure returns the mutual information.
Hausser, J., & Strimmer, K. (2009). Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks. Journal of Machine Learning Research, 1469-1484.
Wyner, A. D. (1978). A definition of conditional mutual information for arbitrary ensembles. Information & Computation, 38(1), 51-59.
# two numeric vectors corresponding to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding joint count table estimated by "uniform width" algorithm XY <- discretize2D(x, y, "uniform_width") # corresponding mutual information MI.measure(XY)
# two numeric vectors corresponding to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding joint count table estimated by "uniform width" algorithm XY <- discretize2D(x, y, "uniform_width") # corresponding mutual information MI.measure(XY)
MI.plugin measures the mutual information between two random variables from the joint probability distribution table.
MI.plugin(probs, unit = c("log", "log2", "log10"))
MI.plugin(probs, unit = c("log", "log2", "log10"))
probs |
the joint probability distribution table of two random variables. |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
MI.plugin returns the mutual information.
# two numeric vectors corresponding to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding joint count table estimated by "uniform width" algorithm count_xy <- discretize2D(x, y, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xy <- freqs.empirical(count_xy) # corresponding mutual information MI.plugin(probs_xy)
# two numeric vectors corresponding to two continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) # corresponding joint count table estimated by "uniform width" algorithm count_xy <- discretize2D(x, y, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xy <- freqs.empirical(count_xy) # corresponding mutual information MI.plugin(probs_xy)
The PID.measure function is used to decompose two source information acting on the common target into four parts: joint information (synergy), unique information from source x, unique information from source y and shared information (redundancy). The input of the PID.measure is the joint count table.
PID.measure( XYZ, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
PID.measure( XYZ, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
XYZ |
a joint count distribution table of three random variables |
method |
six probability estimation algorithms are available, "ML" is the default. |
lambda.probs |
the shrinkage intensity, only called when the probability estimator is "shrink". |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
verbose |
a logic variable. if verbose is true, report the shrinkage intensity. |
Six probability estimation methods are available to evaluate the underlying bin probability from observed counts:
method = "ML": maximum likelihood estimator, also referred to empirical probability,
method = "Jeffreys": Dirichlet distribution estimator with prior a = 0.5,
method = "Laplace": Dirichlet distribution estimator with prior a = 1,
method = "SG": Dirichlet distribution estimator with prior a = 1/length(XY),
method = "minimax": Dirichlet distribution estimator with prior a = sqrt(sum(XY))/length(XY),
method = "shrink": shrinkage estimator.
PID.measure returns a list that includes synergistic information, unique information from x, unique information from y, redundant information and the sum of the four parts of information.
Hausser, J., & Strimmer, K. (2009). Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks. Journal of Machine Learning Research, 1469-1484.
Williams, P. L., & Beer, R. D. (2010). Nonnegative Decomposition of Multivariate Information. arXiv: Information Theory.
Chan, T. E., Stumpf, M. P., & Babtie, A. C. (2017). Gene Regulatory Network Inference from Single-Cell Data Using Multivariate Information Measures. Cell Systems, 5(3).
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm XYZ <- discretize3D(x, y, z, "uniform_width") # corresponding partial information decomposition PID.measure(XYZ) # corresponding count table estimated by "uniform frequency" algorithm XYZ <- discretize3D(x, y, z, "uniform_frequency") # corresponding partial information decomposition PID.measure(XYZ)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm XYZ <- discretize3D(x, y, z, "uniform_width") # corresponding partial information decomposition PID.measure(XYZ) # corresponding count table estimated by "uniform frequency" algorithm XYZ <- discretize3D(x, y, z, "uniform_frequency") # corresponding partial information decomposition PID.measure(XYZ)
PID.plugin decomposes two source information acting on the common target into four parts: joint information (synergy), unique information from source x, unique information from source y and shared information (redundancy). The input of PMI.plug is the joint probability distribution table.
PID.plugin(probs, unit = c("log", "log2", "log10"))
PID.plugin(probs, unit = c("log", "log2", "log10"))
probs |
the joint probability distribution of three random variables. |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
PID.plugin returns a list that includes synergistic information, unique information from source x, unique information from source y, redundant information and the sum of the four parts of information.
Williams, P. L., & Beer, R. D. (2010). Nonnegative Decomposition of Multivariate Information. arXiv: Information Theory.
Chan, T. E., Stumpf, M. P., & Babtie, A. C. (2017). Gene Regulatory Network Inference from Single-Cell Data Using Multivariate Information Measures. Cell systems, 5(3).
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm count_xyz <- discretize3D(x, y, z, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xyz <- freqs.empirical(count_xyz) # corresponding partial information decomposition PID.plugin(probs_xyz)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm count_xyz <- discretize3D(x, y, z, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xyz <- freqs.empirical(count_xyz) # corresponding partial information decomposition PID.plugin(probs_xyz)
The PMI.measure function is used to calculate the non-linearly direct dependencies between two variables conditioned on the third one form the joint count table.
PMI.measure( XYZ, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
PMI.measure( XYZ, method = c("ML", "Jeffreys", "Laplace", "SG", "minimax", "shrink"), lambda.probs, unit = c("log", "log2", "log10"), verbose = TRUE )
XYZ |
a joint count distribution table of three random variables. |
method |
six probability estimation algorithms are available, "ML" is the default. |
lambda.probs |
the shrinkage intensity, only called when the probability estimator is "shrink". |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
verbose |
a logic variable. if verbose is true, report the shrinkage intensity. |
Six probability estimation methods are available to evaluate the underlying bin probability from observed counts:
method = "ML": maximum likelihood estimator, also referred to empirical probability,
method = "Jeffreys": Dirichlet distribution estimator with prior a = 0.5,
method = "Laplace": Dirichlet distribution estimator with prior a = 1,
method = "SG": Dirichlet distribution estimator with prior a = 1/length(XY),
method = "minimax": Dirichlet distribution estimator with prior a = sqrt(sum(XY))/length(XY),
method = "shrink": shrinkage estimator.
PMI.measure returns the part mutual information.
Hausser, J., & Strimmer, K. (2009). Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks. Journal of Machine Learning Research, 1469-1484.
Zhao, J., Zhou, Y., Zhang, X., & Chen, L. (2016). Part mutual information for quantifying direct associations in networks. Proceedings of the National Academy of Sciences of the United States of America, 113(18), 5130-5135.
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm XYZ <- discretize3D(x, y, z, "uniform_width") # corresponding part mutual information PMI.measure(XYZ)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm XYZ <- discretize3D(x, y, z, "uniform_width") # corresponding part mutual information PMI.measure(XYZ)
PMI.plug measures the non-linearly direct dependencies between two variables conditioned on the third one form the joint probability distribution table.
PMI.plugin(probs, unit = c("log", "log2", "log10"))
PMI.plugin(probs, unit = c("log", "log2", "log10"))
probs |
the joint probability distribution table of three random variables. |
unit |
the base of the logarithm. The default is natural logarithm, which is "log". For evaluating entropy in bits, it is suggested to set the unit to "log2". |
PMI.plugin returns the part mutual information.
Zhao, J., Zhou, Y., Zhang, X., & Chen, L. (2016). Part mutual information for quantifying direct associations in networks. Proceedings of the National Academy of Sciences of the United States of America, 113(18), 5130-5135.
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm count_xyz <- discretize3D(x, y, z, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xyz <- freqs.empirical(count_xyz) # corresponding part mutual information PMI.plugin(probs_xyz)
# three numeric vectors corresponding to three continuous random variables x <- c(0.0, 0.2, 0.2, 0.7, 0.9, 0.9, 0.9, 0.9, 1.0) y <- c(1.0, 2.0, 12, 8.0, 1.0, 9.0, 0.0, 3.0, 9.0) z <- c(3.0, 7.0, 2.0, 11, 10, 10, 14, 2.0, 11) # corresponding joint count table estimated by "uniform width" algorithm count_xyz <- discretize3D(x, y, z, "uniform_width") # the joint probability distribution table of the count data library("entropy") probs_xyz <- freqs.empirical(count_xyz) # corresponding part mutual information PMI.plugin(probs_xyz)