Exact Common Information

The exact common information [KLEG14] is the entropy of the smallest variable \(V\) which renders all variables of interest independent:

\[\G{X_{0:n} | Y_{0:m}} = \min_{\ind X_{0:n} \mid Y_{0:m}, V} \H{V | Y_{0:m}}\]

Subadditivity of Independent Variables

Kumar et. al. [KLEG14] have shown that the exact common information of a pair of independent pairs of variables can be less than the sum of their individual exact common informations. Here we verify this claim:

In [1]: from dit.multivariate import exact_common_information as G

In [2]: d = dit.Distribution([(0,0), (0,1), (1,0)], [1/3]*3)

In [3]: d2 = d @ d  # python 3.6 syntax for d.__matmul__(d)

In [4]: print(d2)
Class:    Distribution
Alphabet: ((0, 1), (0, 1), (0, 1), (0, 1))
Base:     linear

x              p(X0,X1,X2,X3)
(0, 0, 0, 0)   1/9
(0, 0, 0, 1)   1/9
(0, 0, 1, 0)   1/9
(0, 1, 0, 0)   1/9
(0, 1, 0, 1)   1/9
(0, 1, 1, 0)   1/9
(1, 0, 0, 0)   1/9
(1, 0, 0, 1)   1/9
(1, 0, 1, 0)   1/9

In [5]: 2*G(d)
Out[5]: 1.8365163897965393

In [6]: G(d2, [[0, 2], [1, 3]])
Out[6]: 1.7524300900019965

API

exact_common_information(dist, rvs=None, crvs=None, niter=None, maxiter=1000, polish=1e-06, bound=None, backend='numpy')

Computes the exact common information, min H[V] where V renders all rvs independent.

Parameters:

dist (Distribution) – The distribution for which the exact common information will be computed.
rvs (list, None) – A list of lists. Each inner list specifies the indexes of the random variables used to calculate the exact common information. If None, then it is calculated over all random variables, which is equivalent to passing rvs=dist.rvs.
crvs (list, None) – A single list of indexes specifying the random variables to condition on. If None, then no variables are conditioned on.
niter (int > 0) – Number of basin hoppings to perform during the optimization.
maxiter (int > 0) – The number of iterations of the optimization subroutine to perform.
polish (False, float) – Whether to polish the result or not. If a float, this will perform a second optimization seeded with the result of the first, but with smaller tolerances and probabilities below polish set to 0. If False, don’t polish.
bound (int) – Bound the size of the Markov variable.
backend (str) – The optimization backend to use. One of 'numpy' (default), 'jax', or 'torch'.

Returns:

ci – The exact common information.

Return type:

float