# Optimization¶

It is often useful to construct a distribution $$d^\prime$$ which is consistent with some marginal aspects of $$d$$, but otherwise optimizes some information measure. For example, perhaps we are interested in constructing a distribution which matches pairwise marginals with another, but otherwise has maximum entropy:

In [1]: from dit.algorithms.scipy_optimizers import MaxEntOptimizer

ImportErrorTraceback (most recent call last)
<ipython-input-1-5fd9c8e847a4> in <module>()
----> 1 from dit.algorithms.scipy_optimizers import MaxEntOptimizer

ImportError: No module named scipy_optimizers

In [2]: xor = dit.example_dists.Xor()

In [3]: meo = MaxEntOptimizer(xor, [[0,1], [0,2], [1,2]])

NameErrorTraceback (most recent call last)
<ipython-input-3-f55de6eaf234> in <module>()
----> 1 meo = MaxEntOptimizer(xor, [[0,1], [0,2], [1,2]])

NameError: name 'MaxEntOptimizer' is not defined

In [4]: meo.optimize()

NameErrorTraceback (most recent call last)
<ipython-input-4-ffaf8eaa6f1b> in <module>()
----> 1 meo.optimize()

NameError: name 'meo' is not defined

In [5]: dp = meo.construct_dist()

NameErrorTraceback (most recent call last)
<ipython-input-5-6f4e9c4137b2> in <module>()
----> 1 dp = meo.construct_dist()

NameError: name 'meo' is not defined

In [6]: print(dp)

NameErrorTraceback (most recent call last)
<ipython-input-6-2b98a304971b> in <module>()
----> 1 print(dp)

NameError: name 'dp' is not defined


## Helper Functions¶

There are three special functions to handle common optimization problems:

In [7]: from dit.algorithms import maxent_dist, marginal_maxent_dists, pid_broja


The first is maximum entropy distributions with specific fixed marginals. It encapsulates the steps run above:

In [8]: print(maxent_dist(xor, [[0,1], [0,2], [1,2]]))
Class:          Distribution
Alphabet:       ('0', '1') for all rvs
Base:           linear
Outcome Class:  str
Outcome Length: 3
RV Names:       None

x     p(x)
000   2444573/19556583
001   2048462/16387697
010   1/8
011   3274257/26194055
100   3976016/31808129
101   1795124/14360991
110   1927555/15420441
111   1/8


The second constructs several maximum entropy distributions, each with all subsets of variables of a particular size fixed:

In [9]: k0, k1, k2, k3 = marginal_maxent_dists(xor)


where k0 is the maxent dist corresponding the same alphabets as xor; k1 fixes $$p(x_0)$$, $$p(x_1)$$, and $$p(x_2)$$; k2 fixes $$p(x_0, x_1)$$, $$p(x_0, x_2)$$, and $$p(x_1, x_2)$$ (as in the maxent_dist example above), and finally k3 fixes $$p(x_0, x_1, x_2)$$ (e.g. is the distribution we started with).

### Partial Information Decomposition¶

Finally, we have pid_broja(). This computes the 2 input, 1 output partial information decomposition as defined [BRO+14]. We can compute the partial information decomposition where $$X_0$$ and $$X_1$$ are interpreted as inputs, and $$X_2$$ as the output, with the following code:

In [10]: sources = [[0], [1]]

In [11]: target = [2]

In [12]: pid_broja(xor, sources, target)

TypeErrorTraceback (most recent call last)
<ipython-input-12-91b5087e14d1> in <module>()
----> 1 pid_broja(xor, sources, target)

TypeError: 'module' object is not callable


indicating that the redundancy (R) is zero, neither input provides unique informaiton (U0, U1), and there is 1 bit of synergy (S).

dit.algorithms.scipy_optimizers provides two optimization classes for optimizing some quantity while matching arbitrary margins from a reference distribution. The first, dit.algorithms.scipy_optimizers.BaseConvexOptimizer, is for use when the objective is convex, while the second, dit.algorithms.scipy_optimizers.BaseNonConvexOptimizer is for use when the objective is non-convex. Simply subclass one of these two and impliment the objective method and it is good to go.