Operations¶
There are several operations possible on joint random variables. Let’s consider the standard xor
distribution:
In [1]: d = dit.Distribution(['000', '011', '101', '110'], [1/4]*4)
In [2]: d.set_rv_names('XYZ')
Marginal¶
dit
supports two ways of selecting only a subset of random variables. marginal()
returns a distribution containing only the random variables specified, whereas marginalize()
return a distribution containing all random variables except the ones specified:
In [3]: print(d.marginal('XY'))
Class: Distribution
Alphabet: ('0', '1') for all rvs
Base: linear
Outcome Class: str
Outcome Length: 2
RV Names: ('X', 'Y')
x p(x)
00 1/4
01 1/4
10 1/4
11 1/4
In [4]: print(d.marginalize('XY'))
Class: Distribution
Alphabet: ('0', '1') for all rvs
Base: linear
Outcome Class: str
Outcome Length: 1
RV Names: ('Z',)
x p(x)
0 1/2
1 1/2

Distribution.
marginal
(rvs, rv_mode=None)[source]¶ Returns a marginal distribution.
Parameters:  rvs (list) – The random variables to keep. All others are marginalized.
 rv_mode (str, None) – Specifies how to interpret the elements of rvs. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements of rvs are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random variable names. If None, then the value of self._rv_mode is consulted.
Returns: d – A new joint distribution with the random variables in rvs kept and all others marginalized.
Return type: joint distribution

Distribution.
marginalize
(rvs, rv_mode=None)[source]¶ Returns a new distribution after marginalizing random variables.
Parameters:  rvs (list) – The random variables to marginalize. All others are kept.
 rv_mode (str, None) – Specifies how to interpret the elements of rvs. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements of rvs are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random variable names. If None, then the value of self._rv_mode is consulted.
Returns: d – A new joint distribution with the random variables in rvs marginalized and all others kept.
Return type: joint distribution
Conditional¶
We can also condition on a subset of random variables:
In [5]: marginal, cdists = d.condition_on('XY')
In [6]: print(marginal)
Class: Distribution
Alphabet: ('0', '1') for all rvs
Base: linear
Outcome Class: str
Outcome Length: 2
RV Names: ('X', 'Y')
x p(x)
00 1/4
01 1/4
10 1/4
11 1/4
In [7]: print(cdists[0]) # XY = 00
Class: Distribution
Alphabet: ('0', '1') for all rvs
Base: linear
Outcome Class: str
Outcome Length: 1
RV Names: ('Z',)
x p(x)
0 1
In [8]: print(cdists[1]) # XY = 01
Class: Distribution
Alphabet: ('0', '1') for all rvs
Base: linear
Outcome Class: str
Outcome Length: 1
RV Names: ('Z',)
x p(x)
1 1
In [9]: print(cdists[2]) # XY = 10
Class: Distribution
Alphabet: ('0', '1') for all rvs
Base: linear
Outcome Class: str
Outcome Length: 1
RV Names: ('Z',)
x p(x)
1 1
In [10]: print(cdists[3]) # XY = 11
Class: Distribution
Alphabet: ('0', '1') for all rvs
Base: linear
Outcome Class: str
Outcome Length: 1
RV Names: ('Z',)
x p(x)
0 1

Distribution.
condition_on
(crvs, rvs=None, rv_mode=None, extract=False)[source]¶ Returns distributions conditioned on random variables
crvs
.Optionally,
rvs
specifies which random variables should remain.NOTE: Eventually this will return a conditional distribution.
Parameters:  crvs (list) – The random variables to condition on.
 rvs (list, None) – The random variables for the resulting conditional distributions.
Any random variable not represented in the union of
crvs
andrvs
will be marginalized. IfNone
, then every random variable not appearing incrvs
is used.  rv_mode (str, None) – Specifies how to interpret
crvs
andrvs
. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements ofcrvs
andrvs
are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random varible names. IfNone
, then the value ofself._rv_mode
is consulted, which defaults to ‘indices’.  extract (bool) – If the length of either
crvs
orrvs
is 1 andextract
isTrue
, then instead of the new outcomes being 1tuples, we extract the sole element to create scalar distributions.
Returns:  cdist (dist) – The distribution of the conditioned random variables.
 dists (list of distributions) – The conditional distributions for each outcome in
cdist
.
Examples
First we build a distribution P(X,Y,Z) representing the XOR logic gate.
>>> pXYZ = dit.example_dists.Xor() >>> pXYZ.set_rv_names('XYZ')
We can obtain the conditional distributions P(X,ZY) and the marginal of the conditioned variable P(Y) as follows:
>>> pY, pXZgY = pXYZ.condition_on('Y')
If we specify
rvs='Z'
, then only ‘Z’ is kept and thus, ‘X’ is marginalized out:>>> pY, pZgY = pXYZ.condition_on('Y', rvs='Z')
We can condition on two random variables:
>>> pXY, pZgXY = pXYZ.condition_on('XY')
The equivalent call using indexes is:
>>> pXY, pZgXY = pXYZ.condition_on([0, 1], rv_mode='indexes')
Join¶
We can construct the join of two random variables:
Where \(\min\) is understood to be minimizing with respect to the entropy.
In [11]: from dit.algorithms.lattice import join
In [12]: print(join(d, ['XY']))
Class: ScalarDistribution
Alphabet: (0, 1, 2, 3)
Base: linear
x p(x)
0 1/4
1 1/4
2 1/4
3 1/4

join
(dist, rvs, rv_mode=None, int_outcomes=True)[source]¶ Returns the distribution of the join of random variables defined by rvs.
Parameters:  dist (Distribution) – The distribution which defines the base sigmaalgebra.
 rvs (list) – A list of lists. Each list specifies a random variable to be joined with the other lists. Each random variable can defined as a series of unique indexes. Multiple random variables can use the same index. For example, [[0,1],[1,2]].
 rv_mode (str, None) – Specifies how to interpret the elements of rvs. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements of rvs are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random variable names. If None, then the value of dist._rv_mode is consulted.
 int_outcomes (bool) – If True, then the outcomes of the join are relabeled as integers instead of as the atoms of the induced sigmaalgebra.
Returns: d – The distribution of the join.
Return type: ScalarDistribution

insert_join
(dist, idx, rvs, rv_mode=None)[source]¶ Returns a new distribution with the join inserted at index idx.
The join of the random variables in rvs is constructed and then inserted into at index idx.
Parameters:  dist (Distribution) – The distribution which defines the base sigmaalgebra.
 idx (int) – The index at which to insert the join. To append the join, set idx to be equal to 1 or dist.outcome_length().
 rvs (list) – A list of lists. Each list specifies a random variable to be met with the other lists. Each random variable can defined as a series of unique indexes. Multiple random variables can use the same index. For example, [[0,1],[1,2]].
 rv_mode (str, None) – Specifies how to interpret the elements of rvs. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements of rvs are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random variable names. If None, then the value of dist._rv_mode is consulted.
Returns: d – The new distribution with the join at index idx.
Return type: Distribution
Meet¶
We can construct the meet of two random variabls:
Where \(\max\) is understood to be maximizing with respect to the entropy.
In [13]: from dit.algorithms.lattice import meet
In [14]: outcomes = ['00', '01', '10', '11', '22', '33']
In [15]: d2 = dit.Distribution(outcomes, [1/8]*4 + [1/4]*2, sample_space=outcomes)
In [16]: d2.set_rv_names('XY')
In [17]: print(meet(d2, ['X', 'Y']))
Class: ScalarDistribution
Alphabet: (0, 1, 2)
Base: linear
x p(x)
0 1/4
1 1/4
2 1/2

meet
(dist, rvs, rv_mode=None, int_outcomes=True)[source]¶ Returns the distribution of the meet of random variables defined by rvs.
Parameters:  dist (Distribution) – The distribution which defines the base sigmaalgebra.
 rvs (list) – A list of lists. Each list specifies a random variable to be met with the other lists. Each random variable can defined as a series of unique indexes. Multiple random variables can use the same index. For example, [[0,1],[1,2]].
 rv_mode (str, None) – Specifies how to interpret the elements of rvs. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements of rvs are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random variable names. If None, then the value of dist._rv_mode is consulted.
 int_outcomes (bool) – If True, then the outcomes of the meet are relabeled as integers instead of as the atoms of the induced sigmaalgebra.
Returns: d – The distribution of the meet.
Return type: ScalarDistribution

insert_meet
(dist, idx, rvs, rv_mode=None)[source]¶ Returns a new distribution with the meet inserted at index idx.
The meet of the random variables in rvs is constructed and then inserted into at index idx.
Parameters:  dist (Distribution) – The distribution which defines the base sigmaalgebra.
 idx (int) – The index at which to insert the meet. To append the meet, set idx to be equal to 1 or dist.outcome_length().
 rvs (list) – A list of lists. Each list specifies a random variable to be met with the other lists. Each random variable can defined as a series of unique indexes. Multiple random variables can use the same index. For example, [[0,1],[1,2]].
 rv_mode (str, None) – Specifies how to interpret the elements of rvs. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements of rvs are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random variable names. If None, then the value of dist._rv_mode is consulted.
Returns: d – The new distribution with the meet at index idx.
Return type: Distribution
Minimal Sufficient Statistic¶
This method constructs the minimal sufficient statistic of \(X\) about \(Y\): \(X \mss Y\):
In [18]: from dit.algorithms import insert_mss
In [19]: d2 = dit.Distribution(['00', '01', '10', '11', '22', '33'], [1/8]*4 + [1/4]*2)
In [20]: print(insert_mss(d2, 1, [0], [1]))
Class: Distribution
Alphabet: (('0', '1', '2', '3'), ('0', '1', '2', '3'), ('2', '0', '1'))
Base: linear
Outcome Class: str
Outcome Length: 3
RV Names: None
x p(x)
002 1/8
012 1/8
102 1/8
112 1/8
220 1/4
331 1/4
Again, \(\min\) is understood to be over entropies.

mss
(dist, rvs, about=None, rv_mode=None, int_outcomes=True)[source]¶ Parameters:  dist (Distribution) – The distribution which defines the base sigmaalgebra.
 rvs (list) – A list of random variables to be compressed into a minimal sufficient statistic.
 about (list) – A list of random variables for which the minimal sufficient static will retain all information about.
 rv_mode (str, None) – Specifies how to interpret the elements of rvs. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements of rvs are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random variable names. If None, then the value of dist._rv_mode is consulted.
 int_outcomes (bool) – If True, then the outcomes of the minimal sufficient statistic are relabeled as integers instead of as the atoms of the induced sigmaalgebra.
Returns: d – The distribution of the minimal sufficient statistic.
Return type: ScalarDistribution
Examples
>>> d = Xor() >>> print(mss(d, [0], [1, 2])) Class: ScalarDistribution Alphabet: (0, 1) Base: linear x p(x) 0 0.5 1 0.5

insert_mss
(dist, idx, rvs, about=None, rv_mode=None)[source]¶ Inserts the minimal sufficient statistic of rvs about about into dist at index idx.
Parameters:  dist (Distribution) – The distribution which defines the base sigmaalgebra.
 idx (int) – The location in the distribution to insert the minimal sufficient statistic.
 rvs (list) – A list of random variables to be compressed into a minimal sufficient statistic.
 about (list) – A list of random variables for which the minimal sufficient static will retain all information about.
 rv_mode (str, None) – Specifies how to interpret the elements of rvs. Valid options are: {‘indices’, ‘names’}. If equal to ‘indices’, then the elements of rvs are interpreted as random variable indices. If equal to ‘names’, the the elements are interpreted as random variable names. If None, then the value of dist._rv_mode is consulted.
Returns: d – The distribution dist modified to contain the minimal sufficient statistic.
Return type: Distribution
Examples
>>> d = Xor() >>> print(insert_mss(d, 1, [0], [1, 2])) Class: Distribution Alphabet: ('0', '1') for all rvs Base: linear Outcome Class: str Outcome Length: 4 RV Names: None x p(x) 0000 0.25 0110 0.25 1011 0.25 1101 0.25