Joint Distributions

The primary method of constructing a distribution is by supplying both the outcomes and the probability mass function:

In [1]: from dit import Distribution

In [2]: outcomes = ['000', '011', '101', '110']

In [3]: pmf = [1/4]*4

In [4]: xor = Distribution(outcomes, pmf)

In [5]: print(xor)
Class:    Distribution
Alphabet: (('0', '1'), ('0', '1'), ('0', '1'))
Base:     linear

x                 p(X0,X1,X2)
('0', '0', '0')   0.25
('0', '1', '1')   0.25
('1', '0', '1')   0.25
('1', '1', '0')   0.25

Another way to construct a distribution is by supplying a dictionary mapping outcomes to probabilities:

In [6]: outcomes_probs = {'000': 1/4, '011': 1/4, '101': 1/4, '110': 1/4}

In [7]: xor2 = Distribution(outcomes_probs)

In [8]: print(xor2)
Class:    Distribution
Alphabet: (('0', '1'), ('0', '1'), ('0', '1'))
Base:     linear

x                 p(X0,X1,X2)
('0', '0', '0')   0.25
('0', '1', '1')   0.25
('1', '0', '1')   0.25
('1', '1', '0')   0.25

Yet a third method is via an ndarray:

In [9]: pmf = [[0.5, 0.25], [0.25, 0]]

In [10]: d = Distribution.from_ndarray(pmf)

In [11]: print(d)
Class:    Distribution
Alphabet: ((0, 1), (0, 1))
Base:     linear

x        p(X0,X1)
(0, 0)   0.5
(0, 1)   0.25
(1, 0)   0.25

Distribution.__init__(data, pmf=None, rv_names=None, free_vars=None, given_vars=None, base='linear', sample_space=None, sparse=True, trim=True, sort=True, validate=True, prng=None)[source]

Initialize an Distribution.

There are three construction modes:

DataArray – pass an xr.DataArray directly (original API).
Outcomes + pmf – pass a sequence of outcomes and a sequence of probabilities, matching the dit.Distribution signature.
Dict – pass a dict mapping outcomes to probabilities.

Parameters:

data (xr.DataArray, sequence, or dict) – If an xr.DataArray, used directly as the probability data. If a dict, keys are outcomes and values are probabilities. Otherwise, treated as a sequence of outcomes (each outcome is an indexable container whose length equals the number of random variables).
pmf (sequence of float, optional) – Probability values corresponding to data when data is a sequence of outcomes. Ignored when data is a DataArray or dict.
rv_names (list of str, optional) – Names for each random variable. Only used when data is outcomes or a dict. Defaults to 'X0', 'X1', …
free_vars (set-like of str, optional) – Names of the free (joint) variables. If both free_vars and given_vars are None, all dimensions are treated as free.
given_vars (set-like of str, optional) – Names of the conditioned variables.
base (str, float, or None) – The probability base. 'linear' (default) for raw probabilities, 2, 'e', or any positive float for log probabilities. If None, auto-detected (linear if the pmf sums to ~1, else ditParams['base']).
sample_space (sequence or CartesianProduct, optional) – Explicit sample space. If provided, used to determine the full set of possible outcomes.
sparse (bool) – If True, outcomes and pmf only report non-zero entries.
trim (bool) – Ignored (kept for API compatibility).
sort (bool) – Ignored (alphabets are always sorted).
validate (bool) – If True, validate normalisation after construction.
prng (random state, optional) – Pseudo-random number generator. Defaults to dit.math.prng.

Examples

From outcomes and pmf (like dit.Distribution):

>>> xrd = Distribution(['00','01','10','11'],
...                      [.25, .25, .25, .25],
...                      rv_names=['X', 'Y'])

From a dict:

>>> xrd = Distribution({'00': .5, '11': .5}, rv_names=['X', 'Y'])

From a DataArray (original API):

>>> xrd = Distribution(my_dataarray, free_vars={'X', 'Y'})

To verify that these two distributions are the same, we can use the is_approx_equal method:

In [12]: xor.is_approx_equal(xor2)
Out[12]: True

Distribution.is_approx_equal(other, atol=1e-09, rtol=None)[source]

Check approximate equality of two distributions.

Compares by sample space and per-outcome probabilities, ignoring dimension names. This matches the old dit.Distribution behavior.

Parameters:

other (Distribution) – Distribution to compare against.
atol (float, optional) – Absolute tolerance for value comparison (default: 1e-9).
rtol (float, optional) – Ignored (kept for signature compatibility).

Returns:

eq

Return type:

bool