MixtureScore#

class bayesflow.scoring_rules.MixtureScore(components: dict[str, ParametricDistributionScore] | None = None, weight_head: str = 'mixture_logits', temperature: float = 1.0, **kwargs)[source]#

Bases: ParametricDistributionScore

\(S(\hat p_{\phi_{1\ldots K},w_{1\ldots K}},\theta)=-\sum_{k=1}^{K} \log w_k+\log(\hat p_{\phi_k}(\theta))\)

Log-score of mixture of parametric distribution components.

Parameters:
componentsdict[str, ParametricDistributionScore]

Mixture components. Dict order defines component ordering.

weight_headstr, optional

Name of the mixture logits head. Defaults to "mixture_logits".

temperaturefloat, optional

Initial mixture temperature. Defaults to 1.0.

**kwargs

Passed to ParametricDistributionScore and ScoringRule.

Notes

The score exposes a flat set of estimation heads so that a ScoringRuleNetwork` can build all required heads automatically.

The exposed heads are:

  • weight_head: mixture logits of shape (K,), where K = len(components)

  • f"{c}__{h}": for each component c and component head h

Mixture weights are represented as logits for numerical stability

log w = log_softmax(logits / temperature)

where temperature is a non-trainable keras.Variable (default: 1.0) that can be updated externally with set_temperature().

Examples

>>> # A network representing a mixture density of three MVN distributions
>>> from bayesflow.networks import ScoringRuleNetwork
>>> from bayesflow.scoring_rules import MvNormalScore, MixtureScore
>>> inference_network = ScoringRuleNetwork(
        mix=MixtureScore(
            mvn1=MvNormalScore(),
            mvn2=MvNormalScore(),
            mvn3=MvNormalScore(),
        )
    )
TRANSFORMATION_TYPE: dict[str, str] = {}#

Defines nonstandard transformation behaviour for de-standardization.

The standard transformation

x_i = x_i’ * sigma_i + mu_i

is referred to as “location_scale”. Keys not specified here will fallback to that default.

NOT_TRANSFORMING_LIKE_VECTOR_WARNING: tuple[str] = ()#

Names of prediction heads for which to warn if the adapter is called on their estimates in inverse direction.

Prediction heads can output estimates in spaces other than the target distribution space. To such estimates the adapter cannot be straightforwardly applied in inverse direction, because the adapter is built to map vectors from the inference variable space. When subclassing ScoringRule, add the names of such heads to the following list to warn users about difficulties with a type of estimate whenever the adapter is applied to them in inverse direction.

get_config()[source]#
classmethod from_config(config, custom_objects=None)[source]#
get_head_shapes_from_target_shape(target_shape: tuple[int, ...]) dict[str, tuple[int, ...]][source]#

Return the head shapes required to parameterize the mixture.

get_head(key: str, output_shape: tuple[int, ...]) Sequential[source]#

Construct a head for the given key.

  • Mixture logits head is constructed via the base ScoringRule logic.

  • Component heads are delegated to the respective component score, preserving its links/subnets.

See also ScoringRule.get_head().

set_temperature(value)[source]#
log_prob(x: Tensor, **estimates: Tensor) Tensor[source]#

Compute log p(x) under the mixture:

log p(x) = logsumexp_k( log w_k + log p_k(x) )

Parameters:
xTensor

Targets of shape (batch, …event…).

**estimatesdict[str, Tensor]

Flat dict containing mixture logits and all component parameter heads.

Returns:
Tensor

Log-probabilities of shape (batch_size,).

sample(batch_shape: tuple[int, ...], **estimates: Tensor) Tensor[source]#

Draw samples from the mixture.

Parameters:
batch_shapeShape

A tuple (batch_size, num_samples).

**estimatesdict[str, Tensor]

Flat dict containing mixture logits and all component parameter heads.

Returns:
Tensor

Samples with shape (batch_size, num_samples, …).

For a specified key, request a link from network output to estimation target.

If no link was specified for the key (e.g. upon initialization), return a linear activation.

Parameters:
keystr

Name of head for which to request a link.

Returns:
linkkeras.Layer

Activation function linking network output to estimation target.

get_subnet(key: str) Layer#

For a specified key, request a subnet to be used for projecting the shared condition embedding before further projection and reshaping to the heads output shape.

If no subnet was specified for the key (e.g. upon initialization), return just an instance of keras.layers.Identity.

Parameters:
keystr

Name of head for which to request a subnet.

Returns:
linkkeras.Layer

Subnet projecting the shared condition embedding.

score(estimates: dict[str, Tensor], targets: Tensor, weights: Tensor = None) Tensor#

Computes the log-score for a predicted parametric probability distribution given realized targets.

\(S(\hat p_\phi, \theta; k) = -\log(\hat p_\phi(\theta))\)