bayesflow.inference_networks module#

class bayesflow.inference_networks.InvertibleNetwork(*args, **kwargs)[source]#

Bases: Model

Implements a chain of conditional invertible coupling layers for conditional density estimation.

available_designs = ('affine', 'spline', 'interleaved')#
__init__(num_params, num_coupling_layers=6, coupling_design='affine', coupling_settings=None, permutation='fixed', use_act_norm=True, act_norm_init=None, use_soft_flow=False, soft_flow_bounds=(0.001, 0.05), **kwargs)[source]#

Creates a chain of coupling layers with optional ActNorm layers in-between. Implements ideas from:

[1] Radev, S. T., Mertens, U. K., Voss, A., Ardizzone, L., & Köthe, U. (2020). BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE Transactions on Neural Networks and Learning Systems.

[2] Kim, H., Lee, H., Kang, W. H., Lee, J. Y., & Kim, N. S. (2020). Softflow: Probabilistic framework for normalizing flow on manifolds. Advances in Neural Information Processing Systems, 33, 16388-16397.

[3] Ardizzone, L., Kruse, J., Lüth, C., Bracher, N., Rother, C., & Köthe, U. (2020). Conditional invertible neural networks for diverse image-to-image translation. In DAGM German Conference on Pattern Recognition (pp. 373-387). Springer, Cham.

[4] Durkan, C., Bekasov, A., Murray, I., & Papamakarios, G. (2019). Neural spline flows. Advances in Neural Information Processing Systems, 32.

[5] Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. Advances in Neural Information Processing Systems, 31.

Parameters:
num_paramsint

The number of parameters to perform inference on. Equivalently, the dimensionality of the latent space.

num_coupling_layersint, optional, default: 6

The number of coupling layers to use as defined in [1] and [2]. In general, more coupling layers will give you more expressive power, but will be slower and may need more simulations to train. Typically, between 4 and 10 coupling layers should suffice for most applications.

coupling_designstr or callable, optional, default: ‘affine’

The type of internal coupling network to use. Must be in [‘affine’, ‘spline’, ‘interleaved’]. The first corresponds to the architecture in [3, 5], the second corresponds to a modified version of [4]. The third option will alternate between affine and spline layers, for example, if num_coupling_layers == 3, the chain will consist of [“affine”, “spline”, “affine”] layers.

In general, spline couplings run slower than affine couplings, but require fewer coupling layers. Spline couplings may work best with complex (e.g., multimodal) low-dimensional problems. The difference will become less and less pronounced as we move to higher dimensions.

Note: This is the first setting you may want to change, if inference does not work as expected!

coupling_settingsdict or None, optional, default: None

The coupling network settings to pass to the internal coupling layers. See default_settings for possible settings. Below are two examples.

Examples:

1. If using coupling_design='affine, you may want to turn on Monte Carlo Dropout and use an ELU activation function for the internal networks. You can do this by providing: `` coupling_settings={

‘mc_dropout’ : True, ‘dense_args’ : dict(units=128, activation=’elu’)

2. If using coupling_design='spline', you may want to change the number of learnable bins and increase the dropout probability (i.e., more regularization to guard against overfitting): `` coupling_settings={

‘dropout_prob’: 0.2, ‘bins’ : 32,

permutationstr or None, optional, default: ‘fixed’

Whether to use permutations between coupling layers. Highly recommended if num_coupling_layers > 1 Important: Must be in [‘fixed’, ‘learnable’, None]

use_act_normbool, optional, default: True

Whether to use activation normalization after each coupling layer, as used in [5]. Recommended to keep default.

act_norm_initnp.ndarray of shape (num_simulations, num_params) or None, optional, default: None

Optional data-dependent initialization for the internal ActNorm layers, as done in [5]. Could be helpful for deep invertible networks.

use_soft_flowbool, optional, default: False

Whether to perturb the target distribution (i.e., parameters) with small amount of independent noise, as done in [2]. Could be helpful for degenerate distributions.

soft_flow_boundstuple(float, float), optional, default: (1e-3, 5e-2)

The bounds of the continuous uniform distribution from which the noise scale would be sampled at each iteration. Only relevant when use_soft_flow=True.

**kwargsdict

Optional keyword arguments (e.g., name) passed to the tf.keras.Model __init__ method.

call(targets, condition, inverse=False, **kwargs)[source]#

Performs one pass through an invertible chain (either inverse or forward).

Parameters:
targetstf.Tensor

The estimation quantities of interest, shape (batch_size, …)

conditiontf.Tensor

The conditional data x, shape (batch_size, summary_dim)

inversebool, default: False

Flag indicating whether to run the chain forward or backwards

Returns:
(z, log_det_J)tuple(tf.Tensor, tf.Tensor)

If inverse=False: The transformed input and the corresponding Jacobian of the transformation, v shape: (batch_size, …), log_det_J shape: (batch_size, …)

targettf.Tensor

If inverse=True: The transformed out, shape (batch_size, …)

Notes

If inverse=False, the return is (z, log_det_J).

If inverse=True, the return is target.

forward(targets, condition, **kwargs)[source]#

Performs a forward pass through the chain.

inverse(z, condition, **kwargs)[source]#

Performs a reverse pass through the chain. Assumes that it is only used in inference mode, so **kwargs contains training=False.

classmethod create_config(**kwargs)[source]#

“Used to create the settings dictionary for the internal networks of the invertible network. Will fill in missing

class bayesflow.inference_networks.EvidentialNetwork(*args, **kwargs)[source]#

Bases: Model

Implements a network whose outputs are the concentration parameters of a Dirichlet density.

Follows ideas from:

[1] Radev, S. T., D’Alessandro, M., Mertens, U. K., Voss, A., Köthe, U., & Bürkner, P. C. (2021). Amortized Bayesian model comparison with evidential deep learning. IEEE Transactions on Neural Networks and Learning Systems.

[2] Sensoy, M., Kaplan, L., & Kandemir, M. (2018). Evidential deep learning to quantify classification uncertainty. Advances in neural information processing systems, 31.

__init__(num_models, dense_args=None, num_dense=3, output_activation='softplus', **kwargs)[source]#

Creates an instance of an evidential network for amortized model comparison.

Parameters:
num_modelsint

The number of candidate (competing models) for the comparison scenario.

dense_argsdict or None, optional, default: None

The arguments for a tf.keras.layers.Dense layer. If None, defaults will be used.

num_denseint, optional, default: 3

The number of dense layers for the main network part.

output_activationstr or callable, optional, default: ‘softplus’

The activation function to use for the network outputs. Important: needs to have positive outputs.

**kwargsdict, optional, default: {}

Optional keyword arguments (e.g., name) passed to the tf.keras.Model __init__ method.

call(condition, **kwargs)[source]#

Computes evidences for model comparison given a batch of data and optional concatenated context, typically passed through a summayr network.

Parameters:
conditiontf.Tensor of shape (batch_size, …)

The input variables used for determining p(model | condition)

Returns:
evidencetf.Tensor of shape (batch_size, num_models) – the learned model evidences
evidence(condition, **kwargs)[source]#
sample(condition, n_samples, **kwargs)[source]#

Samples posterior model probabilities from the higher-order Dirichlet density.

Parameters:
conditiontf.Tensor

The summary of the observed (or simulated) data, shape (n_data_sets, …)

n_samplesint

Number of samples to obtain from the approximate posterior

Returns:
pm_samplestf.Tensor or np.array

The posterior draws from the Dirichlet distribution, shape (num_samples, num_batch, num_models)

classmethod create_config(**kwargs)[source]#

“Used to create the settings dictionary for the internal networks of the invertible network. Will fill in missing

class bayesflow.inference_networks.PMPNetwork(*args, **kwargs)[source]#

Bases: Model

Implements a network that approximates posterior model probabilities (PMPs) as employed in [1].

[1] Elsemüller, L., Schnuerch, M., Bürkner, P. C., & Radev, S. T. (2023).

A Deep Learning Method for Comparing Bayesian Hierarchical Models. arXiv preprint arXiv:2301.11873.

__init__(num_models, dense_args=None, num_dense=3, dropout=True, mc_dropout=False, dropout_prob=0.05, output_activation=<function softmax_v2>, **kwargs)[source]#

Creates an instance of a PMP network for amortized model comparison.

Parameters:
num_modelsint

The number of candidate (competing models) for the comparison scenario.

dense_argsdict or None, optional, default: None

The arguments for a tf.keras.layers.Dense layer. If None, defaults will be used.

num_denseint, optional, default: 3

The number of dense layers for the main network part.

dropoutbool, optional, default: True

Whether to use dropout in-between the hidden layers.

mc_dropoutbool, optional, default: False

Whether to use dropout Monte Carlo dropout (i.e., Bayesian approximation) during inference

dropout_probfloat in (0, 1), optional, default: 0.05

The dropout probability. Only has effecft if dropout=True or mc_dropout=True

output_activationcallable, optional, default: tf.nn.softmax

The activation function to apply to the network outputs. Important: Needs to have positive outputs and be bounded between 0 and 1.

**kwargsdict, optional, default: {}

Optional keyword arguments (e.g., name) passed to the tf.keras.Model __init__ method.

call(condition, return_probs=True, **kwargs)[source]#

Forward pass through the network. Computes approximated PMPs given a batch of data and optional concatenated context, typically passed through a summary network.

Parameters:
conditiontf.Tensor of shape (batch_size, …)

The input variables used for determining p(model | condition)

return_probsbool, optional, default: True

Whether to return probabilities or logits (pre-activation, unnormalized)

Returns:
outtf.Tensor of shape (batch_size, …, num_models)

The approximated PMPs (post-activation) or logits (pre-activation)

posterior_probs(condition, **kwargs)[source]#

Shortcut function to obtain posterior probabilities given a condition tensor (e.g., summary statistics of data sets).

Parameters:
conditiontf.Tensor of shape (batch_size, …)

The input variables used for determining p(model | condition)

Returns:
outtf.Tensor of shape (batch_size, …, num_models)

The approximated PMPs

logits(condition, **kwargs)[source]#

Shortcut function to obtain logits given a condition tensor (e.g., summary statistics of data sets).

Parameters:
conditiontf.Tensor of shape (batch_size, …)

The input variables used for determining p(model | condition)

Returns:
outtf.Tensor of shape (batch_size, …, num_models)

The approximated PMPs

classmethod create_config(**kwargs)[source]#

Used to create the settings dictionary for the internal networks of the network. Will fill in missing.