bayesflow.inference_networks module#
- class bayesflow.inference_networks.InvertibleNetwork(*args, **kwargs)[source]#
Bases:
Model
Implements a chain of conditional invertible coupling layers for conditional density estimation.
- available_designs = ('affine', 'spline', 'interleaved')#
- __init__(num_params, num_coupling_layers=6, coupling_design='affine', coupling_settings=None, permutation='fixed', use_act_norm=True, act_norm_init=None, use_soft_flow=False, soft_flow_bounds=(0.001, 0.05), **kwargs)[source]#
Creates a chain of coupling layers with optional ActNorm layers in-between. Implements ideas from:
[1] Radev, S. T., Mertens, U. K., Voss, A., Ardizzone, L., & Köthe, U. (2020). BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE Transactions on Neural Networks and Learning Systems.
[2] Kim, H., Lee, H., Kang, W. H., Lee, J. Y., & Kim, N. S. (2020). Softflow: Probabilistic framework for normalizing flow on manifolds. Advances in Neural Information Processing Systems, 33, 16388-16397.
[3] Ardizzone, L., Kruse, J., Lüth, C., Bracher, N., Rother, C., & Köthe, U. (2020). Conditional invertible neural networks for diverse image-to-image translation. In DAGM German Conference on Pattern Recognition (pp. 373-387). Springer, Cham.
[4] Durkan, C., Bekasov, A., Murray, I., & Papamakarios, G. (2019). Neural spline flows. Advances in Neural Information Processing Systems, 32.
[5] Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. Advances in Neural Information Processing Systems, 31.
- Parameters:
- num_paramsint
The number of parameters to perform inference on. Equivalently, the dimensionality of the latent space.
- num_coupling_layersint, optional, default: 6
The number of coupling layers to use as defined in [1] and [2]. In general, more coupling layers will give you more expressive power, but will be slower and may need more simulations to train. Typically, between 4 and 10 coupling layers should suffice for most applications.
- coupling_designstr or callable, optional, default: ‘affine’
The type of internal coupling network to use. Must be in [‘affine’, ‘spline’, ‘interleaved’]. The first corresponds to the architecture in [3, 5], the second corresponds to a modified version of [4]. The third option will alternate between affine and spline layers, for example, if num_coupling_layers == 3, the chain will consist of [“affine”, “spline”, “affine”] layers.
In general, spline couplings run slower than affine couplings, but require fewer coupling layers. Spline couplings may work best with complex (e.g., multimodal) low-dimensional problems. The difference will become less and less pronounced as we move to higher dimensions.
Note: This is the first setting you may want to change, if inference does not work as expected!
- coupling_settingsdict or None, optional, default: None
The coupling network settings to pass to the internal coupling layers. See
default_settings
for possible settings. Below are two examples.Examples:
1. If using
coupling_design='affine
, you may want to turn on Monte Carlo Dropout and use an ELU activation function for the internal networks. You can do this by providing: `` coupling_settings={‘mc_dropout’ : True, ‘dense_args’ : dict(units=128, activation=’elu’)
2. If using
coupling_design='spline'
, you may want to change the number of learnable bins and increase the dropout probability (i.e., more regularization to guard against overfitting): `` coupling_settings={‘dropout_prob’: 0.2, ‘bins’ : 32,
- permutationstr or None, optional, default: ‘fixed’
Whether to use permutations between coupling layers. Highly recommended if
num_coupling_layers > 1
Important: Must be in [‘fixed’, ‘learnable’, None]- use_act_normbool, optional, default: True
Whether to use activation normalization after each coupling layer, as used in [5]. Recommended to keep default.
- act_norm_initnp.ndarray of shape (num_simulations, num_params) or None, optional, default: None
Optional data-dependent initialization for the internal
ActNorm
layers, as done in [5]. Could be helpful for deep invertible networks.- use_soft_flowbool, optional, default: False
Whether to perturb the target distribution (i.e., parameters) with small amount of independent noise, as done in [2]. Could be helpful for degenerate distributions.
- soft_flow_boundstuple(float, float), optional, default: (1e-3, 5e-2)
The bounds of the continuous uniform distribution from which the noise scale would be sampled at each iteration. Only relevant when
use_soft_flow=True
.- **kwargsdict
Optional keyword arguments (e.g., name) passed to the tf.keras.Model __init__ method.
- call(targets, condition, inverse=False, **kwargs)[source]#
Performs one pass through an invertible chain (either inverse or forward).
- Parameters:
- targetstf.Tensor
The estimation quantities of interest, shape (batch_size, …)
- conditiontf.Tensor
The conditional data x, shape (batch_size, summary_dim)
- inversebool, default: False
Flag indicating whether to run the chain forward or backwards
- Returns:
- (z, log_det_J)tuple(tf.Tensor, tf.Tensor)
If inverse=False: The transformed input and the corresponding Jacobian of the transformation, v shape: (batch_size, …), log_det_J shape: (batch_size, …)
- targettf.Tensor
If inverse=True: The transformed out, shape (batch_size, …)
Notes
If
inverse=False
, the return is(z, log_det_J)
.If
inverse=True
, the return istarget
.
- class bayesflow.inference_networks.EvidentialNetwork(*args, **kwargs)[source]#
Bases:
Model
Implements a network whose outputs are the concentration parameters of a Dirichlet density.
Follows ideas from:
[1] Radev, S. T., D’Alessandro, M., Mertens, U. K., Voss, A., Köthe, U., & Bürkner, P. C. (2021). Amortized Bayesian model comparison with evidential deep learning. IEEE Transactions on Neural Networks and Learning Systems.
[2] Sensoy, M., Kaplan, L., & Kandemir, M. (2018). Evidential deep learning to quantify classification uncertainty. Advances in neural information processing systems, 31.
- __init__(num_models, dense_args=None, num_dense=3, output_activation='softplus', **kwargs)[source]#
Creates an instance of an evidential network for amortized model comparison.
- Parameters:
- num_modelsint
The number of candidate (competing models) for the comparison scenario.
- dense_argsdict or None, optional, default: None
The arguments for a tf.keras.layers.Dense layer. If None, defaults will be used.
- num_denseint, optional, default: 3
The number of dense layers for the main network part.
- output_activationstr or callable, optional, default: ‘softplus’
The activation function to use for the network outputs. Important: needs to have positive outputs.
- **kwargsdict, optional, default: {}
Optional keyword arguments (e.g., name) passed to the tf.keras.Model __init__ method.
- call(condition, **kwargs)[source]#
Computes evidences for model comparison given a batch of data and optional concatenated context, typically passed through a summayr network.
- Parameters:
- conditiontf.Tensor of shape (batch_size, …)
The input variables used for determining
p(model | condition)
- Returns:
- evidencetf.Tensor of shape (batch_size, num_models) – the learned model evidences
- sample(condition, n_samples, **kwargs)[source]#
Samples posterior model probabilities from the higher-order Dirichlet density.
- Parameters:
- conditiontf.Tensor
The summary of the observed (or simulated) data, shape (n_data_sets, …)
- n_samplesint
Number of samples to obtain from the approximate posterior
- Returns:
- pm_samplestf.Tensor or np.array
The posterior draws from the Dirichlet distribution, shape (num_samples, num_batch, num_models)
- class bayesflow.inference_networks.PMPNetwork(*args, **kwargs)[source]#
Bases:
Model
Implements a network that approximates posterior model probabilities (PMPs) as employed in [1].
- [1] Elsemüller, L., Schnuerch, M., Bürkner, P. C., & Radev, S. T. (2023).
A Deep Learning Method for Comparing Bayesian Hierarchical Models. arXiv preprint arXiv:2301.11873.
- __init__(num_models, dense_args=None, num_dense=3, dropout=True, mc_dropout=False, dropout_prob=0.05, output_activation=<function softmax_v2>, **kwargs)[source]#
Creates an instance of a PMP network for amortized model comparison.
- Parameters:
- num_modelsint
The number of candidate (competing models) for the comparison scenario.
- dense_argsdict or None, optional, default: None
The arguments for a tf.keras.layers.Dense layer. If None, defaults will be used.
- num_denseint, optional, default: 3
The number of dense layers for the main network part.
- dropoutbool, optional, default: True
Whether to use dropout in-between the hidden layers.
- mc_dropoutbool, optional, default: False
Whether to use dropout Monte Carlo dropout (i.e., Bayesian approximation) during inference
- dropout_probfloat in (0, 1), optional, default: 0.05
The dropout probability. Only has effecft if
dropout=True
ormc_dropout=True
- output_activationcallable, optional, default: tf.nn.softmax
The activation function to apply to the network outputs. Important: Needs to have positive outputs and be bounded between 0 and 1.
- **kwargsdict, optional, default: {}
Optional keyword arguments (e.g., name) passed to the
tf.keras.Model
__init__ method.
- call(condition, return_probs=True, **kwargs)[source]#
Forward pass through the network. Computes approximated PMPs given a batch of data and optional concatenated context, typically passed through a summary network.
- Parameters:
- conditiontf.Tensor of shape (batch_size, …)
The input variables used for determining
p(model | condition)
- return_probsbool, optional, default: True
Whether to return probabilities or logits (pre-activation, unnormalized)
- Returns:
- outtf.Tensor of shape (batch_size, …, num_models)
The approximated PMPs (post-activation) or logits (pre-activation)
- posterior_probs(condition, **kwargs)[source]#
Shortcut function to obtain posterior probabilities given a condition tensor (e.g., summary statistics of data sets).
- Parameters:
- conditiontf.Tensor of shape (batch_size, …)
The input variables used for determining
p(model | condition)
- Returns:
- outtf.Tensor of shape (batch_size, …, num_models)
The approximated PMPs
- logits(condition, **kwargs)[source]#
Shortcut function to obtain logits given a condition tensor (e.g., summary statistics of data sets).
- Parameters:
- conditiontf.Tensor of shape (batch_size, …)
The input variables used for determining
p(model | condition)
- Returns:
- outtf.Tensor of shape (batch_size, …, num_models)
The approximated PMPs