bayesflow.simulation module#

class bayesflow.simulation.ContextGenerator(batchable_context_fun: callable = None, non_batchable_context_fun: callable = None, use_non_batchable_for_batchable: bool = False)[source]#

Bases: object

Basic interface for a simulation module responsible for generating variables over which we want to amortize during simulation-based training, but do not want to perform inference on. Both priors and simulators in a generative framework can have their own context generators, depending on the particular modeling goals.

The interface distinguishes between two types of context: batchable and non-batchable.

  • Batchable context variables differ for each simulation in each training batch

  • Non-batchable context varibales stay the same for each simulation in a batch, but differ across batches

Examples for batchable context variables include experimental design variables, design matrices, etc. Examples for non-batchable context variables include the number of observations in an experiment, positional encodings, time indices, etc.

While the latter can also be considered batchable in principle, batching them would require non-Tensor (i.e., non-rectangular) data structures, which usually means inefficient computations.

Examples

Example for a simulation context which will generate a random number of observations between 1 and 100 for each training batch:

>>> gen = ContextGenerator(non_batchable_context_fun=lambda : np.random.randint(1, 101))
__init__(batchable_context_fun: callable = None, non_batchable_context_fun: callable = None, use_non_batchable_for_batchable: bool = False)[source]#

Instantiates a context generator responsible for random generation of variables which vary from data set to data set but cannot be considered data or parameters, e.g., time indices, number of observations, etc. A batchable, non-batchable, or both context functions should be provided to the constructor. An optional argument dictates whether the outputs of the non-batchable context function should be used as inputs to batchable context.

Parameters:
batchable_context_funcallable

A function with optional control arguments responsible for generating per-simulation set context variables

non_batchable_context_funcallable

A function with optional control arguments responsible for generating per-batch-of-simulations context variables.

use_non_batchable_for_batchablebool, optional, default: False

Determines whether to use output of non_batchable_context_fun as input to batchable_context_fun. Only relevant when both context types are provided.

__call__(batch_size, *args, **kwargs)[source]#

Wraps the method generate_context, which returns a dictionary with batchable and non batchable context.

Optional positional and keyword arguments are passed to the internal context-generating functions or ignored if the latter are None.

Parameters:
batch_sizeint

The batch_size argument used for batchable context.

Returns:
context_dictdictionary

A dictionary with context variables with the following keys: batchable_context : value non_batchable_context : value

Note, that the values of the context variables will be None, if the
corresponding context-generating functions have not been provided when
initializing this object.
batchable_context(batch_size, *args, **kwargs)[source]#

Generates ‘batch_size’ context variables given optional arguments. Return type is a list of context variables.

non_batchable_context(*args, **kwargs)[source]#

Generates a context variable shared across simulations in a given batch, given optional arguments.

generate_context(batch_size, *args, **kwargs)[source]#

Creates a dictionary with batchable and non batchable context.

Parameters:
batch_sizeint

The batch_size argument used for batchable context.

Returns:
context_dictdictionary

A dictionary with context variables with the following keys, if default keys not changed: batchable_context : value non_batchable_context : value

Note, that the values of the context variables will be None, if the
corresponding context-generating functions have not been provided when
initializing this object.
class bayesflow.simulation.Prior(batch_prior_fun: callable = None, prior_fun: callable = None, context_generator: callable = None, param_names: list = None)[source]#

Bases: object

Basic interface for a simulation module responsible for generating random draws from a prior distribution.

The prior functions should return a np.array of simulation parameters which will be internally used by the GenerativeModel interface for simulations.

An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: - context_generator.batchable_context(batch_size) - context_generator.non_batchable_context()

__init__(batch_prior_fun: callable = None, prior_fun: callable = None, context_generator: callable = None, param_names: list = None)[source]#

Instantiates a prior generator which will draw random parameter configurations from a user-informed prior distribution. No improper priors are allowed, as these may render the generative scope of a model undefined.

Parameters:
batch_prior_funcallable

A function (callbale object) with optional control arguments responsible for generating batches of per-simulation parameters.

prior_funcallable

A function (callbale object) with optional control arguments responsible for generating per-simulation parameters.

context generatorcallable, optional, (default None, recommended instance of ContextGenerator)

An optional function (ideally an instance of ContextGenerator) for generating prior context variables.

param_nameslist of str, optional, (default None)

A list with strings representing the names of the parameters.

__call__(batch_size, *args, **kwargs)[source]#

Generates batch_size draws from the prior given optional context generator.

Parameters:
batch_sizeint

The number of draws to obtain from the prior + context generator functions.

*argstuple

Optional positional arguments passed to the generator functions.

**kwargsdict

Optional keyword arguments passed to the generator functions.

Returns:
out_dict - a dictionary with the quantities generated from the prior + context funcitons.
plot_prior2d(**kwargs)[source]#

Generates a 2D plot representing bivariate prior ditributions. Uses the function bayesflow.diagnostics.plot_prior2d() internally for generating the plot.

Parameters:
**kwargsdict

Optional keyword arguments passed to the plot_prior2d function.

Returns:
fplt.Figure - the figure instance for optional saving
estimate_means_and_stds(n_draws=1000, *args, **kwargs)[source]#

Estimates prior means and stds given n_draws from the prior, useful for z-standardization of the prior draws.

Parameters:
n_draws: int, optional (default = 1000)

The number of random draws to obtain from the joint prior.

*argstuple

Optional positional arguments passed to the generator functions.

**kwargsdict

Optional keyword arguments passed to the generator functions.

Returns:
(prior_means, prior_stds) - tuple of np.ndarrays

The estimated means and stds of the joint prior.

logpdf(prior_draws)[source]#
class bayesflow.simulation.TwoLevelPrior(hyper_prior_fun: callable, local_prior_fun: callable, shared_prior_fun: callable = None, local_context_generator: callable = None)[source]#

Bases: object

Basic interface for a simulation module responsible for generating random draws from a two-level prior distribution.

The prior functions should return a np.array of simulation parameters which will be internally used by the TwoLevelGenerativeModel interface for simulations.

An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: - context_generator.batchable_context(batch_size) - context_generator.non_batchable_context()

__init__(hyper_prior_fun: callable, local_prior_fun: callable, shared_prior_fun: callable = None, local_context_generator: callable = None)[source]#

Instantiates a prior generator which will draw random parameter configurations from a joint prior having the general form:

p(local | hyper) p(hyper) p(shared)

Such priors are often encountered in two-level hierarchical Bayesian models and allow for modeling nested data. No improper priors are allowed, as these may render the generative scope of a model undefined.

Parameters:
hyper_prior_funcallable

A function (callbale object) which generates random draws from a hyperprior (unconditional)

local_prior_funcallable

A function (callable object) which generates random draws from a conditional prior given hyperparameters sampled from the hyperprior and optional context (e.g., variable number of groups)

shared_prior_funcallable or None, optional, default: None

A function (callable object) which generates random draws from an uncondtional prior. Represents optional shared parameters.

local_context_generatorcallable or None, optional, default: None

An optional function (ideally an instance of ContextGenerator) for generating control variables for the local_prior_fun.

Examples

Varying number of local factors (e.g., groups, participants) between 1 and 100:

def draw_hyper():

# Draw location for 2D conditional prior return np.random.normal(size=2)

def draw_prior(means, num_groups, sigma=1.):

# Draw parameter given location from hyperprior dim = means.shape[0] return np.random.normal(means, sigma, size=(num_groups, dim))

context = ContextGenerator(non_batchable_context_fun=lambda : np.random.randint(1, 101)) prior = TwoLevelPrior(draw_hyper, draw_prior, local_context_generator=context) prior_dict = prior(batch_size=32)

__call__(batch_size, **kwargs)[source]#

Generates batch_size draws from the hierarchical prior.

draw_hyper_parameters(**kwargs)[source]#

TODO

draw_local_parameters(hypers, batchable_context=None, non_batchable_context=None, **kwargs)[source]#

TODO

draw_shared_parameters(**kwargs)[source]#

TODO

class bayesflow.simulation.Simulator(batch_simulator_fun=None, simulator_fun=None, context_generator=None)[source]#

Bases: object

Basic interface for a simulation module responsible for generating randomized simulations given a prior parameter distribution and optional context variables, given a user-provided simulation function.

The user-provided simulator functions should return a np.array of synthetic data which will be used internally by the GenerativeModel interface for simulations.

An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: - context_generator.batchable_context(batch_size) - context_generator.non_batchable_context()

__init__(batch_simulator_fun=None, simulator_fun=None, context_generator=None)[source]#

Instantiates a data generator which will perform randomized simulations given a set of parameters and optional context. Either a batch_simulator_fun or simulator_fun, but not both, should be provided to instantiate a Simulator object.

If a batch_simulator_fun is provided, the interface will assume that the function operates on batches of parameter vectors and context variables and will pass the latter directly to the function. Power users should attempt to provide optimized batched simulators.

If a simulator_fun is provided, the interface will assume that the function operates on single parameter vectors and context variables and will wrap the simulator internally to allow batched functionality.

Parameters:
batch_simulator_funcallable

A function (callbale object) with optional control arguments responsible for generating a batch of simulations given a batch of parameters and optional context variables.

simulator_funcallable

A function (callable object) with optional control arguments responsible for generating a simulaiton given a single parameter vector and optional variables.

context_generatorcallable (default None, recommended instance of ContextGenerator)

An optional function (ideally an instance of ContextGenerator) for generating prior context variables.

__call__(params, *args, **kwargs)[source]#

Generates simulated data given param draws and optional context variables generated internally.

Parameters:
paramsnp.ndarray of shape (n_sim, …) - the parameter draws obtained from the prior.
Returns:
out_dictdictionary

An output dictionary with randomly simulated variables, the following keys are mandatory, if default keys not modified: sim_data : value non_batchable_context : value batchable_context : value

class bayesflow.simulation.GenerativeModel(prior: callable, simulator: callable, skip_test: bool = False, prior_is_batched: bool = False, simulator_is_batched: bool = None, name: str = 'anonymous')[source]#

Bases: object

Basic interface for a generative model in a simulation-based context. Generally, a generative model consists of two mandatory components:

  • Prior : A randomized function returning random parameter draws from a prior distribution;

  • Simulator : A function which transforms the parameters into observables in a non-deterministic manner.

__init__(prior: callable, simulator: callable, skip_test: bool = False, prior_is_batched: bool = False, simulator_is_batched: bool = None, name: str = 'anonymous')[source]#

Instantiates a generative model responsible for drawing generating params, data, and optional context.

Parameters:
priorcallable or bayesflow.simulation.Prior

A function returning random draws from the prior parameter distribution. Should encode prior knowledge about plausible parameter ranges

simulatorcallable or bayesflow.simulation.Simulator

A function accepting parameter draws, optional context, and optional arguments as input and returning obseravble data

skip_testbool, optional, default: False

If True, a forward inference pass will be performed.

prior_is_batchedbool, optional, default: False

Only relevant and mandatory if providing a custom prior without the Prior wrapper.

simulator_is_batchedbool or None, optional, default: None

Only relevant and mandatory if providing a custom simulator without he Simulator wrapper.

namestr (default - “anonoymous”)

An optional name for the generative model. If kept default (None), ‘anonymous’ is set as name.

Notes

If you are not using the provided Prior and Simulator wrappers for your prior and data generator, only functions returning a np.ndarray in the correct format will be accepted, since these will be wrapped internally. In addition, you need to indicate whether your simulator operates on batched of parameters or on single parameter vectors via tha simulator_is_batched argument.

__call__(batch_size, **kwargs)[source]#

Carries out forward inference batch_size times.

plot_pushforward(parameter_draws=None, funcs_list=None, funcs_labels=None, batch_size=1000, show_raw_sims=True)[source]#

Creates simulations from parameter_draws (generated from self.prior if they are not passed as an argument) and plots visualizations for them.

Parameters:
parameter_drawsnp.ndarray of shape (batch_size, num_parameters)

A sample of parameters. May be drawn from either the prior (which is also the default behavior if no input is specified) or from the posterior to do a prior/posterior pushforward.

funcs_listlist of callable

A list of functions that can be used to aggregate simulation data (map a single simulation to a single real value). The default behavior without user input is to use numpy’s mean and standard deviation functions.

funcs_labelslist of str

A list of labels for the functions in funcs_list. The default behavior without user input is to call the functions “Aggregator function 1, Aggregator function 2, etc.”

batch_sizeint, optional, default: 1000

The number of prior draws to generate (and then create and visualizes simulations from)

show_raw_simsbool, optional, default: True

Flag determining whether or not a plot of 49 raw (i.e. unaggregated) simulations is generated. Useful for very general data exploration.

Returns:
A dictionary with the following keys:
  • parameter_drawsnp.ndarray

    The parameters provided by the user or generated internally.

  • simulationsnp.ndarray

    The simulations generated from parameter_draws (or prior draws generated on the fly)

  • aggregated_datalist of np.ndarray

    Arrays generated from the simulations with the functions in funcs_list

presimulate_and_save(batch_size, folder_path, total_iterations=None, memory_limit=None, iterations_per_epoch=None, epochs=None, extend_from=0, disable_user_input=False)[source]#

Simulates a dataset for single-pass offline training (called via the train_from_presimulation method of the Trainer class in the trainers.py script).

Parameters:
batch_sizeint

Number of simulations which will be used in each backprop step of training.

folder_pathstr

The folder in which to save the presimulated data.

total_iterationsint or None, optional, default: None

Total number of iterations to perform during training. If total_iterations divided by epochs is not an integer, it will be increased so that said division does result in an integer.

memory_limitint or None, optional, default: None

Upper bound on the size of individual files (in Mb); can be useful to avoid running out of RAM during training.

iterations_per_epochint or None, optional, default: None

Number of batch simulations to perform per epoch file. If iterations_per_epoch batches per file lead to files exceeding the memory_limit, iterations_per_epoch will be lowered so that the memory_limit can be enforced.

epochsint or None, optional, default: None

Number of epoch files to generate. A higher number will be generated if the memory_limit for individual files requires it.

extend_fromint, optional, default: 0

If folder_path already contains simulations and the user wishes to add further simulations to these, extend_from must provide the number of the last presimulation file in folder_path.

disable_user_input: bool, optional, default: False

If True, user will not be asked if memory space is sufficient for presimulation.

Notes

One of the following pairs of parameters has to be provided:

  • (iterations_per_epoch, epochs),

  • (total_iterations, iterations_per_epoch)

  • (total_iterations, epochs)

Providing all three of the parameters in these pairs leads to a consistency check, since incompatible combinations are possible.

class bayesflow.simulation.TwoLevelGenerativeModel(prior: callable, simulator: callable, skip_test: bool = False, simulator_is_batched: bool = None, name: str = 'anonymous')[source]#

Bases: object

Basic interface for a generative model in a simulation-based context.

Generally, a generative model consists of two mandatory components: - MultilevelPrior : A randomized function returning random parameter draws from a two-level prior distribution; - Simulator : A function which transforms the parameters into observables in a non-deterministic manner.

__init__(prior: callable, simulator: callable, skip_test: bool = False, simulator_is_batched: bool = None, name: str = 'anonymous')[source]#

Instantiates a generative model responsible for generating parameters, data, and optional context.

Parameters:
priorcallable

A function returning random draws from the two-level prior parameter distribution. Should encode prior knowledge about plausible parameter ranges

simulatorcallable or bayesflow.simulation.Simulator

A function accepting parameter draws, shared parameters, optional context, and optional arguments as input and returning observable data

skip_testbool, optional, default: False

If True, a forward inference pass will be performed.

simulator_is_batchedbool or None, optional, default: None

Only relevant and mandatory if providing a custom simulator without the Simulator wrapper.

namestr (default - “anonymous”)

An optional name for the generative model.

Notes

If you are not using the provided TwoLevelPrior and Simulator wrappers for your prior and data generator, only functions returning a np.ndarray in the correct format will be accepted, since these will be wrapped internally. In addition, you need to indicate whether your simulator operates on batched of parameters or on single parameter vectors via tha simulator_is_batched argument.

__call__(batch_size, **kwargs)[source]#

Carries out forward inference batch_size times.

class bayesflow.simulation.MultiGenerativeModel(generative_models: list, model_probs='equal', shared_context_gen=None)[source]#

Bases: object

Basic interface for multiple generative models in a simulation-based context. A MultiveGenerativeModel instance consists of a list of GenerativeModel instances and a prior distribution over candidate models defined by a list of probabilities.

__init__(generative_models: list, model_probs='equal', shared_context_gen=None)[source]#

Instantiates a multi-generative model responsible for generating parameters, data, and optional context from a list of models according to specified prior model probabilities (PMPs).

Parameters:
generative_modelslist of GenerativeModel instances

The list of candidate generative models

model_probsstring (default - ‘equal’) or list of floats with sum(model_probs) == 1.

The list of model probabilities, should have the same length as the list of generative models. Note, that probabilities should sum to one.

shared_context_gencallable or None, optional, default: None

An optional function to generate context variables shared across all models and simulations in a given batch.

For instance, if the number of observations in a data set should vary during training, you need to pass the shared context to the MultiGenerativeModel, and not the individual GenerativeModels, as the latter will result in unequal numbers of observations across the models in a single batch.

Important: This function should return a dictionary with keys corresponding to the function arguments expected by the simulators.

__call__(batch_size, **kwargs)[source]#

Generates a total of batch_size simulations from all models.