bayesflow.simulation module#
- class bayesflow.simulation.ContextGenerator(batchable_context_fun: callable = None, non_batchable_context_fun: callable = None, use_non_batchable_for_batchable: bool = False)[source]#
Bases:
object
Basic interface for a simulation module responsible for generating variables over which we want to amortize during simulation-based training, but do not want to perform inference on. Both priors and simulators in a generative framework can have their own context generators, depending on the particular modeling goals.
The interface distinguishes between two types of context: batchable and non-batchable.
Batchable context variables differ for each simulation in each training batch
Non-batchable context varibales stay the same for each simulation in a batch, but differ across batches
Examples for batchable context variables include experimental design variables, design matrices, etc. Examples for non-batchable context variables include the number of observations in an experiment, positional encodings, time indices, etc.
While the latter can also be considered batchable in principle, batching them would require non-Tensor (i.e., non-rectangular) data structures, which usually means inefficient computations.
Examples
Example for a simulation context which will generate a random number of observations between 1 and 100 for each training batch:
>>> gen = ContextGenerator(non_batchable_context_fun=lambda : np.random.randint(1, 101))
- __init__(batchable_context_fun: callable = None, non_batchable_context_fun: callable = None, use_non_batchable_for_batchable: bool = False)[source]#
Instantiates a context generator responsible for random generation of variables which vary from data set to data set but cannot be considered data or parameters, e.g., time indices, number of observations, etc. A batchable, non-batchable, or both context functions should be provided to the constructor. An optional argument dictates whether the outputs of the non-batchable context function should be used as inputs to batchable context.
- Parameters:
- batchable_context_funcallable
A function with optional control arguments responsible for generating per-simulation set context variables
- non_batchable_context_funcallable
A function with optional control arguments responsible for generating per-batch-of-simulations context variables.
- use_non_batchable_for_batchablebool, optional, default: False
Determines whether to use output of non_batchable_context_fun as input to batchable_context_fun. Only relevant when both context types are provided.
- __call__(batch_size, *args, **kwargs)[source]#
Wraps the method generate_context, which returns a dictionary with batchable and non batchable context.
Optional positional and keyword arguments are passed to the internal context-generating functions or ignored if the latter are None.
- Parameters:
- batch_sizeint
The batch_size argument used for batchable context.
- Returns:
- context_dictdictionary
A dictionary with context variables with the following keys:
batchable_context
: valuenon_batchable_context
: value- Note, that the values of the context variables will be None, if the
- corresponding context-generating functions have not been provided when
- initializing this object.
- batchable_context(batch_size, *args, **kwargs)[source]#
Generates ‘batch_size’ context variables given optional arguments. Return type is a list of context variables.
- non_batchable_context(*args, **kwargs)[source]#
Generates a context variable shared across simulations in a given batch, given optional arguments.
- generate_context(batch_size, *args, **kwargs)[source]#
Creates a dictionary with batchable and non batchable context.
- Parameters:
- batch_sizeint
The batch_size argument used for batchable context.
- Returns:
- context_dictdictionary
A dictionary with context variables with the following keys, if default keys not changed:
batchable_context
: valuenon_batchable_context
: value- Note, that the values of the context variables will be
None
, if the - corresponding context-generating functions have not been provided when
- initializing this object.
- class bayesflow.simulation.Prior(batch_prior_fun: callable = None, prior_fun: callable = None, context_generator: callable = None, param_names: list = None)[source]#
Bases:
object
Basic interface for a simulation module responsible for generating random draws from a prior distribution.
The prior functions should return a np.array of simulation parameters which will be internally used by the GenerativeModel interface for simulations.
An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: - context_generator.batchable_context(batch_size) - context_generator.non_batchable_context()
- __init__(batch_prior_fun: callable = None, prior_fun: callable = None, context_generator: callable = None, param_names: list = None)[source]#
Instantiates a prior generator which will draw random parameter configurations from a user-informed prior distribution. No improper priors are allowed, as these may render the generative scope of a model undefined.
- Parameters:
- batch_prior_funcallable
A function (callbale object) with optional control arguments responsible for generating batches of per-simulation parameters.
- prior_funcallable
A function (callbale object) with optional control arguments responsible for generating per-simulation parameters.
- context generatorcallable, optional, (default None, recommended instance of ContextGenerator)
An optional function (ideally an instance of ContextGenerator) for generating prior context variables.
- param_nameslist of str, optional, (default None)
A list with strings representing the names of the parameters.
- __call__(batch_size, *args, **kwargs)[source]#
Generates
batch_size
draws from the prior given optional context generator.- Parameters:
- batch_sizeint
The number of draws to obtain from the prior + context generator functions.
- *argstuple
Optional positional arguments passed to the generator functions.
- **kwargsdict
Optional keyword arguments passed to the generator functions.
- Returns:
- out_dict - a dictionary with the quantities generated from the prior + context funcitons.
- plot_prior2d(**kwargs)[source]#
Generates a 2D plot representing bivariate prior ditributions. Uses the function
bayesflow.diagnostics.plot_prior2d()
internally for generating the plot.- Parameters:
- **kwargsdict
Optional keyword arguments passed to the
plot_prior2d
function.
- Returns:
- fplt.Figure - the figure instance for optional saving
- estimate_means_and_stds(n_draws=1000, *args, **kwargs)[source]#
Estimates prior means and stds given n_draws from the prior, useful for z-standardization of the prior draws.
- Parameters:
- n_draws: int, optional (default = 1000)
The number of random draws to obtain from the joint prior.
- *argstuple
Optional positional arguments passed to the generator functions.
- **kwargsdict
Optional keyword arguments passed to the generator functions.
- Returns:
- (prior_means, prior_stds) - tuple of np.ndarrays
The estimated means and stds of the joint prior.
- class bayesflow.simulation.TwoLevelPrior(hyper_prior_fun: callable, local_prior_fun: callable, shared_prior_fun: callable = None, local_context_generator: callable = None)[source]#
Bases:
object
Basic interface for a simulation module responsible for generating random draws from a two-level prior distribution.
The prior functions should return a np.array of simulation parameters which will be internally used by the TwoLevelGenerativeModel interface for simulations.
An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: -
context_generator.batchable_context(batch_size)
-context_generator.non_batchable_context()
- __init__(hyper_prior_fun: callable, local_prior_fun: callable, shared_prior_fun: callable = None, local_context_generator: callable = None)[source]#
Instantiates a prior generator which will draw random parameter configurations from a joint prior having the general form:
p(local | hyper) p(hyper) p(shared)
Such priors are often encountered in two-level hierarchical Bayesian models and allow for modeling nested data. No improper priors are allowed, as these may render the generative scope of a model undefined.
- Parameters:
- hyper_prior_funcallable
A function (callbale object) which generates random draws from a hyperprior (unconditional)
- local_prior_funcallable
A function (callable object) which generates random draws from a conditional prior given hyperparameters sampled from the hyperprior and optional context (e.g., variable number of groups)
- shared_prior_funcallable or None, optional, default: None
A function (callable object) which generates random draws from an uncondtional prior. Represents optional shared parameters.
- local_context_generatorcallable or None, optional, default: None
An optional function (ideally an instance of
ContextGenerator
) for generating control variables for the local_prior_fun.
Examples
Varying number of local factors (e.g., groups, participants) between 1 and 100:
- def draw_hyper():
# Draw location for 2D conditional prior return np.random.normal(size=2)
- def draw_prior(means, num_groups, sigma=1.):
# Draw parameter given location from hyperprior dim = means.shape[0] return np.random.normal(means, sigma, size=(num_groups, dim))
context = ContextGenerator(non_batchable_context_fun=lambda : np.random.randint(1, 101)) prior = TwoLevelPrior(draw_hyper, draw_prior, local_context_generator=context) prior_dict = prior(batch_size=32)
- draw_local_parameters(hypers, batchable_context=None, non_batchable_context=None, **kwargs)[source]#
TODO
TODO
- class bayesflow.simulation.Simulator(batch_simulator_fun=None, simulator_fun=None, context_generator=None)[source]#
Bases:
object
Basic interface for a simulation module responsible for generating randomized simulations given a prior parameter distribution and optional context variables, given a user-provided simulation function.
The user-provided simulator functions should return a np.array of synthetic data which will be used internally by the GenerativeModel interface for simulations.
An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: -
context_generator.batchable_context(batch_size)
-context_generator.non_batchable_context()
- __init__(batch_simulator_fun=None, simulator_fun=None, context_generator=None)[source]#
Instantiates a data generator which will perform randomized simulations given a set of parameters and optional context. Either a
batch_simulator_fun
orsimulator_fun
, but not both, should be provided to instantiate aSimulator
object.If a
batch_simulator_fun
is provided, the interface will assume that the function operates on batches of parameter vectors and context variables and will pass the latter directly to the function. Power users should attempt to provide optimized batched simulators.If a
simulator_fun
is provided, the interface will assume that the function operates on single parameter vectors and context variables and will wrap the simulator internally to allow batched functionality.- Parameters:
- batch_simulator_funcallable
A function (callbale object) with optional control arguments responsible for generating a batch of simulations given a batch of parameters and optional context variables.
- simulator_funcallable
A function (callable object) with optional control arguments responsible for generating a simulaiton given a single parameter vector and optional variables.
- context_generatorcallable (default None, recommended instance of ContextGenerator)
An optional function (ideally an instance of
ContextGenerator
) for generating prior context variables.
- __call__(params, *args, **kwargs)[source]#
Generates simulated data given param draws and optional context variables generated internally.
- Parameters:
- paramsnp.ndarray of shape (n_sim, …) - the parameter draws obtained from the prior.
- Returns:
- out_dictdictionary
An output dictionary with randomly simulated variables, the following keys are mandatory, if default keys not modified:
sim_data
: valuenon_batchable_context
: valuebatchable_context
: value
- class bayesflow.simulation.GenerativeModel(prior: callable, simulator: callable, skip_test: bool = False, prior_is_batched: bool = False, simulator_is_batched: bool = None, name: str = 'anonymous')[source]#
Bases:
object
Basic interface for a generative model in a simulation-based context. Generally, a generative model consists of two mandatory components:
Prior : A randomized function returning random parameter draws from a prior distribution;
Simulator : A function which transforms the parameters into observables in a non-deterministic manner.
- __init__(prior: callable, simulator: callable, skip_test: bool = False, prior_is_batched: bool = False, simulator_is_batched: bool = None, name: str = 'anonymous')[source]#
Instantiates a generative model responsible for drawing generating params, data, and optional context.
- Parameters:
- priorcallable or bayesflow.simulation.Prior
A function returning random draws from the prior parameter distribution. Should encode prior knowledge about plausible parameter ranges
- simulatorcallable or bayesflow.simulation.Simulator
A function accepting parameter draws, optional context, and optional arguments as input and returning obseravble data
- skip_testbool, optional, default: False
If True, a forward inference pass will be performed.
- prior_is_batchedbool, optional, default: False
Only relevant and mandatory if providing a custom prior without the
Prior
wrapper.- simulator_is_batchedbool or None, optional, default: None
Only relevant and mandatory if providing a custom simulator without he
Simulator
wrapper.- namestr (default - “anonoymous”)
An optional name for the generative model. If kept default (None), ‘anonymous’ is set as name.
Notes
If you are not using the provided
Prior
andSimulator
wrappers for your prior and data generator, only functions returning anp.ndarray
in the correct format will be accepted, since these will be wrapped internally. In addition, you need to indicate whether your simulator operates on batched of parameters or on single parameter vectors via tha simulator_is_batched argument.
- plot_pushforward(parameter_draws=None, funcs_list=None, funcs_labels=None, batch_size=1000, show_raw_sims=True)[source]#
Creates simulations from
parameter_draws
(generated fromself.prior
if they are not passed as an argument) and plots visualizations for them.- Parameters:
- parameter_drawsnp.ndarray of shape (batch_size, num_parameters)
A sample of parameters. May be drawn from either the prior (which is also the default behavior if no input is specified) or from the posterior to do a prior/posterior pushforward.
- funcs_listlist of callable
A list of functions that can be used to aggregate simulation data (map a single simulation to a single real value). The default behavior without user input is to use numpy’s mean and standard deviation functions.
- funcs_labelslist of str
A list of labels for the functions in funcs_list. The default behavior without user input is to call the functions “Aggregator function 1, Aggregator function 2, etc.”
- batch_sizeint, optional, default: 1000
The number of prior draws to generate (and then create and visualizes simulations from)
- show_raw_simsbool, optional, default: True
Flag determining whether or not a plot of 49 raw (i.e. unaggregated) simulations is generated. Useful for very general data exploration.
- Returns:
- A dictionary with the following keys:
- parameter_drawsnp.ndarray
The parameters provided by the user or generated internally.
- simulationsnp.ndarray
The simulations generated from parameter_draws (or prior draws generated on the fly)
- aggregated_datalist of np.ndarray
Arrays generated from the simulations with the functions in funcs_list
- presimulate_and_save(batch_size, folder_path, total_iterations=None, memory_limit=None, iterations_per_epoch=None, epochs=None, extend_from=0, disable_user_input=False)[source]#
Simulates a dataset for single-pass offline training (called via the train_from_presimulation method of the Trainer class in the trainers.py script).
- Parameters:
- batch_sizeint
Number of simulations which will be used in each backprop step of training.
- folder_pathstr
The folder in which to save the presimulated data.
- total_iterationsint or None, optional, default: None
Total number of iterations to perform during training. If total_iterations divided by epochs is not an integer, it will be increased so that said division does result in an integer.
- memory_limitint or None, optional, default: None
Upper bound on the size of individual files (in Mb); can be useful to avoid running out of RAM during training.
- iterations_per_epochint or None, optional, default: None
Number of batch simulations to perform per epoch file. If
iterations_per_epoch
batches per file lead to files exceeding the memory_limit,iterations_per_epoch
will be lowered so that the memory_limit can be enforced.- epochsint or None, optional, default: None
Number of epoch files to generate. A higher number will be generated if the memory_limit for individual files requires it.
- extend_fromint, optional, default: 0
If
folder_path
already contains simulations and the user wishes to add further simulations to these, extend_from must provide the number of the last presimulation file infolder_path
.- disable_user_input: bool, optional, default: False
If True, user will not be asked if memory space is sufficient for presimulation.
Notes
One of the following pairs of parameters has to be provided:
(iterations_per_epoch, epochs),
(total_iterations, iterations_per_epoch)
(total_iterations, epochs)
Providing all three of the parameters in these pairs leads to a consistency check, since incompatible combinations are possible.
- class bayesflow.simulation.TwoLevelGenerativeModel(prior: callable, simulator: callable, skip_test: bool = False, simulator_is_batched: bool = None, name: str = 'anonymous')[source]#
Bases:
object
Basic interface for a generative model in a simulation-based context.
Generally, a generative model consists of two mandatory components: - MultilevelPrior : A randomized function returning random parameter draws from a two-level prior distribution; - Simulator : A function which transforms the parameters into observables in a non-deterministic manner.
- __init__(prior: callable, simulator: callable, skip_test: bool = False, simulator_is_batched: bool = None, name: str = 'anonymous')[source]#
Instantiates a generative model responsible for generating parameters, data, and optional context.
- Parameters:
- priorcallable
A function returning random draws from the two-level prior parameter distribution. Should encode prior knowledge about plausible parameter ranges
- simulatorcallable or bayesflow.simulation.Simulator
A function accepting parameter draws, shared parameters, optional context, and optional arguments as input and returning observable data
- skip_testbool, optional, default: False
If True, a forward inference pass will be performed.
- simulator_is_batchedbool or None, optional, default: None
Only relevant and mandatory if providing a custom simulator without the
Simulator
wrapper.- namestr (default - “anonymous”)
An optional name for the generative model.
Notes
If you are not using the provided
TwoLevelPrior
andSimulator
wrappers for your prior and data generator, only functions returning anp.ndarray
in the correct format will be accepted, since these will be wrapped internally. In addition, you need to indicate whether your simulator operates on batched of parameters or on single parameter vectors via tha simulator_is_batched argument.
- class bayesflow.simulation.MultiGenerativeModel(generative_models: list, model_probs='equal', shared_context_gen=None)[source]#
Bases:
object
Basic interface for multiple generative models in a simulation-based context. A
MultiveGenerativeModel
instance consists of a list ofGenerativeModel
instances and a prior distribution over candidate models defined by a list of probabilities.- __init__(generative_models: list, model_probs='equal', shared_context_gen=None)[source]#
Instantiates a multi-generative model responsible for generating parameters, data, and optional context from a list of models according to specified prior model probabilities (PMPs).
- Parameters:
- generative_modelslist of GenerativeModel instances
The list of candidate generative models
- model_probsstring (default - ‘equal’) or list of floats with sum(model_probs) == 1.
The list of model probabilities, should have the same length as the list of generative models. Note, that probabilities should sum to one.
- shared_context_gencallable or None, optional, default: None
An optional function to generate context variables shared across all models and simulations in a given batch.
For instance, if the number of observations in a data set should vary during training, you need to pass the shared context to the
MultiGenerativeModel
, and not the individualGenerativeModels
, as the latter will result in unequal numbers of observations across the models in a single batch.Important: This function should return a dictionary with keys corresponding to the function arguments expected by the simulators.