OnlineDataset#

class bayesflow.datasets.OnlineDataset(simulator: Simulator, batch_size: int, num_batches: int, adapter: Adapter | None, *, stage: str = 'training', augmentations: Callable | Mapping[str, Callable] | Sequence[Callable] = None, **kwargs)[source]#

Bases: PyDataset

A dataset that generates simulations on-the-fly.

Initialize an OnlineDataset instance for infinite stream training.

Parameters:
simulatorSimulator

A simulator object with a .sample(batch_shape) method to generate data.

batch_sizeint

Number of samples per batch.

num_batchesint

Total number of batches in the dataset.

adapterAdapter or None

Optional adapter to transform the simulated batch.

stagestr, default=”training”

Current stage (e.g., “training”, “validation”, etc.) used by the adapter.

augmentationsCallable or Mapping[str, Callable] or Sequence[Callable], optional

A single augmentation function, dictionary of augmentation functions, or sequence of augmentation functions to apply to the batch.

If you provide a dictionary of functions, each function should accept one element of your output batch and return the corresponding transformed element.

Otherwise, your function should accept the entire dictionary output and return a dictionary.

Note - augmentations are applied before the adapter is called and are generally transforms that you only want to apply during training.

**kwargs

Additional keyword arguments passed to the base PyDataset.

property num_batches: int#

Number of batches in the PyDataset.

Returns:

The number of batches in the PyDataset or None to indicate that the dataset is infinite.

property max_queue_size#
on_epoch_begin()#

Method called at the beginning of every epoch.

on_epoch_end()#

Method called at the end of every epoch.

property use_multiprocessing#
property workers#