coverage#

bayesflow.diagnostics.coverage(estimates: Mapping[str, ndarray] | ndarray, targets: Mapping[str, ndarray] | ndarray, difference: bool = False, variable_keys: Sequence[str] = None, variable_names: Sequence[str] = None, test_quantities: dict[str, Callable] = None, figsize: Sequence[int] = None, label_fontsize: int = 16, legend_fontsize: int = 14, title_fontsize: int = 18, tick_fontsize: int = 12, legend_location: str = 'lower right', color: str = '#132a70', num_col: int = None, num_row: int = None) Figure[source]#

Creates coverage plots showing empirical coverage of posterior credible intervals.

The empirical coverage shows the coverage (proportion of true variable values that fall within the interval) of the central posterior credible intervals. A well-calibrated model would have coverage exactly match interval width (i.e. 95% credible interval contains the true value 95% of the time) as shown by the diagonal line.

The coverage is accompanied by credible intervals for the coverage (gray ribbon). These are computed via the (conjugate) Beta-Binomial model for binomial proportions with a uniform prior. For more details on the Beta-Binomial model, see Chapter 2 of Bayesian Data Analysis (2013, 3rd ed.) by Gelman A., Carlin J., Stern H., Dunson D., Vehtari A., & Rubin D.

Parameters:
estimatesnp.ndarray of shape (num_datasets, num_post_draws, num_params)

The posterior draws obtained from num_datasets

targetsnp.ndarray of shape (num_datasets, num_params)

The true parameter values used for generating num_datasets

differencebool, optional, default: True

If True, plots the difference between empirical coverage and ideal coverage (coverage - width), making deviations from ideal calibration more visible. If False, plots the standard coverage plot.

variable_keyslist or None, optional, default: None

Select keys from the dictionaries provided in estimates and targets. By default, select all keys.

variable_nameslist or None, optional, default: None

The parameter names for nice plot titles. Inferred if None test_quantities : dict or None, optional, default: None A dict that maps plot titles to functions that compute test quantities based on estimate/target draws.

The dict keys are automatically added to variable_keys and variable_names. Test quantity functions are expected to accept a dict of draws with shape (batch_size, ...) as the first (typically only) positional argument and return an NumPy array of shape (batch_size,). The functions do not have to deal with an additional sample dimension, as appropriate reshaping is done internally.

figsizetuple or None, optional, default: None

The figure size passed to the matplotlib constructor. Inferred if None.

label_fontsizeint, optional, default: 16

The font size of the y-label and x-label text

legend_fontsizeint, optional, default: 14

The font size of the legend text

title_fontsizeint, optional, default: 18

The font size of the title text

tick_fontsizeint, optional, default: 12

The font size of the axis ticklabels

legend_locationstr, optional, default: ‘upper right

The location of the legend.

colorstr, optional, default: ‘#132a70’

The color for the coverage line

num_rowint, optional, default: None

The number of rows for the subplots. Dynamically determined if None.

num_colint, optional, default: None

The number of columns for the subplots. Dynamically determined if None.

Returns:
fplt.Figure - the figure instance for optional saving
Raises:
ShapeError

If there is a deviation from the expected shapes of estimates and targets.