calibration_ecdf#
- bayesflow.diagnostics.calibration_ecdf(estimates: dict[str, ndarray] | ndarray, targets: dict[str, ndarray] | ndarray, variable_keys: Sequence[str] = None, variable_names: Sequence[str] = None, difference: bool = False, stacked: bool = False, rank_type: str | ndarray = 'fractional', figsize: Sequence[float] = None, label_fontsize: int = 16, legend_fontsize: int = 14, title_fontsize: int = 18, tick_fontsize: int = 12, rank_ecdf_color: str = '#132a70', fill_color: str = 'grey', num_row: int = None, num_col: int = None, **kwargs) Figure [source]#
Creates the empirical CDFs for each marginal rank distribution and plots it against a uniform ECDF. ECDF simultaneous bands are drawn using simulations from the uniform, as proposed by [1].
For models with many parameters, use stacked=True to obtain an idea of the overall calibration of a posterior approximator.
To compute ranks based on the Euclidean distance to the origin or a reference, use rank_type=’distance’ (and pass a reference array, respectively). This can be used to check the joint calibration of the posterior approximator and might show potential biases in the posterior approximation which are not detected by the fractional ranks (e.g., when the prior equals the posterior). This is motivated by [2].
[1] Säilynoja, T., Bürkner, P. C., & Vehtari, A. (2022). Graphical test for discrete uniformity and its applications in goodness-of-fit evaluation and multiple sample comparison. Statistics and Computing, 32(2), 1-21. https://arxiv.org/abs/2103.10522
- [2] Lemos, Pablo, et al. “Sampling-based accuracy testing of posterior estimators
for general inference.” International Conference on Machine Learning. PMLR, 2023. https://proceedings.mlr.press/v202/lemos23a.html
- Parameters:
- estimatesnp.ndarray of shape (n_data_sets, n_post_draws, n_params)
The posterior draws obtained from n_data_sets
- targetsnp.ndarray of shape (n_data_sets, n_params)
The prior draws obtained for generating n_data_sets
- differencebool, optional, default: False
If True, plots the ECDF difference. Enables a more dynamic visualization range.
- stackedbool, optional, default: False
If True, all ECDFs will be plotted on the same plot. If False, each ECDF will have its own subplot, similar to the behavior of calibration_histogram.
- rank_typestr, optional, default: ‘fractional’
If fractional (default), the ranks are computed as the fraction of posterior samples that are smaller than the prior. If distance, the ranks are computed as the fraction of posterior samples that are closer to a reference points (default here is the origin). You can pass a reference array in the same shape as the estimates array by setting targets in the
ranks_kwargs
. This is motivated by [2].- variable_keyslist or None, optional, default: None
Select keys from the dictionaries provided in estimates and targets. By default, select all keys.
- variable_nameslist or None, optional, default: None
The parameter names for nice plot titles. Inferred if None. Only relevant if stacked=False.
- figsizetuple or None, optional, default: None
The figure size passed to the matplotlib constructor. Inferred if None.
- label_fontsizeint, optional, default: 16
The font size of the y-label and y-label texts
- legend_fontsizeint, optional, default: 14
The font size of the legend text
- title_fontsizeint, optional, default: 18
The font size of the title text. Only relevant if stacked=False
- tick_fontsizeint, optional, default: 12
The font size of the axis ticklabels
- rank_ecdf_colorstr, optional, default: ‘#a34f4f’
The color to use for the rank ECDFs
- fill_colorstr, optional, default: ‘grey’
The color of the fill arguments.
- num_rowint, optional, default: None
The number of rows for the subplots. Dynamically determined if None.
- num_colint, optional, default: None
The number of columns for the subplots. Dynamically determined if None.
- **kwargsdict, optional, default: {}
Keyword arguments can be passed to control the behavior of ECDF simultaneous band computation through the
ecdf_bands_kwargs
dictionary. See simultaneous_ecdf_bands for keyword arguments. Moreover, additional keyword arguments can be passed to control the behavior of the rank computation through theranks_kwargs
dictionary.
- Returns:
- fplt.Figure - the figure instance for optional saving
- Raises:
- ShapeError
If there is a deviation form the expected shapes of estimates and targets.
- ValueError
If an unknown rank_type is passed.