calibration_error#
- bayesflow.diagnostics.calibration_error(estimates: ~typing.Mapping[str, ~numpy.ndarray] | ~numpy.ndarray, targets: ~typing.Mapping[str, ~numpy.ndarray] | ~numpy.ndarray, variable_keys: ~typing.Sequence[str] = None, variable_names: ~typing.Sequence[str] = None, resolution: int = 20, aggregation: ~typing.Callable = <function median>, min_quantile: float = 0.005, max_quantile: float = 0.995) Mapping[str, Any] [source]#
Computes an aggregate score for the marginal calibration error over an ensemble of approximate posteriors. The calibration error is given as the aggregate (e.g., median) of the absolute deviation between an alpha-CI and the relative number of inliers from
estimates
over multiple alphas in (0, 1).- Parameters:
- estimatesnp.ndarray of shape (num_datasets, num_draws, num_variables)
The random draws from the approximate posteriors over
num_datasets
- targetsnp.ndarray of shape (num_datasets, num_variables)
The corresponding ground-truth values sampled from the prior
- variable_keysSequence[str], optional (default = None)
Select keys from the dictionaries provided in estimates and targets. By default, select all keys.
- variable_namesSequence[str], optional (default = None)
Optional variable names to show in the output.
- resolutionint, optional, default: 20
The number of credibility intervals (CIs) to consider
- aggregationcallable or None, optional, default: np.median
The function used to aggregate the marginal calibration errors. If
None
provided, the per-alpha calibration errors will be returned.- min_quantilefloat in (0, 1), optional, default: 0.005
The minimum posterior quantile to consider.
- max_quantilefloat in (0, 1), optional, default: 0.995
The maximum posterior quantile to consider.
- Returns:
- resultdict
Dictionary containing:
- “values”float or np.ndarray
The aggregated calibration error per variable
- “metric_name”str
The name of the metric (“Calibration Error”).
- “variable_names”str
The (inferred) variable names.