ModelComparisonApproximator#
- class bayesflow.approximators.ModelComparisonApproximator(*args, **kwargs)[source]#
Bases:
ApproximatorDefines an approximator for model (simulator) comparison, where the (discrete) posterior model probabilities are learned with a classifier.
Uses a
ScoringRuleNetworkwith aCrossEntropyScoreto map summary/condition inputs to class logits and train via categorical cross-entropy.- Parameters:
- num_modelsint
Number of models (simulators) that the approximator will compare.
- classifier_networkkeras.Layer
The network backbone (e.g., an MLP) that is used for model classification. Internally wrapped in a
ScoringRuleNetworkwith aCrossEntropyScore. The input to the classifier network is created by concatenatinginference_conditionsand (optional) output of thesummary_network.- adapterbf.adapters.Adapter, optional
Adapter for data pre-processing. If
None(default), an identity adapter is used that makes a shallow copy and passes data through unchanged.- summary_networkbf.networks.SummaryNetwork, optional
The summary network used for data summarization (default is None). The input of the summary network is
summary_variables.- standardizestr | Sequence[str] | None
The variables to standardize before passing to the networks. Can be any subset of [“inference_conditions”, “summary_variables”]. (default is None, since model indices are one-hot encoded and should not be standardized).
- build_dataset(*, dataset: PyDataset = None, simulator: ModelComparisonSimulator = None, simulators: Sequence[Simulator] = None, **kwargs) OnlineDataset[source]#
- compute_metrics(inference_variables: Tensor, inference_conditions: Tensor = None, summary_variables: Tensor = None, sample_weight: Tensor = None, summary_attention_mask: Tensor = None, summary_mask: Tensor = None, inference_attention_mask: Tensor = None, inference_mask: Tensor = None, stage: str = 'training') dict[str, Tensor][source]#
Computes loss and tracks metrics for the classifier and summary networks.
This method coordinates summary metric computation (if present), combines summary outputs with inference conditions, computes classifier logits and cross-entropy loss via the
CrossEntropyScore, and aggregates all tracked metrics into a single dictionary.- Parameters:
- inference_variablesTensor
One-hot encoded model indices (targets for classification).
- inference_conditionsTensor, optional
Conditioning variables for the classifier network (default is None). May be combined with summary network outputs if present.
- summary_variablesTensor, optional
Input tensor(s) for the summary network (default is None). Required if a summary network is present.
- sample_weightTensor, optional
Weighting tensor for metric computation (default is None).
- summary_attention_maskTensor, optional
Attention mask forwarded to the summary network (default is None).
- summary_maskTensor, optional
Padding / key mask forwarded to the summary network (default is None).
- inference_attention_maskTensor, optional
Accepted for API consistency but unused (model comparison uses an MLP classifier).
- inference_maskTensor, optional
Padding / key mask forwarded to the classifier network (default is None).
- stagestr, optional
Current training stage (e.g., “training”, “validation”, “inference”). Controls certain metric computations (default is “training”).
- Returns:
- metricsdict[str, Tensor]
Dictionary containing the total loss under the key “loss”, as well as all tracked metrics for the classifier and summary networks. Each metric key is prefixed to indicate its source.
- fit(*, adapter: Adapter = 'auto', dataset: PyDataset = None, simulator: ModelComparisonSimulator = None, simulators: Sequence[Simulator] = None, **kwargs)[source]#
Trains the approximator on the provided dataset or on-demand generated from the given (multi-model) simulator. If dataset is not provided, a dataset is built from the simulator. If simulator is not provided, it will be built from a list of simulators. If the model has not been built, it will be built using a batch from the dataset.
- Parameters:
- adapterAdapter or ‘auto’, optional
The data adapter that will make the simulated / real outputs neural-network friendly.
- datasetkeras.utils.PyDataset, optional
A dataset containing simulations for training. If provided, simulator must be None.
- simulatorModelComparisonSimulator, optional
A simulator used to generate a dataset. If provided, dataset must be None.
- simulators: Sequence[Simulator], optional
A list of simulators (one simulator per model). If provided, dataset must be None.
- **kwargs
Additional keyword arguments passed to keras.Model.fit(), as described in:
- https://github.com/keras-team/keras/blob/v3.13.2/keras/src/backend/tensorflow/trainer.py#L314
- Returns:
- keras.callbacks.History
A history object containing the training loss and metrics values.
- Raises:
- ValueError
If both dataset and simulator or simulators are provided or neither is provided.
- get_config()[source]#
Returns the config of the object.
An object config is a Python dictionary (serializable) containing the information needed to re-instantiate it.
- predict(*, conditions: Mapping[str, ndarray], probs: bool = True, **kwargs) ndarray[source]#
Predicts posterior model probabilities given input conditions. The conditions dictionary is preprocessed using the adapter. The output is converted to NumPy array after inference.
- Parameters:
- conditionsMapping[str, np.ndarray]
Dictionary of conditioning variables as NumPy arrays.
- probs: bool, optional
A flag indicating whether model probabilities (True) or logits (False) are returned. Default is True.
- **kwargsdict
Additional keyword arguments for the adapter and classifier.
- Returns:
- outputs: np.ndarray
Predicted posterior model probabilities given conditions.
- __call__(*args, **kwargs)#
Call self as a function.
- add_loss(loss)#
Can be called inside of the call() method to add a scalar loss.
Example:
class MyLayer(Layer): ... def call(self, x): self.add_loss(ops.sum(x)) return x
- add_metric(*args, **kwargs)#
- add_variable(shape, initializer, dtype=None, trainable=True, autocast=True, regularizer=None, constraint=None, name=None)#
Add a weight variable to the layer.
Alias of add_weight().
- add_weight(shape=None, initializer=None, dtype=None, trainable=True, autocast=True, regularizer=None, constraint=None, aggregation='none', overwrite_with_gradient=False, name=None)#
Add a weight variable to the layer.
- Args:
- shape: Shape tuple for the variable. Must be fully-defined
(no None entries). Defaults to () (scalar) if unspecified.
- initializer: Initializer object to use to populate the initial
variable value, or string name of a built-in initializer (e.g. “random_normal”). If unspecified, defaults to “glorot_uniform” for floating-point variables and to “zeros” for all other types (e.g. int, bool).
- dtype: Dtype of the variable to create, e.g. “float32”. If
unspecified, defaults to the layer’s variable dtype (which itself defaults to “float32” if unspecified).
- trainable: Boolean, whether the variable should be trainable via
backprop or whether its updates are managed manually. Defaults to True.
- autocast: Boolean, whether to autocast layers variables when
accessing them. Defaults to True.
- regularizer: Regularizer object to call to apply penalty on the
weight. These penalties are summed into the loss function during optimization. Defaults to None.
- constraint: Contrainst object to call on the variable after any
optimizer update, or string name of a built-in constraint. Defaults to None.
- aggregation: Optional string, one of None, “none”, “mean”,
“sum” or “only_first_replica”. Annotates the variable with the type of multi-replica aggregation to be used for this variable when writing custom data parallel training loops. Defaults to “none”.
- overwrite_with_gradient: Boolean, whether to overwrite the variable
with the computed gradient. This is useful for float8 training. Defaults to False.
name: String name of the variable. Useful for debugging purposes.
- build(data_shapes: Mapping[str, tuple[int] | Mapping[str, Mapping]])#
Template method for building all network components.
This method orchestrates the build process by: 1. Building the summary network (if present) and caching its output shape 2. Enriching data_shapes with computed values for hooks to access 3. Calling hook methods in the proper sequence 4. Marking as built
Hooks receive an enriched data_shapes dict that includes “_summary_outputs” if a summary network was built, so they don’t need to recompute this value.
- classmethod build_adapter(inference_variables: str | Sequence[str], inference_conditions: str | Sequence[str] = None, summary_variables: str | Sequence[str] = None, sample_weight: str = None, summary_attention_mask: str = None, summary_mask: str = None, inference_attention_mask: str = None, inference_mask: str = None) Adapter#
Create a default
Adapterfor the approximator.Handles the common pipeline shared by all approximators:
to_array -> convert_dtype -> concatenate -> keep. Subclasses can callsuper().build_adapter(...)and apply additional steps to the returned adapter.- Parameters:
- inference_variablesstr or Sequence[str]
Names of the inference variables in the data dict.
- inference_conditionsstr or Sequence[str], optional
Names of the inference conditions in the data dict.
- summary_variablesstr or Sequence[str], optional
Names of the summary variables in the data dict.
- sample_weightstr, optional
Name of the sample weight variable.
- summary_attention_maskstr, optional
Name of the attention mask for the summary network. Forwarded as
attention_maskto the summary network.- summary_maskstr, optional
Name of the padding/key mask for the summary network. Forwarded as
maskto the summary network.- inference_attention_maskstr, optional
Name of the attention mask for the inference network. Forwarded as
attention_maskto the inference network.- inference_maskstr, optional
Name of the padding/key mask for the inference network. Forwarded as
maskto the inference network.
- build_from_config(config)#
Builds the layer’s states with the supplied config dict.
By default, this method calls the build(config[“input_shape”]) method, which creates weights based on the layer’s input shape in the supplied config. If your config contains other information needed to load the layer’s state, you should override this method.
- Args:
config: Dict containing the input shape associated with this layer.
- build_from_data(adapted_data: Mapping[str, Any])#
Build the approximator from adapted data by extracting shapes.
- call(*args, **kwargs)#
- compile(*args, inference_metrics: Any = None, summary_metrics: Any = None, **kwargs)#
Compile the approximator, setting metrics on inference and summary networks if provided.
- Parameters:
- inference_metricskeras.Metric or Sequence[keras.Metric], optional
Metric(s) to set on the inference_network.
- summary_metricskeras.Metric or Sequence[keras.Metric], optional
Metric(s) to set on the summary_network (if present).
- *args, **kwargs
Additional arguments passed to the parent compile method.
- compile_from_config(config)#
Compile the approximator from a saved configuration.
- property compute_dtype#
The dtype of the computations performed by the layer.
- compute_loss(x=None, y=None, y_pred=None, sample_weight=None, training=True)#
Compute the total loss, validate it, and return it.
Subclasses can optionally override this method to provide custom loss computation logic.
Example:
class MyModel(Model): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.loss_tracker = metrics.Mean(name='loss') def compute_loss(self, x, y, y_pred, sample_weight, training=True): loss = ops.mean((y_pred - y) ** 2) loss += ops.sum(self.losses) self.loss_tracker.update_state(loss) return loss def reset_metrics(self): self.loss_tracker.reset_state() @property def metrics(self): return [self.loss_tracker] inputs = layers.Input(shape=(10,), name='my_input') outputs = layers.Dense(10)(inputs) model = MyModel(inputs, outputs) model.add_loss(ops.sum(outputs)) optimizer = SGD() model.compile(optimizer, loss='mse', steps_per_execution=10) dataset = ... model.fit(dataset, epochs=2, steps_per_epoch=10) print(f"Custom loss: {model.loss_tracker.result()}")
- Args:
x: Input data. y: Target data. y_pred: Predictions returned by the model (output of model(x)) sample_weight: Sample weights for weighting the loss function. training: Whether we are training or evaluating the model.
- Returns:
The total loss as a scalar tensor, or None if no loss results (which is the case when called by Model.test_step).
- compute_loss_and_updates(trainable_variables, non_trainable_variables, metrics_variables, x, y, sample_weight, training=False, optimizer_variables=None)#
This method is stateless and is intended for use with jax.grad.
- compute_mask(inputs, previous_mask)#
- compute_output_shape(*args, **kwargs)#
- compute_output_spec(*args, **kwargs)#
- count_params()#
Count the total number of scalars composing the weights.
- Returns:
An integer count.
- property dtype#
Alias of layer.variable_dtype.
- property dtype_policy#
- evaluate(x=None, y=None, batch_size=None, verbose='auto', sample_weight=None, steps=None, callbacks=None, return_dict=False, **kwargs)#
Returns the loss value & metrics values for the model in test mode.
Computation is done in batches (see the batch_size arg.)
- Args:
- x: Input data. It can be:
A NumPy array (or array-like), or a list of arrays
(in case the model has multiple inputs). - A backend-native tensor, or a list of tensors (in case the model has multiple inputs). - A dict mapping input names to the corresponding array/tensors, if the model has named inputs. - A keras.utils.PyDataset returning (inputs, targets) or (inputs, targets, sample_weights). - A tf.data.Dataset yielding (inputs, targets) or (inputs, targets, sample_weights). - A torch.utils.data.DataLoader yielding (inputs, targets) or (inputs, targets, sample_weights). - A Python generator function yielding (inputs, targets) or (inputs, targets, sample_weights).
- y: Target data. Like the input data x, it can be either NumPy
array(s) or backend-native tensor(s). If x is a keras.utils.PyDataset, tf.data.Dataset, torch.utils.data.DataLoader or a Python generator function, y should not be specified since targets will be obtained from x.
- batch_size: Integer or None.
Number of samples per batch of computation. If unspecified, batch_size will default to 32. Do not specify the batch_size if your input data x is a keras.utils.PyDataset, tf.data.Dataset, torch.utils.data.DataLoader or Python generator function since they generate batches.
- verbose: “auto”, 0, 1, or 2. Verbosity mode.
0 = silent, 1 = progress bar, 2 = single line. “auto” becomes 1 for most cases. Note that the progress bar is not particularly useful when logged to a file, so verbose=2 is recommended when not running interactively (e.g. in a production environment). Defaults to “auto”.
- sample_weight: Optional NumPy array or tensor of weights for
the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) NumPy array or tensor with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D NumPy array or tensor with shape (samples, sequence_length) to apply a different weight to every timestep of every sample. This argument is not supported when x is a keras.utils.PyDataset, tf.data.Dataset, torch.utils.data.DataLoader or Python generator function. Instead, provide sample_weights as the third element of x. Note that sample weighting does not apply to metrics specified via the metrics argument in compile(). To apply sample weighting to your metrics, you can specify them via the weighted_metrics in compile() instead.
- steps: Integer or None.
Total number of steps (batches of samples) to draw before declaring the evaluation round finished. If steps is None, it will run until x is exhausted. In the case of an infinitely repeating dataset, it will run indefinitely.
- callbacks: List of keras.callbacks.Callback instances.
List of callbacks to apply during evaluation.
- return_dict: If True, loss and metric results are returned as a
dict, with each key being the name of the metric. If False, they are returned as a list.
- Returns:
Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics).
Note: When using compiled metrics, evaluate() may return multiple submetric values, while model.metrics_names often lists only top-level names (e.g., ‘loss’, ‘compile_metrics’), leading to a length mismatch. The order of the evaluate() output corresponds to the order of metrics specified during model.compile(). You can use this order to map the evaluate() results to the intended metric. model.metrics_names itself will still return only the top-level names.
- export(filepath, format='tf_saved_model', verbose=None, input_signature=None, **kwargs)#
Export the model as an artifact for inference.
- Args:
- filepath: str or pathlib.Path object. The path to save the
artifact.
- format: str. The export format. Supported values:
“tf_saved_model”, “onnx”, “openvino”, and “litert”. Defaults to “tf_saved_model”.
- verbose: bool. Whether to print a message during export. Defaults
to None, which uses the default value set by different backends and formats.
- input_signature: Optional. Specifies the shape and dtype of the
model inputs. Can be a structure of keras.InputSpec, tf.TensorSpec, backend.KerasTensor, or backend tensor. If not provided, it will be automatically computed. Defaults to None.
- **kwargs: Additional keyword arguments.
- is_static: Optional bool. Specific to the JAX backend and
format=”tf_saved_model”. Indicates whether fn is static. Set to False if fn involves state updates (e.g., RNG seeds and counters).
- jax2tf_kwargs: Optional dict. Specific to the JAX backend
and format=”tf_saved_model”. Arguments for jax2tf.convert. See the documentation for [jax2tf.convert](
If native_serialization and polymorphic_shapes are not provided, they will be automatically computed.
- opset_version: Optional int. Specific to format=”onnx”.
An integer value that specifies the ONNX opset version.
- LiteRT-specific options: Optional keyword arguments specific
to format=”litert”. These are passed directly to the TensorFlow Lite converter and include options like optimizations, representative_dataset, experimental_new_quantizer, allow_custom_ops, enable_select_tf_ops, etc. See TensorFlow Lite documentation for all available options.
Note: This feature is currently supported only with TensorFlow, JAX and Torch backends.
Note: Be aware that the exported artifact may contain information from the local file system when using format=”onnx”, verbose=True and Torch backend.
Examples:
Here’s how to export a TensorFlow SavedModel for inference.
# Export the model as a TensorFlow SavedModel artifact model.export("path/to/location", format="tf_saved_model") # Load the artifact in a different process/environment reloaded_artifact = tf.saved_model.load("path/to/location") predictions = reloaded_artifact.serve(input_data)
Here’s how to export an ONNX for inference.
# Export the model as a ONNX artifact model.export("path/to/location", format="onnx") # Load the artifact in a different process/environment ort_session = onnxruntime.InferenceSession("path/to/location") ort_inputs = { k.name: v for k, v in zip(ort_session.get_inputs(), input_data) } predictions = ort_session.run(None, ort_inputs)
Here’s how to export a LiteRT (TFLite) for inference.
# Export the model as a LiteRT artifact model.export("path/to/location", format="litert") # Load the artifact in a different process/environment interpreter = tf.lite.Interpreter(model_path="path/to/location") interpreter.allocate_tensors() interpreter.set_tensor( interpreter.get_input_details()[0]['index'], input_data ) interpreter.invoke() output_data = interpreter.get_tensor( interpreter.get_output_details()[0]['index'] )
- classmethod from_config(config, custom_objects=None)#
Deserialize and instantiate an approximator from configuration.
- get_build_config()#
Returns a dictionary with the layer’s input shape.
This method returns a config dict that can be used by build_from_config(config) to create all states (e.g. Variables and Lookup tables) needed by the layer.
By default, the config only contains the input shape that the layer was built with. If you’re writing a custom layer that creates state in an unusual way, you should override this method to make sure this state is already created when Keras attempts to load its value upon model loading.
- Returns:
A dict containing the input shape associated with the layer.
- get_compile_config()#
Serialize compile configuration for all network metrics.
Collects metrics from inference_network and summary_network (if present), serializes them, and merges with parent class config.
- Returns:
- dict
Configuration dictionary with serialized metrics.
- get_layer(name=None, index=None)#
Retrieves a layer based on either its name (unique) or index.
If name and index are both provided, index will take precedence. Indices are based on order of horizontal graph traversal (bottom-up).
- Args:
name: String, name of layer. index: Integer, index of layer.
- Returns:
A layer instance.
- get_metrics_result()#
Returns the model’s metrics values as a dict.
If any of the metric result is a dict (containing multiple metrics), each of them gets added to the top level returned dict of this method.
- Returns:
A dict containing values of the metrics listed in self.metrics. Example: {‘loss’: 0.2, ‘accuracy’: 0.7}.
- get_quantization_layer_structure(mode=None)#
Returns the quantization structure for the model.
This method is intended to be overridden by model authors to provide topology information required for structure-aware quantization modes like ‘gptq’.
- Args:
mode: The quantization mode.
- Returns:
A dictionary describing the topology, e.g.: {‘pre_block_layers’: [list], ‘sequential_blocks’: [list]} or None if the mode does not require structure or is not supported. ‘pre_block_layers’ is a list of layers that the inputs should be passed through, before being passed to the sequential blocks. For example, inputs to an LLM must first be passed through an embedding layer, followed by the transformer.
- get_state_tree(value_format='backend_tensor')#
Retrieves tree-like structure of model variables.
This method allows retrieval of different model variables (trainable, non-trainable, optimizer, and metrics). The variables are returned in a nested dictionary format, where the keys correspond to the variable names and the values are the nested representations of the variables.
- Returns:
- dict: A dictionary containing the nested representations of the
requested variables. The keys are the variable names, and the values are the corresponding nested dictionaries.
- value_format: One of “backend_tensor”, “numpy_array”.
- The kind of array to return as the leaves of the nested
state tree.
Example:
model = keras.Sequential([ keras.Input(shape=(1,), name="my_input"), keras.layers.Dense(1, activation="sigmoid", name="my_dense"), ], name="my_sequential") model.compile(optimizer="adam", loss="mse", metrics=["mae"]) model.fit(np.array([[1.0]]), np.array([[1.0]])) state_tree = model.get_state_tree()
The state_tree dictionary returned looks like:
{ 'metrics_variables': { 'loss': { 'count': ..., 'total': ..., }, 'mean_absolute_error': { 'count': ..., 'total': ..., } }, 'trainable_variables': { 'my_sequential': { 'my_dense': { 'bias': ..., 'kernel': ..., } } }, 'non_trainable_variables': {}, 'optimizer_variables': { 'adam': { 'iteration': ..., 'learning_rate': ..., 'my_sequential_my_dense_bias_momentum': ..., 'my_sequential_my_dense_bias_velocity': ..., 'my_sequential_my_dense_kernel_momentum': ..., 'my_sequential_my_dense_kernel_velocity': ..., } } } }
- get_weights()#
Return the values of layer.weights as a list of NumPy arrays.
- property input#
Retrieves the input tensor(s) of a symbolic operation.
Only returns the tensor(s) corresponding to the first time the operation was called.
- Returns:
Input tensor or list of input tensors.
- property input_dtype#
The dtype layer inputs should be converted to.
- property input_spec#
- jax_state_sync()#
- property jit_compile#
- property layers#
- load_own_variables(store)#
Loads the state of the layer.
You can override this method to take full control of how the state of the layer is loaded upon calling keras.models.load_model().
- Args:
store: Dict from which the state of the model will be loaded.
- load_weights(filepath, skip_mismatch=False, **kwargs)#
Load the weights from a single file or sharded files.
Weights are loaded based on the network’s topology. This means the architecture should be the same as when the weights were saved. Note that layers that don’t have weights are not taken into account in the topological ordering, so adding or removing layers is fine as long as they don’t have weights.
Partial weight loading
If you have modified your model, for instance by adding a new layer (with weights) or by changing the shape of the weights of a layer, you can choose to ignore errors and continue loading by setting skip_mismatch=True. In this case any layer with mismatching weights will be skipped. A warning will be displayed for each skipped layer.
Sharding
When loading sharded weights, it is important to specify filepath that ends with *.weights.json which is used as the configuration file. Additionally, the sharded files *_xxxxx.weights.h5 must be in the same directory as the configuration file.
- Args:
- filepath: str or pathlib.Path object. Path where the weights
will be saved. When sharding, the filepath must end in .weights.json.
- skip_mismatch: Boolean, whether to skip loading of layers where
there is a mismatch in the number of weights, or a mismatch in the shape of the weights.
Example:
# Load the weights in a single file. model.load_weights("model.weights.h5") # Load the weights in sharded files. model.load_weights("model.weights.json")
- property losses#
List of scalar losses from add_loss, regularizers and sublayers.
- make_predict_function(force=False)#
- make_test_function(force=False)#
- make_train_function(force=False)#
- property metrics#
List of all metrics.
- property metrics_names#
- property metrics_variables#
List of all metric variables.
- property non_trainable_variables#
List of all non-trainable layer state.
This extends layer.non_trainable_weights to include all state used by the layer including state for metrics and `SeedGenerator`s.
- property non_trainable_weights#
List of all non-trainable weight variables of the layer.
These are the weights that should not be updated by the optimizer during training. Unlike, layer.non_trainable_variables this excludes metric state and random seeds.
- property output#
Retrieves the output tensor(s) of a layer.
Only returns the tensor(s) corresponding to the first time the operation was called.
- Returns:
Output tensor or list of output tensors.
- property path#
The path of the layer.
If the layer has not been built yet, it will be None.
- predict_on_batch(x)#
Returns predictions for a single batch of samples.
- Args:
x: Input data. It must be array-like.
- Returns:
NumPy array(s) of predictions.
- predict_step(state, data)#
- property quantization_mode#
The quantization mode of this layer, None if not quantized.
- quantize(mode=None, config=None, filters=None, **kwargs)#
Quantize the weights of the model.
Note that the model must be built first before calling this method. quantize will recursively call quantize(…) in all layers and will be skipped if the layer doesn’t implement the function.
This method can be called by passing a mode string, which uses the default configuration for that mode. Alternatively, a config object can be passed to customize the behavior of the quantization (e.g. to use specific quantizers for weights or activations).
- Args:
- mode: The mode of the quantization. Supported modes are:
“int8”, “int4”, “float8”, “gptq”. This is optional if config is provided.
- config: The configuration object specifying additional
quantization options. This argument allows to configure the weight and activation quantizers. be an instance of keras.quantizers.QuantizationConfig.
- filters: Optional filters to apply to the quantization. Can be a
regex string, a list of regex strings, or a callable. Only the layers which match the filter conditions will be quantized.
**kwargs: Additional keyword arguments.
Example:
Quantize a model to int8 with default configuration:
# Build the model model = keras.Sequential([ keras.Input(shape=(10,)), keras.layers.Dense(10), ]) model.build((None, 10)) # Quantize with default int8 config model.quantize("int8")
Quantize a model to int8 with a custom configuration:
from keras.quantizers import Int8QuantizationConfig from keras.quantizers import AbsMaxQuantizer # Build the model model = keras.Sequential([ keras.Input(shape=(10,)), keras.layers.Dense(10), ]) model.build((None, 10)) # Create a custom config config = Int8QuantizationConfig( weight_quantizer=AbsMaxQuantizer( axis=0, value_range=(-127, 127) ), activation_quantizer=AbsMaxQuantizer( axis=-1, value_range=(-127, 127) ), ) # Quantize with custom config model.quantize(config=config)
- quantized_build(input_shape, mode)#
- quantized_call(*args, **kwargs)#
- rematerialized_call(layer_call, *args, **kwargs)#
Enable rematerialization dynamically for layer’s call method.
- Args:
layer_call: The original call method of a layer.
- Returns:
Rematerialized layer’s call method.
- reset_metrics()#
- property run_eagerly#
- save(filepath, overwrite=True, zipped=None, **kwargs)#
Saves a model as a .keras file.
Note that model.save() is an alias for keras.saving.save_model().
The saved .keras file contains:
The model’s configuration (architecture)
The model’s weights
The model’s optimizer’s state (if any)
Thus models can be reinstantiated in the exact same state.
- Args:
- filepath: str or pathlib.Path object.
The path where to save the model. Must end in .keras (unless saving the model as an unzipped directory via zipped=False).
- overwrite: Whether we should overwrite any existing model at
the target location, or instead ask the user via an interactive prompt.
- zipped: Whether to save the model as a zipped .keras
archive (default when saving locally), or as an unzipped directory (default when saving on the Hugging Face Hub).
Example:
model = keras.Sequential( [ keras.layers.Dense(5, input_shape=(3,)), keras.layers.Softmax(), ], ) model.save("model.keras") loaded_model = keras.saving.load_model("model.keras") x = keras.random.uniform((10, 3)) assert np.allclose(model.predict(x), loaded_model.predict(x))
- save_own_variables(store)#
Saves the state of the layer.
You can override this method to take full control of how the state of the layer is saved upon calling model.save().
- Args:
store: Dict where the state of the model will be saved.
- save_weights(filepath, overwrite=True, max_shard_size=None)#
Saves all weights to a single file or sharded files.
By default, the weights will be saved in a single .weights.h5 file. If sharding is enabled (max_shard_size is not None), the weights will be saved in multiple files, each with a size at most max_shard_size (in GB). Additionally, a configuration file .weights.json will contain the metadata for the sharded files.
The saved sharded files contain:
- *.weights.json: The configuration file containing ‘metadata’ and
‘weight_map’.
- *_xxxxxx.weights.h5: The sharded files containing only the
weights.
- Args:
- filepath: str or pathlib.Path object. Path where the weights
will be saved. When sharding, the filepath must end in .weights.json. If .weights.h5 is provided, it will be overridden.
- overwrite: Whether to overwrite any existing weights at the target
location or instead ask the user via an interactive prompt.
- max_shard_size: int or float. Maximum size in GB for each
sharded file. If None, no sharding will be done. Defaults to None.
Example:
# Instantiate a EfficientNetV2L model with about 454MB of weights. model = keras.applications.EfficientNetV2L(weights=None) # Save the weights in a single file. model.save_weights("model.weights.h5") # Save the weights in sharded files. Use `max_shard_size=0.25` means # each sharded file will be at most ~250MB. model.save_weights("model.weights.json", max_shard_size=0.25) # Load the weights in a new model with the same architecture. loaded_model = keras.applications.EfficientNetV2L(weights=None) loaded_model.load_weights("model.weights.h5") x = keras.random.uniform((1, 480, 480, 3)) assert np.allclose(model.predict(x), loaded_model.predict(x)) # Load the sharded weights in a new model with the same architecture. loaded_model = keras.applications.EfficientNetV2L(weights=None) loaded_model.load_weights("model.weights.json") x = keras.random.uniform((1, 480, 480, 3)) assert np.allclose(model.predict(x), loaded_model.predict(x))
- set_state_tree(state_tree)#
Assigns values to variables of the model.
This method takes a dictionary of nested variable values, which represents the state tree of the model, and assigns them to the corresponding variables of the model. The dictionary keys represent the variable names (e.g., ‘trainable_variables’, ‘optimizer_variables’), and the values are nested dictionaries containing the variable paths and their corresponding values.
- Args:
- state_tree: A dictionary representing the state tree of the model.
The keys are the variable names, and the values are nested dictionaries representing the variable paths and their values.
- set_weights(weights)#
Sets the values of layer.weights from a list of NumPy arrays.
- property standardize_layers#
Shortcut to the standardizer’s per-variable layers.
- stateless_call(trainable_variables, non_trainable_variables, *args, return_losses=False, **kwargs)#
Call the layer without any side effects.
- Args:
trainable_variables: List of trainable variables of the model. non_trainable_variables: List of non-trainable variables of the
model.
*args: Positional arguments to be passed to call(). return_losses: If True, stateless_call() will return the list of
losses created during call() as part of its return values.
**kwargs: Keyword arguments to be passed to call().
- Returns:
- A tuple. By default, returns (outputs, non_trainable_variables).
If return_losses = True, then returns (outputs, non_trainable_variables, losses).
Note: non_trainable_variables include not only non-trainable weights such as BatchNormalization statistics, but also RNG seed state (if there are any random operations part of the layer, such as dropout), and Metric state (if there are any metrics attached to the layer). These are all elements of state of the layer.
Example:
model = ... data = ... trainable_variables = model.trainable_variables non_trainable_variables = model.non_trainable_variables # Call the model with zero side effects outputs, non_trainable_variables = model.stateless_call( trainable_variables, non_trainable_variables, data, ) # Attach the updated state to the model # (until you do this, the model is still in its pre-call state). for ref_var, value in zip( model.non_trainable_variables, non_trainable_variables ): ref_var.assign(value)
- stateless_compute_loss(trainable_variables, non_trainable_variables, metrics_variables, x=None, y=None, y_pred=None, sample_weight=None, training=True)#
- stateless_compute_metrics(trainable_variables: Any, non_trainable_variables: Any, metrics_variables: Any, data: dict[str, Any], stage: str = 'training') tuple[Array, tuple]#
Stateless forward pass used as the
jax.value_and_gradtarget.All model state is injected via
keras.StatelessScopeso that JAX can differentiate through the computation.- Parameters:
- trainable_variablesAny
Current trainable weight values.
- non_trainable_variablesAny
Current non-trainable variable values (e.g. batch-norm statistics).
- metrics_variablesAny
Current metric tracking variable values.
- datadict[str, Any]
Input data dictionary passed to
compute_metrics().- stagestr, default
"training" "training"or"validation".
- Returns:
- lossjax.Array
Scalar loss for gradient computation.
- auxtuple
(metrics_dict, updated_non_trainable_variables, updated_metrics_variables).
- stateless_test_step(state: tuple, data: dict[str, Any]) tuple[dict[str, Array], tuple]#
Stateless validation step.
- Parameters:
- statetuple
(trainable_variables, non_trainable_variables, metrics_variables).- datadict[str, Any]
Input data for validation.
- Returns:
- metricsdict[str, jax.Array]
Computed evaluation metrics.
- statetuple
Updated state tuple.
- stateless_train_step(state: tuple, data: dict[str, Any]) tuple[dict[str, Array], tuple]#
Stateless training step with
jax.value_and_grad.Computes gradients via
jax.value_and_gradonstateless_compute_metrics()and applies the optimizer update statelessly.- Parameters:
- statetuple
(trainable_variables, non_trainable_variables, optimizer_variables, metrics_variables).- datadict[str, Any]
Input data for training.
- Returns:
- metricsdict[str, jax.Array]
Computed training metrics.
- statetuple
Updated state tuple.
- summarize(conditions: Mapping[str, ndarray], **kwargs) ndarray#
Computes the learned summary statistics of given summary variables.
The conditions dictionary is preprocessed using the adapter and passed through the summary network.
- Parameters:
- conditionsMapping[str, np.ndarray]
Dictionary of simulated or real quantities as NumPy arrays.
- **kwargsdict
Additional keyword arguments for the adapter and the summary network.
- Returns:
- summariesnp.ndarray
The learned summary statistics. Returns None if no summary network is present.
- summary(line_length=None, positions=None, print_fn=None, expand_nested=False, show_trainable=False, layer_range=None)#
Prints a string summary of the network.
- Args:
- line_length: Total length of printed lines
(e.g. set this to adapt the display to different terminal window sizes).
- positions: Relative or absolute positions of log elements
in each line. If not provided, becomes [0.3, 0.6, 0.70, 1.]. Defaults to None.
- print_fn: Print function to use. By default, prints to stdout.
If stdout doesn’t work in your environment, change to print. It will be called on each line of the summary. You can set it to a custom function in order to capture the string summary.
- expand_nested: Whether to expand the nested models.
Defaults to False.
- show_trainable: Whether to show if a layer is trainable.
Defaults to False.
- layer_range: a list or tuple of 2 strings,
which is the starting layer name and ending layer name (both inclusive) indicating the range of layers to be printed in summary. It also accepts regex patterns instead of exact names. In this case, the start predicate will be the first element that matches layer_range[0] and the end predicate will be the last element that matches layer_range[1]. By default None considers all layers of the model.
- Raises:
ValueError: if summary() is called before the model is built.
- property supports_masking#
Whether this layer supports computing a mask using compute_mask.
- symbolic_call(*args, **kwargs)#
- test_on_batch(x, y=None, sample_weight=None, return_dict=False)#
Test the model on a single batch of samples.
- Args:
x: Input data. Must be array-like. y: Target data. Must be array-like. sample_weight: Optional array of the same length as x, containing
weights to apply to the model’s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample.
- return_dict: If True, loss and metric results are returned as a
dict, with each key being the name of the metric. If False, they are returned as a list.
- Returns:
A scalar loss value (when no metrics and return_dict=False), a list of loss and metric values (if there are metrics and return_dict=False), or a dict of metric and loss values (if return_dict=True).
- test_step(*args, **kwargs)#
Alias for
stateless_test_step()(required bykeras.Model.fit()).
- to_json(**kwargs)#
Returns a JSON string containing the network configuration.
To load a network from a JSON save file, use keras.models.model_from_json(json_string, custom_objects={…}).
- Args:
- **kwargs: Additional keyword arguments to be passed to
json.dumps().
- Returns:
A JSON string.
- train_on_batch(x, y=None, sample_weight=None, class_weight=None, return_dict=False)#
Runs a single gradient update on a single batch of data.
- Args:
x: Input data. Must be array-like. y: Target data. Must be array-like. sample_weight: Optional array of the same length as x, containing
weights to apply to the model’s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample.
- class_weight: Optional dictionary mapping class indices (integers)
to a weight (float) to apply to the model’s loss for the samples from this class during training. This can be useful to tell the model to “pay more attention” to samples from an under-represented class. When class_weight is specified and targets have a rank of 2 or greater, either y must be one-hot encoded, or an explicit final dimension of 1 must be included for sparse class labels.
- return_dict: If True, loss and metric results are returned as a
dict, with each key being the name of the metric. If False, they are returned as a list.
- Returns:
A scalar loss value (when no metrics and return_dict=False), a list of loss and metric values (if there are metrics and return_dict=False), or a dict of metric and loss values (if return_dict=True).
- train_step(*args, **kwargs)#
Alias for
stateless_train_step()(required bykeras.Model.fit()).
- property trainable#
Settable boolean, whether this layer should be trainable or not.
- property trainable_variables#
List of all trainable layer state.
This is equivalent to layer.trainable_weights.
- property trainable_weights#
List of all trainable weight variables of the layer.
These are the weights that get updated by the optimizer during training.
- property variable_dtype#
The dtype of the state (weights) of the layer.
- property variables#
List of all layer state, including random seeds.
This extends layer.weights to include all state used by the layer including `SeedGenerator`s.
Note that metrics variables are not included here, use metrics_variables to visit all the metric variables.
- property weights#
List of all weight variables of the layer.
Unlike, layer.variables this excludes metric state and random seeds.