Introduce weights sharding #21022

james77777778 · 2025-03-13T08:28:19Z

Continuing the work based on #19286

This PR introduces max_shard_size in Model.save_weights.

Behind the scenes, this PR refactors H5IOStore by combining it with H5Entry. This change allows for more fine-grained control when storing weights. Specifically, it enables the creation of a new shard file once the current shard file reaches its capacity due to incoming weights.

Compatibility has been verified, but please let me know if anything was overlooked.
The test for LoRA weights saving/loading in KerasHub has been included.

Ping @mattdangerw and @divyashreepathihalli for requesting this feature.

A simple demo script:

import os

import numpy as np

from keras import applications

# EfficientNetB0 is about 20.3MB.
model = applications.EfficientNetB0(weights=None, input_shape=(224, 224, 3))
ref_input = np.random.random((1, 224, 224, 3)).astype("float32")
ref_output = model.predict(ref_input)

# `max_shard_size` is in GB. 0.015 means about 15MB per shard.
model.save_weights("model.weights.json", max_shard_size=0.015)
files = [x for x in os.listdir(".")]
assert "model.weights.json" in files
assert "model_00000.weighs.h5" in files
assert "model_00001.weighs.h5" in files
print("Sharded weights saved successfully!")

# Load the sharded weights with the new instance.
model = applications.EfficientNetB0(weights=None, input_shape=(224, 224, 3))
model.load_weights("model.weights.json", sharded=True)
np.testing.assert_allclose(model.predict(ref_input), ref_output, atol=1e-6)
print("Passed!")

The outputs:

1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 1s/step
Sharded weights saved successfully!
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 996ms/step
Passed!

The format of .weights.json (similar to HF's format):

{
    "metadata": {
        "total_size": 476111392.0
    },
    "weight_map": {
        "/vars": "model_00000.weighs.h5",
        "/layers/input_layer/vars": "model_00000.weights.h5",
        "/layers/rescaling/vars": "model_00000.weights.h5",
        "/layers/conv2d/vars": "model_00000.weights.h5",
        "/layers/batch_normalization/vars": "model_00000.weights.h5",
...

codecov-commenter · 2025-03-13T08:38:05Z

Codecov Report

Attention: Patch coverage is 85.14286% with 26 lines in your changes missing coverage. Please review.

Project coverage is 82.65%. Comparing base (624c00b) to head (7464cdc).

Files with missing lines	Patch %	Lines
keras/src/saving/saving_lib.py	86.41%	12 Missing and 10 partials ⚠️
keras/src/saving/saving_api.py	63.63%	2 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #21022      +/-   ##
==========================================
+ Coverage   82.47%   82.65%   +0.17%     
==========================================
  Files         563      563              
  Lines       53861    53975     +114     
  Branches     8361     8386      +25     
==========================================
+ Hits        44424    44615     +191     
+ Misses       7395     7291     -104     
- Partials     2042     2069      +27

Flag	Coverage Δ
keras	`82.47% <85.14%> (+0.17%)`	⬆️
keras-jax	`64.01% <85.14%> (+0.20%)`	⬆️
keras-numpy	`59.07% <84.57%> (+0.31%)`	⬆️
keras-openvino	`32.67% <11.42%> (-0.05%)`	⬇️
keras-tensorflow	`64.33% <85.14%> (+0.20%)`	⬆️
keras-torch	`64.05% <85.14%> (+0.20%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mattdangerw

This looks great to me! @fchollet probably worth looking at this one!

My main question was on the loading side. Do we need to ask people to pass sharded=True, can we remove the arg or leave it none and infer by default?

keras/src/models/model.py

mattdangerw · 2025-03-13T21:24:25Z

keras/src/saving/saving_lib.py

-        if "dtype" in value.attrs and value.attrs["dtype"] == "bfloat16":
-            value = np.array(value, dtype=ml_dtypes.bfloat16)
-        return value
+    def __delitem__(self, key):


Is this new? Why do we need del?

I just added it because it might be helpful. I'm considering making the store object more dict-like, but it's subject to change.

Got it. I'd probably ditch any code paths that we don't actively use or test, as they might easily break with some future changes. But no strong opinion

Tests have been added, which should now guarantee the code quality.

mattdangerw · 2025-03-13T21:27:38Z

keras/src/models/model.py

+                location or instead ask the user via an interactive prompt.
+            max_shard_size: `int` or `float`. Maximum size in GB for each
+                sharded file. If `None`, no sharding will be done. Defaults to
+                `None`.
        """


Might be worth adding some code examples, for those who don't want to read and just want to see how to shard :)

Sounds good!

I have added an example:

# Instantiate a EfficientNetV2L model with about 454MB of weights. model = keras.applications.EfficientNetV2L(weights=None) # Save the weights in a single file. model.save_weights("model.weights.h5") # Save the weights in sharded files. Use `max_shard_size=0.25` means # each sharded file will be at most ~250MB. model.save_weights("model.weights.json", max_shard_size=0.25) # Load the weights in a new model with the same architecture. loaded_model = keras.applications.EfficientNetV2L(weights=None) loaded_model.load_weights("model.weights.h5") x = keras.random.uniform((1, 480, 480, 3)) assert np.allclose(model.predict(x), loaded_model.predict(x)) # Load the sharded weights in a new model with the same architecture. loaded_model = keras.applications.EfficientNetV2L(weights=None) loaded_model.load_weights("model.weights.json") x = keras.random.uniform((1, 480, 480, 3)) assert np.allclose(model.predict(x), loaded_model.predict(x))

Instead of

# each sharded file will be at most ~250MB model.save_weights("model.weights.json", max_shard_size=0.25)

How about for better readability.

# each sharded file will be at most ~250MB model.save_weights("model.weights.json", max_shard_size='250MB')

I agree that using a string has better readability but I think @fchollet wanted it to be an int.
Maybe we can support both int and string?
#19286 (comment)

I think we can start with a int and extend later? Let's stick with what we got!

keras/src/saving/saving_lib.py

…-sharding

fchollet

Great work -- LGTM

mattdangerw

Thanks! This looks great to me

keras/src/saving/saving_lib.py

keras/src/models/model.py

mattdangerw · 2025-03-18T16:26:09Z

keras/src/models/model.py

+                location or instead ask the user via an interactive prompt.
+            max_shard_size: `int` or `float`. Maximum size in GB for each
+                sharded file. If `None`, no sharding will be done. Defaults to
+                `None`.
        """


I think we can start with a int and extend later? Let's stick with what we got!

fchollet · 2025-03-18T21:55:34Z

The test failure is unrelated. Will wait for Matt's comments to be addressed before merging.

…-sharding

…dH5IOStore`.

james77777778 · 2025-03-19T02:31:59Z

@mattdangerw @fchollet
I have resolved all the comments and added tests for the low-level operations of H5IOStore and ShardedH5IOStore.

* Add random_posterization processing layer (#20688) * Add random_posterization processing layer * Add test cases * correct failed case * Fix torch gpu CI (#20696) * Add random_sharpness processing layer (#20697) * Add random_sharpness.py * Update random_sharpness * Add test cases * Fix failed test case * Add random_shear processing layer (#20702) * Add random_shear processing layer * Update method name * Fix failed test case * Fix failed test case * Fix failed test case * Fix the aggregation in the codebase (#20703) * Bump the github-actions group with 2 updates (#20707) Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action). Updates `actions/upload-artifact` from 4.4.3 to 4.5.0 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882...6f51ac03b9356f520e9adb1b1b7802705f340c2b) Updates `github/codeql-action` from 3.27.5 to 3.28.0 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/f09c1c0a94de965c15400f5634aa42fac8fb8f88...48ab28a6f5dbc2a99bf1e0131198dd8f1df78169) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-minor dependency-group: github-actions - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-minor dependency-group: github-actions ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: Torch MPS backend failing test (#20709) * implement transform_bounding_boxes for random_shear (#20704) * Fix torch GPU CI * Update BackupAndRestore class example (#20714) * Update BackupAndRestore class example * Update backup_and_restore.py --------- Co-authored-by: François Chollet <[email protected]> * Update version number * Refactor `keras/src/export/export_lib` and add `export_onnx` (#20710) * Refactor export_lib and add export_onnx Add tf2onnx requirements * Add onnxruntime dep * Update numpy dep * Resolve comments * Patch `tf2onnx` to ensure compatibility with `numpy>=2.0.0` (#20725) * Patch tf2onnx to support numpy 2 * Fix warnings * Update export_onnx * Add build method to supress warning (#20729) * Specify window_length dtype requirement in tf.keras.ops.istft in math.py (#20728) The `window_length` parameter in `tf.keras.ops.istft` requires `tf.int32` dtype, but this isn't documented. This can cause unexpected `ValueError` when using `tf.int64` and `tf.int16` Here is the Example case: ``` import tensorflow as tf input_dict = { 'stfts': tf.constant([[-0.87817144+1.14583987j, -0.32066484+0.25565411j]], dtype=tf.complex128), 'frame_length': tf.constant(256, dtype=tf.int16), 'frame_step': tf.constant(5120,dtype=tf.int64) } result = tf.signal.inverse_stft(**input_dict) print(result) ``` The code throws the following error: ``` ValueError: window_length: Tensor conversion requested dtype int32 for Tensor with dtype int64 ``` * Add rand_augment processing layer (#20716) * Add rand_augment init * Update rand_augment init * Add rand_augment * Add NotImplementedError * Add some test cases * Fix failed test case * Update rand_augment * Update rand_augment test * Fix random_rotation bug * Add build method to supress warning. * Add implementation for transform_bboxes * Fixing batch_dim_name attribute (#20674) * fixing wrong trainer assumption that batch dim is always the first one in the mesh * need functools partial * lint * fix test failure when distribution=None * lint2 * fix for test failure * added data sharding for 3D+ meshes * lint3 * added @property for batch_dim_name + refactoring * fix typo * Add support for `dtype` / `DTypePolicy` to `JaxLayer` and `FlaxLayer`. (#20732) The `dtype` / `DTypePolicy` is applied to all float variables. * Allow dynamic shape in `STFTSpectrogram` layer. (#20736) by simply using `ops.shape(x)` instead of `x.shape`. * Remove duplicate export tests in `model_test`. (#20735) The same tests exist at: - https://github.com/keras-team/keras/blob/master/keras/src/export/saved_model_test.py#L66 - https://github.com/keras-team/keras/blob/master/keras/src/export/onnx_test.py#L62 The goal is to isolate the use of `onnxruntime` to a single file, `onnx_test.py`. * Add OpenVINO into README.md (#20739) * Add OpenVINO into README.md Signed-off-by: Kazantsev, Roman <[email protected]> * Update README.md --------- Signed-off-by: Kazantsev, Roman <[email protected]> * Multiple Example Title has removed in metrics.MeanIoU method (#20738) Multiple Example Title has removed in metrics.MeanIoU method * Fix JAX GPU CI and make formatter happy (#20749) * Fix JAX GPU CI * Makes formatter happy * Makes formatter happy - 2 * Add checks to deserialization. (#20751) In particular for functional models. * feat(ops): Add keras.ops.numpy.rot90 operation (#20723) (#20745) * feat(ops): Add keras.ops.image.rot90 operation Adds a new operation to rotate tensors by 90 degrees in the specified plane: - Implements rot90 operation in keras.ops.image module - Adds support for multiple rotations (k parameter) and custom axes - Matches numpy.rot90 behavior and API for consistency - Adds comprehensive test coverage including batch images support - Handles input validation for tensor dimensions and axes - Supports symbolic tensor execution The operation follows the same interface as numpy.rot90 and tf.image.rot90: rot90(array, k=1, axes=(0, 1)) * feat: add JAX, NumPy and PyTorch backends for rot90 Add implementations of rot90() for multiple backend frameworks: - JAX backend implementation - NumPy backend implementation - PyTorch backend implementation * Move rot90 from image to numpy ops Move rot90 operation to numpy.py files in backend implementations since it's a numpy op (https://numpy.org/doc/stable/reference/generated/numpy.rot90.html). Now exported as both keras.ops.rot90 and keras.ops.numpy.rot90. * Fix dtype conflict in PyTorch backend's rot90 function Resolved the 'Invalid dtype: object' error by explicitly using to avoid naming conflicts with the custom function. * Replace experimental NumPy rot90 with core TF ops Replace tf.experimental.numpy.rot90 with core TF ops for XLA compatibility. Use convert_to_tensor for input handling. * Fix code format * Fix code format following ruff update * Fix Torch GPU CI * Update API ref * Fix flaky `JaxLayer` test. (#20756) The `DTypePolicy` test produces lower precision results. * Fix serialization of domain packages. (#20755) Not all of their symbols are exported. * Preliminary parts needed for ragged support, including densification. (#20757) Added `ragged` option to `KerasTensor`, `InputLayer` and `convert_to_tensor`. The logic is the same as for sparse tensors. Fixes https://github.com/keras-team/keras/issues/20731 * Disallow pickle loading in npz files * Implemented more generic asset tracking mechanism in saved model export. (#20758) This new implementation is in line with what was done in Keras 2. It tracks all `TrackableResource`s, and lookup tables and hashmaps are subclasses of `TrackableResource`. This allows users to attach preprocessing functions that are not solely based on Keras preprocessing layers. * [Keras Ops] Add einops-style `rearrange()` to `keras.ops` (#20733) * Add einops-style rearrange to keras.ops.einops * Address PR comments * Add any_symbolic_tensors() check on call * Pass all arguments in symbolic_call * Remove constructor and fix call * Add basic couple of tests * Add more tests * Add examples to docstring * Skip tests if backend is openvino * Remove numpy from tests in lieu of keras.ops * Skip tests for openvino when the testing operation isn't supported * Remove all type annotations for consistency. (#20762) Some tools don't like the mix of code with and without type hints. * Porting TF fake_quant_with_min_max functions (#20641) * QAT (squashed this time) (#1) * adds fake_quant_with_min_max functions from TF to keras3 * Addresses PR review comments * drops another type hint * swaps out if statements, change float() to ops.cast and adds fake_quant_with_min_max_vars function * fix missed if statement, adds gradient tests via main function for tf and torch * fix unbound variable error when not using torch or tf backend (#2) Refactor to use backend specific gradient functions in tests and merges logic into single function * More QAT function revisions (#3) This PR addresses review feedback to fix implementation and to move tests to using named_parameters rather than individual functions. * Qat revisions (#4) Adds axis param and fixes logic for per channel function * updated implementation * removed redundant functions * Add aug_mix processing layer (#20759) * Add implementation for AugMix * Update implementation for aug_mix * Update description for aug_mix * Fix some issues that was from review * `JaxLayer` now uses the global dtype policy by default. (#20767) All floats will now follow the global dtype policy unless a specific dtype policy is passed to the layer. * fix(ops): Fix issue with map_coordinates for uint8 dtype (#20768) The issue arose from improper handling of out-of-bound coordinates, causing invalid indexing when using dtype='uint8' with TensorFlow backend. Changes made: - Improved processing of coordinates to handle all fill_mode cases, including 'reflect', correctly. - Simplified the logic for gathering and applying fill values, ensuring consistent behavior across data types. - Added test cases for uint8, float32, and various fill_mode settings to validate the fix. Tests for uint8 and float32 now succeed, and the logic for nearest fill_mode and manual casting is also fixed. Fixes #20608 * Multiple Example Title has been removed in metrics.BinaryIoU (#20775) * fix(ops): Fix inconsistent padding calculation in PyTorch backend ops (#20774) * Fix "same" padding torch issue * format * fix type * add condition for channels first and last * fix(ops): Fix inconsistent padding calculation in PyTorch backend ops Was able to still reproduce the error, the PyTorch backend had inconsistent behavior between static shape inference and dynamic execution for pooling operations, particularly with 'same' padding and non-unit strides, figured that the root cause was by incorrect padding calculation logic that didn't properly handle asymmetric padding cases. Key changes: - Rewrote _compute_padding_length() to handle stride-based padding - Fixed padding calculation to properly support asymmetric padding cases - Standardize channels_first/channels_last conversion in pooling ops - Cleaned up padding application in _apply_same_padding() - Added proper handling of data_format throughout pooling pipeline This fixes the issue where MaxPooling2D with 'same' padding would produce different shapes between compute_output_shape() and actual execution (e.g. (1,5,2,2) vs (1,5,2,1)). Rebased on top of Sachin's September 2024 PR to incorporate latest keras:master changes. --------- Co-authored-by: sachin prasad <[email protected]> * Improve `fake_quant_with_min_max_vars` (#20772) * Fix fake_quant_with_min_max_vars * Add FakeQuantWithMinMaxVars operation and use shortcut for TF backend. * Fix memory leaks in `model.evaluate`. (#20779) The history is only used in `model.fit`, no need to create it for `evaluate` and `predict`. The history is attached to the model and therefore lives for as long as the model is around. The executor used in `CallbackList` was never shut down, causing it to keep a thread around, which in turn had thread locals that were leaked. * fix(applications): Improve validation and error handling for ConvNeXt weights and fix broadcasting in EfficientNetV2 (#20785) * fix(applications): Improve validation and error handling for ConvNeXt weights - Validate architecture and weights compatibility before API request. - Enhance error messages for mismatched model name and weights. * fix: Correct spurious change, and fix mean/variance shapes for channels_first preprocessing in EfficientNetV2 - Reshaped mean and variance tensors to [1,3,1,1] for proper broadcasting in channels_first mode. - Ensured compatibility with channels_last format while addressing broadcasting errors. * fix ciou implementation bug (#20784) * Add cut_mix processing layer (#20776) * Add cut_mix processing layer * Update implementation * Update logic and refactoring * correct test case failed. * Update cut_mix.py * correct gpu test case failed. --------- Co-authored-by: François Chollet <[email protected]> * Add random_invert layer (#20787) * fix(metrics): Fix BinaryAccuracy metric to handle boolean inputs (#20782) * Fix BinaryAccuracy metric to handle boolean inputs Previously, BinaryAccuracy would return incorrect results when given boolean inputs in JAX backend, and would raise errors in TensorFlow backend. This was because the metric expects numerical values (floats/integers) but wasn't properly handling boolean array inputs. Fix by casting y_true and y_pred to floatx() in MeanMetricWrapper.update_state(). This ensures consistent behavior across backends and proper handling of boolean inputs. * fix: Make the linter happy :) * fix: Align update_state casting with metric's data type * Fix issue with Masking layer with Tensor as `mask_value` (#20791) * Fix issue with Masking layer with Tensor as `mask_value` * fix formatting * Fix reference to nonexistent namespace (#20810) The error message produced when using, for example, a tensorflow math operation in a layer referenced a nonexistent keras.operations namespace (which makes fixing the issue a lot more difficult for newcomers, given that they will encounter it while following examples from the book Deep Learning with Python, 2nd edition). The correct name of the implied namespace is keras.ops. * extract metrics update logic into a helper method (#20805) this change will allow users to customize what happens in the step function while being able to use existing metrics update logic without needing to duplicate it Co-authored-by: Zoe Kendall <[email protected]> * Turn the attribute `_return_attention_scores` into an argument (#20803) * Add random_erasing layer (#20798) * Add initial random_erasing * Update random_erasing logic * Update description and add test case * fix value range bug * add seed for random fill_value * fix torch backend resize with `pad_to_aspectio_ratio` is set to `True` (#20797) * fix torch backend resize with `pad_to_aspectio_ratio` is set to `True` * fix axis for single image * Fix issue for when running gpu * add missing device type * add unit test when pad_to_aspect_ratio set to True * fix numpy backend * nit * fix api method * fix if condition for channels_first * Update fill_mode argument default value in RandomZoom class (#20796) * Update fill_mode argument default value in RansdomZoom class * Update fill_mode argument default value in RansdomZoom document * fix(ops): Handle floating-point edge cases in ops.argmax() (#20808) * fix(ops): Handle floating-point edge cases in argmax - Adjust input for negative zero values in argmax. - Modify implementation to use core ops with floating-point handling. * fix: Make the linter happy :) * fix: Resolve spurious change with TensorFlow graph mode compatibility issues - Improved negative zero handling and axis resolution with graph-compatible tensor ops. * test: Add negative zero handling test for backends (not supported for OpenVINO) * fix: Change to self.assertEqual * fix(ops): Fix ops.argmin() handling of subnormal float values in Keras backends (#20812) - Update JAX and NumPy backends to handle subnormal float comparisons - Add test case to verify subnormal float value handling * Add random_gaussian_blur layer (#20817) * Add random_gaussian_blur * Update description and add test cases * Correct failed test case * fix(layers): Fix incorrect masked mean/variance in BatchNormalization layer (#20815) * fix(layers): Fix incorrect masked mean/variance in BatchNormalization layer Update masked moments calculation to properly account for broadcast dimensions when summing mask weights. Added test to verify broadcast mask handling produces zero-centered outputs. * change: skip test for OpenVINO * fix: Fix OpenVINO compatibility in BatchNormalization layer ops - Convert tuple reduction axes to list format for compatibility with OpenVINO's constant op - Remove OpenVINO skip decorator after fixing axis format * fix: Normalize reduction_axes to list during build Avoid repeated type checks and conversions during forward pass. * fix: Double type-casting * Update SECURITY.md * Fix for deserializing custom functions serialized with Keras <= 3.6. (#20824) Fixes https://github.com/keras-team/keras/issues/20806 This a workaround for an incompatibility between 3.6 and 3.7 introduced by serialization bug fix https://github.com/keras-team/keras/pull/20406 * Fix jax version (#20827) * Update requirements-jax-cuda.txt jax version * Update requirements-jax-cuda.txt * Fix CI breakage with torch-xla. (#20828) Error with torch 2.6: ``` ImportError: /opt/hostedtoolcache/Python/3.9.21/x64/lib/python3.9/site-packages/_XLAC.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN5torch8autograd12VariableInfoC1ERKN2at6TensorE /opt/hostedtoolcache/Python/3.9.21/x64/lib/python3.9/site-packages/torch_xla/__init__.py:20: Impo ``` * Add `signbit` and fix `argmin` and `argmax` (#20821) * Add `signbit` op and fix `argmin` and `argmax`. * Add APIs * Fix CI * Fix torch CI * Simplify the logic * Fix TF GPU CI * Pin version of torch-xla to 2.5.1. (#20834) This is needed to make it compatible with the pinned version of torch we're using. Note that torch-xla 2.6 doesn't support GPU https://pypi.org/project/torch-xla/2.6.0/ GPU support will be coming back with 2.7. * fix(trainers): Add support for DistributedDatasetsFromFunction in data adapters (#20829) The is_tf_dataset() function in data adapters now recognizes DistributedDatasetsFromFunction as a valid TensorFlow dataset type. This allows for properly handling distributed datasets created via strategy.distribute_datasets_from_function() - Added test case to verify distributed datasets from function support * Bump the github-actions group with 2 updates (#20840) Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action). Updates `actions/upload-artifact` from 4.5.0 to 4.6.0 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/6f51ac03b9356f520e9adb1b1b7802705f340c2b...65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08) Updates `github/codeql-action` from 3.28.0 to 3.28.8 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/48ab28a6f5dbc2a99bf1e0131198dd8f1df78169...dd746615b3b9d728a6a37ca2045b68ca76d4841a) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-minor dependency-group: github-actions - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * update random_erasing factor description. (#20837) * [OpenVINO backend] Provide more granular tests exclusion mechanism (#20845) * [OpenVINO backend] Provide more granular tests exclusion mechanism This mechanism is required for open-source community who will provide PRs for each operation. In order to validate PR with the concrete operation support, they should remove the corresponding line. Signed-off-by: Kazantsev, Roman <[email protected]> * Optimize code in conftest.py Signed-off-by: Kazantsev, Roman <[email protected]> * Format code file Signed-off-by: Kazantsev, Roman <[email protected]> * Update keras/src/backend/openvino/excluded_concrete_tests.txt --------- Signed-off-by: Kazantsev, Roman <[email protected]> * Use Python 3.10 for testing environment (#20846) * Use Python 3.10 for testing environment. * Fix TF GPU CI * Update requirements-jax-cuda.txt (#20852) * Don't duplicate frozen parameters during predict() (#20851) On the Jax backend we were not using donate_argnums during predict. This works when a model is mostly trainable, but when a model is mostly or all frozen, this will result in 2x the memory jump (which is why we use donate_argnums for fit and evaluate). This change adds donate_argnums to the predict function to avoid the memory spike. But because this means all incoming state (including the trainable variables) will be deleted by jax, this means we need to sync the trainable variables state much like in fit and evaluate. An alternative would be to change the predict_step signature (so we could only donate non-trainable variables), but this would be a breaking change and confusing. * provided y_true or y_pred add labels for plot image gallery method (#20853) * Fix convnext to work with any custom input tensors (#20854) * Add applications_test.py test for custom input tensors that currently breaks convnext networks * Fix convnext to work with any custom input tensors * Fix code formatting * Fix code formatting * Fix code formatting * Prevent information leakage and improve the ONNX export for the torch backend (#20859) * Use a better setting for `verbose` and improve the onnx export for the torch backend * Fix torch CI * Add Rematerialization to Keras (#20743) * add remat op * update test * remove print statements * remove memory testing * run api_gen.sh * update docstring * add remat scope * code reformat * update scope to return all configs * add remat wrapper to layer * add output size mode * add activation mode to remat * add warnings and ops to numpy and openvino backend * fix torch implementatiopn * update tests * fix tests * update numpy and openvino# * address review comments * fix indentation * skip tests in numpy and openvino * also wrap quantized call * fix jax test * fix test * update docstring and example and expose rematscope * run api_gen * address review comments * update core.py * fix tests * update get remat mode * update exposed apis * update docstring * run api_gen.sh * address review comments * add mock tests to verify remat being called * address review comments * update quantization test * add functional model test * skip tests for numpy and openvino * update remat docstring * fix torch test * rollback changes to test * fix torch test * fix format errors * move remat wrapping logic to operations.py * change jax cuda version to see if array deallocation gets resolved * disable jax gpu test * fix jax version * Add random_perspective layer (#20836) * Add random_perspective layer * Add range check for scale * Update quote for description string * Update transform_bounding_boxes method. * Clear JAX state sharding after `fit`, `evaluate` and `predict`. (#20865) The state sharding is leaked at the end of `fit`, `evaluate` and `predict`. The values are not reused if `fit`, `evaluate` and `predict` is called again. * add backticks to docstring string code keywords (#20863) * add backticks to docstring string code keywords * Update remat.py * fix(layers): Update Conv2D docstring to clarify numerical precision across backends (#20867) * fix(layers): Update Conv2D docstring to clarify numerical precision across backends Clarify that Conv2D operations may exceed the documented 1e-7 precision difference across backends Document that large convolutions can show notable variations due to accumulated floating-point operations * Update conv2d.py --------- Co-authored-by: François Chollet <[email protected]> * Remove `torchvision` dep and simplify `resize` and `rgb_to_grayscale` in torch backend (#20868) * Remove `torchvision` dependency and simplify `resize`. * Add pillow as the testing requirement * fix time_distributed layer with mask and partial_batch_size (#20765) * fix time_distributed layer with mask and partial_batch_size * Fix test fails for non TF backends * Fix formatting issue * test case and inline import of TF * Disable testcase for Numpy backend * Fix lint error * Fix Torch GPU CI (#20877) * fix solve method on linalg (#20879) * [OpenVINO backend] Support numpy.amax and numpy.amin (#20883) Signed-off-by: Kazantsev, Roman <[email protected]> * `HashedCrossing` layer preserves the static batch size when known. (#20889) Previously, the output of `HashedCrossing` would always have `None` batch size as a result of the underlying Tensorflow `tf.sparse.cross_hashed`. The previous reshaping logic in `HashedCrossing` would fix the last dimension (expected to be 1) but not the batch dimension. * `TextVectorization` with `output_sequence_length` returns outputs with a static last dimension of `output_sequence_length`. (#20892) When handling a ragged intermediate tensor, the padding code would still be executed even though `Ragged.to_tensor` already pads correctly. Changed control flow to skip padding. When handling a dense intermediate tensor, the padding is applied from the dynamic shape. Added `set_shape` to apply the static `output_sequence_length`. * fix(ops): Fix TensorFlow backend keras.ops.rot90 shape transformation and improve test coverage (#20882) * fix(ops): Correct TF rot90 shape transformation and improve test coverage Fix shape handling in TF rot90 to correctly swap height/width dimensions based on k rotations. Refactor test suite to use parameterized test cases and cover edge conditions more thoroughly. * refactor: Make linter happy :) * fix ifft2 op with TF backend (#20905) * docs: add params to Sequential.pop docstring (#20896) * docs: add params to Sequential.pop docstring in sequential.py * Remove trailing white space in Sequential.pop docstring in sequential.py * Remove trailing white space in sequential.py * docs: add the default argument value in Sequential.pop docstring in sequential.py * sytle: reformat sequential.py with black * docs: fix Sequential.pop docstring formatting * Always allow `ExportArchive.track` to track TensorFlow resources. (#20906) Previously, `track` would only work with `Layer`s or `Model`s unless the backend was TensorFlow. It would raise an error on JAX for instance. It is now possible to export saved models with a mix of Keras models and TensorFlow native preprocessing involving resources even with the JAX backend. - Added example on how to use `ExportArchive` to export a function combining a model with some TensorFlow native preprocessing with a resource. - Added unit test testing the combining of a model with some TensorFlow native preprocessing with a resource. - Renamed `track` to `_track_layer` in backend specific `ExportArchive` classes because that is the use case. - Use `super()` instead of `BackendExportArchive` for consistency. * Add iterations property to LossScaleOptimizer (#20901) Fixes #20878. TensorBoard isn't able to report the correct step because this optimizer doesn't forward the `iterations` property. * Fix cloning with compiled sequential model (#20888) * Fix cloning with compiled sequential model * Fix cloning with compiled functional model * remove redundant code * Remove redundant code * Add perspective_transform for ops (#20899) * Add perspective_transform for ops * Add perspective_transform for torch * Add perspective_transform for jax * Add perspective_transform for ops * Add perspective_transform test * Fix failed test cases * Fix failed test on torch ci * Update random_perspective to use ops.perspective_transform (#20915) * Update get_perspective_matrix method * Update bbox logic * refactoring random_perspective * apply tensor cast * add dtype conversion * Update base scale factor * correct failed test case * correct failed test case * correct failed test case * Remove scale zero test case * update the logic to use perspective_transform on image layer * Update test cases * Only load OpenVINO excludes file when backend is "openvino". (#20923) It is not necessary to decorate excluded openvino tests with other backends. * Fix `masking_test.py` saving a file in the current folder. (#20924) Tests should only write files in a temp folder. * Recognize placer as a remote location (#20926) * Recognize placer as a remote location Recognize `/placer` paths as remote locations, allowing users to save Keras models directly to Placer paths. * Running ./shell/format.sh * [OpenVINO backend] Support arctan2. (#29010) (#20921) * support arctan2 ov backend * fix format * fix corner case: both x1 and x2 equal zero * [OpenVINO Backend] Include NumpyDtype tests (#20929) Signed-off-by: Kazantsev, Roman <[email protected]> * Remove unused dependency (#20932) * Fix failing jax remat test (#20935) * add jit compile for jax training * change to dense * [Keras Ops] Add `keras.ops.polar` operation (#20930) * Add polar operation and tests * Fix values for corectness test * Specify dtype * merge conflicts (#20934) Co-authored-by: Mohamed I. Hammad <[email protected]> * [Openvino Backend] support arange, modify dtype check (#20941) * Fix mean metrics to allow non-tensor inputs (#20954) * Fix tril/triu ops (#20900) * Fix tril/triu ops * Small change * Facepalm * Handle tensors * Add comment * Address comments * Fix `BinaryAccuracy` to handle boolean inputs. (#20956) This is a follow up to https://github.com/keras-team/keras/pull/20782 and a replacement for https://github.com/keras-team/keras/pull/20782 We cannot cast `y_pred` and `y_true` to the expected output dtype in `MeanMetricWrapper`. Some metrics expect integers (indices or IDs for instance) and fail if `y_pred` and `y_true` are provided as floats. It is the responsibility of the metric function to cast as needed. In this case, the correct approach in `BinaryAccuracy` is to use the regular type promotion rules to ensure that the comparison between `y_pred` and `threshold` is done without losing precision. `ops.greater` already does the type promotion correctly. Previously, `threshold` was incorrectly cast to the `y_pred` dtype, which in this case would lower its precision. * Add gaussian_blur for image (#20943) * Add Gaussian Blur * Add Gaussian Blur for ops * Add gaussian_blur test * Update gaussian_blur args * Correct bug for numpy implementation * Update argument base value * [OpenVINO backend] Support arctan2, pass the NumpyDtypeTest::arctan2 test (#20928) * pass NumpyDtypeTest::arctan2 and add some test cases in NumpyTwoInputOpsCorrectnessTest::arctan2 * newly add NumpyDtypeTest::test_arctan2 * Fix JAX CPU tests - saved_model_export.py (#20962) With JAX 0.5.1, `jax2tf` exports XLA that is not compatible with TensorFlow 2.18, making the `saved_model_export.py` tests fail. Since Tensorflow 2.19 is not out yet, we pin JAX to 0.5.0 for now. * Update the RandomGaussianBlur layer to utilize the image layer method (#20958) * Update random_gaussian_blur layer to use image layer method * Combine two statements into one * Add `antialias` to `layers.Resizing` and add more tests. (#20972) * fix legacy model saving & reloading with axis argument in its layer (#20973) * fix legacy model saving & relaoding with axis arg in layer * fix formatting issue * add temp_file_path * Make gaussian_blur to use scipy convolve2d (#20974) * [OpenVino BackEnd]support np.count_nonzero for ov BackEnd (#20945) * suppoer np.count_nonzero for ov BackEnd * modifing function vars to lowercase * Bump the github-actions group with 3 updates (#20975) Bumps the github-actions group with 3 updates: [ossf/scorecard-action](https://github.com/ossf/scorecard-action), [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action). Updates `ossf/scorecard-action` from 2.4.0 to 2.4.1 - [Release notes](https://github.com/ossf/scorecard-action/releases) - [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md) - [Commits](https://github.com/ossf/scorecard-action/compare/62b2cac7ed8198b15735ed49ab1e5cf35480ba46...f49aabe0b5af0936a0987cfb85d86b75731b0186) Updates `actions/upload-artifact` from 4.6.0 to 4.6.1 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08...4cec3d8aa04e39d1a68397de0c4cd6fb9dce8ec1) Updates `github/codeql-action` from 3.28.8 to 3.28.10 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/dd746615b3b9d728a6a37ca2045b68ca76d4841a...b56ba49b26e50535fa1e7f7db0f4f7b4bf65d80d) --- updated-dependencies: - dependency-name: ossf/scorecard-action dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [OpenVINO backend] Support numpy.append (#20951) * [OpenVINO backend] Support numpy.append Signed-off-by: Lim, Kuan Xian <[email protected]> * Remove NumpyDtype test_append_ from exclude list Signed-off-by: Lim, Kuan Xian <[email protected]> * Fix attribute error Signed-off-by: Lim, Kuan Xian <[email protected]> * Fix NumpyDtypeTest error Signed-off-by: Lim, Kuan Xian <[email protected]> * Update concat to append Signed-off-by: Lim, Kuan Xian <[email protected]> --------- Signed-off-by: Lim, Kuan Xian <[email protected]> * Fix PyTorch stateful RNN/LSTM gradient computation error resolves #20875 (#20916) * Fix PyTorch stateful RNN gradient computation error * Updates post feedback * [Keras Ops and Layer] Add keras.ops.rms_norm() and keras.layers.RMSNormalization() (#20911) * Add RMSNorm and rms_norm * math.square -> numpy.square * Update docstrings * Add RMSNormalization Layer * Update docstrings * Lint with new ruff version * Add tests for layer * Address comments * Convert to tensor if not - avoid openvino and torch typing issues if scale is scalar * address comments * Fix tests * Add reference to paper * Fix docstring to remove input_dim argument * Update layer_normalization.py --------- Co-authored-by: François Chollet <[email protected]> * Fix docstring * Update version number * Enable cuDNN RNNs when dropout is set and `training=True` (#20983) * Fix `Discretization` serialization when `num_bins` is used. (#20971) Previously, serialization / deserialization would fail if: - the layer was saved / restored before `adapt` was called - the layer was saved / restored after `adapt` was called, but the dataset was such that the number of bins learned was fewer than `num_bins` The fix consists in adding a `from_config` to handle `bin_boundaries` separately. This is because at initial creation, `bin_boundaries` and `num_bins` cannot be both set, but when restoring the layer after `adapt`, they are both set. Tightened the error checking: - never allow `num_bins` and `bin_boundaries` to be specified at the same time, even if they match (same as `tf_keras`) - don't allow `num_bins` and `bin_boundaries` to be `None` at the same time - verify that `adapt` has been called in `call` Also removed `init_bin_boundaries` as the value was never used and its presence can be inferred. * Add access to native mesh and layout distribution objects. (#20897) - Added `backend_mesh` property to `keras.distribution.DeviceMesh` to access the native mesh object. - Added `backend_layout` property to `keras.distribution.TensorLayout` to access the native layout or sharding object. The values are cached. Changed the code to access these directly instead of calling the convertion functions every time. Made the following renames so that these functions can be used in backend agnostic code: - `_to_jax_device` to `_to_backend_device` - `_to_jax_mesh` and `_to_dtensor_mesh` to `_to_backend_mesh` - `_to_jax_layout` and `_to_dtensor_layout` to `_to_backend_layout` * Don't require jax on the numpy backend (#20989) We can still use it for the resize op, but we shouldn't fail to import without jax installed. * Fixes inconsistent serialization logic for inputs (#20993) * Removes unnesting logic for input tensors in functional model deserialization flow * Adds test case for verifying nested input restoration after deserialization removes unnecessary imports * fixes imports * Fix flash attention TPU error (#20994) * Fix flash attention TPU error * fix space * fix default mask * update default mask if none check in wrapping function instead * Add optional arg for attention logits soft cap for jax tpu backend (#20999) * Fix flash attention TPU error * fix space * fix default mask * update default mask if none check in wrapping function instead * allow dot_product attention to accept optional logits soft cap value * add optional attention soft cap arg * fix test and add error message * fix import error * code reformat * remove jax dependency from numpy image layer (#21000) * Wrap tf variables in keras variables for TFSMLayer (#20995) Fixes #20955 * [OpenVINO Backend] Support numpy exp and expand_dims (#21006) Signed-off-by: Kazantsev, Roman <[email protected]> * [Good First Issue][Keras 3 OpenVINO Backend]: Support numpy.dot operation #29119 (#20982) * Implement dot operation for openvino * Enable dot tests * Add pytest.ini in the root directory * Fix style issues * Handle scaler inputs and fix code formate * Handle scaler inputs and fix code formate * Delete pytest.ini * Remove scaler handling * Handle scaler inputs * Handle scalers and style format * update scaler handling * Fix the format of the numpy.py file * Fix sytling issues * Fix sytling issues --------- Co-authored-by: Saif Mohammed <[email protected]> * Add elastic_transform processing for image.py (#20977) * Add elastic_transform for numpy * Add elastic_transform for torch * Add elastic_transform for jax * Add elastic_transform for tensorflow * Add seed generator for elastic_transform * Add interpolation args * Add fill_model and fill_value for args * Add elastic_transform for ops layer * Add test cases * Ensures that the layer is marked as built when the `build` is not overriden (#20880) * Ensure that the layer is correctly marked as built. * Add `_build_at_init` in `Layer` and use it everywhere. * Fix typos and add a test case for elastic_transform (#21007) * fix mis typo * Add test case * Re-run test case CI * [OpenVINO backend]: Support numpy.bincount (#20940) * feat: implement numpy.bincount for openvino backend rebased fix: hardcode dtype int32 when weights=none Signed-off-by: 11happy <[email protected]> fix: use np.expand_dims Signed-off-by: 11happy <[email protected]> remove unecessary headers Signed-off-by: 11happy <[email protected]> style: reformat numpy_test.py Signed-off-by: 11happy <[email protected]> * fix: correct test files Signed-off-by: 11happy <[email protected]> * fix: reshape depth to scalar Signed-off-by: 11happy <[email protected]> * fix: use reshape correctly Signed-off-by: 11happy <[email protected]> * fix: take reference from transpose impl to use scalar shape Signed-off-by: 11happy <[email protected]> * fix use squeeze Signed-off-by: 11happy <[email protected]> * revert to prv impl Signed-off-by: 11happy <[email protected]> * fix: scalar type issue Signed-off-by: 11happy <[email protected]> * refactor: reduce on rank-1 to have correct results Signed-off-by: 11happy <[email protected]> --------- Signed-off-by: 11happy <[email protected]> * Fix torch CI * [OpenVINO backend] Support numpy.argsort (#20913) * [OpenVINO backend] Support numpy.argsort * [OpenVINO backend] explicitly specify bf16 in get_ov_output from bfloat16 numpy arrays * remove NumpyOneInputOpsCorrectnessTest::test_argsort * Fix argsort to handle dynamic shapes * Fix incorrect argument in JAX flash attention. (#21014) The mask is named `array` in `NumpyMask`. * Restore variables on `fit()` interrupt with Jax backend (#21019) * restore variables on `fit()` interrupt * fix test * linter fixes * [OpenVINO backend] Support numpy.full_like (#21008) * [OpenVino BackEnd] support np.diff for ov BackEnd (#20950) * [OpenVino BackEnd] support np.diff for ov BackEnd * [OpenVino BackEnd] support np.diff for ov BackEnd * [OpenVino BackEnd] support np.diff for ov BackEnd * [OpenVino BackEnd] support np.diff for ov BackEnd * [OpenVino BackEnd] support np.diff for ov BackEnd * [OpenVino BackEnd] support np.diff for ov BackEnd * [OpenVino BackEnd] support np.diff for ov BackEnd * [OpenVINO backend] Support numpy.empty (#21010) * [OpenVINO Backend] numpy.empty implementation * fix: reformatted * fix: fixed final lint issues * fix: updated empty logic * Add RandomElasticTransform layer (#21018) * Add random_elastic_transform * Add random_elastic_transform test case * Correct random_elastic_transform failed test case * Make `import_test.py` debuggable from console output. (#21033) Previously, if no wheel was found, the `[-1]` subscript would fail, preventing the `if not whl_path` clause from outputting the error message. * Make code compatible with Numpy >= 2.1. (#21032) Starting with 2.1, the first argument of `np.reshape` is positional only. Removed keyword `a` and for consistency did the same with other backends. * Fix bitwise `left_shift` and `right_shift` result dtype... (#21034) when second argument is a constant int. Previously, a `convert_to_tensor` was applied to the second argument, making it an `int32` or `int64`. The result dtype would take into account this dtype, which could upgrade the dtype of the result. The expectation is that if the second argument is a constant, the result dtype is the same as the first argument. This is already supported correctly by all underlying backend implementations. * [OpenVINO Backend] Get back tests for exp and expand_dims to precommit (#21038) Signed-off-by: Kazantsev, Roman <[email protected]> * [Documentation] Updated Binary Focal Crossentropy Loss Docstring (#21036) * updated binary focal loss docstring * update to docstring comment * fixed typo * Fix optree regsitration (#21049) The following will break as reimporting Keras will try to re-register the Tensorflow list/dict wrappers. Presumably anything that forced an actual reimport of `keras` would trigger the same crash. ```python import keras keras.config.set_backend("tensorflow") ``` * Lion typo fix (#21056) * Add support for torch tensors on meta device (#21053) * Add support for torch tensors on meta device * Add unitary test * Fix unitary test * feat: add Categorical Generalized Cross Entropy (GCE) loss (#21024) * feat: add Categorical Generalized Cross Entropy (GCE) loss * run api generation * docs: Align docstrings with Keras style guide * docs: more docstring changes * Fix torch gpu tests. (#21063) * Introduce weights sharding (#21022) * Introduce weights sharding * Address comments and update the format of the config file. * Update docstring * Resovle comments and add more basic tests for `H5IOStore` and `ShardedH5IOStore`. * Improve `H5IOStore`. (#21067) * [Documentation] Added Dice Loss Function Example to Docstring (#21064) * added example to dice loss function * linted with ruff * Allow synchronization value to be set on Variables (#21072) And use on_read synchronization for Metric variables. * implement of muon optimizer (#21037) * implement of muons * format * renew note * api_gen * api_gen * api_gen * fix argument and args * fix argument and args * Docstring fixes for Muon optimizer. * Add pre-commit hooks (#21074) * Add pre-commit hooks * Add instructions to run pre-commit manually * Use tf.int32.min rather than relying on integer overflow (#21077) * Fix warning for random_saturation (#21066) * Fix warning for random_saturation * Update random_saturation.py * Update random_saturation.py * Update 1e-6 to epsilon() * merge master --------- Co-authored-by: François Chollet <[email protected]> * Special handling of Torch DDP in callback (#21081) * Special handling of Torch DDP in callback * Use inheritance tree for DDP check Modified DDP check to use isinstance rather than type().__name__ for robustness. Fixed additional whitepace * Fixing comment. * inlining DDP import where its needed. * fix muon document (#21079) * fix muon argument * fix muon argument * change behavior * add some test * add some test * fix * fix * [OpenVINO backend] Support numpy.log10 (#21042) * [OpenVINO backend] Support numpy.log10 * Address review feedback on log10 implementation * Fix log function and update excluded_concrete_tests.txt * Raise error if inputs are not connected with output in functional model (#20705) * Raise error if inputs are not connected with output in functional model * Fix Failing test case for unconnected inputs/outputs * fix formatting issue * Fix functional dict inputs to support optional ones (#21030) * Fix functional dict inputs to support optional ones * Add unit test for optional dict inputs * Fix unit test formatting * [OpenVino BackEnd] support np.log2 for ov BackEnd (#21048) * [OpenVino BackEnd] support np.log2 for ov BackEnd * [OpenVino BackEnd] support np.log2 for ov BackEnd * [OpenVino BackEnd] support np.log2 for ov BackEnd * [OpenVino BackEnd] support np.log2 for ov BackEnd * Fix `Model.export` to Saved Model for models with dict inputs. (#21095) Fixes https://github.com/keras-team/keras/issues/20835 Also changed multi-input tests to exercise `model.export()` and its signature inference logic. * Fix scatter_update for torch (#21101) * Refactor ModelCheckpoint Save Logic (#21100) The _save_model method combined the logic to determine if the checkpoint should be saved, and the logic to create the paths and save the checkpoint. This commit separates the check the determine whether the checkpoint should be saved from the I/O logic, and in doing so resolves two bugs in the current implementation: 1) Host directory is created for every for save iteration, regardless of whether the model will be saved or not. For example, when `save_freq == 'epoch'` and `save_best_only == True`, a folder is created for every epoch, even though the model is only saved when the monitored condition is satisfied. This results in a large number of empty folders and makes it difficult to identify the most recently saved checkpoint. With this commit, the directory to save the model or model weights is only created when necessary. 2) If save_best_only=True, and the monitored value is an np.ndarray or backend tensor, then it falls back to `save_best_only=False` and saves the model. However, in this scenario, it save saves the whole model without regard to the value of `self.save_weights_only` This commit uses consistent save logic that always checks the value of `self.save_weights_only`. * Add verification to remat tests. (#21102) The functions that go through `remat()` should actually be called, if not, remat is not really applied. * Fix Remat error when called with a model (#21094) * add print * fix remat issue * simplify code * enable traceback filtering and update the function sig * add a wrapper for activations * change to except * add layer call decorator * fix remat call * `TrackingTest` no longer assigns None to variables (#21106) JAX will soon fail when `jnp.array` is called with None, so this test will be broken under newer JAX versions if kept as is. * #21088: fixes activation layer serialization/deserialization logic (#21117) * fixes activation layer serialization logic * adds additional test case for string identifiers * makes pre-commit happy * fixed torch version issue for macOS (#21136) * Add alpha argument description to elu docstring (#21142) * [OpenVINO backend] Support numpy.expm1 (#21141) * [OpenVINO backend] Support numpy.expm1 * remove a line with NumpyOneInputOpsCorrectnessTest::test_expm1 * does nothing * does nothing * Fix Functional model graph under global dtype policy. (#21134) When constructing a Functional model with a global dtype policy, a spurious `Cast` operation would appear in the graph before each layer. This cast is part of the layer `__call__` method and should not appear separately. * Bump the github-actions group with 2 updates (#21113) Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action). Updates `actions/upload-artifact` from 4.6.1 to 4.6.2 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/4cec3d8aa04e39d1a68397de0c4cd6fb9dce8ec1...ea165f8d65b6e75b540449e92b4886f43607fa02) Updates `github/codeql-action` from 3.28.10 to 3.28.13 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/b56ba49b26e50535fa1e7f7db0f4f7b4bf65d80d...1b549b9259bda1cb5ddde3b41741a82a2d15a841) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update training_with_built_in_methods.py (#21098) Clarify parameter name * [OpenVINO backend]: Implement numpy.identity (#21083) * openvino backend implement numpy.identity Signed-off-by: 11happy <[email protected]> * use openvino DTYPES and exclued test Signed-off-by: 11happy <[email protected]> --------- Signed-off-by: 11happy <[email protected]> * Enable SparseCategoricalCrossentropy to accept and propagate axis (#21104) * feat: Enable SparseCategoricalCrossentropy to accept and propagate axis; minor PyTorch implementation update to support channel-first layouts * formatting * Modified Example code in numerical_utils (#21125) * Add configurable lora_alpha parameter for LoRA in multiple Keras layers (#21139) * feat: Add alpha parameter to enable_lora Adds an alpha scaling parameter to LoRA layers, defaulting to rank for backward compatibility. * feat: Add lora_alpha tests to Dense, Embedding, and EinsumDense layers * fix: Fix LoRA test failures by using ops to do numpy conversion * fix: remove .numpy() in LoRA tests * docs: Apply backticks to keywords per review Updated docstrings to enclose parameters like 'alpha' and 'rank' in backticks as requested in PR review. * Add OpenVINO backend support for argmin and argmax (#21060) * Update numpy.py * Update excluded_concrete_tests.txt * all issues fixed * Update numpy.py * numpy.py reformatted * Update excluded_concrete_tests.txt * Update excluded_concrete_tests.txt * Update excluded_concrete_tests.txt * Update excluded_concrete_tests.txt * Update excluded_concrete_tests.txt * Update excluded_concrete_tests.txt * Add support for dynamic dimensions for ops handling `tf.IndexedSlices`. (#21148) Fixes https://github.com/keras-team/keras/issues/21069 * [OpenVINO backend] Added support for numpy.isclose operation (#21138) * Added decomposition for numpy.isclose * Removed test from excluded list * Fixed failed test cases * Fixed dtype error * Aligns Softmax masking behavior with JAX for fully masked axis (#21149) * Fixes softmax masking logic to match JAX behavior * fix comment * use backend.numpy.multipy for element-wise multiplication * Removing references to jax.config.spmd_mode('allow_all'). (#21164) This flag no longer does anything in jax. * allow TorchModuleWrapper compute output shape (#21160) * allow TorchModuleWrapper compute output shape * modify * Add details when `TestCase.run_layer_test` output verification fails. (#21165) Adds the expected/actual output shapes/dtypes in the failure message. Also greatly simplifies the code by using `keras.tree`. * Improve `tf.RaggedTensor` support in `DataAdapter`s. (#21170) Previously, only 2D Tensorflow ragged tensors were supported. This adds support for any rank. Also added tests for ragged tensors with `GeneratorDataAdapter`. * WIP: Add PyTorch backend support for LSTM with CuDNN optimization (#21135) * WIP: Add PyTorch backend support for LSTM with CuDNN optimization * WIP: Add PyTorch backend support for LSTM with CuDNN optimization * Add backward compatibility to PyTorch-backed LSTM implementation with cuDNN support * Updates to adress failed tests * Handling formatting errors * Add `tf.RaggedTensor` support to `Embedding` layer. (#21171) Adds support for indices indices in the form of a `tf.RaggedTensor` to the `Embedding` layer by adding support to `ops.take`. The output is also ragged. Also: - adds support for negative indices in the sparse tensor use case. - adds support for ragged tensors in `TestCase.run_layer_test`. * [OpenVINO Backend]: support numpy.ndim (#21176) * feat: support numpy.ndim Signed-off-by: 11happy <[email protected]> * use shapeof shapeof method Signed-off-by: 11happy <[email protected]> --------- Signed-off-by: 11happy <[email protected]> * Fix Embedding test with ragged tensors on GPU. (#21177) The loss needs to not have any non-compilable op. * Add sparse_sigmoid activation (#21175) * Add sparse_sigmoid activation layer * Correct typo * [OpenVINO BACKEND] - feat: implement numpy.nonzero for openvino backend (#21163) * feat: implement numpy.nonzero for openvino backend Signed-off-by: 11happy <[email protected]> * format code Signed-off-by: 11happy <[email protected]> --------- Signed-off-by: 11happy <[email protected]> * Add sparse support to `ops.ones_like` and `ops.zeros_like`. (#21181) `ops.zeros_like` is in particular useful for creating a mask of the populated values in the sparse tensor. * Fix dtype detection for JAX types. (#21184) The jax types like `jax.float32` have a string representation of ``` <class 'jax.numpy.float32'> ``` so with the previous code, would be "standardized" as `float32'>` (trailing quote and angle bracket), which is an invalid type. But, the JAX dtypes _do_ have a `__name__` property, so should be properly detected if we switch the order around. Kept the old `jax.numpy` string version in place in case that worked with older versions of JAX. * Bump the python group with 5 updates (#21114) Updates the requirements on [tensorflow-cpu](https://github.com/tensorflow/tensorflow), [tensorflow](https://github.com/tensorflow/tensorflow), [torch](https://github.com/pytorch/pytorch), [torch-xla](https://github.com/pytorch/xla) and [tensorflow[and-cuda]](https://github.com/tensorflow/tensorflow) to permit the latest version. Updates `tensorflow-cpu` to 2.18.1 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/v2.18.1/RELEASE.md) - [Commits](https://github.com/tensorflow/tensorflow/compare/v2.18.0...v2.18.1) Updates `tensorflow` to 2.18.1 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/v2.18.1/RELEASE.md) - [Commits](https://github.com/tensorflow/tensorflow/compare/v2.18.0...v2.18.1) Updates `torch` from 2.5.1+cu121 to 2.6.0 - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/commits/v2.6.0) Updates `torch-xla` from 2.5.1 to 2.6.0 - [Release notes](https://github.com/pytorch/xla/releases) - [Commits](https://github.com/pytorch/xla/compare/v2.5.1...v2.6.0) Updates `tensorflow[and-cuda]` to 2.18.1 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/v2.18.1/RELEASE.md) - [Commits](https://github.com/tensorflow/tensorflow/compare/v2.18.0...v2.18.1) --- updated-dependencies: - dependency-name: tensorflow-cpu dependency-type: direct:production dependency-group: python - dependency-name: tensorflow dependency-type: direct:production dependency-group: python - dependency-name: torch dependency-type: direct:production update-type: version-update:semver-minor dependency-group: python - dependency-name: torch-xla dependency-type: direct:production update-type: version-update:semver-minor dependency-group: python - dependency-name: tensorflow[and-cuda] dependency-type: direct:production dependency-group: python ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix `Embedding.compute_output_spec` with a non-`KerasTensor` input. (#21192) The `ragged` attribute exists only with `KerasTensor`s. Minor fix of a unit tests that was using the same local variable for two nested loops. * Allow `Embedding` subclasses to only override `compute_output_shape`. (#21195) Without the need to also override `compute_output_spec`. * Return explicitly layout if already set on variable. (#21194) If explicitly overwriting a variable._layout, we want to keep this layout in any future calls. This allows auxiliary variables (e.g. optimizer gradients, momentums) to use the same explicit layout. * Don't scale gradients if overwriting variable with gradient. (#21193) If overwriting, the gradient represents the desired final value of the variable, so if we did scale it, we're changing that value. * Redundant imports; no path hacking in package (#21187) * Add back shell/format.sh, but it just runs pre-commit (#21197) For folks who are used to the old format, this will print instructions. And for people like me, saves needing to remember `SKIP=api-gen pre-commit run --all-files` When I just want the formatter. api_gen.py is too slow to run every time. * Add openvino to the basic requirements file (#21198) Unlike jax/torch/tensorflow which all vie for a certain cuda, I don't think openvino has trouble co-installing. And without the basic requirements.txt will not give a working dev environment. You can't run pre-commit without openvino installed. * [Keras 3 OpenVINO Backend]: Support numpy.log1p operation #29487 (#21129) * Supports numpy.log1p operation * Applied api-gen hook modifications * Revert "Applied api-gen hook modifications" This reverts commit 2b880fa3a3c47650fdbd32ebc98005fa1949e887. * Excluded Concrete Tests * Put Blank Line * Add pre-commit to the common requirements file (#21199) We also want it for cuda installations. * Fix nightly releases (#21203) They have been broken for a month * Update version number * [OpenVINO Backend] Support numpy min operation (#21168) * Add numpy min for OV Backend * Add boolean case * Fix failing tests issue * Update implementation * Adds Support For Custom Call-Context Arguments (#21204) * Adds support for call context args * formatting fixes * passes kwargs to compute_output_spec of each layer for a sequential model * removes requirement for outer layers to declare context args in call signature * renames call_context_flags to call_context_args * Adds default return value for dictionary lookup * addresses comments * fixup comments * modifies test case to not handle context-arg in intermediate layer * fix comment * Recognize /tfhub as a remote location. (#21211) * Recognize /tfhub as a remote location. * Add test * Fix Trainer.get_compile_config base case (empty dict) (#21212) * Implement angle function in keras.ops (#21200) * Add first version of angle operation on numpy * Skip test with bfloat16 on numpy * Remove bfloat16 checking on Angle * Fix test case for float16 on torch cuda * exclude openvino test case * exclude openvino test case * exclude openvino test case * Update init files * Fix warnings * [OpenVINO Backend] : add support for numpy.nan_to_num (#21186) * feat: add support for numpy.nan_to_num Signed-off-by: 11happy <[email protected]> * use np.inf Signed-off-by: 11happy <[email protected]> * correct implementation based on new tests Signed-off-by: 11happy <[email protected]> * use np only torch having import errors Signed-off-by: 11happy <[email protected]> * use inf approach Signed-off-by: 11happy <[email protected]> * refactor code Signed-off-by: 11happy <[email protected]> --------- Signed-off-by: 11happy <[email protected]> * Clear static loss-scale for inner optimizer in LossScaleOptimizer. (#21233) The outer `LossScaleOptimizer` ignores the inner's loss-scale factor when scaling the loss. When computing unscaled gradients, we therefore need to eliminate the inner's loss scale factor, otherwise the gradients get incorrectly scaled. * Update conftest.py (#21220) * Update conftest.py updated the requires_trainable_backend decorator to use in operator for checking backend values. * Update conftest.py * Adds `_register_call_context_args` to declare and use call-context arguments. (#21222) * Adds register_call_context_args API to layer class for better UX * remove type hints * Fixes typo + adds tests * Fixes comment * Improves test coverage * Added tests * Makes methods underscore-private * Rename @property to _call_context_args * makes _register_call_context_args the canonical way to use call context args * minor test fix * Updated confusion_metrics.py (#21227) Modified compile() API Code. * Don't create unused optimizer variables. (#21232) If `variable.overwrite_with_gradient == True`, then the only optimizer variable ever used for that variable is `base_optimizer._accumulated_gradients`. All other optimizer variables are unused. This can be extremely wasteful if the training variables are large, for example in the case of large embedding tables that span multiple hosts/devices. Added a convenience function in the base optimizer `add_optimizer_variables(...)` that loops through the variable list and automatically adds a variable only if appropriate. If a variable would otherwise be unused, a `None` is inserted into the list. This is needed to keep `optimizer._get_variable_index()` consistent. Updated all built-in optimizers to use this. NOTE: if a custom optimizer that exists out in the wild still does create unused optimizer variables, the optimizer should still work - it will just be wasteful. IOW this should not be a breaking change. * Implement bartlett function in keras.ops (#21214) * Add bartlett for ops * Update excluded_concrete_tests.txt * Fix stacked RNN with mask in JAX & Numpy backends (#21224) * Fix stacked RNN with mask in JAX backend * Add unit test for stacked RNN mask * Fix stacked RNN with mask in Numpy backend * Move unit test to stacked_rnn_cells_test * Bump github/codeql-action in the github-actions group (#21237) Bumps the github-actions group with 1 update: [github/codeql-ac…

google-ml-butler bot added the size:L label Mar 13, 2025

google-ml-butler bot assigned gbaned Mar 13, 2025

Introduce weights sharding

e7aa098

james77777778 force-pushed the add-weights-sharding branch from 6315817 to e7aa098 Compare March 13, 2025 08:32

mattdangerw self-requested a review March 13, 2025 21:16

google-ml-butler bot added the awaiting review label Mar 13, 2025

mattdangerw reviewed Mar 13, 2025

View reviewed changes

james77777778 added 3 commits March 14, 2025 11:46

Merge branch 'master' of github.com:keras-team/keras into add-weights…

91c8a27

…-sharding

Address comments and update the format of the config file.

2c00862

Update docstring

4503ecd

james77777778 force-pushed the add-weights-sharding branch from b0dcd6d to 4503ecd Compare March 14, 2025 08:19

fchollet approved these changes Mar 18, 2025

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Mar 18, 2025

kokoro-team removed the kokoro:force-run label Mar 18, 2025

mattdangerw approved these changes Mar 18, 2025

View reviewed changes

google-ml-butler bot added the kokoro:force-run label Mar 18, 2025

kokoro-team removed the kokoro:force-run label Mar 18, 2025

james77777778 added 2 commits March 19, 2025 09:42

Merge branch 'master' of github.com:keras-team/keras into add-weights…

e080f49

…-sharding

Resovle comments and add more basic tests for H5IOStore and `Sharde…

7464cdc

…dH5IOStore`.

google-ml-butler bot removed the ready to pull Ready to be merged into the codebase label Mar 19, 2025

fchollet merged commit d865f5f into keras-team:master Mar 19, 2025
8 checks passed

google-ml-butler bot removed the awaiting review label Mar 19, 2025

james77777778 deleted the add-weights-sharding branch March 19, 2025 05:56

Introduce weights sharding #21022

Introduce weights sharding #21022

Uh oh!

Conversation

james77777778 commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

james77777778 Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fchollet commented Mar 18, 2025

Uh oh!

james77777778 commented Mar 19, 2025

Uh oh!

Uh oh!

Uh oh!

james77777778 commented Mar 13, 2025 •

edited

Loading

codecov-commenter commented Mar 13, 2025 •

edited

Loading

james77777778 Mar 14, 2025 •

edited

Loading