You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to the documentation, the ShallowRedisSaver and AsyncShallowRedisSaver are intended to retain only the latest complete checkpoint, including associated blobs and write entries. However, this behavior does not hold due to two related issues:
Key Formatting Inconsistency:
The shallow saver classes do not use the same key formatting utilities (to_storage_safe_id, to_storage_safe_str) as the base classes, especially when handling empty checkpoint_ns values. As a result, when searching for outdated keys, orphaned checkpoint_blob and checkpoint_write entries are left undeleted because the key patterns do not match what was written (e.g., empty strings are not encoded to __empty__).
Unsafe Channel Value Parsing:
Current logic does not encode channel values. If a channel value contains the delimiter character :, the split/parse logic (see shallow.py#L182-L186) misinterprets keys, causing it to miss the correct entries for cleanup.
The key pattern generated by _make_shallow_redis_checkpoint_blob_key_pattern (ref) misses this blob because it doesn't encode empty strings the same way as _dump_blobs, so old blobs aren't deleted.
Even with key pattern issue fixed, channel values containing : cause parsing errors. E.g., the parsing would incorrectly interpret channel="branch" and version="to" instead of the intended values.
Suggested Solutions
Consistently apply encoding (e.g., base64) to all interpolated values (including thread_id, checkpoint_ns, and channel) to avoid delimiter collisions.
Ensure that key formatting (including handling empty strings) is unified across both base and shallow classes to avoid orphaned entries. An initial draft PR fix: cleanup blobs and writes for shallow classes #37 for this was created (not addressing the delimiter issue though)
The text was updated successfully, but these errors were encountered:
Bug Description
According to the documentation, the
ShallowRedisSaver
andAsyncShallowRedisSaver
are intended to retain only the latest complete checkpoint, including associated blobs and write entries. However, this behavior does not hold due to two related issues:Key Formatting Inconsistency:
The shallow saver classes do not use the same key formatting utilities (
to_storage_safe_id
,to_storage_safe_str
) as the base classes, especially when handling emptycheckpoint_ns
values. As a result, when searching for outdated keys, orphanedcheckpoint_blob
andcheckpoint_write
entries are left undeleted because the key patterns do not match what was written (e.g., empty strings are not encoded to__empty__
).Unsafe Channel Value Parsing:
Current logic does not encode channel values. If a channel value contains the delimiter character
:
, the split/parse logic (see shallow.py#L182-L186) misinterprets keys, causing it to miss the correct entries for cleanup.Minimal Example
Given:
The following key would be produced:
Current behavior:
_make_shallow_redis_checkpoint_blob_key_pattern
(ref) misses this blob because it doesn't encode empty strings the same way as _dump_blobs, so old blobs aren't deleted.:
cause parsing errors. E.g., the parsing would incorrectly interpretchannel="branch"
andversion="to"
instead of the intended values.Suggested Solutions
thread_id
,checkpoint_ns
, andchannel
) to avoid delimiter collisions.The text was updated successfully, but these errors were encountered: