Bugfix: Pydantic deserialization for FlyteFile and FlyteDirectory #3339
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #6669
Tracking issue
Closes flyteorg/flyte#6669
Why are the changes needed?
When deserializing Pydantic models containing
FlyteFile
orFlyteDirectory
fields usingmodel_validate()
, the deserialized objects weremissing private attributes (
_remote_source
,_downloader
, etc.). This caused anAttributeError
when attempting to re-serialize theseobjects with
model_dump()
, breaking the serialize → deserialize → serialize cycle.This is a critical bug that prevents users from using FlyteFile/FlyteDirectory within Pydantic BaseModel classes in a normal way.
What changes were proposed in this pull request?
Added Pydantic model validators to both
FlyteFile
andFlyteDirectory
classes that ensure private attributes are properly initialized duringdeserialization:
FlyteFile (
flytekit/types/file/file.py
):deserialize_flyte_file
validator to check if private attributes existdict_to_flyte_file()
transformerFlyteDirectory (
flytekit/types/directory/types.py
):deserialize_flyte_dir
validatordict_to_flyte_directory()
to properly reconstruct the objectHow was this patch tested?
Added two new unit tests in
test_pydantic_basemodel_transformer.py
:test_flytefile_pydantic_model_dump_validate_cycle
- Verifies FlyteFile can be serialized, deserialized, and re-serialized without errorstest_flytedirectory_pydantic_model_dump_validate_cycle
- Same for FlyteDirectoryCheck all the applicable boxes
Summary by Bito
This pull request fixes a critical bug in deserializing Pydantic models with FlyteFile and FlyteDirectory fields by adding model validators. These validators ensure proper initialization of private attributes, preventing AttributeErrors during re-serialization. New unit tests have also been included to confirm the functionality of these enhancements.