-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Describe the bug
When protobuf enums are serialized to JSON (common in REST APIs), they are typically represented as strings (e.g., "BATCH_FILE") rather than integers. The generated Pydantic models don't include field validators to handle this standard protobuf JSON serialization format, causing deserialization failures.
Dependencies
python version: sys.version_info(major=3, minor=11, micro=8, releaselevel='final', serial=0)
############# dependencies ##############
grpc: 1.62.3
pydantic: 2.10.6
########## Expand dependencies ##########
mypy-protobuf: 3.3.0
toml: 0.10.2
########## Format dependencies ##########
autoflake: Not Install
black: Not Install
isort: Not Install
Protobuf File Content
Filename: feast/core/DataSource.proto
syntax = "proto3"; package feast.core; message DataSource { enum SourceType { INVALID = 0; BATCH_FILE = 1; BATCH_BIGQUERY = 2; STREAM_KAFKA = 3; } SourceType type = 1; string name = 2; }
CLI (if use plugin mode)
python -m grpc_tools.protoc \ -I. \ --protobuf-to-pydantic_out=. \ feast/core/DataSource.proto
Output content
Filename: feast/core/DataSource_p2p.py
from pydantic import BaseModel, Field from enum import IntEnum class DataSource(BaseModel): class SourceType(IntEnum): INVALID = 0 BATCH_FILE = 1 BATCH_BIGQUERY = 2 STREAM_KAFKA = 3 type: "DataSource.SourceType" = Field(default=0) name: str = Field(default="")
Expected behavior
The generated model should include a field validator to handle both integer and string representations:
from pydantic import BaseModel, Field, field_validator
from enum import IntEnum
class DataSource(BaseModel):
class SourceType(IntEnum):
INVALID = 0
BATCH_FILE = 1
BATCH_BIGQUERY = 2
STREAM_KAFKA = 3
type: "DataSource.SourceType" = Field(default=0)
name: str = Field(default="")
@field_validator('type', mode='before')
@classmethod
def validate_type(cls, v):
if isinstance(v, str):
# Convert string enum names to values
return cls.SourceType[v]
return v
Reproduction:
# JSON from REST API (standard protobuf JSON format)
json_data = {"type": "BATCH_FILE", "name": "my_source"}
# Current behavior: FAILS
ds = DataSource(**json_data)
# ValidationError: Input should be a valid integer
# Expected behavior: SUCCEEDS
ds = DataSource(**json_data)
assert ds.type == DataSource.SourceType.BATCH_FILE
Additional context
According to the protobuf JSON mapping spec, enums are serialized as strings in JSON format. This is the standard behavior for protobuf REST APIs, so generated Pydantic models should support this out of the box.