Skip to content

Optional Message Fields Generated as Non-Optional with default_factory #115

@aniketpalu

Description

@aniketpalu

Describe the bug
When protobuf messages have optional message-typed fields, protobuf_to_pydantic generates non-optional fields with default_factory=MessageClass() instead of Optional[MessageClass] with default=None. This makes it impossible to distinguish between "field not set" and "field set to empty message", which is semantically different in protobuf.

Dependencies

############# dependencies ############## 
    grpc:            1.62.3
    pydantic:        2.10.6

########## Expand dependencies ########## 
    mypy-protobuf:   3.3.0
    toml:            0.10.2

########## Format dependencies ########## 
    autoflake:       Not Install
    black:           Not Install
    isort:           Not Install

Protobuf File Content

Filename: feast/core/DataSource.proto

syntax = "proto3";
package feast.core;

message DataSource {
string name = 1;
string description = 2;
}

Filename: feast/core/FeatureView.proto

syntax = "proto3";
package feast.core;

import "feast/core/DataSource.proto";

message FeatureView {
string name = 1;
optional DataSource batch_source = 2;
optional DataSource stream_source = 3;
}

CLI(if use plugin mode)

python -m grpc_tools.protoc \
-I. \
--protobuf-to-pydantic_out=. \
feast/core/DataSource.proto feast/core/FeatureView.proto

Output content

Filename: feast/core/FeatureView_p2p.py

from pydantic import BaseModel, Field
import typing
from .DataSource_p2p import DataSource

class FeatureView(BaseModel):
name: str = Field(default="")
batch_source: DataSource = Field(default_factory=DataSource)
stream_source: DataSource = Field(default_factory=DataSource)

Expected behavior

from pydantic import BaseModel, Field
import typing
from .DataSource_p2p import DataSource

class FeatureView(BaseModel):
name: str = Field(default="")
batch_source: typing.Optional[DataSource] = Field(default=None)
stream_source: typing.Optional[DataSource] = Field(default=None)

Reproduction

# Current behavior
fv = FeatureView(name="test")
print(fv.batch_source)  # DataSource(name='', description='')
# Problem: Can't tell if batch_source was explicitly set or just defaulted

# Expected behavior
fv = FeatureView(name="test")
print(fv.batch_source)  # None
# Clear indication that batch_source was not provided

# When explicitly set
fv = FeatureView(name="test", batch_source=DataSource(name="source1"))
print(fv.batch_source)  # DataSource(name='source1', description='')

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions