Skip to content

ObjectType instantiation and pydantic datamodel #183

@CagtayFabry

Description

@CagtayFabry

I started playing around with this package and found it really great, especially using pydantic in the background makes some things very smooth and clear. 👍

However, I also stumbled upon a case that was quite confusing for me: What happens when I instantiate an ObjectType(or more specifically one of its derived classes like ExperimentalStep )

I will try to explain my understanding and source of confusion:

Looking at bam_masterdata.datamodel_object_types.ExperimentalStep:

class ExperimentalStep(ObjectType):
    defs = ObjectTypeDef(
        code="EXPERIMENTAL_STEP",
        description="""Experimental Step (generic)//Experimenteller Schritt (allgemein)""",
        generated_code_prefix="EXP",
    )

    name = PropertyTypeAssignment(
        code="$NAME",
        data_type="VARCHAR",
        property_label="Name",
        ...
    )

    show_in_project_overview = PropertyTypeAssignment(
        code="$SHOW_IN_PROJECT_OVERVIEW",
        data_type="BOOLEAN",
        property_label="Show in project overview",
        ...
    )

    finished_flag = PropertyTypeAssignment(
        code="FINISHED_FLAG",
        data_type="BOOLEAN",
        property_label="Experiment completed",
        ...
    )
...

we can see that the class attributes are assigned to describe the ExperimentalStep object type, which mostly concerns the PropertyTypeAssignments alongside some metadata.

My problem started when I wanted to use ExperimentalStep to instantiate an instance (duh..) that could be uploadedto openBIS later. Since I didn't remember all the possible attributes to define an experimental step, I looked up the init docstring/signature. However all I could find was

Init signature:
ExperimentalStep(
    *,
    properties: list[bam_masterdata.metadata.definitions.PropertyTypeAssignment] = [],
) -> None

Slightly confusing at first, but it seems to come from bam_masterdata.datamodel.metadata.ObjectType which lists propertiesas a field:

class ObjectType(BaseEntity):
    properties: list[PropertyTypeAssignment] = Field(
        default=[],
        description="""
        List of properties assigned to an object type. This is useful for internal representation of the model.
        """,
    )

I initally suspected some issues with docstring inheritance, but looking into it it became clear that these properties never get assigned as pydantic Fields. It took me a while to understand the parsing that ultimately assigns (and validates) the user supplied kwargs for class instantiation into class attributes in ObjectType.__setattr__

A few caveats I noticed with the current approach:

  • when supplied with a value, ObjectType.__setattr__ will basically override the previously defined class value:

    ExperimentalStep().name
    >>> PropertyTypeAssignment(code='$NAME', description='Name' ....
    print(ExperimentalStep(name="experiment name").name)
    >>> experiment name

    So the attributes are either the PropertyAssignments or the value. Normally I would expect something like None for optional values.

  • The type validation has to be done by the library in ObjectType.__setattr__, which I imagine could be quite difficult sometimes.

  • if the properties would be actual fields, we could also leverage pydantics conversion methods, e.g. creating datetime.datetimefrom string like this would work:

    class MyExperimentalStep(BaseModel):
        name: str 
        start_date: datetime
    MyExperimentalStep(name="d", start_date="2025-01-01")
    >>> MyExperimentalStep(name='d', start_date=datetime.datetime(2025, 1, 1, 0, 0))

    currently it fails because we have to supply a datetime directly

    ExperimentalStep(name="a new name", start_date="2025")
    >>> TypeError: Invalid type for 'start_date': Expected datetime, got str
  • understanding and inspecting the code (I guess also for IDEs) get's a bit more complicated due to the custom logic handling the properties and assignments. E.g. one cannot really use ExperimentalStep.__pydantic_fields__ to get information about the class.

So after all this, I am wondering would it be worth it to try and refactor the code to convert the proprties into actual pydantic Fields ?
(disclaimer: I haven't looked into the code that actually generates the python class definitions yet)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions