Skip to content

Callable context_getter prevents serialization of nlp pipeline #426

@theoimbert-aphp

Description

@theoimbert-aphp

When using a function as a context getter in TrainableSpanClassifier and TrainableNerCrf at least, nlp.to_disk(save_path) returns an error of the type :

PydanticSerializationError: Unable to serialize unknown type: <function context_getter_function at 0x7f26ec697560>

How to reproduce the bug

import edsnlp, edsnlp.pipes as eds

# context_getter function
def context_getter_function(span):
    doc = span.doc
    return doc[max(0, span.start-15) : min(len(doc), span.end+15)]

# pipeline definition
nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(
    eds.span_classifier(
        embedding=eds.span_pooler(
            pooling_mode="mean",
            embedding=eds.transformer(
                model="prajjwal1/bert-tiny",
            ),
        ),
        span_getter=[
            "mobility", 
        ],
        attributes=[
            "_.negation",
        ],
        context_getter=context_getter_function
    ),
    name="span_classifier",
)

# trying to save the model
nlp.to_disk("path/to/model")

Note that using the model defined in this example for inference or training does work, it is specifically the saving part that does not.

Your Environment

  • Python Version Used: 3.7.16
  • EDS-NLP Version Used: 0.17.2
  • spaCy: 3.7.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions