Skip to content

MlflowArtifactDatasets saved to multiple runs when async=true in runner #646

Open
@Calychas

Description

@Calychas

Description

When running the pipeline with async option in the SequentialRunner (e.g. kedro run --pipeline my_pipeline --async, the datasets created by MLFlowArtifactDataset are put in separate runs. Removing the async option makes them land in one run.

See two pipeline invocations, one with and one without async:
Image

Steps to Reproduce

  1. Have at least one MLFlowArtifactDataset in catalog
  2. Run the pipeline with async option
  3. Observe artefacts landing outside of the main run

Expected Result

They should land in one run

Your Environment

  • kedro and kedro-mlflow version used (pip show kedro and pip show kedro-mlflow): 0.19.2 and 0.14.4
  • MLFlow: 2.21.3
  • Python version used (python -V): 3.11.11
  • Operating system and version: Apple M2 Pro 15.3.2

Does the bug also happen with the last version on master?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions