Skip to content

[feature] Add download_to_workspace option to dsl.importer #12352

@VaniHaripriya

Description

@VaniHaripriya

Feature Area

/area backend
/area sdk

What feature would you like to see?

As a KFP user, I want dsl.importer to support a download_to_workspace=True option, so that artifacts can be downloaded directly into the pipeline workspace. This requires adding a download_to_workspace field to the ImporterSpec protobuf message and updating the corresponding Python class and function. When the input artifact has the metadata field _kfp_workspace=True set, then artifact.path should return the correct path within the workspace (e.g., prefixed with /kfp-workspace/.artifacts/)

The KFP Importer needs to differentiate artifacts downloaded to the workspace. It should use a new execution type in MLMD (e.g., system.ImporterWorkspaceExecution). Before registering artifacts in MLMD, they should be downloaded to the workspace at a path like /kfp-workspace/.artifacts/. The KFP Importer should fail if the artifact is from OCI (starts with oci://).

The KFP Launcher should skip downloading artifacts to local emptyDir volumes if the artifact has the _kfp_workspace metadata field set to True (indicating it's already in the workspace).

The Container Driver needs to:

  1. Set a _kfp_workspace=True metadata field (not persisted to MLMD) on artifacts originating from a system.ImporterWorkspaceExecution task.
  2. Mount the KFP workspace volume in the Pod spec patch if an input artifact is from the workspace.
  3. Disallow user-mounted volumes in or under /kfp-workspace.

Love this idea? Give it a 👍.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions