[backend] Backwards compatibility error with POD_NAMES v1 for v1 runs

### Environment

*  How did you deploy Kubeflow Pipelines (KFP)?
Using [kubeflow/manifests](https://github.com/kubeflow/manifests)
*  KFP version: 
1.9.1 (but I think it impacts 1.10 too)
*  KFP SDK version: 
1.8.22 and 2.8.0

### Steps to reproduce

Setting `POD_NAMES=v1` in the workflow controller resolves the "ML Metadata not found" issue in [kubeflow/pipelines#11457](https://github.com/kubeflow/pipelines/issues/11457). 

**However**, this change introduces a bug that prevents users from retrying failed pipelines. The sequence is: 
1. A user starts a run.
2. The pipeline fails due to one or more failing components.
3. The user clicks "Retry" in the web UI.
4. The pipeline gets stuck in a pending state, and no new pods are scheduled.

This happens because, if the pods associated with failed components are not deleted, the workflow controller logs:

> level=info msg="Failed pod ... creation: already exists"

As a result, the Kubeflow pipeline remains stuck in the pending state.

### Expected result

The failed pipeline is retried.

### Materials and Reference

The reason for this bug is that:

- The [`GenerateRetryExecution`](https://github.com/kubeflow/pipelines/blob/51ab5e678feee36b59a000f3a03abe0ccd174360/backend/src/common/util/workflow.go#L196) function uses [`RetrievePodName`](https://github.com/kubeflow/pipelines/blob/51ab5e678feee36b59a000f3a03abe0ccd174360/backend/src/common/util/workflow.go#L384) to collect the list of pods to delete.
- When `POD_NAMES=v1` is set, pod names match the ArgoWorkflows Node IDs.
- This causes deletion to fail, leaving pods behind and blocking pipeline retries.

#### Proposed Solution

To fix this, we should use the annotation `workflows.argoproj.io/pod-name-format` in the `RetrievePodName` function.
A similar solution is implement in the frontend in this PR: https://github.com/kubeflow/pipelines/pull/11682



---


Impacted by this bug? Give it a 👍.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[backend] Backwards compatibility error with POD_NAMES v1 for v1 runs #12308

Environment

Steps to reproduce

Expected result

Materials and Reference

Proposed Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[backend] Backwards compatibility error with POD_NAMES v1 for v1 runs #12308

Description

Environment

Steps to reproduce

Expected result

Materials and Reference

Proposed Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions