Skip to content

Make RuntimeInfo thread safe #2873

@mszadkow

Description

@mszadkow

What happened?

Calling RuntimeInfo concurrently causes data race as observed in this Kueue issue.
As a workaround we synchronise calls to RuntimeInfo.
Seems that the issue comes from syncPodSets access.

What did you expect to happen?

Make RuntimeInfo thread safe

Environment

Kubernetes version:

$ kubectl version

Kubeflow Trainer version:

$ kubectl get pods -n kubeflow-system -l app.kubernetes.io/name=kubeflow-trainer -o jsonpath="{.items[*].spec.containers[*].image}"

Kubeflow Python SDK version:

$ pip show kubeflow

Impacted by this bug?

Give it a 👍 We prioritize the issues with most 👍

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions