Investigation: check scheduler performance when there are many workloads

I'm not sure if we have an issue or not, but I would like to verify if scheduler would send multiple Workload update requests or not in case the cluster is busy. Consider the scenario;

1. the cluster has 100 gpu, all are busy
2. we consider workload-X but it is inadmissible, so we put it into inadmissibleWorklaods (we update the workload with the reason [here](https://github.com/kubernetes-sigs/kueue/blob/main/pkg/scheduler/scheduler.go#L798-L807)
3. some workload ends, so we requeue all workloads 
4. the workload-X is reconsidered, but still cannot fit
Question: do we send another request to update the Workload-X, or we just skip the update? IIUC the code even if we skip then we still send the [event](https://github.com/kubernetes-sigs/kueue/blob/main/pkg/scheduler/scheduler.go#L808)

The ask comes to better understand if we could improve performance on large scale deployments where we have 10k workloads, and constant inflow of new workloads, so "requeue" is called almost all the time, and workloads are constantly re-evaluated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigation: check scheduler performance when there are many workloads #8081

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigation: check scheduler performance when there are many workloads #8081

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions