Recommended approach for dynamic quotas #2512
Unanswered
krestjaninoff
asked this question in
Q&A
Replies: 1 comment
-
As my understanding, when you use the GKE and cluster autoscaler, the ClusterQueue automatically adjusted based on the current available Node by ProvisioningRequest. Where do you work Kueue? GKE? other Cloud Providers? Or OnPrem? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It is quite common for large clusters to deal with temporary unavailable nodes due to hardware issues or other reasons. In this case, kueue's cluster queues have to be updated to reflect the actual amount of available resources. Otherwise, we are facing a possibility of a partially started job (when a subset of pods cannot be scheduled due to a lack of available nodes).
What's the recommended way to address this issue? Should it be a custom solution to monitor node health events and update kueue configuration?
Beta Was this translation helpful? Give feedback.
All reactions