1
1
# Quota Maintenance
2
2
3
- Kubernetes built- in ` ResourceQuotas ` should not be combined with Kueue quotas.
3
+ A * team * in MLBatch is a group of users that share a resource quota.
4
4
5
- Kueue quotas can be adjusted post creation. Workloads already admitted are not
6
- impacted.
5
+ In Kueue, the ` ClusterQueue ` is the abstraction used to define a pool
6
+ of resources (` cpu ` , ` memory ` , ` nvidia.com/gpu ` , etc.) that is
7
+ available to a team. A ` LocalQueue ` is the abstraction used by
8
+ members of the team to submit workloads to a ` ClusterQueue ` for
9
+ execution using those resources.
10
+
11
+ Kubernetes built-in ` ResourceQuotas ` should not be used for resources that
12
+ are being managed by ` ClusterQueues ` . The two quota systems are incompatible.
13
+
14
+ We strongly recommend maintaining a simple relationship between
15
+ between teams, namespaces, ` ClusterQueues ` and ` LocalQueues ` . Each
16
+ team should assigned to their own namespace that contains a single
17
+ ` LocalQueue ` which is configured to be the only ` LocalQueue ` that
18
+ targets the team's ` ClusterQueue ` .
19
+
20
+ The quotas assigned to a ` ClusterQueue ` can be dynamically adjusted by
21
+ a cluster admin at any time. Adjustments to quotas only impact queued
22
+ workloads; workloads already admitted for execution are not impacted
23
+ by quota adjustments.
7
24
8
25
For Kueue quotas to be effective, the sum of all quotas for each managed
9
26
resource (` cpu ` , ` memory ` , ` nvidia.com/gpu ` , ` pods ` ) must be maintained to
@@ -14,15 +31,18 @@ less. Quotas should be reduced when the available capacity is reduced whether
14
31
because of failures or due to the allocation of resources to non-batch
15
32
workloads.
16
33
17
- To facilitate the necessary quota adjustments, one option is to setup a
18
- dedicated cluster queue for slack capacity that other cluster queues can borrow
19
- from. This queue should not be associated with any team, project, namespace, or
20
- local queue. Its quota should be adjusted dynamically to reflect changes in
21
- cluster capacity. If sized appropriately, this queue will make adjustments to
22
- other cluster queues unnecessary for small cluster capacity changes. Concretely,
23
- two teams could be granted 45% of the cluster capacity, with 10% capacity set
24
- aside for this extra cluster queue. Any changes to the cluster capacity below
25
- 10% can then be handled by adjusting the latter.
34
+ To facilitate the necessary quota adjustments, we recommend setting up
35
+ a dedicated cluster queue for slack capacity that other cluster queues
36
+ can borrow from. This queue should not be associated with any team,
37
+ project, namespace, or local queue. Its ` lendingLimit ` should be adjusted
38
+ dynamically to reflect changes in cluster capacity. If sized
39
+ appropriately, this queue will make adjustments to other cluster
40
+ queues unnecessary for small cluster capacity changes. The figure
41
+ below shows this recommended setup for an MLBatch cluster with three
42
+ teams. Beginning with RHOAI 2.12 (AppWrapper v0.23), the dynamic
43
+ adjustment of the Slack ` ClusterQueue ` ` lendingLimit ` can be
44
+ configured to be fully automated.
45
+ ![ Figure with ClusterQueues for three teams and slack] ( ./figures/CohortWithSlackCQ.png )
26
46
27
47
Every resource name occurring in the resource requests or limits of a workload
28
48
must be covered by a cluster queue intended to admit the workload, even if the
0 commit comments