Skip to content

Commit 4c93f58

Browse files
committed
k8s appwrapper config
1 parent a45be3f commit 4c93f58

6 files changed

+82
-16
lines changed

setup.k8s-v1.25/CLUSTER-SETUP.md

+13-7
Original file line numberDiff line numberDiff line change
@@ -70,13 +70,19 @@ Install the AppWrapper Operator
7070
```sh
7171
kubectl apply --server-side -k setup.k8s-v1.25/appwrapper
7272
```
73-
74-
- the AppWrapper controller is enabled and configured as follows:
75-
- `userRBACAdmissionCheck` is disabled,
76-
- `schedulerName` is set to `scheduler-plugins-scheduler`,
77-
- `queueName` is set to `default-queue`,
78-
79-
TODO: *** UNDER CONSTRUCTION **
73+
The provided configuration differs from the default configuration of the
74+
operators as follows:
75+
- Kubeflow Training Operator:
76+
- `gang-scheduler-name` is set to `scheduler-plugins-scheduler`,
77+
- Kueue:
78+
- `manageJobsWithoutQueueName` is enabled,
79+
- `batch/job` integration is disabled,
80+
- `waitForPodsReady` is disabled,
81+
- AppWrapper operator:
82+
- `userRBACAdmissionCheck` is disabled,
83+
- `schedulerName` is set to `scheduler-plugins-scheduler`,
84+
- `queueName` is set to `default-queue`,
85+
- pod priorities, resource requests and limits have been adjusted.
8086

8187
## Kueue Configuration
8288

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
kind: ConfigMap
2+
apiVersion: v1
3+
metadata:
4+
name: appwrapper-operator-config
5+
namespace: appwrapper-system
6+
data:
7+
config.yaml: |
8+
appwrapper:
9+
enableKueueIntegrations: true
10+
kueueJobReconciller:
11+
manageJobsWithoutQueueName: true
12+
waitForPodsReady: false
13+
queueName: default-queue
14+
schedulerName: scheduler-plugins-scheduler
15+
userRBACAdmissionCheck: false
16+
controllerManager:
17+
health:
18+
bindAddress: ":8081"
19+
metrics:
20+
bindAddress: "127.0.0.1:8080"
21+
leaderElection: true
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
apiVersion: kustomize.config.k8s.io/v1beta1
2+
kind: Kustomization
3+
4+
namespace: mlbatch-system
5+
6+
resources:
7+
- "https://github.com/project-codeflare/appwrapper/config/default?ref=v0.21.0"
8+
9+
images:
10+
- name: quay.io/ibm/appwrapper
11+
newTag: v0.21.0
12+
13+
patches:
14+
- path: manager_resources_patch.yaml
15+
- path: config_patch.yaml
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: controller-manager
5+
namespace: system
6+
spec:
7+
template:
8+
spec:
9+
priorityClassName: system-node-critical
10+
containers:
11+
- name: manager
12+
resources:
13+
requests:
14+
cpu: 250m
15+
memory: 250Mi
16+
limits:
17+
cpu: 1000m
18+
memory: 1000Mi

setup.k8s-v1.25/kueue/controller_manager_config.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ clientConnection:
2222
qps: 50
2323
burst: 100
2424
#pprofBindAddress: :8083
25-
#waitForPodsReady:
26-
# enable: false
25+
waitForPodsReady:
26+
enable: false
2727
# timeout: 5m
2828
# blockAdmission: false
2929
# requeuingStrategy:

setup.tmpl/CLUSTER-SETUP.md.tmpl

+13-7
Original file line numberDiff line numberDiff line change
@@ -148,13 +148,19 @@ Install the AppWrapper Operator
148148
```sh
149149
{{ .KUBECTL }} apply --server-side -k setup.{{ .VERSION }}/appwrapper
150150
```
151-
152-
- the AppWrapper controller is enabled and configured as follows:
153-
- `userRBACAdmissionCheck` is disabled,
154-
- `schedulerName` is set to `scheduler-plugins-scheduler`,
155-
- `queueName` is set to `default-queue`,
156-
157-
TODO: *** UNDER CONSTRUCTION **
151+
The provided configuration differs from the default configuration of the
152+
operators as follows:
153+
- Kubeflow Training Operator:
154+
- `gang-scheduler-name` is set to `scheduler-plugins-scheduler`,
155+
- Kueue:
156+
- `manageJobsWithoutQueueName` is enabled,
157+
- `batch/job` integration is disabled,
158+
- `waitForPodsReady` is disabled,
159+
- AppWrapper operator:
160+
- `userRBACAdmissionCheck` is disabled,
161+
- `schedulerName` is set to `scheduler-plugins-scheduler`,
162+
- `queueName` is set to `default-queue`,
163+
- pod priorities, resource requests and limits have been adjusted.
158164

159165
{{- end }}
160166

0 commit comments

Comments
 (0)