Skip to content

Commit c43ce06

Browse files
authored
Merge pull request #344 from zsnmwy/feat/trans_crane_scheduler_docs_to_cn
feat: trans crane-scheduler docs to cn
2 parents 06cac8d + 66c8063 commit c43ce06

File tree

2 files changed

+266
-5
lines changed

2 files changed

+266
-5
lines changed

docs/tutorials/scheduling-pods-based-on-actual-node-load.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
## Overview
44
Crane-scheduler is a collection of scheduler plugins based on [scheduler framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/), including:
55

6-
- [Dynamic scheuler: a load-aware scheduler plugin](./dynamic-scheduler-plugin.md)
6+
- [Dynamic scheduler: a load-aware scheduler plugin](./dynamic-scheduler-plugin.md)
77

88
## Get Started
99

@@ -185,10 +185,22 @@ profiles:
185185
image: docker.io/gocrane/crane-scheduler:0.0.23
186186
...
187187
```
188-
5. Install [crane-scheduler-controller](deploy/controller/deployment.yaml):
189-
```bash
190-
kubectl apply ./deploy/controller/rbac.yaml && kubectl apply -f ./deploy/controller/deployment.yaml
191-
```
188+
5. Install [crane-scheduler-controller](https://github.com/gocrane/crane-scheduler/tree/main/deploy/controller):
189+
190+
=== "Main"
191+
192+
```bash
193+
kubectl apply -f https://raw.githubusercontent.com/gocrane/crane-scheduler/main/deploy/controller/rbac.yaml
194+
kubectl apply -f https://raw.githubusercontent.com/gocrane/crane-scheduler/main/deploy/controller/deployment.yaml
195+
```
196+
197+
=== "Mirror"
198+
199+
200+
```bash
201+
kubectl apply -f https://finops.coding.net/p/gocrane/d/crane-scheduler/git/raw/main/deploy/controller/rbac.yaml?download=false
202+
kubectl apply -f https://finops.coding.net/p/gocrane/d/crane-scheduler/git/raw/main/deploy/controller/deployment.yaml?download=false
203+
```
192204

193205
### Schedule Pods With Crane-scheduler
194206
Test Crane-scheduler with following example:
Lines changed: 249 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,249 @@
1+
# Crane-scheduler
2+
3+
## 概述
4+
Crane-scheduler 是一组基于[scheduler framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/)的调度插件, 包含:
5+
6+
- [Dynamic scheduler:负载感知调度器插件](./dynamic-scheduler-plugin.md)
7+
8+
## 开始
9+
10+
### 安装 Prometheus
11+
确保你的 Kubernetes 集群已安装 Prometheus。如果没有,请参考[Install Prometheus](https://github.com/gocrane/fadvisor/blob/main/README.md#prerequests).
12+
13+
### 配置 Prometheus 规则
14+
15+
1. 配置 Prometheus 的规则以获取预期的聚合数据:
16+
17+
```yaml
18+
apiVersion: monitoring.coreos.com/v1
19+
kind: PrometheusRule
20+
metadata:
21+
name: example-record
22+
spec:
23+
groups:
24+
- name: cpu_mem_usage_active
25+
interval: 30s
26+
rules:
27+
- record: cpu_usage_active
28+
expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[30s])) * 100)
29+
- record: mem_usage_active
30+
expr: 100*(1-node_memory_MemAvailable_bytes/node_memory_MemTotal_bytes)
31+
- name: cpu-usage-5m
32+
interval: 5m
33+
rules:
34+
- record: cpu_usage_max_avg_1h
35+
expr: max_over_time(cpu_usage_avg_5m[1h])
36+
- record: cpu_usage_max_avg_1d
37+
expr: max_over_time(cpu_usage_avg_5m[1d])
38+
- name: cpu-usage-1m
39+
interval: 1m
40+
rules:
41+
- record: cpu_usage_avg_5m
42+
expr: avg_over_time(cpu_usage_active[5m])
43+
- name: mem-usage-5m
44+
interval: 5m
45+
rules:
46+
- record: mem_usage_max_avg_1h
47+
expr: max_over_time(mem_usage_avg_5m[1h])
48+
- record: mem_usage_max_avg_1d
49+
expr: max_over_time(mem_usage_avg_5m[1d])
50+
- name: mem-usage-1m
51+
interval: 1m
52+
rules:
53+
- record: mem_usage_avg_5m
54+
expr: avg_over_time(mem_usage_active[5m])
55+
```
56+
!!! warning "️Troubleshooting"
57+
58+
Prometheus 的采样间隔必须小于30秒,不然可能会导致规则无法正常生效。如:`cpu_usage_active`。
59+
60+
2\. 更新 Prometheus 服务发现的配置,确保`node_exporters/telegraf`正在使用节点名称作为实例名称:
61+
62+
```yaml hl_lines="9-11"
63+
- job_name: kubernetes-node-exporter
64+
tls_config:
65+
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
66+
insecure_skip_verify: true
67+
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
68+
scheme: https
69+
kubernetes_sd_configs:
70+
...
71+
# Host name
72+
- source_labels: [__meta_kubernetes_node_name]
73+
target_label: instance
74+
...
75+
```
76+
77+
!!! note "Note"
78+
79+
如果节点名称是本机IP,则可以跳过此步骤。
80+
81+
### 安装 Crane-scheduler
82+
有两种选择:
83+
84+
- 安装 Crane-scheduler 作为第二个调度器
85+
- 用 Crane-scheduler 替换原生 Kube-scheduler
86+
87+
#### 安装 Crane-scheduler 作为第二个调度器
88+
=== "Main"
89+
90+
```bash
91+
helm repo add crane https://gocrane.github.io/helm-charts
92+
helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="REPLACE_ME_WITH_PROMETHEUS_ADDR" crane/scheduler
93+
```
94+
95+
=== "Mirror"
96+
97+
```bash
98+
helm repo add crane https://finops-helm.pkg.coding.net/gocrane/gocrane
99+
helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="REPLACE_ME_WITH_PROMETHEUS_ADDR" crane/scheduler
100+
```
101+
#### 用 Crane-scheduler 替换原生 Kube-scheduler
102+
103+
1. 备份`/etc/kubernetes/manifests/kube-scheduler.yaml`
104+
```bash
105+
cp /etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/
106+
```
107+
2. 通过修改 kube-scheduler 的配置文件(`scheduler-config.yaml` ) 启用动态调度插件并配置插件参数:
108+
```yaml title="scheduler-config.yaml"
109+
apiVersion: kubescheduler.config.k8s.io/v1beta2
110+
kind: KubeSchedulerConfiguration
111+
...
112+
profiles:
113+
- schedulerName: default-scheduler
114+
plugins:
115+
filter:
116+
enabled:
117+
- name: Dynamic
118+
score:
119+
enabled:
120+
- name: Dynamic
121+
weight: 3
122+
pluginConfig:
123+
- name: Dynamic
124+
args:
125+
policyConfigPath: /etc/kubernetes/policy.yaml
126+
...
127+
```
128+
3. 新建`/etc/kubernetes/policy.yaml`,用作动态插件的调度策略:
129+
```yaml title="/etc/kubernetes/policy.yaml"
130+
apiVersion: scheduler.policy.crane.io/v1alpha1
131+
kind: DynamicSchedulerPolicy
132+
spec:
133+
syncPolicy:
134+
##cpu usage
135+
- name: cpu_usage_avg_5m
136+
period: 3m
137+
- name: cpu_usage_max_avg_1h
138+
period: 15m
139+
- name: cpu_usage_max_avg_1d
140+
period: 3h
141+
##memory usage
142+
- name: mem_usage_avg_5m
143+
period: 3m
144+
- name: mem_usage_max_avg_1h
145+
period: 15m
146+
- name: mem_usage_max_avg_1d
147+
period: 3h
148+
149+
predicate:
150+
##cpu usage
151+
- name: cpu_usage_avg_5m
152+
maxLimitPecent: 0.65
153+
- name: cpu_usage_max_avg_1h
154+
maxLimitPecent: 0.75
155+
##memory usage
156+
- name: mem_usage_avg_5m
157+
maxLimitPecent: 0.65
158+
- name: mem_usage_max_avg_1h
159+
maxLimitPecent: 0.75
160+
161+
priority:
162+
##cpu usage
163+
- name: cpu_usage_avg_5m
164+
weight: 0.2
165+
- name: cpu_usage_max_avg_1h
166+
weight: 0.3
167+
- name: cpu_usage_max_avg_1d
168+
weight: 0.5
169+
##memory usage
170+
- name: mem_usage_avg_5m
171+
weight: 0.2
172+
- name: mem_usage_max_avg_1h
173+
weight: 0.3
174+
- name: mem_usage_max_avg_1d
175+
weight: 0.5
176+
177+
hotValue:
178+
- timeRange: 5m
179+
count: 5
180+
- timeRange: 1m
181+
count: 2
182+
```
183+
4. 修改`kube-scheduler.yaml`并用 Crane-scheduler的镜像替换 kube-scheduler 镜像:
184+
```yaml title="kube-scheduler.yaml"
185+
...
186+
image: docker.io/gocrane/crane-scheduler:0.0.23
187+
...
188+
```
189+
5. 安装[crane-scheduler-controller](https://github.com/gocrane/crane-scheduler/tree/main/deploy/controller):
190+
=== "Main"
191+
192+
```bash
193+
kubectl apply -f https://raw.githubusercontent.com/gocrane/crane-scheduler/main/deploy/controller/rbac.yaml
194+
kubectl apply -f https://raw.githubusercontent.com/gocrane/crane-scheduler/main/deploy/controller/deployment.yaml
195+
```
196+
197+
=== "Mirror"
198+
199+
```bash
200+
kubectl apply -f https://finops.coding.net/p/gocrane/d/crane-scheduler/git/raw/main/deploy/controller/rbac.yaml?download=false
201+
kubectl apply -f https://finops.coding.net/p/gocrane/d/crane-scheduler/git/raw/main/deploy/controller/deployment.yaml?download=false
202+
```
203+
204+
### 使用 Crane-scheduler 调度 Pod
205+
使用以下示例测试 Crane-scheduler :
206+
207+
```yaml
208+
apiVersion: apps/v1
209+
kind: Deployment
210+
metadata:
211+
name: cpu-stress
212+
spec:
213+
selector:
214+
matchLabels:
215+
app: cpu-stress
216+
replicas: 1
217+
template:
218+
metadata:
219+
labels:
220+
app: cpu-stress
221+
spec:
222+
schedulerName: crane-scheduler
223+
hostNetwork: true
224+
tolerations:
225+
- key: node.kubernetes.io/network-unavailable
226+
operator: Exists
227+
effect: NoSchedule
228+
containers:
229+
- name: stress
230+
image: docker.io/gocrane/stress:latest
231+
command: ["stress", "-c", "1"]
232+
resources:
233+
requests:
234+
memory: "1Gi"
235+
cpu: "1"
236+
limits:
237+
memory: "1Gi"
238+
cpu: "1"
239+
```
240+
!!! Note
241+
242+
如果想将`crane-scheduler`用作默认调度器,请将`crane-scheduler`更改为`default-scheduler`。
243+
244+
如果测试 pod 调度成功,将会有以下事件:
245+
```bash
246+
Type Reason Age From Message
247+
---- ------ ---- ---- -------
248+
Normal Scheduled 28s crane-scheduler Successfully assigned default/cpu-stress-7669499b57-zmrgb to vm-162-247-ubuntu
249+
```

0 commit comments

Comments
 (0)