Skip to content

Commit 90b39bb

Browse files
committed
docs: finished one slide
Signed-off-by: Neko Ayaka <[email protected]>
1 parent 38fabf4 commit 90b39bb

File tree

1 file changed

+51
-32
lines changed

1 file changed

+51
-32
lines changed

packages/2024-08-21-kubecon-hk/slides.md

+51-32
Original file line numberDiff line numberDiff line change
@@ -1256,11 +1256,30 @@ class: py-10
12561256

12571257
---
12581258
class: py-10
1259+
glow: right
1260+
glowSeed: 230
12591261
---
12601262

12611263
# Why can it?
12621264

1263-
<span>How it was implemented</span>
1265+
<span>Sum it up for architecture</span>
1266+
1267+
<v-clicks depth="2">
1268+
1269+
- After labelled, we will watch over the stopped <span text-sky-400><div inline-block i-carbon:cube translate-y-0.8 mr-1 />`Pod`</span>, and analyze the:
1270+
- <span text-violet-300><div inline-block i-carbon:cloud-alerting translate-y-0.8 mr-2 />Node issues</span>
1271+
- <span text-purple-300><div inline-block i-carbon:ibm-open-enterprise-languages translate-y-0.8 mr-2 />Logs</span> (e.g. <span text="[#64b023]"><div inline-block translate-y-0.8 mr-1 i-bi:nvidia />CUDA</span>, <span text="[#64b023]"><div inline-block translate-y-0.8 mr-1 i-bi:nvidia />cuDNN</span>, <span text="[#64b023]"><div inline-block translate-y-0.8 mr-1 i-bi:nvidia />NCCL</span>, `OOM` errors)
1272+
- <span text-pink-300><div inline-block i-carbon:exit translate-y-0.8 mr-2 />Exit codes</span>
1273+
- Once Issue identified:
1274+
- <span text-purple-300><div inline-block i-carbon:flow-stream-reference translate-y-0.8 mr-2 />event will be recorded</span> (e.g. container logs, syscalls)
1275+
- <span text-pink-300>trigger cascading shutdown</span> (which results in job restarting by <div i-devicon:kubernetes inline-block translate-y-0.5 mr-2 /><span text="[#5791f7]">Controller & Operator</span>)
1276+
- For continues diagnostics, <span text="[#64b023]"><div inline-block translate-y-0.8 mr-1 i-bi:nvidia />`dcgmi`</span>, <span text="[#64b023]"><div inline-block translate-y-0.8 mr-1 i-bi:nvidia />`nvidia-smi`</span>, <span text="[#64b023]"><div inline-block translate-y-0.8 mr-1 i-bi:nvidia />`nccl-test`</span> will be executed periodically to:
1277+
- <span text-purple-300><div inline-block i-carbon:flow-stream-reference translate-y-0.8 mr-2 />Network & IO connectivity & throughput</span>
1278+
- <span text-indigo-300><div inline-block i-bi:gpu-card translate-y-0.8 mr-2 />GPU & VRAM health</span>
1279+
- <span text-blue-300><div inline-block i-carbon:fusion-blender translate-y-0.8 mr-2 />PCIe status</span>
1280+
- <span text-sky-300><div inline-block i-carbon:edge-node translate-y-0.8 mr-2 />Kernel modules status</span>
1281+
1282+
</v-clicks>
12641283

12651284
---
12661285
class: py-10
@@ -1312,37 +1331,6 @@ metadata:
13121331
class: py-10
13131332
---
13141333

1315-
# Let's build it together
1316-
1317-
<span>Open sourced, already</span>
1318-
1319-
<div flex>
1320-
<div
1321-
v-click="1" flex flex-col items-start transition duration-500 ease-in-out
1322-
:class="$clicks < 1 ? 'translate-x--20' : 'translate-x-0'"
1323-
>
1324-
<div mt-10 flex gap-16>
1325-
<img src="/kcover-repository-qr.png" w-60 />
1326-
<div text-2xl flex items-center gap-2 mt-4>
1327-
<div i-ri:github-fill /><span underline decoration-dashed font-mono decoration-zinc-300>BaizeAI/kcover</span>
1328-
</div>
1329-
</div>
1330-
</div>
1331-
</div>
1332-
1333-
<div w-full absolute bottom-0 left-0 flex items-center transform="translate-x--10 translate-y--10">
1334-
<div w-full flex items-center justify-end gap-4>
1335-
<img src="/KubeCon.png" h-10>
1336-
<img src="/CloudNativeCon.png" h="10.1">
1337-
<img src="/OpenSourceSummit.png" h-9>
1338-
<img src="/AI_dev.png" h-4>
1339-
</div>
1340-
</div>
1341-
1342-
---
1343-
class: py-10
1344-
---
1345-
13461334
# Futures
13471335

13481336
<span>Foresight from our perspective</span>
@@ -1402,6 +1390,37 @@ class: py-10
14021390
class: py-10
14031391
---
14041392

1393+
# Let's build it together
1394+
1395+
<span>Open sourced, already</span>
1396+
1397+
<div flex>
1398+
<div
1399+
v-click="1" flex flex-col items-start transition duration-500 ease-in-out
1400+
:class="$clicks < 1 ? 'translate-x--20' : 'translate-x-0'"
1401+
>
1402+
<div mt-10 flex gap-16>
1403+
<img src="/kcover-repository-qr.png" w-60 />
1404+
<div text-2xl flex items-center gap-2 mt-4>
1405+
<div i-ri:github-fill /><span underline decoration-dashed font-mono decoration-zinc-300>BaizeAI/kcover</span>
1406+
</div>
1407+
</div>
1408+
</div>
1409+
</div>
1410+
1411+
<div w-full absolute bottom-0 left-0 flex items-center transform="translate-x--10 translate-y--10">
1412+
<div w-full flex items-center justify-end gap-4>
1413+
<img src="/KubeCon.png" h-10>
1414+
<img src="/CloudNativeCon.png" h="10.1">
1415+
<img src="/OpenSourceSummit.png" h-9>
1416+
<img src="/AI_dev.png" h-4>
1417+
</div>
1418+
</div>
1419+
1420+
---
1421+
class: py-10
1422+
---
1423+
14051424
# To community
14061425

14071426
<span>Let's improve it together</span>

0 commit comments

Comments
 (0)