Releases · Project-HAMi/HAMi

07 Nov 06:41

github-actions

v2.7.1

1adf148

v2.7.1 Latest

Latest

What's Changed

🐛 Bug Fixes

Major:

Update HAMi-core to fix vllm-related issues: #1381 # 1461 by @archlitchi in #1478
Fix: Calculation error for quotas by @luohua13 in #1400

Others

Fix release CI by @archlitchi in #1373
Fix: failed clusterrolebinding when change release name or chart name by @FouoF in #1380
fix: e2e ginkgo version mismatch by @FouoF in #1391
fix: check pod nil in ReleaseNodeLock by @DSFans2014 in #1372
fix: upgrade nvidia-mig-parted to v0.12.2 to solve security issues by @Shouren in #1388
fix: scheduler flaky test by @FouoF in #1402
Fix: After removing the device plugin from the gpu node, it can still… by @luohua13 in #1456
Fix concurrent map iteration and map write fatal error. by @litaixun in #1452
fix: fix typos by @DSFans2014 in #1434
Fix CI error of the PR #1470, #1326, #1033 by @archlitchi in #1473
Fix concurrent map read write fatal error. by @litaixun in #1476
add podInfos in DeviceUsage to enhance scheduling decision by @Kyrie336 in #1362
Update device-numa acquisition logic by @archlitchi in #1403
Improved support for iluvatar GPUs by @qiangwei1983 in #1399
Improve: Replace StrategicMergePatchType by MergePatchType by @luohua13 in #1431
optimize schedule failure event by @Kyrie336 in #1444
Release v2.7.1 by @archlitchi in #1480

New Contributors

@luohua13 made their first contribution in #1400
@qiangwei1983 made their first contribution in #1399
@eltociear made their first contribution in #1412
@daixiang0 made their first contribution in #1465
@zhegemingzimeibanquan made their first contribution in #1419

Full Changelog: v2.7.0...v2.7.1

Contributors

Shouren, eltociear, and 9 other contributors

Assets 3

26 Sep 13:41

github-actions

v2.7.0

38714d5

v2.7.0

What's Changed

✨ Key Features

Metax sGPU topology aware by @Kyrie336 in #1193
NVIDIA Resourcequota by @FouoF in #1359
Kunlunxin topology-aware scheduling by @FouoF in #1141
Kunlunxin vxpu sopport #1016 by @ouyangluwei163 @archlitchi in #1337
Enflame GCU topology-awareness (#1040) by @zhaikangqi331 in #1334
AWS-neuron device and device-core allocation by @archlitchi in #1238
Aggregated Scheduling Failure Events by @Wangmin362 in #1333

✨ Other Features

Optimize Fit-in-device logic to make it device-specific by @archlitchi in #1097
feat(scheduler): make node lock timeout configurable by @Kevinz857 in #1117
featue: mig mode-change #1116 by @ouyangluwei163 in #1124
feat: Add new labels in .github/release.yml by @Shouren in #1066
feat(scheduler-role): use a scoped-down role for scheduler by @Antvirf in #1152
feat(helm): optionally disable admission webhook by @Antvirf in #1145
remove redundant metrics for vgpu allocation by @FouoF in #1169
refactor: clean up code and improve maintainability by @Wangmin362 in #1195
refactor: Ranging over SplitSeq is more efficient by @Shouren in #1239
feat:NodeLockTimeout set from env by @miaobyte in #1244
refactor: move watchAndFeedback function to feedback.go by @miaobyte in #1248
feat: add informer-based pod cache to reduce API server load by @miaobyte in #1250
feat: Add option to disable device plugin at values.yaml. by @FouoF in #1274
refactor(util/nodelock): replace manual polling with k8s.io/client-go/util/retry by @mayooot in #1252
refactor: Remove annotation in Devices interfaces by @Shouren in #1343
feat: update the Ascend910 scheduling policy by @DSFans2014 in #1344
feat(nvidia): default gpucores=100 when memory is exclusive and cores… by @xrwang8 in #1354

🐛 Bug Fixes

fix: Before executing MIG partitioning, suppress NVML usage in o… by @Goend in #1095
Fix golint-CI by @archlitchi in #1127
fix: override node socre failure for kunlun #1137 by @ouyangluwei163 in #1138
fix: Multi-node scoring nodes are inaccurate by @ouyangluwei163 in #1147
fix: An error occurred while create Iluvatar pod by @ouyangluwei163 in #1149
Fix e2e CI by @archlitchi in #1165
fix: Add option for overwrite schedulerName by @Shouren in #1163
fix: using go-safecast to fix incorrect conversion of numbers by @Shouren in #1183
fix: deal with security issues reported by Trivy in image by @Shouren in #1189
fix: wrong Pod's UID and emtpy Pod's name in log of webhook.go by @Shouren in #1092
fix: concurrent map writes error in scheduler.calcScore #1269 by @Shouren in #1270
fix: release dangling node lock by @peachest in #1271
fix: fix err which retrieved incorrect NUMA node information issue #1275 by @abstractmj in #1276
fix(security): resolve issues reported by Code scanning in Security by @Shouren in #1280
fix: fix golangci-lint error by @DSFans2014 in #1319
Fix: device allocation missing containers with no device request by @FouoF in #1299
fix: update int8Slice to uint8Slice for better type clarity and consistency by @yxxhero in #1357

📚 Documentation

documentation: add Known Issues for dynamic mig support by @Goend in #1122
docs: fix broken link by @lixd in #1125
clearly list supported devices doc references at README by @FouoF in #1155
docs: update ascend910b-support docs by @DSFans2014 in #1321

🔨 Other Changes

Prerelease-v2.6 by @archlitchi in #1108
add new reviewers Shouren and ouyangluwei163 by @wawa0210 in #1131
Support topology-awareness for Kunlunxin device by @archlitchi in #1121
Support Metax sGPU Qos Policy by @Kyrie336 in #1123
add global image for chart by @calvin0327 in #1133
fix: Skip admission webhook when Pod's scheduler is already assigned. by @ghostloda in #1041
Add node configs to docs by @wylswz in #1159
build(deps): upgrade golang to 1.24.4 by @Shouren in #1172
build(deps): Upgrade golang image in ci to 1.24.4 by @Shouren in #1176
build(deps): Upgrade controller-runtime to 0.21.0 by @Shouren in #1171
build(deps): Dump github.com/NVIDIA/nvidia-container-toolkit by @Shouren in #1170
Add unit tests for Fit Function for enflame,hygon, metax, mthreads, nvidia by @Wangmin362 in #1199
[Misc] update hami-core version by @chaunceyjiang in #1201
Improve the impl of DevicePluginConfigs.Nodeconfig overwriting NvidiaConfig by @FouoF in #1158
Add unit tests for cambricon's Fit Function by @Wangmin362 in #1198
Add unit tests for Ascend's Fit Function by @Wangmin362 in #1197
修复生成 pod 请求资源时不必要的重复计算 by @litaixun in #1215
修复更新节点注解时的日志提示词 by @litaixun in #1214
If the mem applied for the Mig device is the same as the template value,>will result in CardNotFoundCustom Filter Rule. by @zgqqiang in #1179
updated dri section to combine text for better readability by @mpetason in #1216
feat: Add nvidia gpu topoloy scheduler by @fyp711 in #1028
add issue translate robot by @wawa0210 in #1232
add issue translate robot by @wawa0210 in #1234
perf(util/nodelock): Use clientset Patch instead of Update. by @mayooot in #1192
Update hami-core and fix readme documents by @archlitchi in #1240
Update hami-core version to fix by @archlitchi in #1256
[Snyk] Security upgrade tensorflow/tensorflow from latest-gpu to 2.20.0rc0-gpu by @wawa0210 in #1243
feat: Add an action of 'Close stale issue and PRs' in github worklfow by @Shouren in #1083
Welcome fyp711 to become a HAMi member by @wawa0210 in #1288
Add values readme by @clcc2019 in #1267
Support Metax sGPU device health check by @Kyrie336 in #1295
Optimize pkg/util.go and distribute logics to corresponding logics by @archlitchi in #1296
cleanup: Clear and correct ascend device name by @FouoF in #1315
bugfix: Nvidia card abnormal pod will still continue to schedule by @zgqqiang in #1336
FIx CI, add 910B4-1 template and fix vGPUmonitor metrics error by @archlitchi in #1345
add httpTargetPort to values.yaml by @flpanbin in #1356
Update kunlunxin documents by @archlitchi in #1366
update chart version and hami-core by @archlitchi in #1369

New Contributors

@Kevinz857 made their first contribution in #1117
@FouoF made their first contribution in #1141
@Antvirf made their first contribution in #1152
@wylswz made their first contribution i...

Contributors

Shouren, mpetason, and 27 other contributors

Assets 3

05 Aug 02:38

github-actions

v2.5.3

edb5ebd

v2.5.3

What's Changed

🔨 Other Changes

Release v2.5.1 - fix e2e workflow by @archlitchi in #1037
Release v2.5.2 by @archlitchi in #1080

Bug Fixes：

Full Changelog: v2.5.2...v2.5.3

Contributors

archlitchi

Assets 3

04 Aug 10:17

github-actions

v2.6.1

e482ca3

v2.6.1

BUG Fix:

Full Changelog: v2.6.0...v2.6.1

Assets 3

07 Jun 03:56

github-actions

v2.6.0

ce38bd4

v2.6.0

Key feature:

Optimize scheduler log
Support enflame gcu-share
Support metax GPU and metax sGPU
Helm chart add checksum annotation for restarting hami component after ConfigMap modification
Support for using RuntimeClass with nvidia devices
Add support for profiling via net/http/pprof package
Add nvidia gpu topoloy score registry to node
Feat: vGPUmonitor support MigInfo metrics

Bug fix:

Fix stuck in driver 570+
Fix device memory not counted properly in comfyUI task
Fix cambricon devices not allocated properly
Fix wrong log and container request device count error
Fix vgpu-devices-allocated annotations are inconsistent
Fix removing node devices from node manager
Fix: Dynamic GPU partitioning lacks single-GPU-level granularity
Fix device memory count error on cuMallocAsync
Fix scheduler crash if a 'mig' task running accidentally on a 'hami-core' GPU
Fix multi-process device memory count

What's Changed

⬆️ Dependencies

Bump aquasecurity/trivy-action from 0.28.0 to 0.29.0 by @dependabot in #631
Bump nvidia/cuda from 12.4.1-base-ubuntu22.04 to 12.6.3-base-ubuntu22.04 in /docker by @dependabot in #676
Bump actions/upload-artifact from 4.4.3 to 4.5.0 by @dependabot in #717
Bump docker/build-push-action from 6.9.0 to 6.10.0 by @dependabot in #644
Bump docker/build-push-action from 6.10.0 to 6.11.0 by @dependabot in #792
Bump golang.org/x/net from 0.26.0 to 0.33.0 by @dependabot in #839
Bump docker/build-push-action from 6.11.0 to 6.13.0 by @dependabot in #837
Bump golang.org/x/net from 0.26.0 to 0.35.0 by @dependabot in #859
Bump aquasecurity/trivy-action from 0.29.0 to 0.30.0 by @dependabot in #941
Bump docker/login-action from 3.3.0 to 3.4.0 by @dependabot in #942
Bump docker/build-push-action from 6.13.0 to 6.15.0 by @dependabot in #899
build(deps): bump docker/build-push-action from 6.15.0 to 6.16.0 by @dependabot in #1024
build(deps): bump docker/build-push-action from 6.16.0 to 6.17.0 by @dependabot in #1052
build(deps): bump docker/build-push-action from 6.17.0 to 6.18.0 by @dependabot in #1091

🔨 Other Changes

Fix Kubernetes version string handling by stripping metadata by @Nimbus318 in #623
Update vGPUmonitor to add dynamic adjustment on core and memory limit by @archlitchi in #624
feat: support device plugin daemonset update strategy by @devenami in #628
add ut about schedule policy by @yt-huang in #638
Fix: Refactor the license based on the approaches used in OpenSearch and ElasticSearch. by @haitwang-cloud in #626
add ut for the scheduler by @shijinye in #645
docs(issue-tmpl): add FAQ link to issue templates by @Nimbus318 in #647
fix: filter device registry to node by @lengrongfu in #639
Add self-hosted runner by @archlitchi in #659
fix-example-yaml by @WQL782795 in #667
update docs by @yangshiqi in #668
add ut for ascend by @shijinye in #664
optimization map init in test by @lengrongfu in #678
Optimize monitor by @for800000 in #683
fix code lint faild by @lengrongfu in #685
fix(helm): Add NODE_NAME env var to the vgpu-monitor container from spec.nodeName by @Nimbus318 in #687
fix vGPUmonitor deviceidx is always 0 by @lengrongfu in #684
add ut for pkg/scheduler/event.go by @Penguin-zlh in #688
add ut for nodes by @shijinye in #695
add license for pkg/scheduler/event_test.go by @Penguin-zlh in #706
fix: exception happen when creating multiple ascend-gpu pods concurrently by @lijm87 in #575
add ut for device/nvidia by @shijinye in #657
add ut for pkg/monitor/nvidia/v0/spec.go by @yt-huang in #670
Enable Dynamic-mig feature for HAMi by @archlitchi in #708
Fix chart can not be deployed properly by @archlitchi in #711
Fix NodeLock issue by @archlitchi in #714
fix example yaml by @lixd in #709
add ut for device/cambricon by @shijinye in #712
Update dynamic mig documents and examples by @archlitchi in #718
random time may be zero by @shijinye in #697
fix grafana dashboard and clarify dashboard usage more clearly. by @jiangsanyin in #543
doc(README): add examples for GPU sharing and update-examples by @xiaoyao in #665
add ut for github.com/Project-HAMi/HAMi/pkg/scheduler/pod.go by @yt-huang in #673
Add design document to 'dynamic-mig' feature by @archlitchi in #725
fix(doc): fix a typo and resolve markdown warnings in the tasklist by @elrondwong in #724
add ut for pkg/util/nodelock/nodelock.go by @learner0810 in #719
test: add ut for pkg/version/version.go by @Penguin-zlh in #677
Update on mig mode by @archlitchi in #726
Update documents for config & config_cn by @archlitchi in #729
set PASS_DEVICE_SPECS ENV to device-plugin by @jingzhe6414 in #690
fix device-plugin-version by @learner0810 in #743
feat: Return the nodes that failed to be scheduled back to the scheduler by @chaunceyjiang in #746
fix(log): fix missing log output in nvidiadeviceplugin server by @elrondwong in #735
support configuration resources limits and requests by @flpanbin in #739
feat(test): add TestMarshalNodeDevices scenarios by @elrondwong in #747
print flags for device-plugin and scheduler by @flpanbin in #756
Fix typos, add more contributors and maintainers. by @yangshiqi in #765
Add a mind map(Chinese and English) to help understand this project by @oceanweave in #764
[Docs] update config pages by @windsonsea in #760
add ut for device-map by @KubeKyrie in #762
refactor(ci): use go.mod file for Go version in workflows by @yxxhero in #766
support set log level for device plugin by @flpanbin in #771
feat: Restart/Upgrade device-plugin will not affect services. by @chaunceyjiang in #767
add ut nvml devices by @KubeKyrie in #773
add ut for device-map by @KubeKyrie in #772
Optimize the time format layout by @learner0810 in #741
fix: nvidia-device-plugin no version info by @chaunceyjiang in #779
HAMi supports e2e by @Rei1010 in #775
Proposal: enable E2E test by @Rei1010 in #633
add ut for device/iluvatar by @shijinye in #795
add ut for device/hygon by @shijinye in #787
add ut for pkg/monitor/nvidia/v1 by @shijinye in #780
refactor(logging): enhance log messages for device resource counting by @haitwang-cloud in #778
Enrich pod health check by @Rei1010 in #801
docs: fix broken link by @lixd in #802
Optimize the E2E execution logic by @Rei1010 in #803
optimize MetricsBindAddress to MetricsBindPort by @phoenixwu0229 in #796
fix: handle the node nil issue & E2E test failure ...

Contributors

yangshiqi, joy717, and 45 other contributors

Assets 3

26 May 02:55

github-actions

v2.5.2

133ba17

v2.5.2

Full Changelog: v2.5.1...v2.5.2

Fix device usage metrics(31992) can't be accessed

Assets 3

06 May 06:57

github-actions

v2.5.1

fcd3930

v2.5.1

What's Changed

🔨 Other Changes

Release v2.5 by @archlitchi in #1034
Update tag to v2.5.1 by @archlitchi in #1035
Fix: Update handling of version strings in Helm template and helpers.tpl by @HJJ256 in #845
Update libvgpu.so by @archlitchi in #876
fix: Set passDeviceSpecsEnabled to false by default in device plugin by @Nimbus318 in #872
fix: scheduler ignore KUBECONFIG env even if this environment variable is set @Shouren in #681
fix: correct device filter initialization order by @Nimbus318 in #857
fix parseNvidiaNumaInfo index out of range by @flpanbin in #889
Fix cambricon pods not been recognized by HAMi scheduler by @archlitchi in #947
fix ubuntu base image in Dockerfile.withlib by @flpanbin in #944
fix: Add error handling for nvml.Init in NvidiaDevicePlugin by @yxxhero in #982
Fix device memory count error on cuMallocAsync by @archlitchi in #1029
Bump golang.org/x/net from 0.26.0 to 0.33.0 by @dependabot in #839

Full Changelog: v2.5.0...v2.5.1

Contributors

Shouren, yxxhero, and 5 other contributors

Assets 3

06 Feb 01:08

github-actions

v2.5.0

4c6059e

v2.5.0

Major features:

Support dynamic mig feature, please refer to this document
Reinstall Hami will NOT crash GPU tasks
Put all configurations into a configMap, you can customize hami installation by modify its content: see details

Major bug fixes:

Fix an issue where hami-core will stuck on tasks using 'cuMallocAsync'
Fix hami-core stuck on high glib images, like 'tf-serving:latest'

What's Changed

⬆️ Dependencies

Bump aquasecurity/trivy-action from 0.28.0 to 0.29.0 by @dependabot in #631
Bump nvidia/cuda from 12.4.1-base-ubuntu22.04 to 12.6.3-base-ubuntu22.04 in /docker by @dependabot in #676
Bump actions/upload-artifact from 4.4.3 to 4.5.0 by @dependabot in #717
Bump docker/build-push-action from 6.9.0 to 6.10.0 by @dependabot in #644
Bump docker/build-push-action from 6.10.0 to 6.11.0 by @dependabot in #792

🔨 Other Changes

Fix Kubernetes version string handling by stripping metadata by @Nimbus318 in #623
Update vGPUmonitor to add dynamic adjustment on core and memory limit by @archlitchi in #624
feat: support device plugin daemonset update strategy by @devenami in #628
add ut about schedule policy by @yt-huang in #638
Fix: Refactor the license based on the approaches used in OpenSearch and ElasticSearch. by @haitwang-cloud in #626
add ut for the scheduler by @shijinye in #645
docs(issue-tmpl): add FAQ link to issue templates by @Nimbus318 in #647
fix: filter device registry to node by @lengrongfu in #639
Add self-hosted runner by @archlitchi in #659
fix-example-yaml by @WQL782795 in #667
update docs by @yangshiqi in #668
add ut for ascend by @shijinye in #664
optimization map init in test by @lengrongfu in #678
Optimize monitor by @for800000 in #683
fix code lint faild by @lengrongfu in #685
fix(helm): Add NODE_NAME env var to the vgpu-monitor container from spec.nodeName by @Nimbus318 in #687
fix vGPUmonitor deviceidx is always 0 by @lengrongfu in #684
add ut for pkg/scheduler/event.go by @Penguin-zlh in #688
add ut for nodes by @shijinye in #695
add license for pkg/scheduler/event_test.go by @Penguin-zlh in #706
fix: exception happen when creating multiple ascend-gpu pods concurrently by @lijm87 in #575
add ut for device/nvidia by @shijinye in #657
add ut for pkg/monitor/nvidia/v0/spec.go by @yt-huang in #670
Enable Dynamic-mig feature for HAMi by @archlitchi in #708
Fix chart can not be deployed properly by @archlitchi in #711
Fix NodeLock issue by @archlitchi in #714
fix example yaml by @lixd in #709
add ut for device/cambricon by @shijinye in #712
Update dynamic mig documents and examples by @archlitchi in #718
random time may be zero by @shijinye in #697
fix grafana dashboard and clarify dashboard usage more clearly. by @jiangsanyin in #543
doc(README): add examples for GPU sharing and update-examples by @xiaoyao in #665
add ut for github.com/Project-HAMi/HAMi/pkg/scheduler/pod.go by @yt-huang in #673
Add design document to 'dynamic-mig' feature by @archlitchi in #725
fix(doc): fix a typo and resolve markdown warnings in the tasklist by @elrondwong in #724
add ut for pkg/util/nodelock/nodelock.go by @learner0810 in #719
test: add ut for pkg/version/version.go by @Penguin-zlh in #677
Update on mig mode by @archlitchi in #726
Update documents for config & config_cn by @archlitchi in #729
set PASS_DEVICE_SPECS ENV to device-plugin by @jingzhe6414 in #690
fix device-plugin-version by @learner0810 in #743
feat: Return the nodes that failed to be scheduled back to the scheduler by @chaunceyjiang in #746
fix(log): fix missing log output in nvidiadeviceplugin server by @elrondwong in #735
support configuration resources limits and requests by @flpanbin in #739
feat(test): add TestMarshalNodeDevices scenarios by @elrondwong in #747
print flags for device-plugin and scheduler by @flpanbin in #756
Fix typos, add more contributors and maintainers. by @yangshiqi in #765
Add a mind map(Chinese and English) to help understand this project by @oceanweave in #764
[Docs] update config pages by @windsonsea in #760
add ut for device-map by @KubeKyrie in #762
refactor(ci): use go.mod file for Go version in workflows by @yxxhero in #766
support set log level for device plugin by @flpanbin in #771
feat: Restart/Upgrade device-plugin will not affect services. by @chaunceyjiang in #767
add ut nvml devices by @KubeKyrie in #773
add ut for device-map by @KubeKyrie in #772
Optimize the time format layout by @learner0810 in #741
fix: nvidia-device-plugin no version info by @chaunceyjiang in #779
HAMi supports e2e by @Rei1010 in #775
Proposal: enable E2E test by @Rei1010 in #633
add ut for device/iluvatar by @shijinye in #795
add ut for device/hygon by @shijinye in #787
add ut for pkg/monitor/nvidia/v1 by @shijinye in #780
refactor(logging): enhance log messages for device resource counting by @haitwang-cloud in #778
Enrich pod health check by @Rei1010 in #801
docs: fix broken link by @lixd in #802
Optimize the E2E execution logic by @Rei1010 in #803
optimize MetricsBindAddress to MetricsBindPort by @phoenixwu0229 in #796
fix: handle the node nil issue & E2E test failure by @haitwang-cloud in #804
add ut for device/mthreads by @shijinye in #808
fix: Resolve formatting issue in ConfigMap causing display anomalies by @lixd in #814
[docs] Update ascend910b-support.md by @windsonsea in #816
Refine metrics logs by @haitwang-cloud in #817
Update mig-related logics and refine logs by @archlitchi in #833
Add 910B4 config to device-configmap for ascend by @lijm87 in #828
[docs] fix: glibc version requirement in README by @chinaran in #826
Update HAMi-core for v2.5.0 by @archlitchi in #834
FIx multi-process device memory count issue by @archlitchi in #835
bump version to v2.5.0 by @wawa0210 in #836
Fix CI by @archlitchi in #838
Fix CI release by @archlitchi in #840
Fix release ci by @archlitchi in #841
Fix Dockerfile to make CI pass by @archlitchi in #846
Fix E2E failure with pod status check by @Rei1010 in htt...

Contributors

yangshiqi, chinaran, and 27 other contributors

Assets 3

15 Nov 06:56

github-actions

v2.4.1

69370a7

v2.4.1

Major Features:

Support Metax scheduling optimazation
Support Mthreads sGPU
Add a configMap hami-scheduler-device for all configurations of HAMi
Optimize installation process

Details

⬆️ Dependencies

Bump actions/download-artifact from 3 to 4 by @dependabot in #529
Bump docker/build-push-action from 6.8.0 to 6.9.0 by @dependabot in #528
Bump actions/upload-artifact from 3.1.3 to 4.4.0 by @dependabot in #530
Bump aquasecurity/trivy-action from 0.24.0 to 0.27.0 by @dependabot in #546
Bump actions/upload-artifact from 4.4.0 to 4.4.3 by @dependabot in #541
Bump ubuntu from 20.04 to 24.04 in /docker by @dependabot in #394
Bump aquasecurity/trivy-action from 0.27.0 to 0.28.0 by @dependabot in #559
Bump codecov/codecov-action from 4 to 5 by @dependabot in #613

🔨 Other Changes

fix build badge status by @wawa0210 in #526
update action-gh-release template file to more accurate matching by @wawa0210 in #527
Refactor helm "Admission Webhook" config. by @4gt-104 in #532
fix: error happen when allocate iluvatar device by @lijm87 in #522
Fix code scanning alert-Incorrect conversion between integer types by @ghostloda in #556
update hami-core version by @chaunceyjiang in #557
Mthreads support by @archlitchi in #560
Fix code scanning alert-Incorrect conversion between integer types by @ghostloda in #561
update docs by @ghostloda in #567
migrate hami slack to cncf hami group by @wawa0210 in #568
Fix pod assignment issue when pod already has a node assigned by @chaunceyjiang in #564
fix(scheduler): prevent array out-of-bounds when GPU containers are placed between non-GPU containers by @Nimbus318 in #572
improve pkg/k8sutil/pod.go ut coverage by @wawa0210 in #570
Metax GPU topo-awareness support by @archlitchi in #574
Add WebUI to readme and readme_cn.md by @archlitchi in #578
remove watermark of MetaX topo diagrams by @obnah in #581
update HAMi Talks and References by @wawa0210 in #582
fix: assgin to wrong devices when 1 pod has 2+ containers request GPU by @joy717 in #593
docs: fix deployments path in README by @dublc in #608
Add unified configMap and update charts by @archlitchi in #614
Fix configMap device-config not properly installed by @archlitchi in #616
fix CI: race condition error by @archlitchi in #618
Pre release to v2.4.1 by @archlitchi in #619

New Contributors

@4gt-104 made their first contribution in #532
@lijm87 made their first contribution in #522
@ghostloda made their first contribution in #556
@Nimbus318 made their first contribution in #572
@obnah made their first contribution in #581
@dublc made their first contribution in #608

Full Changelog: v2.4.0...v2.4.1

Contributors

joy717, obnah, and 9 other contributors

Assets 3

29 Sep 06:39

github-actions

v2.4.0

b08f9c5

v2.4.0

What's Changed

✨ New Features

Support huawei ascend 910p for GA by @peizhaoyou in #389 ,https://github.com/Project-HAMi/ascend-device-plugin
Support for multiple versions of cudevshr for vGPUmonitor by @zoyopei in #458
Add filter device when register node by uuid or index by @lengrongfu in #495
Support Ascend custom configuration file settings for NPU virtualization by @wawa0210 in #510
Add event handlers registration by @wawa0210 in #417
Officially supports arm architecture

🐛 Bug Fixes

fixed go build image version by @chaunceyjiang in #405
fix: fix duplicate resource keys in configmap by @devenami in #422
fix data race when read pods info by @lengrongfu in #419
fix OpenSSF Best Practices by @wawa0210 in #478
fix CI and go-lint check error by @archlitchi in #486
fix trivy scan failed by @wawa0210 in #504 and #507
Fix HAMi image is too large and uses inappropriate base image by @wawa0210 in #508
fix device configmap by @zoyopei in #494
fix chart lint always running when charts has no change by @wawa0210 in #501

🔨 Other Changes

Proposal: support GPU Utilization Metrics by @chaunceyjiang in #258
disable PreferredAllocation by @lengrongfu in #415
optimization code by @lengrongfu in #401
add vgpu doc and update the readme. by @william-wang in #430
add hami vulnerability scan and report by @wawa0210 in #433
add CodeQL analysis by @wawa0210 in #432
update hami logo by @lengrongfu in #456
add node record pod info by @lengrongfu in #451
support code Coverage Analytics by @wawa0210 in #473
add dev branch ci & remove unused binary by @archlitchi in #487
refactoring ascend device code by @zoyopei in #492
Remake README.md by @archlitchi in #489
support pr commit can also build images by @wawa0210 in #499
Add HAMi release process by @wawa0210 in #520

New Contributors

@william-wang made their first contribution in #430
@devenami made their first contribution in #422

Full Changelog: v2.3.13...v2.4.0

Contributors

william-wang, wawa0210, and 6 other contributors

Assets 3

Releases: Project-HAMi/HAMi

v2.7.1

What's Changed

🐛 Bug Fixes

New Contributors

Contributors

Uh oh!

v2.7.0

What's Changed

✨ Key Features

✨ Other Features

🐛 Bug Fixes

📚 Documentation

🔨 Other Changes

New Contributors

Contributors

Uh oh!

v2.5.3

What's Changed

🔨 Other Changes

Contributors

Uh oh!

v2.6.1

Uh oh!

v2.6.0

Key feature:

Bug fix:

What's Changed

⬆️ Dependencies

🔨 Other Changes

Contributors

Uh oh!

v2.5.2

Uh oh!

v2.5.1

What's Changed

🔨 Other Changes

Contributors

Uh oh!

v2.5.0

What's Changed

⬆️ Dependencies

🔨 Other Changes

Contributors

Uh oh!

v2.4.1

Details

⬆️ Dependencies

🔨 Other Changes

New Contributors

Contributors

Uh oh!

v2.4.0

What's Changed

✨ New Features

🐛 Bug Fixes

🔨 Other Changes

New Contributors

Contributors

Uh oh!