Skip to content

Releases: Project-HAMi/HAMi

v2.7.1

07 Nov 06:41

Choose a tag to compare

What's Changed

🐛 Bug Fixes

Major:

Update HAMi-core to fix vllm-related issues: #1381 # 1461 by @archlitchi in #1478
Fix: Calculation error for quotas by @luohua13 in #1400

Others

New Contributors

Full Changelog: v2.7.0...v2.7.1

v2.7.0

26 Sep 13:41

Choose a tag to compare

What's Changed

✨ Key Features

✨ Other Features

  • Optimize Fit-in-device logic to make it device-specific by @archlitchi in #1097
  • feat(scheduler): make node lock timeout configurable by @Kevinz857 in #1117
  • featue: mig mode-change #1116 by @ouyangluwei163 in #1124
  • feat: Add new labels in .github/release.yml by @Shouren in #1066
  • feat(scheduler-role): use a scoped-down role for scheduler by @Antvirf in #1152
  • feat(helm): optionally disable admission webhook by @Antvirf in #1145
  • remove redundant metrics for vgpu allocation by @FouoF in #1169
  • refactor: clean up code and improve maintainability by @Wangmin362 in #1195
  • refactor: Ranging over SplitSeq is more efficient by @Shouren in #1239
  • feat:NodeLockTimeout set from env by @miaobyte in #1244
  • refactor: move watchAndFeedback function to feedback.go by @miaobyte in #1248
  • feat: add informer-based pod cache to reduce API server load by @miaobyte in #1250
  • feat: Add option to disable device plugin at values.yaml. by @FouoF in #1274
  • refactor(util/nodelock): replace manual polling with k8s.io/client-go/util/retry by @mayooot in #1252
  • refactor: Remove annotation in Devices interfaces by @Shouren in #1343
  • feat: update the Ascend910 scheduling policy by @DSFans2014 in #1344
  • feat(nvidia): default gpucores=100 when memory is exclusive and cores… by @xrwang8 in #1354

🐛 Bug Fixes

📚 Documentation

  • documentation: add Known Issues for dynamic mig support by @Goend in #1122
  • docs: fix broken link by @lixd in #1125
  • clearly list supported devices doc references at README by @FouoF in #1155
  • docs: update ascend910b-support docs by @DSFans2014 in #1321

🔨 Other Changes

New Contributors

Read more

v2.5.3

05 Aug 02:38

Choose a tag to compare

What's Changed

🔨 Other Changes

Bug Fixes:

Full Changelog: v2.5.2...v2.5.3

v2.6.1

04 Aug 10:17

Choose a tag to compare

v2.6.0

07 Jun 03:56

Choose a tag to compare

Key feature:

  • Optimize scheduler log
  • Support enflame gcu-share
  • Support metax GPU and metax sGPU
  • Helm chart add checksum annotation for restarting hami component after ConfigMap modification
  • Support for using RuntimeClass with nvidia devices
  • Add support for profiling via net/http/pprof package
  • Add nvidia gpu topoloy score registry to node
  • Feat: vGPUmonitor support MigInfo metrics

Bug fix:

  • Fix stuck in driver 570+
  • Fix device memory not counted properly in comfyUI task
  • Fix cambricon devices not allocated properly
  • Fix wrong log and container request device count error
  • Fix vgpu-devices-allocated annotations are inconsistent
  • Fix removing node devices from node manager
  • Fix: Dynamic GPU partitioning lacks single-GPU-level granularity
  • Fix device memory count error on cuMallocAsync
  • Fix scheduler crash if a 'mig' task running accidentally on a 'hami-core' GPU
  • Fix multi-process device memory count

What's Changed

⬆️ Dependencies

  • Bump aquasecurity/trivy-action from 0.28.0 to 0.29.0 by @dependabot in #631
  • Bump nvidia/cuda from 12.4.1-base-ubuntu22.04 to 12.6.3-base-ubuntu22.04 in /docker by @dependabot in #676
  • Bump actions/upload-artifact from 4.4.3 to 4.5.0 by @dependabot in #717
  • Bump docker/build-push-action from 6.9.0 to 6.10.0 by @dependabot in #644
  • Bump docker/build-push-action from 6.10.0 to 6.11.0 by @dependabot in #792
  • Bump golang.org/x/net from 0.26.0 to 0.33.0 by @dependabot in #839
  • Bump docker/build-push-action from 6.11.0 to 6.13.0 by @dependabot in #837
  • Bump golang.org/x/net from 0.26.0 to 0.35.0 by @dependabot in #859
  • Bump aquasecurity/trivy-action from 0.29.0 to 0.30.0 by @dependabot in #941
  • Bump docker/login-action from 3.3.0 to 3.4.0 by @dependabot in #942
  • Bump docker/build-push-action from 6.13.0 to 6.15.0 by @dependabot in #899
  • build(deps): bump docker/build-push-action from 6.15.0 to 6.16.0 by @dependabot in #1024
  • build(deps): bump docker/build-push-action from 6.16.0 to 6.17.0 by @dependabot in #1052
  • build(deps): bump docker/build-push-action from 6.17.0 to 6.18.0 by @dependabot in #1091

🔨 Other Changes

Read more

v2.5.2

26 May 02:55

Choose a tag to compare

Full Changelog: v2.5.1...v2.5.2

Fix device usage metrics(31992) can't be accessed

v2.5.1

06 May 06:57

Choose a tag to compare

What's Changed

🔨 Other Changes

Full Changelog: v2.5.0...v2.5.1

v2.5.0

06 Feb 01:08
4c6059e

Choose a tag to compare

Major features:

  1. Support dynamic mig feature, please refer to this document
  2. Reinstall Hami will NOT crash GPU tasks
  3. Put all configurations into a configMap, you can customize hami installation by modify its content: see details

Major bug fixes:

  1. Fix an issue where hami-core will stuck on tasks using 'cuMallocAsync'
  2. Fix hami-core stuck on high glib images, like 'tf-serving:latest'

What's Changed

⬆️ Dependencies

  • Bump aquasecurity/trivy-action from 0.28.0 to 0.29.0 by @dependabot in #631
  • Bump nvidia/cuda from 12.4.1-base-ubuntu22.04 to 12.6.3-base-ubuntu22.04 in /docker by @dependabot in #676
  • Bump actions/upload-artifact from 4.4.3 to 4.5.0 by @dependabot in #717
  • Bump docker/build-push-action from 6.9.0 to 6.10.0 by @dependabot in #644
  • Bump docker/build-push-action from 6.10.0 to 6.11.0 by @dependabot in #792

🔨 Other Changes

Read more

v2.4.1

15 Nov 06:56
69370a7

Choose a tag to compare

Major Features:

  1. Support Metax scheduling optimazation
  2. Support Mthreads sGPU
  3. Add a configMap hami-scheduler-device for all configurations of HAMi
  4. Optimize installation process

Details

⬆️ Dependencies

🔨 Other Changes

New Contributors

Full Changelog: v2.4.0...v2.4.1

v2.4.0

29 Sep 06:39

Choose a tag to compare

What's Changed

✨ New Features

🐛 Bug Fixes

🔨 Other Changes

New Contributors

Full Changelog: v2.3.13...v2.4.0