Skip to content

Releases: huggingface/optimum-habana

v1.19.0: SynapseAI v1.22, GRPO, Snowflake Arctic, Diffusers v0.34

10 Sep 14:17
Compare
Choose a tag to compare

SynapseAI v1.22

Diffusers v0.34

GRPO trainer

FP8 with FSDP

Deepspeed regional compilation

Stable Diffusion

Snowflake Arctic

Model optimizations

Safe softmax

Bitsandbytes

Other

v1.18.1: Transformers v4.51, Qwen3, dynamic quantization

28 Jul 14:21
Compare
Choose a tag to compare

Transformers v4.51

This release supports and has been validated for Transformers v4.51

Qwen3

This release adds optimized support for Qwen3 models on Gaudi.

Dynamic Quantization

This release adds support for dynamic quantization.

v1.18.0: SynapseAI v1.21, Accelerate, CogVideoX, Llava-onevision

13 Jun 10:51
Compare
Choose a tag to compare

SynapseAI v1.21

This release has been tested on and validated for SynapseAI v1.21.

Accelerate

Gaudi is now natively supported in Accelerate, checkout the doc for more information.

Diffusers

CogVideoX

GLM4V

Siglip and Llava-onevision

Model optimizations

Other


v1.17.0: Transformers v4.49

14 Apr 16:34
Compare
Choose a tag to compare

Transformers v4.49

This release has been tested and validated for Transformers v4.49 and SynapseAI v1.20.

Model optimizations

Tests and CI

Other

  • Disable HPU migration (future add-on to HF diffusers) for OH diffusers #1866 @dsocek
  • Allow explicit control over flash_attention_fast_softmax setting #1851 @astachowiczhabana

v1.16.0: Deepseek V3, SynapseAI v1.20, Llama 405b, AWQ

12 Mar 09:57
Compare
Choose a tag to compare

SynapseAI v1.20

This release has been tested on and validated for SynapseAI v1.20.

New models

Llama 405b

AWQ

Various model optimizations

Sentence Transformers

CI

  • Implement baselines as a fixture and with simple rebase support #1732 @uartie

Other

v1.15.0: SynapseAI v1.19.0, FLUX, Mllama, DeepSeek, Falcon 3

02 Jan 11:36
Compare
Choose a tag to compare

SynapseAI v1.19

FLUX

New models

Various model optimizations

Sentence Transformers

Textual Inversion XL

TIMM

Context Parallelism

CI improvements

Documentation

Other

v1.14.1: Patch release

29 Oct 17:13
Compare
Choose a tag to compare

Full Changelog: v1.14.0...v1.14.1

v1.14.0: Transformers v4.45, SynapseAI v1.18, Qwen2-MoE, text-to-video generation

22 Oct 16:11
Compare
Choose a tag to compare

Transformers v4.45

SynapseAI v1.18

Qwen2-MoE

  • Added Qwen2-MoE model, optimizing its performance on Gaudi #1316 @gyou2021

Text-to-video generation

Depth-to-image generation

Model optimizations

Intel Neural Compressor

  • Enable INC for llava models and change softmax to use torch.nn.functional.softmax as its supported module by INC #1325 @tthakkal
  • Load INC GPTQ checkpoint & rename params #1364 @HolyFalafel
  • Fix load INC load weights compile error due to Transformer 4.45 upgrade. #1421 @jiminha

Vera/LN-tuning

Other

v1.13.2: Patch release

06 Sep 20:17
Compare
Choose a tag to compare

Llava(-next) improvements

This patch release adds multi-card support for Llava(-next) and enables users to turn on/off recomputing for flash attention.

  • Llava: Added flash_attention_recompute arg to provide an option to enable/disable recompute #1278 @tthakkal
  • Add the deepspeed injection_policy of mistral #1309 @yuanwu2017

Full Changelog: v1.13.1...v1.13.2

v1.13.1: Patch release

25 Aug 13:34
Compare
Choose a tag to compare

Fixed memory regressions

  • Remove _expand_inputs_for_generation for greedy search (#1266) @libinta
  • Fix memory regression for modeling llama (#1271) @libinta

FSDP

FSDP checkpoint saving is fixed.

Known limitations

  • ESMFold does not work on Gaudi1, this will be fixed in a future version

Full Changelog: v1.13.0...v1.13.1