v1.17.0: Transformers v4.49
Transformers v4.49
This release has been tested and validated for Transformers v4.49 and SynapseAI v1.20.
Model optimizations
- Use token_idx_cpu int instead of token_idx tensor in slicing #1848 @jaygala223
- Keep logits in bf16 #1835 @jaygala223
- Optimize SD3 Pipeline : Padding prompt Embeddings for softmax_hf8 compatibility and Efficient Utilization #1816 @deepak-gowda-narayana
- Add G3 perf WA for Qwen2VL #1884 @nngokhale
- Fix MPT regression #1857 @atakaha
Tests and CI
- Slow test updates #1804 @ugolowic
- Fix race condition when downloading nltk tokenizer #1802 @ugolowic
- fea(): Skipped the torch_fx tests #1797 @imangohari1
- Upstream tests #1834 @IlyasMoutawwakil
- test_examples: add missing clip-roberta baseline #1852 @uartie
- Separate slow tests by required number of cards #1803 @ugolowic
- Update PR doc build workflow #1904 @regisss
Other
- Disable HPU migration (future add-on to HF diffusers) for OH diffusers #1866 @dsocek
- Allow explicit control over flash_attention_fast_softmax setting #1851 @astachowiczhabana