Skip to content

Commit cf8460b

Browse files
committed
update sdpa and fa2 documentation
1 parent fb2ee61 commit cf8460b

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

docs/source/en/perf_infer_gpu_one.md

+2
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,7 @@ FlashAttention-2 is currently supported for the following architectures:
109109
* [SigLIP](https://huggingface.co/docs/transformers/model_doc/siglip)
110110
* [UniSpeech](https://huggingface.co/docs/transformers/v4.39.3/en/model_doc/unispeech#transformers.UniSpeechModel)
111111
* [unispeech_sat](https://huggingface.co/docs/transformers/v4.39.3/en/model_doc/unispeech-sat#transformers.UniSpeechSatModel)
112+
* [helium](https://huggingface.co/docs/transformers/main/en/model_doc/heliumtransformers.HeliumModel)
112113

113114
You can request to add FlashAttention-2 support for another model by opening a GitHub Issue or Pull Request.
114115

@@ -324,6 +325,7 @@ For now, Transformers supports SDPA inference and training for the following arc
324325
* [XLM-RoBERTa](https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.XLMRobertaModel)
325326
* [XLM-RoBERTa-XL](https://huggingface.co/docs/transformers/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLModel)
326327
* [YOLOS](https://huggingface.co/docs/transformers/model_doc/yolos#transformers.YolosModel)
328+
* [helium](https://huggingface.co/docs/transformers/main/en/model_doc/heliumtransformers.HeliumModel)
327329

328330
<Tip>
329331

0 commit comments

Comments
 (0)