Does FlashInfer support ViT attention in Qwen2.5_Vl?

In vllm Qwen2.5_vl.py i had found this below:

``` python
        if attn_type != AttentionType.DECODER:
            raise NotImplementedError("Encoder self-attention and "
                                      "encoder/decoder cross-attention "
                                      "are not implemented for "
                                      "FlashInferImpl")
```

so I thought FlashInfer could not support Encoder Only attention, but in the doc of Flashinfer ,i had find 

``` markdown
causal (bool) – Whether to apply causal mask to the attention matrix. This argument is ignored if mask is provided in [plan()](https://docs.flashinfer.ai/api/prefill.html#flashinfer.prefill.BatchPrefillWithRaggedKVCacheWrapper.plan).
```

in my view, If i set casual to False ,the Attention Process should be the same as flash attention which is Encoder only attention for Vit ,just like 
``` python3
flash_attn_varlen_func(q,
                                            k,
                                            v,
                                            cu_seqlens_q=cu_seqlens,
                                            cu_seqlens_k=cu_seqlens,
                                            max_seqlen_q=max_seqlen,
                                            max_seqlen_k=max_seqlen,
                                            dropout_p=0,
                                            causal=False)
```
 please , if any one who might answer this ,can i use FlashInfer for qwen2.5_vl ViT attention?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does FlashInfer support ViT attention in Qwen2.5_Vl? #992

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Does FlashInfer support ViT attention in Qwen2.5_Vl? #992

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions