SageAttention support for vLLM？ #71

Zachary-ai-engineer · 2024-12-16T03:54:49Z

vLLM has chunked prefill and paged attention features. Do you expect SageAttention2 to support these acceleration algorithms based on block tables memory management? This will greatly accelerate the commercial application of LLM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SageAttention support for vLLM？ #71

SageAttention support for vLLM？ #71

Zachary-ai-engineer commented Dec 16, 2024

SageAttention support for vLLM？ #71

SageAttention support for vLLM？ #71

Comments

Zachary-ai-engineer commented Dec 16, 2024