forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Labels
enhancementNew feature or requestNew feature or request
Description
🚀 The feature, motivation and pitch
Motivation: To bridge the gap between V1 Engine vs V0 multi step = 10
Find out if the v1 AITER MHA kernel has the same FA CK kernel invocation as in the V0 Engine CK Flash Attention.
The reason is to find out if there is any difference in the shape that is causing the Backend to be so slow.
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request