forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Labels
Description
🚀 The feature, motivation and pitch
- MLA_Decode need fixing the API, qo_ptr
- a8w8_GEMM_ASM for bf16
- ck_moe_2_stage => to see if we can use it to run Qwe3 245 FP8 block scaled fused moe.
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.