Skip to content

feat(turbomind): integrate cublasGemmGroupedBatchedEx for Qwen3.5 MoE inference on Blackwell GPUs with memory copy optimizations#4490

Open
hd9568 wants to merge 1 commit intoInternLM:mainfrom
hd9568:feature/blackwell-moe-opt
Open

feat(turbomind): integrate cublasGemmGroupedBatchedEx for Qwen3.5 MoE inference on Blackwell GPUs with memory copy optimizations#4490
hd9568 wants to merge 1 commit intoInternLM:mainfrom
hd9568:feature/blackwell-moe-opt

Commits