-
Notifications
You must be signed in to change notification settings - Fork 679
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(turbomind): integrate cublasGemmGroupedBatchedEx for Qwen3.5 MoE inference on Blackwell GPUs with memory copy optimizations
#4490
opened Apr 3, 2026 by
hd9568
Loading…
Integrate deep-ep nccl backend
enhancement
New feature or request
#4477
opened Mar 27, 2026 by
irexyc
Loading…
[refactor] [api_server] [1/N] Improve reasoning and tool-call parsers
improvement
#4468
opened Mar 26, 2026 by
lvhan028
Loading…
feat: Turbomind linear gdn prefix caching
enhancement
New feature or request
#4465
opened Mar 25, 2026 by
lapy
Loading…
feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families
enhancement
New feature or request
#4460
opened Mar 24, 2026 by
lapy
Loading…
Draft model update params
enhancement
New feature or request
#4452
opened Mar 24, 2026 by
CUHKSZzxy
Loading…
[Feature] Support n parameter in /v1/chat/completions and /v1/completions
improvement
#4419
opened Mar 17, 2026 by
ziyangliu-666
Loading…
Support MiniMax-M2 in TurboMind engine
enhancement
New feature or request
#4343
opened Feb 10, 2026 by
zh-nj
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.