-
Notifications
You must be signed in to change notification settings - Fork 14
Pull requests: AMD-AGI/Primus-Turbo
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(typo): rename MXScalingRecipe, grouped gemm fp8 function name and remove scale args from quantize api
#298
opened Apr 17, 2026 by
RuibinCheung
Contributor
Loading…
3 tasks done
refactor(moe): extensible EPBackend Protocol + EPBufferConfig & dispatcher test overhaul
#297
opened Apr 17, 2026 by
zhenhuang12
Contributor
Loading…
9 of 12 tasks
Allow Triton GEMM selectors to run without origami.
#296
opened Apr 17, 2026 by
kyle-256
Contributor
Loading…
feat: enable hybrid FP8 dtypes on Triton grouped GEMM backends
#288
opened Apr 15, 2026 by
sarthak-amd
•
Draft
perf: optimize hipBLASLt grouped GEMM with algo tuning, enable grouped_gemm autotune hipblaslt support
#284
opened Apr 14, 2026 by
kyle-256
Contributor
Loading…
Add HYBRID FP8 format support for Triton backend in gemm and grouped_gemm
#278
opened Apr 10, 2026 by
kyle-256
Contributor
Loading…
feat(mi300x:grouped_gemm): Triton dynamic tile downgrade and M-aware config
#272
opened Apr 6, 2026 by
ChengYao-amd
Contributor
Loading…
2 tasks
feat(mi300x:ck-grouped-gemm): M-aware tile selection for BF16 and FP8
#271
opened Apr 6, 2026 by
ChengYao-amd
Contributor
Loading…
5 tasks
feat(benchmark): add --backend flag to GEMM and grouped GEMM benchmarks
#270
opened Apr 6, 2026 by
ChengYao-amd
Contributor
Loading…
4 tasks
feat(mi300x:gemm): shape-based backend dispatch + autotune persistent cache
#269
opened Apr 6, 2026 by
ChengYao-amd
Contributor
Loading…
3 tasks
feat(mi300x:ck-gemm): FP8 GEMM M-aware tile selection
#268
opened Apr 6, 2026 by
ChengYao-amd
Contributor
Loading…
3 tasks
feat(gemm): Blockwise FP8 GEMM Triton persistent kernel with hardware-aware dispatch + Triton 3.7.0
#267
opened Apr 6, 2026 by
ChengYao-amd
Contributor
Loading…
3 of 4 tasks
feat(mi300x:attention): Triton FlashAttention optimization + FP8 decode fix
#266
opened Apr 6, 2026 by
ChengYao-amd
Contributor
Loading…
4 tasks done
feat(benchmark): per-model/GPU batch sizes and vocab projection for GEMM bench
#265
opened Mar 31, 2026 by
Z-Y00
Loading…
Improve MI355 tensorwise FP8 quant path for DeepSeek V2/V3 grouped GEMM
#260
opened Mar 27, 2026 by
kyle-256
Contributor
Loading…
refactor: reorganize moe ops and kernels
#243
opened Mar 5, 2026 by
zhenhuang12
Contributor
Loading…
[WIP] feat: enable pip install primus_turbo with wheel publishing
#239
opened Feb 25, 2026 by
xiaobochen-amd
Collaborator
Loading…
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.