[bugfix mxpf4] Infer mxfp4 quantmethod from layer #824

ZhiweiYan-96 · 2025-11-27T18:07:02Z

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

zejunchen-zejun · 2025-11-28T04:03:48Z

vllm/model_executor/layers/quantization/quark/quark_moe.py

-            a2_scale=None,
-            block_shape=None,
-        )
+        return mxfp4_w4a16_moe_quant_config(layer.w13_weight_scale,


leave some comment here for why not choosing ocp_mx_moe_quant_config

Signed-off-by: ZhiweiYan-96 <[email protected]>

tjtanaa · 2025-11-28T04:47:25Z

@xuebwang-amd @haoyangli-amd @fxmarty-amd can you help to take a look at this quark bugfix?
@fxmarty-amd I saw that in your PR vllm-project@41f1cf3#diff-c73528091f2176f6547e24074950962deec4311315fa5e99ef2ca7682680708c introduced ocp_mx_moe_quant_config. Do you intend ocp_mx_moe_quant_config to only support act-mxfp4 and weight-mxfp4, or should we extend the ocp_mx_moe_quant_config to support act-bfloat16 and weight-mxfp4.

[bugfx mxpf4] Infer mxfp4 quantmethod from layer

54cac38

ZhiweiYan-96 requested review from kliuae-amd, tjtanaavllm, wuhuikx and zejunchen-zejun as code owners November 27, 2025 18:07

ZhiweiYan-96 changed the title ~~[bugfx mxpf4] Infer mxfp4 quantmethod from layer~~ [bugfix mxpf4] Infer mxfp4 quantmethod from layer Nov 27, 2025

ZhiweiYan-96 added 2 commits November 28, 2025 03:56

Use aw4a16 config

dd265e8

lint

760ad2d

zejunchen-zejun reviewed Nov 28, 2025

View reviewed changes

Add comments

7da5e5f

Signed-off-by: ZhiweiYan-96 <[email protected]>

ZhiweiYan-96 requested a review from lihaoyang-amd November 28, 2025 04:41

zhuyuhua-v mentioned this pull request Nov 28, 2025

[Sync] dev/perf sync with upstream 20251124 #822

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bugfix mxpf4] Infer mxfp4 quantmethod from layer #824

[bugfix mxpf4] Infer mxfp4 quantmethod from layer #824

Uh oh!

ZhiweiYan-96 commented Nov 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

zejunchen-zejun Nov 28, 2025

Uh oh!

tjtanaa commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[bugfix mxpf4] Infer mxfp4 quantmethod from layer #824

Are you sure you want to change the base?

[bugfix mxpf4] Infer mxfp4 quantmethod from layer #824

Uh oh!

Conversation

ZhiweiYan-96 commented Nov 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

zejunchen-zejun Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

tjtanaa commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ZhiweiYan-96 commented Nov 27, 2025 •

edited by github-actions bot

Loading