Skip to content

Conversation

@ZhiweiYan-96
Copy link

@ZhiweiYan-96 ZhiweiYan-96 commented Nov 27, 2025

Purpose

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@ZhiweiYan-96 ZhiweiYan-96 changed the title [bugfx mxpf4] Infer mxfp4 quantmethod from layer [bugfix mxpf4] Infer mxfp4 quantmethod from layer Nov 27, 2025
a2_scale=None,
block_shape=None,
)
return mxfp4_w4a16_moe_quant_config(layer.w13_weight_scale,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave some comment here for why not choosing ocp_mx_moe_quant_config

Signed-off-by: ZhiweiYan-96 <[email protected]>
@tjtanaa
Copy link

tjtanaa commented Nov 28, 2025

@xuebwang-amd @haoyangli-amd @fxmarty-amd can you help to take a look at this quark bugfix?
@fxmarty-amd I saw that in your PR vllm-project@41f1cf3#diff-c73528091f2176f6547e24074950962deec4311315fa5e99ef2ca7682680708c introduced ocp_mx_moe_quant_config. Do you intend ocp_mx_moe_quant_config to only support act-mxfp4 and weight-mxfp4, or should we extend the ocp_mx_moe_quant_config to support act-bfloat16 and weight-mxfp4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants