Skip to content

Conversation

@lhp-deep
Copy link

@lhp-deep lhp-deep commented Nov 12, 2025

What this PR does / why we need it?

In reinforcement learning scenarios, the current inference applies a transpose operation to the weights. For a cleaner architecture, the weight transpose module was moved to wakeup.

Does this PR introduce any user-facing change?

How was this patch tested?

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to move the weight transposition logic for MoE layers to the wake_up function, enhancing support for Reinforcement Learning (RL) scenarios. The changes span an example file, the MoE operator implementation, and the worker logic. While the overall direction seems correct, I've identified a critical bug in the new code added to vllm_ascend/worker/worker_v1.py. The logic for retrieving a parameter's parent module is flawed and will lead to a runtime error. This needs to be addressed.

Comment on lines 200 to 206
parent_module = model
parts = name.split('.')
param_name = parts[-1]
module_path = parts[-1]

for part in module_path:
parent_module = getattr(parent_module, part)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The logic to retrieve the parent module of the w2_weight parameter is incorrect. module_path is assigned the parameter name string (parts[-1]), and the subsequent loop iterates over the characters of this string. This will cause getattr to fail at runtime.

You should iterate over the module path components to correctly traverse the model hierarchy. A cleaner way to achieve this is by using model.get_submodule().

Suggested change
parent_module = model
parts = name.split('.')
param_name = parts[-1]
module_path = parts[-1]
for part in module_path:
parent_module = getattr(parent_module, part)
parts = name.split('.')
param_name = parts[-1]
parent_module = model.get_submodule(".".join(parts[:-1]))

Comment on lines 213 to 219
parent_module = model
parts = name.split('.')
param_name = parts[-1]
module_path = parts[-1]

for part in module_path:
parent_module = getattr(parent_module, part)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Similar to the w2_weight block, the logic to retrieve the parent module for w13_weight is incorrect. module_path is assigned the parameter name string, and the loop iterates over its characters, which will cause a runtime error. You should use model.get_submodule() to correctly get the parent module.

Suggested change
parent_module = model
parts = name.split('.')
param_name = parts[-1]
module_path = parts[-1]
for part in module_path:
parent_module = getattr(parent_module, part)
parts = name.split('.')
param_name = parts[-1]
parent_module = model.get_submodule(".".join(parts[:-1]))

@lhp-deep lhp-deep changed the title move weight transpose to wakeup for RL secnarios [MOE]move weight transpose to wakeup for RL secnarios Nov 13, 2025
@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ops module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants