qwen3-vl Vit module enable sp and mrope fusion op #4165

qigangc · 2025-11-13T06:11:57Z

What this PR does / why we need it?

Enable Qwen3-VL vit sp parallel and mrope npu fusion op

Does this PR introduce any user-facing change?

No

How was this patch tested?

Test Qwen3-VL 30B model accuracy on textVQA with aisbench

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@2918c1b

github-actions · 2025-11-13T06:12:06Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request adapts the Qwen3-VL large model for sequence parallelism (SP) on Ascend NPUs. It introduces new distributed utility functions for all-to-all communication and modifies the vision transformer components to incorporate SP logic, including tensor padding, sharding, and gathering. While the overall approach to implementing sequence parallelism is sound, I've identified critical bugs in the new all-to-all communication primitives. These bugs will cause incorrect tensor reshaping, leading to corrupted data and incorrect model outputs. These issues must be addressed for the SP implementation to function correctly.

gemini-code-assist · 2025-11-13T06:14:09Z

vllm_ascend/distributed/context_parallel_utils.py

+        output = output.reshape(hc, shard_seqlen, bs,
+                                hs).transpose(0, 2).contiguous()
+        return output.reshape(bs, shard_seqlen, hc, hs)


The reshape operation on the output tensor is incorrect. The tensor has a shape of (seq_world_size, shard_hc, shard_seqlen, bs, hs), and the reshape attempts to merge the first two dimensions (seq_world_size and shard_hc). However, these dimensions are not contiguous in memory after the preceding transpose operations. A transpose(0, 1) is required to make them adjacent before reshaping. Failure to do so will result in a tensor with scrambled data.

Additionally, the reshape in the return statement is redundant as the tensor already has the correct shape after the preceding operations.

Suggested change

output = output.reshape(hc, shard_seqlen, bs,

hs).transpose(0, 2).contiguous()

return output.reshape(bs, shard_seqlen, hc, hs)

output = output.transpose(0, 1).contiguous().reshape(

hc, shard_seqlen, bs, hs).transpose(0, 2).contiguous()

return output

gemini-code-assist · 2025-11-13T06:14:09Z

vllm_ascend/distributed/context_parallel_utils.py

+        output = output.reshape(hc, shard_seqlen,
+                                hs).transpose(0, 1).contiguous()
+        return output


Similar to the issue in all_to_all_4d, the reshape operation on the output tensor here is incorrect. The output tensor has a shape of (seq_world_size, shard_hc, shard_seqlen, hs), and reshape(hc, ...) incorrectly attempts to merge the non-contiguous first two dimensions. This will lead to data corruption. You need to transpose the first two dimensions to make them contiguous before reshaping.

Suggested change

output = output.reshape(hc, shard_seqlen,

hs).transpose(0, 1).contiguous()

return output

output = output.transpose(0, 1).contiguous().reshape(

hc, shard_seqlen, hs).transpose(0, 1).contiguous()

return output

Adaptation for Qwen3-VL large model SP parallelism functionality

b42a142

qigangc changed the title ~~Adaptation for Qwen3-VL large model SP parallelism functionality~~ qwen3-vl Vit module enable sp and mrope fusion op Nov 13, 2025

gemini-code-assist bot reviewed Nov 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

qwen3-vl Vit module enable sp and mrope fusion op #4165

qwen3-vl Vit module enable sp and mrope fusion op #4165

qigangc commented Nov 13, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 13, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 13, 2025

Uh oh!

gemini-code-assist bot Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qwen3-vl Vit module enable sp and mrope fusion op #4165

Are you sure you want to change the base?

qwen3-vl Vit module enable sp and mrope fusion op #4165

Conversation

qigangc commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 13, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qigangc commented Nov 13, 2025 •

edited

Loading