Skip to content

Conversation

@22dimensions
Copy link
Collaborator

@22dimensions 22dimensions commented Nov 12, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions github-actions bot added documentation Improvements or additions to documentation module:tests module:ops labels Nov 12, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the codebase to support a newer version of the vLLM library. The changes primarily involve adapting to moved utility functions and API modifications. While the changes appear functional, there's a significant amount of code duplication introduced for handling version-dependent imports (e.g., cdiv, init_cached_hf_modules) across numerous files. I strongly recommend refactoring this logic into a centralized compatibility utility module to enhance maintainability. Additionally, there's duplicated logic in vllm_ascend/models/qwen2_5_vl.py for processing visual inputs that could be extracted into a helper method.

Comment on lines 35 to 40
from vllm_ascend.utils import vllm_version_is

if vllm_version_is("0.11.0"):
from vllm.utils import cdiv
else:
from vllm.utils.math_utils import cdiv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This conditional import logic for cdiv is repeated in many other files (attention/mla_v1.py, core/scheduler.py, distributed/mooncake/config_data.py, patch/platform/patch_mamba_config.py, spec_decode/mtp_proposer.py, torchair/torchair_attention.py, torchair/torchair_mla.py, torchair/torchair_sfa.py, worker/block_table.py, worker/model_runner_v1.py). A similar pattern exists for init_cached_hf_modules in worker/worker_v1.py and its test file. This widespread duplication makes the code harder to maintain and prone to errors if future updates are needed.

To improve this, I suggest creating a central compatibility utility module (e.g., vllm_ascend/utils/compat.py) to house all such version-dependent imports. Then, other files can import cdiv, init_cached_hf_modules, etc., directly from this new module, centralizing the version-checking logic.

Comment on lines 540 to 550
if vllm_version_is("0.11.0"):
image_embeds = self.visual(pixel_values, grid_thw=grid_thw)
else:
with set_ascend_forward_context(None, self.vllm_config):
image_embeds = self.visual(pixel_values, grid_thw=grid_thw)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This if/else block to handle different vLLM versions is duplicated in _process_video_input (lines 561-567). This duplicated logic should be extracted into a private helper method to improve code clarity and maintainability. For example, you could create a _run_visual method:

def _run_visual(self, pixel_values, grid_thw):
    if vllm_version_is("0.11.0"):
        return self.visual(pixel_values, grid_thw=grid_thw)
    else:
        with set_ascend_forward_context(None, self.vllm_config):
            return self.visual(pixel_values, grid_thw=grid_thw)

Then you can call this helper in both _process_image_input and _process_video_input.

            image_embeds = self._run_visual(pixel_values, grid_thw=grid_thw)


from vllm.v1.kv_cache_interface import FullAttentionSpec, MambaSpec

from vllm_ascend.utils import vllm_version_is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This import of vllm_version_is is a duplicate of the one on line 7 and should be removed.

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions github-actions bot added module:tests and removed documentation Improvements or additions to documentation module:ops labels Nov 13, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 13, 2025
@22dimensions 22dimensions added ready-for-test start test by label for PR ready read for review labels Nov 14, 2025
@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

leo-pony and others added 18 commits November 14, 2025 17:51
…tructured outputs compatibility#26866

Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: 22dimensions <[email protected]>
Signed-off-by: 22dimensions <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation module:core module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants