[Kernel] add custom op DispatchGmmCombineDecode #4139

GuoRen868 · 2025-11-12T03:53:06Z

What this PR does / why we need it?
Does this PR introduce any user-facing change?
How was this patch tested?
vLLM version: v0.11.0
vLLM main: vllm-project/vllm@24d6314

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@83f478b

github-actions · 2025-11-12T03:53:14Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces a new custom operator DispatchGmmCombineDecode for the Ascend platform. The changes include the operator definition, kernel implementation, build scripts, and PyTorch bindings. My review has identified a few critical issues. There is a significant issue in the shell script csrc/build_aclnn.sh regarding environment variable setup which could cause silent failures. Another critical bug is in csrc/pytorch_npu_helper.hpp where tensor strides are calculated incorrectly, which will fail for non-contiguous tensors. Additionally, there's a confusing duplicated field in csrc/custom_ops/kernels/dispatch_gmm_combine_decode/op_kernel/dispatch_gmm_combine_decode_tiling.h that should be corrected to improve maintainability.

gemini-code-assist · 2025-11-12T03:56:58Z

csrc/build_aclnn.sh

+
+# install custom ops
+./build_out/custom_ops/run/CANN_ascend910_93_ubuntu_aarch64.run --install-path=/usr/local/Ascend/ascend-toolkit/latest/opp/
+source /usr/local/Ascend/ascend-toolkit/latest/opp/vendors/customize/bin/set_env.bash


The source command on this line will only affect the environment of the script's execution shell. When this script is executed, it runs in a sub-shell, and any environment variables set within it are lost when the script finishes. If the intention is to modify the environment of the calling shell, this script should be sourced (e.g., source csrc/build_aclnn.sh) rather than executed. The #!/bin/bash shebang is misleading if the script is meant to be sourced. This can lead to silent failures in the environment setup.

gemini-code-assist · 2025-11-12T03:56:58Z

csrc/pytorch_npu_helper.hpp

+
+    // 适配dispatch_gmm_combine_decode算子的weight入参
+    if (acl_data_type == ACL_INT8 && dimNum == 3) {
+        format = ACL_FORMAT_FRACTAL_NZ;
+    }
+
+    auto acl_tensor =
+        aclCreateTensor(at_tensor.sizes().data(), at_tensor.sizes().size(), acl_data_type, strides.data(),
+                        0, format, at_tensor.sizes().data(), at_tensor.sizes().size(),
+                        const_cast<void *>(at_tensor.storage().data()));
+
+    return acl_tensor;
+}
+
+inline aclScalar *ConvertType(const at::Scalar &at_scalar)
+{


The calculation of tensor strides is incorrect as it assumes the tensor is contiguous. This will lead to incorrect memory access and data corruption for non-contiguous tensors. You should use the tensor's actual strides and storage offset provided by PyTorch via at_tensor.strides() and at_tensor.storage_offset().

const auto dimNum = at_tensor.dim(); aclFormat format = ACL_FORMAT_ND; // 适配dispatch_gmm_combine_decode算子的weight入参 if (acl_data_type == ACL_INT8 && dimNum == 3) { format = ACL_FORMAT_FRACTAL_NZ; } auto acl_tensor = aclCreateTensor(at_tensor.sizes().data(), dimNum, acl_data_type, at_tensor.strides().data(), at_tensor.storage_offset(), format, at_tensor.sizes().data(), dimNum, const_cast<void *>(at_tensor.storage().data()));

gemini-code-assist · 2025-11-12T03:56:58Z

...ustom_ops/kernels/dispatch_gmm_combine_decode/op_kernel/dispatch_gmm_combine_decode_tiling.h

+    uint32_t aicNum;               // aivNum
+    uint32_t aivNum;               // aivNum


There appears to be a duplicated field aivNum and a confusing comment. The struct DispatchGmmCombineDecodeInfo has aicNum with comment // aivNum and aivNum with comment // aivNum. This is likely a copy-paste error and can lead to confusion and bugs. Please clarify the purpose of each field and correct the comments. For example, aicNum should probably be for AI Core count and aivNum for AI Vector count.

Suggested change

uint32_t aicNum; // aivNum

uint32_t aivNum; // aivNum

uint32_t aicNum; // aicNum

uint32_t aivNum; // aivNum

GuoRen868 added 5 commits November 8, 2025 18:18

fused

c36db04

binding

953251b

aclnn compile.

1550c86

build and test

6deb62d

fixbug

bb712ab

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

change name

e0b3da3

GuoRen868 force-pushed the fused_pr branch from 48db2f2 to e0b3da3 Compare November 12, 2025 03:57

github-actions bot added the module:tests label Nov 12, 2025

uttest

49eca00

GuoRen868 force-pushed the fused_pr branch from c8479f5 to 49eca00 Compare November 12, 2025 09:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Kernel] add custom op DispatchGmmCombineDecode #4139

[Kernel] add custom op DispatchGmmCombineDecode #4139

GuoRen868 commented Nov 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Uh oh!

gemini-code-assist bot Nov 12, 2025

Uh oh!

gemini-code-assist bot Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Kernel] add custom op DispatchGmmCombineDecode #4139

Are you sure you want to change the base?

[Kernel] add custom op DispatchGmmCombineDecode #4139

Conversation

GuoRen868 commented Nov 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GuoRen868 commented Nov 12, 2025 •

edited by github-actions bot

Loading