You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SGLang: [AMD] Skip the flaky test for lora ci test. sgl-project/sglang#20175
SGLang CI started failing after add_rmsnorm_quant_kernel was merged into AITER. Please refer to this PR for detailed analysis at both the kernel level and model level precision/accuracy test.
Our experiments show that the AITER kernel (SGLANG_USE_AITER=1) add_rmsnorm_quant_kernel has a larger numerical difference compared to the vLLM kernels (SGLANG_USE_AITER=0) fused_add_rms_norm and rms_norm compared to the ground truth LlamaRMSNorm from the lib transformers.
Problem Description
This is a follow-up tracking issue for `add_rmsnorm_quant_kernel. Two related PRs are listed below:
This PR contains the latest changes to
add_rmsnorm_quant_kernel, including updates related to kernel accuracy.SGLang CI started failing after
add_rmsnorm_quant_kernelwas merged into AITER. Please refer to this PR for detailed analysis at both the kernel level and model level precision/accuracy test.Our experiments show that the AITER kernel (
SGLANG_USE_AITER=1)add_rmsnorm_quant_kernelhas a larger numerical difference compared to the vLLM kernels (SGLANG_USE_AITER=0)fused_add_rms_normandrms_normcompared to the ground truthLlamaRMSNormfrom the lib transformers.Operating System
22.04.5 LTS (Jammy Jellyfish)
CPU
AMD EPYC 9655 96-Core Processor
GPU
AMD Instinct MI325X
ROCm Version
ROCm 7.0.0
ROCm Component
No response
Steps to Reproduce
Docker: rocm/sgl-dev:v0.5.9-rocm700-mi30x-20260308
python test_rmsnorm_3way_consistency.py
Check rmsnorm_seed_sweep_4panel.png
test_rmsnorm_3way_consistency.py
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response