I'm getting a flaky test_qwen3_moe_layer_lora (failing line here) rerunning the CI with an empty commit solved it
the test compares two equivalent computations using np.allclose(rtol=1e-3, atol=1e-3). The output array spans roughly -2.8 to 10^4 in the same tensor. np.allclose checks |a-b| <= atol + rtol*|b| per element, so the tolerance budget collapses for near-zero entries while being generous for large ones, the assertion trips even though the relative error is within expected f32 variability
Test and full CI logs here
This could be solved by scaling atol relative to the array's maximum magnitude rather than using a fixed constant, which keeps the check meaningful across the full value range
Claude suggests vLLM solves this by comparing the top_k rank values with the check_logprobs_close function link instead of the raw values
I'm getting a flaky
test_qwen3_moe_layer_lora(failing line here) rerunning the CI with an empty commit solved itthe test compares two equivalent computations using
np.allclose(rtol=1e-3, atol=1e-3). The output array spans roughly -2.8 to 10^4 in the same tensor. np.allclose checks|a-b| <= atol + rtol*|b|per element, so the tolerance budget collapses for near-zero entries while being generous for large ones, the assertion trips even though the relative error is within expected f32 variabilityTest and full CI logs here
This could be solved by scaling
atolrelative to the array's maximum magnitude rather than using a fixed constant, which keeps the check meaningful across the full value rangeClaude suggests vLLM solves this by comparing the top_k rank values with the
check_logprobs_closefunction link instead of the raw values