Skip to content

Add vLLM Buildkite CI failure report tool#8014

Open
yangw-dev wants to merge 1 commit into
mainfrom
elainewy/vllm-buildkite-fetch-failures
Open

Add vLLM Buildkite CI failure report tool#8014
yangw-dev wants to merge 1 commit into
mainfrom
elainewy/vllm-buildkite-fetch-failures

Conversation

@yangw-dev
Copy link
Copy Markdown
Contributor

@yangw-dev yangw-dev commented Apr 28, 2026

Script to fetch the latest vllm Buildkite CI build for a given branch, extract all failed steps with their failure reasons, and provide direct links to the relevant log lines.

you need to pass the buidkite token for this, see readme, simply log in as buildkite user, and generate a access token:
https://buildkite.com/user/api-access-tokens

this can potentially used to auto-detect failures after a release run. Ideal route:

  1. the release branch in vllm is updated (release engineer pushed)
  2. release engineer triggers the buildkite
  3. a cron job periodically check if buildkite tests are completed and use this tool to detect failed error and logs
  4. generate issues or call claude to evaluate
  5. pin release engineer/related pytorch engineer

example local result

python3 tools/vllm/fetch_failures.py --branch "atalman:release_212_tests" --save-local-logs
Logs saved to ./build_63095/
======================================================================
Build #63095 | Branch: atalman:release_212_tests | State: failed
Message: [CI] Fix Dockerfile.cpu to resolve torch 2.12.0 from CPU test channel
Created: 2026-04-27T13:15:10.279Z
Failed steps: 13
======================================================================

  1. [Fusion E2E TP2 Quick (H100)]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-7f55-4614-89ff-7663a81087fe
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Fusion_E2E_TP2_Quick_H100.log
    - tests/compile/fusions_e2e/test_tp2_ar_rms.py::test_tp2_ar_rms_fp8_fusions[inductor_partition--quant_fp8,-rms_norm-4-TRITON_ATTN-nvidia/Llama-4-Scout-17B-16E-Instruct-FP8-<lambda>-model_kwargs1-<lambda>] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-7f55-4614-89ff-7663a81087fe/L1144

  7. [Entrypoints Integration (API Server openai - Part 3)]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-7f94-413f-8833-7f61c5b5b519
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Entrypoints_Integration_API_Server_openai_-_Part_3.log
    - entrypoints/openai/realtime/test_realtime_validation.py::test_multi_chunk_streaming[mistralai/Voxtral-Mini-4B-Realtime-2602] | AssertionError: assert ' First words...s sure to go.' == ' First words...s sure to go.'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-7f94-413f-8833-7f61c5b5b519/L25272

  8. [LoRA %N]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-801f-4374-90e2-c70b02636115
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/LoRA_N.log
    (no specific test failures extracted)

  9. [Batch Invariance (B200)]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-802a-456a-aae7-93c4931e983e
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Batch_Invariance_B200.log
    - v1/determinism/test_batch_invariance.py::test_v1_generation_is_deterministic_across_batch_sizes_with_needle[FLASH_ATTN] | Failed: Nondeterministic outputs detected: 3 failed out of 5 trials (max_batch_size=128).
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-802a-456a-aae7-93c4931e983e/L2867

  10. [Python-only Installation]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-802c-4d8e-9671-0af9c46d347b
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Python-only_Installation.log
    - ERROR: No matching distribution found for torch==2.12.0
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-802c-4d8e-9671-0af9c46d347b/L5696

  11. [V1 Core + KV + Metrics]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-802d-40fe-b406-c932ae335b0c
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/V1_Core__KV__Metrics.log
    - entrypoints/openai/correctness/test_lmeval.py::test_lm_eval_accuracy_v1_engine | AssertionError: Expected: 0.54 |  Measured: 0.4806671721000758
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-802d-40fe-b406-c932ae335b0c/L6168

  12. [Model Runner V2 Spec Decode]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-803b-4f80-9b8d-a71b7f81774f
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Model_Runner_V2_Spec_Decode.log
    - v1/spec_decode/test_max_len.py::test_eagle_max_len[FLASH_ATTN-10] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-803b-4f80-9b8d-a71b7f81774f/L2413
    - v1/spec_decode/test_max_len.py::test_eagle_max_len[TRITON_ATTN-10] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-803b-4f80-9b8d-a71b7f81774f/L2414
    - v1/spec_decode/test_max_len.py::test_eagle_max_len[TREE_ATTN-10] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-803b-4f80-9b8d-a71b7f81774f/L2415

  13. [Distributed Model Tests (2 GPUs)]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-805f-4ccb-a1d4-565db48920b8
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Distributed_Model_Tests_2_GPUs.log
    - models/multimodal/generation/test_common.py::test_single_image_models_heavy[chameleon-broadcast-test_case0] | ValueError: Pointer argument cannot be accessed from Triton (cpu tensor?)
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-805f-4ccb-a1d4-565db48920b8/L3309
    - models/multimodal/generation/test_common.py::test_single_image_models_heavy[chameleon-broadcast-test_case1] | ValueError: Pointer argument cannot be accessed from Triton (cpu tensor?)
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-805f-4ccb-a1d4-565db48920b8/L3310
    - models/multimodal/generation/test_common.py::test_single_image_models_heavy[llava-broadcast-test_case2] | ValueError: Pointer argument cannot be accessed from Triton (cpu tensor?)
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-805f-4ccb-a1d4-565db48920b8/L3311
    - models/multimodal/generation/test_common.py::test_single_image_models_heavy[llava-broadcast-test_case3] | ValueError: Pointer argument cannot be accessed from Triton (cpu tensor?)
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-805f-4ccb-a1d4-565db48920b8/L3312
    - models/multimodal/generation/test_common.py::test_single_image_models_heavy[llava_next-broadcast-test_case4] | ValueError: Pointer argument cannot be accessed from Triton (cpu tensor?)
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-805f-4ccb-a1d4-565db48920b8/L3313
    - models/multimodal/generation/test_common.py::test_single_image_models_heavy[llava_next-broadcast-test_case5] | ValueError: Pointer argument cannot be accessed from Triton (cpu tensor?)
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-805f-4ccb-a1d4-565db48920b8/L3314

  14. [Language Models Tests (Standard)]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-8073-40d5-81c1-dc5a93f94427
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Language_Models_Tests_Standard.log
    - models/language/pooling/test_reward.py::test_prm_models_with_golden_outputs[half-Qwen/Qwen2.5-Math-PRM-7B] | assert False
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-8073-40d5-81c1-dc5a93f94427/L743

  15. [Plugin Tests (2 GPUs)]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-8095-4c67-a489-0e84f2c0199b
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Plugin_Tests_2_GPUs.log
    - models/test_oot_registration.py::test_oot_registration_embedding | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-8095-4c67-a489-0e84f2c0199b/L1812

  16. [PyTorch Compilation Unit Tests]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/PyTorch_Compilation_Unit_Tests.log
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-0-backed-gpt2] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5443
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-0-backed-Qwen/Qwen2-7B-Instruct] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5447
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-0-backed_size_oblivious-gpt2] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5448
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-0-backed_size_oblivious-Qwen/Qwen2-7B-Instruct] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5452
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-1-backed-gpt2] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5453
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-1-backed-Qwen/Qwen2-7B-Instruct] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5457
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-1-backed-Qwen/Qwen3-4B-Instruct-2507] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5458
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-1-backed_size_oblivious-gpt2] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5462
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-True-1-backed_size_oblivious-Qwen/Qwen2-7B-Instruct] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5466
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-0-backed-gpt2] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5467
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-0-backed-Qwen/Qwen2-7B-Instruct] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5471
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-0-unbacked-gpt2] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5472
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-0-unbacked-Qwen/Qwen2-7B-Instruct] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5476
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-0-unbacked-Qwen/Qwen3-4B-Instruct-2507] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5477
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-0-backed_size_oblivious-gpt2] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5481
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-1-backed-gpt2] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5482
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-1-backed-Qwen/Qwen2-7B-Instruct] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5486
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-1-backed-Qwen/Qwen3-4B-Instruct-2507] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5487
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-1-unbacked-gpt2] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5491
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-1-backed_size_oblivious-gpt2] | AssertionError: assert 'no' == 'yes'
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5492
    - compile/test_dynamic_shapes_compilation.py::test_dynamic_shapes_compilation[False-False-1-backed_size_oblivious-Qwen/Qwen2-7B-Instruct] | RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a5-4d18-8569-dfa79929df32/L5496

  17. [PyTorch Compilation Unit Tests (H100)]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a6-4397-a7e0-84bad5534d30
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/PyTorch_Compilation_Unit_Tests_H100.log
    - compile/h100/test_startup.py::test_moe_startup[0] | KeyError: None
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80a6-4397-a7e0-84bad5534d30/L762

  18. [Quantization]
    Log: https://buildkite.com/vllm/ci/builds/63095#019dcf15-80ae-4e8a-a68a-b2f75c554b00
    Local: /Users/elainewy/Documents/vllm_buildkite/build_63095/Quantization.log
    - quantization/test_cpu_offload.py::test_cpu_offload_compressed_tensors | AssertionError: Results for model='nm-testing/Qwen1.5-MoE-A2.7B-Chat-quantized.w4a16' are not the same.
      https://buildkite.com/vllm/ci/builds/63095#019dcf15-80ae-4e8a-a68a-b2f75c554b00/L4310

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
torchci Ignored Ignored Preview Apr 28, 2026 6:43pm

Request Review

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 28, 2026
@yangw-dev yangw-dev requested a review from atalman April 28, 2026 01:54
@yangw-dev
Copy link
Copy Markdown
Contributor Author

this might be better lives in vllm/ repo in long term, i will just add this to test-infra for now, and can move this anywhere later

@yangw-dev yangw-dev force-pushed the elainewy/vllm-buildkite-fetch-failures branch from ffcf29d to 02173aa Compare April 28, 2026 02:04
@yangw-dev yangw-dev requested a review from huydhn April 28, 2026 02:04
@yangw-dev yangw-dev marked this pull request as ready for review April 28, 2026 02:06
required=True,
help="Branch name, e.g. atalman:release_212_tests",
)
parser.add_argument("--token", required=True, help="Buildkite API token")
Copy link
Copy Markdown
Contributor

@huydhn huydhn Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just worth double check that ClickHouse for this, we have all the Buildkite CI signals from ClickHouse there. Using Buildkite API token is fine, but I guess not many of us have access to that while ClickHouse is more available

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tool is mainly for release pytorch investigation, potentially this can be used in https://github.com/vllm-project/vllm-dashboard for us to investigate the release errors

@huydhn
Copy link
Copy Markdown
Contributor

huydhn commented Apr 28, 2026

FYI, Kevin shares this wip dashboard earlier https://vllm-ci-dashboard.vercel.app, I can see the signals from @atalman's PR there. It doesn't have the logs though.

@yangw-dev
Copy link
Copy Markdown
Contributor Author

yangw-dev commented Apr 28, 2026

FYI, Kevin shares this wip dashboard earlier https://vllm-ci-dashboard.vercel.app, I can see the signals from @atalman's PR there. It doesn't have the logs though.

ah yes, I'm thinking add more feature to the dashboard as release investigation tool

Script to fetch the latest Buildkite CI build for a given branch,
extract all failed steps with their failure reasons, and provide
direct links to the relevant log lines.

Authored with Claude.
@yangw-dev yangw-dev force-pushed the elainewy/vllm-buildkite-fetch-failures branch from 02173aa to 7f54bf6 Compare April 28, 2026 18:43
Copy link
Copy Markdown
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants