Add benchmark automation tool #563

liu-cong · 2025-03-24T17:01:12Z

This is a tool to automatically generate benchmark manifests deploying vllm model servers and an inference gateway or a k8s service, and runs benchmarks against them. It automatically deploys the resources in a new namespace, collects results, and tears down the resources.

This tool is experimental. I tested this tool using GKE. In theory, it should work on any cluster, though it may require minimal modifications.

EDIT:
This tool currently has many limitations and will require non-trivial improvements to become a stable tool for long term maintenance. In the meantime, I found it very useful and saved me a lot of time, so I'd like to share it out so people may find it useful. I am seeking feedback on whether we should merge it now, or keep it in a branch and start thinking about changes that benefit us in the long term..

Future improvements

This tool currently has its own helm charts and it requires effort to keep them up to date. It will be nice to have these charts standardized across this repo.
The tool makes heavy use of bash scripts which is hard to test/maintain long term.
Improve code structure, readability and testing

k8s-ci-robot · 2025-03-24T17:01:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: liu-cong
Once this PR has been reviewed and has the lgtm label, please assign arangogutierrez for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

netlify · 2025-03-24T17:02:25Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`0d5e02c`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67e19c0cc386b000085761ea
😎 Deploy Preview	https://deploy-preview-563--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

kfswain · 2025-03-24T17:31:28Z

/hold

liu-cong · 2025-03-24T17:33:06Z

cc @kaushikmitr @achandrasekar

smarterclayton · 2025-03-24T18:45:45Z

tools/benchmark/scripts/run_benchmarks_file.bash

@@ -0,0 +1,39 @@
+#!/bin/bash


This is a lot of bash. What's the principle underpinning how this will evolve? If you need to do analysis of the output, bash will quickly hit limits.

I kind of expected either a go code / go test like structure, or a python / colab style experimentation setup. Agree some of these things are good for bash, but not all of them are.

Bash is mostly kubectl commands to set up and tear down the benchmarks, output analysis is done with a Jupyter notebook.

But agree there is quite a lot of bash, and some of it can/should be done with python or go for better readability and testing.

I think a key principle to discuss is the benchmark config API, as well as the benchmark results API (currently implicitly defined by the LPG tool json format). With those defined people can come up with different implementations of automating the benchmark. And bash, or go, or juputer notebook is more of an implementation detail.

kfswain · 2025-04-09T02:48:16Z

What are we thinking with this one?

liu-cong · 2025-04-09T16:06:29Z

I think most folks are busy with other things and didn't have time to try this out. I know @kaushikmitr was planning to try this.
Some solid feedback will help us decide whether we want to maintain this.

I am happy to put this to a branch, and defer the decision once we have more concrete feedbacks.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 24, 2025

k8s-ci-robot requested review from ahg-g and kfswain March 24, 2025 17:01

k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 24, 2025

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 24, 2025

liu-cong force-pushed the benchmark branch from 629b182 to 943195f Compare March 24, 2025 17:52

Add benchmark automation tool

0d5e02c

liu-cong force-pushed the benchmark branch from 943195f to 0d5e02c Compare March 24, 2025 17:53

smarterclayton reviewed Mar 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark automation tool #563

Add benchmark automation tool #563

liu-cong commented Mar 24, 2025 •

edited

Loading

k8s-ci-robot commented Mar 24, 2025

netlify bot commented Mar 24, 2025 •

edited

Loading

kfswain commented Mar 24, 2025

liu-cong commented Mar 24, 2025

smarterclayton Mar 24, 2025

liu-cong Mar 24, 2025

kfswain commented Apr 9, 2025

liu-cong commented Apr 9, 2025

Add benchmark automation tool #563

Are you sure you want to change the base?

Add benchmark automation tool #563

Conversation

liu-cong commented Mar 24, 2025 • edited Loading

k8s-ci-robot commented Mar 24, 2025

netlify bot commented Mar 24, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

kfswain commented Mar 24, 2025

liu-cong commented Mar 24, 2025

smarterclayton Mar 24, 2025

Choose a reason for hiding this comment

liu-cong Mar 24, 2025

Choose a reason for hiding this comment

kfswain commented Apr 9, 2025

liu-cong commented Apr 9, 2025

liu-cong commented Mar 24, 2025 •

edited

Loading

netlify bot commented Mar 24, 2025 •

edited

Loading