[tx] Add separate JIT vs post-JIT timing measurement to memory benchmark by raulchen · Pull Request #1021 · NovaSky-AI/SkyRL

raulchen · 2026-02-04T20:06:53Z

Summary

Separate JIT compilation time from post-JIT runtime measurement by sending multiple requests per test
First request triggers JIT compilation and is measured separately
Subsequent requests (default: 3) measure actual post-JIT performance and are averaged
Extract common timing logic into _run_timed_requests helper method
Use random tokens (with fixed seed) for reproducible benchmarks

Changes

Add jit_e2e_sec field to capture first request (JIT) time
Rename client_e2e_sec to post_jit_e2e_sec for clarity
Add --num-measurement-iters flag (default: 3) for post-JIT iterations
Update CSV output and summary tables to show both metrics

The benchmark now sends multiple requests per test: the first request triggers JIT compilation and subsequent requests measure actual post-JIT runtime. This provides clearer insight into compilation overhead vs steady-state performance. - Add jit_e2e_sec field to capture first request (JIT) time - Rename client_e2e_sec to post_jit_e2e_sec for clarity - Add --num-measurement-iters flag (default: 3) for post-JIT iterations - Update CSV output and summary tables to show both metrics Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

gemini-code-assist

Code Review

The pull request successfully introduces separate timing measurements for JIT compilation and post-JIT runtime, providing a clearer understanding of performance characteristics. The changes involve updating the BenchmarkConfig and TestResult data structures, modifying the test execution logic to perform warmup and multiple measurement iterations, and updating output formats (CLI summary and CSV) to reflect the new metrics. The addition of the --num-measurement-iters CLI flag makes the benchmark more configurable.

skyrl-tx/benchmarks/benchmark_memory.py

Address PR feedback: - Rename parameter to match CLI flag name for consistency - Remove default value since it's always passed from caller - Extract common timing logic into _run_timed_requests helper method - Use random tokens for each request Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

raulchen · 2026-02-04T21:04:30Z

/gemini review

gemini-code-assist

Code Review

The pull request successfully introduces separate timing measurements for JIT compilation and post-JIT runtime, significantly enhancing the memory benchmark's insights. The changes are well-integrated across the configuration, data structures, execution logic, and reporting. The addition of _run_timed_requests effectively encapsulates the new measurement methodology, and the updated output formats provide clear visibility into the new metrics.

skyrl-tx/benchmarks/benchmark_memory.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gemini-code-assist bot reviewed Feb 4, 2026

View reviewed changes

skyrl-tx/benchmarks/benchmark_memory.py Outdated Show resolved Hide resolved

skyrl-tx/benchmarks/benchmark_memory.py Outdated Show resolved Hide resolved

pcmoritz added the tx label Feb 4, 2026

raulchen force-pushed the benchmark-warm branch from ed5d1ee to cf65e60 Compare February 4, 2026 21:01

gemini-code-assist bot reviewed Feb 4, 2026

View reviewed changes

skyrl-tx/benchmarks/benchmark_memory.py Outdated Show resolved Hide resolved

pcmoritz approved these changes Feb 4, 2026

View reviewed changes

raulchen and others added 2 commits February 4, 2026 13:26

Update skyrl-tx/benchmarks/benchmark_memory.py

d44cfbc

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fix

f1486f0

pcmoritz merged commit ae4917d into NovaSky-AI:main Feb 4, 2026
4 of 5 checks passed

raulchen deleted the benchmark-warm branch February 4, 2026 22:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tx] Add separate JIT vs post-JIT timing measurement to memory benchmark#1021

[tx] Add separate JIT vs post-JIT timing measurement to memory benchmark#1021
pcmoritz merged 4 commits intoNovaSky-AI:mainfrom
raulchen:benchmark-warm

raulchen commented Feb 4, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

raulchen commented Feb 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

raulchen commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

raulchen commented Feb 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

raulchen commented Feb 4, 2026 •

edited

Loading