Improve benchmark docs page coverage and formatting#15623
Improve benchmark docs page coverage and formatting#15623kuldeep27396 wants to merge 3 commits intoapache:mainfrom
Conversation
|
AI-generated summary of this PR:
I reviewed and posted this summary, but the text above was generated with AI. |
steveloughran
left a comment
There was a problem hiding this comment.
This is a really good rework.
Note that I do think the lists of benchmarks for each module would be best as tables, though that's very subjective.
|
|
||
| Below are the existing benchmarks shown with the actual commands on how to run them locally. | ||
| JMH writes human-readable output to `build/reports/jmh/human-readable-output.txt` and JSON output to `build/reports/jmh/results.json` by default. Override them with `-PjmhOutputPath=<path>` and `-PjmhJsonOutputPath=<path>` if needed. | ||
|
|
There was a problem hiding this comment.
probably worth mentioning https://jmh.morethan.io/ as a way to display json results: you can share the results.json with others for them to view.
| Below are the existing benchmarks shown with the actual commands on how to run them locally. | ||
| JMH writes human-readable output to `build/reports/jmh/human-readable-output.txt` and JSON output to `build/reports/jmh/results.json` by default. Override them with `-PjmhOutputPath=<path>` and `-PjmhJsonOutputPath=<path>` if needed. | ||
|
|
||
| The default versions in this repository are: |
There was a problem hiding this comment.
listing this creates one more maintenance point when versions are upgraded...or somewhere where they get out of date. I think it's best to not enumerate.
|
@kuldeep27396 Please explicitly mark the message generated by AI. |
|
Updated in efb5a2d: converted the benchmark inventories on the docs page to tables for readability, and I also edited the earlier summary comment to explicitly mark it as AI-generated. |
steveloughran
left a comment
There was a problem hiding this comment.
commented on (existing) text. TLDR; your macbook with an ide and claude using 3/4 RAM doesn't resemble a 32 core x86 server with 128 GB of RAM, so try to set up realistic test considitions, or at least close the IDE before running the tests overnight
| Benchmarks are located under `<project-name>/jmh`. It is generally favorable to only run the tests of interest rather than running all available benchmarks. | ||
| Also note that JMH benchmarks run within the same JVM as the system-under-test, so results might vary between runs. | ||
| Benchmarks are located under `<module>/src/jmh`. It is generally better to run only the benchmarks you are investigating instead of the full suite. | ||
| Also note that JMH benchmarks run in the same JVM as the system under test, so results may vary between runs. |
There was a problem hiding this comment.
If look for uses of @Fork(1) in the benchmarks you'll see this isn't true. The only @fork(0) right now is in the PR I'm working on, as I want fast iterations. (#15629).
But even with the forking, performance varies a lot on a local system because of what else you are doing on the same system, memory consumption etc. If you kick off a benchmark then have a conversation with an AI agent you will get bad numbers.
Better to say
Iceberg benchmarks execute in a new JVM for each test, for isolation and reproducibility.
Other work taking place on the same computer may create demand for CPU, memory or other system resource, and so produce inconsistent results.
Benchmarks results should be more reliable if executed on a system without other CPU, RAM or IO processes active at the same time. Even better: run them on a host dedicated exclusively to the benchmark.
Why this change
The benchmarks page had a few separate problems that were all tied together:
Because of that, a narrow formatting-only edit would still have left the page inaccurate and incomplete. The page needed to be cleaned up structurally and rewritten around the current benchmark layout so the commands and benchmark inventory stayed aligned with the repository.
What changed
How this was validated
core,data,spark/v4.1/spark,spark/v4.1/spark-extensions, andflink/v2.1/flinkThat command resolved successfully and confirmed the task paths referenced by the updated page, including:
:iceberg-core:jmh:iceberg-data:jmh:iceberg-flink:iceberg-flink-2.1:jmh:iceberg-spark:iceberg-spark-4.1_2.13:jmh:iceberg-spark:iceberg-spark-extensions-4.1_2.13:jmhCloses #15556