Skip to content

Conversation

@spencer-tb
Copy link
Contributor

@spencer-tb spencer-tb commented Nov 14, 2025

🗒️ Description

Follow up to #1747
Adds a python config file to allow us to use multiple differing opcodes counts for each scenario on the fly.

This PR is a WIP.

🔗 Related Issues or PRs

N/A.

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@spencer-tb spencer-tb added C-enhance Category: an improvement or new feature A-test-benchmark Area: Tests Benchmarks—Performance measurement (eg. `tests/benchmark/*`, `p/t/s/e/benchmark/*`) P-high labels Nov 15, 2025
Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this! I left some suggestions, but I’m happy to discuss further. I’ll share this with Kamil to confirm it aligns with their needs.


# Scenario configurations using test_name.*OPCODE.* patterns
# Keys are regex patterns checked in order; first match wins
SCENARIO_CONFIGS = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to move this mapping into a separate file so that we can maintain it automatically with the script.

I have this prototype, here’s a quick summary of the script that helps analyze differences between the old and new mappings:

  1. It uses Python’s AST to analyze benchmark tests.
  2. It filters only tests that use the benchmark_test wrapper with a code_generator attribute, since we currently only support fixed-opcode-count feature in code generator.
  3. Based on current regex conventions, identifies the patterns:
  • If there is no opcode parameterization, the regex follows: test_<opcode/precompile>_<modifier>.* (e.g., test_selfbalance, test_codesize).
  • If the test is parametrized by opcode, the regex includes the opcode:
    test_name.*<opcode>.*, e.g., test_ext_account_query_warm.*<opcode>.* where <opcode> is BALANCE, EXTCODESIZE, etc.
  • Note: Some parametrization names should be refactored to opcode. For example, in test_mod_arithmetic, op should be renamed to opcode so the script can detect it correctly.
  1. The script supports several modes:
  • check mode: Detect new/missing entries without modifying files
  • dry-run: Show the generated configuration without applying changes
  • update: Update the config file with new entries
  • no-filter: Include all benchmark tests (not needed and can be removed)

However, I suggest the following structure, and update the script accordingly:

  • the opcode-count mapping under tests/benchmark/configs/fixed_opcode_counts.py
  • the parser and mapping update script under tests/benchmark/configs/parser.py

Regarding --fixed-opcode-count, I agree with the behavior this PR already had:

  • If no values are passed, use the defaults from the mapping.
  • If values are passed, they override the configuration for flexibility.

What do you think? Happy to discuss with different approach.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussion yesterday, add this script into CI to make sure no cases missing.

}


def get_opcode_counts_for_scenario(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to separate SCENARIO_CONFIGS variable and get_opcode_counts_for_scenario function in different files.

Comment on lines 165 to +166
if has_repricing:
if fixed_opcode_counts:
opcode_counts = [
int(x.strip()) for x in fixed_opcode_counts.split(",")
opcode_counts_to_use = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contains a few logic issues, but they will be resolved in my upcoming PR.

@spencer-tb spencer-tb force-pushed the enhance/benchmarking/fixed-opcode-count-config branch from 7ca99dc to 7c68d19 Compare December 4, 2025 16:34
@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.31%. Comparing base (85c6fff) to head (18ceb8e).
⚠️ Report is 3 commits behind head on forks/osaka.

Additional details and impacted files
@@             Coverage Diff              @@
##           forks/osaka    #1790   +/-   ##
============================================
  Coverage        87.31%   87.31%           
============================================
  Files              541      541           
  Lines            32832    32832           
  Branches          3015     3015           
============================================
  Hits             28668    28668           
  Misses            3557     3557           
  Partials           607      607           
Flag Coverage Δ
unittests 87.31% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-test-benchmark Area: Tests Benchmarks—Performance measurement (eg. `tests/benchmark/*`, `p/t/s/e/benchmark/*`) C-enhance Category: an improvement or new feature P-high

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants