Skip to content

Enhance fixed opcode count workflow for benchmark #1834

@LouisTsai-Csie

Description

@LouisTsai-Csie

Fixed-Opcode-Count Feature: Current Status and Open Challenges

The --fixed-opcode-count feature allows users to specify a comma-separated list of opcode counts to run. Currently, this only works for tests that use the benchmark wrapper with a code_generator attribute.

Example:

uv run fill -v test_arithmetic.py::test_arithmetic --fixed-opcode-count 1,2,3 --clean

This produces three benchmark variants that run the attack_block generated code 1000, 2000, and 3000 times, respectively.

This feature helps evaluate performance across different opcode iteration counts and can support building regression/performance models for gas repricing effort.

However, there are several open challenges that need to be addressed before the feature becomes stable.

Configuration for Per-Operation Opcode Counts

We need a way to specify the opcode count per test or per opcode pattern. This likely requires a structured configuration (Python mapping, JSON, or YAML). Spencer has already started work on this in PR #1790 .

Expected DevEx

  1. A central mapping containing all tests that support fixed-opcode-count mode, along with their desired iteration values.
    Example:
{
    "test_arithmetic.*MUL.*": [1, 2, 3],
    "test_arithmetic.*SUB.*": [1, 5, 10],
    "test_arithmetic.*DIV.*": [5, 10],
    "test_arithmetic.*SDIV.*": [10, 20]
}
  1. When a user runs a specific test with --fixed-opcode-count without specifying values, the values should automatically come from the mapping.

  2. When values are specified on the command line, they should override the defaults in the mapping.

  3. A script should automatically detect: (1) missing entries in the mapping (based on available tests), and (2) redundant entries that no longer correspond to a test.

Block Gas Limit Requirements

In fixed-opcode-count mode, we currently set the block gas limit to a very high number (e.g., 1000M gas).
However, this is still insufficient for certain scenarios (e.g., precompiles or storage-heavy operations).

Two possible solutions

  1. Raise the upper bound dramatically (e.g., 100G gas) so that exhaustion becomes impossible.
  2. Select the block gas limit dynamically based on the benchmark parameters.

Nethermind tooling prefers the dynamic approach, but this increases the framework complexity for us. We would need to:

  • estimate gas consumption before running the benchmark, or
  • implement a rough pre-execution estimator.

Additionally, for execute-remote mode, adjusting the block gas limit is impossible since we cannot modify genesis configurations.

Opcode Count Verification

After running the benchmark, we need a reliable method to verify that the executed opcode count is close to the intended value (e.g., exact or within a 5% deviation).

However, two issues arise:

Determining the Target Opcode
Attack block, which will run for given amount of cycle count, often include extra instructions.Take test_callvalue as example, POP(CALLVALUE) may execute many more POPs than expected because POP is part of the generator logic.

We need a mechanism to identify which opcode is the target, and ignore supporting opcodes such as POP.

Acceptable Deviation Threshold
Some opcodes naturally generate more operations (e.g., test_jumpdest may emit more than exactly 1000 JUMPDESTs).
We need a consistent and well-documented deviation policy.

Additional Notes

CI Missing Coverage

We currently lack CI workflows for:

  • fill with fixed-opcode-count
  • execute with fixed-opcode-count

These should be added to prevent regressions.

Repricing Logic Must Be Refactored

The repricing flag and related logic have become overly complex and need to be simplified to avoid confusion when combined with the fixed-opcode-count feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions