-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Update EIP-8037: Add access-list based deduplication and success/failure cases #10542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Update EIP-8037: Add access-list based deduplication and success/failure cases #10542
Conversation
✅ All reviewers have approved. |
…ure gas accounting This commit implements two critical fixes to EIP-8037: 1. Success/Failure Gas Accounting: Properly distinguishes between successful and failed contract deployments. Post-execution costs (ACCOUNT_CREATION_COST, HASH_GAS, CODE_DEPOSIT_GAS) are now only charged on success, aligning with Ethereum's principle that gas should reflect actual resource consumption. 2. Access-List CodeHash Deduplication: Replaces database lookup mechanism with an access-list based approach to solve consensus divergence between different sync modes (full-sync vs snap-sync). Approximately 27,869 orphaned bytecodes exist in full-sync nodes with no live account links, making DB lookups non-deterministic. Changes include: - Explicit success/failure gas formulas with detailed rationale - AccessTuple extension with checkCodeHash field - Comprehensive rationale explaining sync-mode divergence problem - Developer-facing specification with Go struct and JSON example - Updated gas cost examples for all scenarios - LaTeX formatting for mathematical notation
60386c3
to
453f2e5
Compare
EIPS/eip-8037.md
Outdated
- If duplicate: do NOT charge `CODE_DEPOSIT_GAS(L)`; link `codeHash` to `H` | ||
- If not duplicate: charge `CODE_DEPOSIT_GAS(L)` where `CODE_DEPOSIT_GAS(L) = 1,900 × L` and persist `R` under `H`, then link | ||
4. **Failure paths** (REVERT, OOG/invalid during initcode, OOG during code deposit, or `L > MAX_CODE_SIZE`): | ||
- Do NOT charge `ACCOUNT_CREATION_COST`, `HASH_GAS(L)`, or `CODE_DEPOSIT_GAS(L)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we missing the case where the initcode reverts but the account is still created? I remember this being an edge-case that Guillaume pointed out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gballet is this the case?
Aren't all clients creating the account after or before? They do as they want?
@misilva73 my idea here was that if we fail, we won't create the account. Note that without selfdestruct, it should not be possible to create the account if initcode reverts.
Nevertheless, happy to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it's a bit tricky, but yes there is some edge case: if there isn't enough gas to save the entire code, then its nonce will be set to 1
and the account will be saved to the state.
To be clear, it's only in case of an OOG during the storage part. If the init code reverts, or runs out of gas, no account will be created.
The account will be created before calling the init code (although it's persisted in the state afterwards) but there is a "checkpoint save" before that is done. If there is any other error than running out of gas while storing the code, then it will go back to that checkpoint.
- so
L > MAX_CODE_SIZE
andREVERT
/OOG during initcode should not chargeACCOUNT_CREATION_COST
but OOG during code deposit does. - note that if initcode OOGs, it doesn't matter if you charge the following gas costs: there's just not enough gas to pay for them anyway. So whether you are at -100 or -1000 makes no difference.
The question that remains to be answered is: how do we treat running out of gas for HASH_GAS(L)
? should the account be created? It depends when the hash happens. if it happens after the account creation gas is charged, then I'd say it's only fair that the contract gets created. If it happens before, then it should not. My preference goes to creating it because we paid for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I've made a mistake here: this is only true of pre-homestead: post-homestead, the thing is reverted. so we don't have this problem today. That makes sense, you don't want to prevent a further deployment of code at this address.
But the question still remains: if you oog during code hash computation, do you want to create the contract if things still happen? to be consistent with the current behavior, you should not create anything. It's unfair, because you paid for it, but it's simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one thing to add, mostly as a reference for myself to remember: you don't need to charge the hashing costs, because this happens after the gas cost for contract storage have been paid. so the wording is correct.
EIPS/eip-8037.md
Outdated
- If `CodeExists(...) == true`, do not charge code-deposit storage gas (`code_deposit_gas=0`); simply link the new account's `codeHash` to the existing code object. | ||
|
||
In addition, contract creation is also charged a `storage_access_cost = GAS_WARM_ACCESS (if warm) | GAS_COLD_SLOAD (if cold)` and a `hash_cost = GAS_KECCAK256_WORD * code_data_words`, where `code_data_words = ceil(L / 32)`. This accounts for the added execution cost of accessing and verifying for duplicated code. | ||
This proposal fundamentally changes how contract deployment costs are computed to properly account for success and failure paths, ensuring that post-execution costs are only charged when deployment succeeds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be a bit more specific here. My suggestion:
This proposal fundamentally changes how contract deployment costs are computed to properly account for success and failure paths, ensuring that post-execution costs are only charged when deployment succeeds. | |
This proposal introduces a discount for contract deployment of duplicated bytecode. To account for the added work of checking bytecode duplication, it additionally charges a storage access cost and a hashing cost, while properly accounting for success and failure paths and ensuring that post-execution costs are only charged when deployment succeeds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the discount should be the first thing to mention here, because the main point is that it is a more fine-grained accounting of the state gas costs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is only more fine-grained because of the discount. If it was not for the discount, we would not be charging the hashing cost nor the access cost
- No code is stored; no `codeHash` is linked to the account | ||
- The account remains unchanged or non-existent | ||
|
||
**Total gas formulas:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we account for the case where an account is created but code is not deployed, we would need another formula here. To keep things simple, I would suggest not having these formulas. Not a strong opinion, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is suposed to be the spec, I think the more info we have, and the clearer we make it, the better. Core devs will appreciate nice and detailed specs IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The final spec won't look like this, actually. If we are doing the correct spec, we will need to change generic_create
, def create
and def create2
and the formula will be more complicated than what we currently have.
My understanding is that this is a simplified description of the changes and thus the shorter and simpler, the better. But if you feel strongly about keeping it, I will not be a blocker.
+ EXECUTION_GAS | ||
``` | ||
|
||
#### Access-List CodeHash Deduplication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we should add 2930 to the required EIPs field 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. I thought that this is only done if the EIP required isn't merged.
Happy to add it.
Maybe @jochem-brouwer can bring some advice here.
|
||
Upon activation of this EIP, the following parameters of the gas model are updated: | ||
|
||
| **Parameter** | **Current value** | **New value** | **Increase** | **Operations affected** | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep the previous variable names. I took them from the execution specs. Is there a reason for using the new ones you propose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh I see. I thought they were better. But not strongly opinionated. So will revert.
EIPS/eip-8037.md
Outdated
4. Explicit opt-in: Transactions must explicitly indicate which addresses to check, preventing unexpected behavior and giving users control over gas optimization. | ||
5. Forward compatibility: Older clients that don't implement the feature simply never grant the discount, maintaining compatibility while newer clients can optimize. Same will happen with Wallet providers and TX builders. | ||
|
||
#### Access-List Extension Specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section shouldn't be under "rationale". We should create a section named "Reference Implementation" . Check EIP-1
} | ||
``` | ||
|
||
**Consensus semantics:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's run over an example, as I don't think I am who W
is built.
Let's assume we have a block with two transactions:
-
$T_A$ deploys a contract with new bytecode -
$T_B$ follows$T_A$ and deploys the same bytecode as$T_A$
With our current design codeHash
of both contracts is in W
as they are "access-listed addresses where checkCodeHash == true
". So, before each transaction execution, the check would yield a duplicated bytecode for both transactions and no code deposit would be charged. Is this correct?
Also, if I understand this correctly:
- If
W
only contains thecodeHash
of access-listed addresses, are we saying that only contract deployments that use EIP-2930 get a discount? - Do clients need to keep an history of all code hashes previously included in an access-list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some clarifications. Hope is easier to follow now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks great now, much clearer 👏
When a contract creation transaction or opcode (`CREATE`/`CREATE2`) is executed, gas is charged differently based on whether the deployment succeeds or fails. Given runtime bytecode `R` (length `L`) returned by initcode and `H = keccak256(R)`: | ||
|
||
1. **On opcode entry:** Always charge `CREATE_FIXED_COST` (212,800 gas) | ||
2. **During initcode execution:** Charge `EXECUTION_GAS` equal to gas consumed by the initcode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXECUTION_GAS
looks like a constant, I would rather have something like exec_costs(initcode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the full formula in the specs is:
# GAS
extend_memory = calculate_gas_extend_memory(
evm.memory, [(memory_start_position, memory_size)]
)
call_data_words = ceil32(Uint(memory_size)) // Uint(32)
init_code_gas = init_code_cost(Uint(memory_size))
charge_gas(
evm,
GAS_CREATE
+ GAS_KECCAK256_WORD * call_data_words
+ extend_memory.cost
+ init_code_gas,
)
So, it depends on the initcode in the call data and the memory expansion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied the change from @gballet
Also, agree the formula is more complex. But exec_initcode_costs
implicitly accounts for mem expansion and call data costs. As this isn't something we are changing.
EIPS/eip-8037.md
Outdated
- Charge `HASH_GAS(L)` where `HASH_GAS(L) = 6 × ceil(L / 32)` to compute `H` | ||
- Perform deduplication check (see Access-List CodeHash section below): | ||
- If duplicate: do NOT charge `CODE_DEPOSIT_GAS(L)`; link `codeHash` to `H` | ||
- If not duplicate: charge `CODE_DEPOSIT_GAS(L)` where `CODE_DEPOSIT_GAS(L) = 1,900 × L` and persist `R` under `H`, then link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use CODE_DEPOSIT_GAS
instead of 1900
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"then link" -> truncated sentence?
Changes made: 1. Updated intro to explicitly mention bytecode deduplication discount 2. Added EIP-2930 to requires field (extends AccessTuple structure) 3. Reverted parameter names to match execution specs (GAS_CREATE, etc.) 4. Clarified forward compatibility: post-fork all clients enforce identical behavior; pre-fork behavior only differs between pre/post-fork, not between different client implementations 5. Moved Access-List Extension Specification to Reference Implementation section (per EIP-1 structure) 6. CRITICAL: Fixed W set building semantics with comprehensive clarification: - W is built from EXISTING contracts at transaction START - Only addresses that exist in state with deployed code contribute - Added example showing how T_A deploys, then T_B references T_A - Clarified edge case: same-block duplicate deployments both pay full cost if neither references existing contract (extremely rare, not worth special handling) - Updated both Specification and Reference Implementation sections The most important fix addresses Maria's concern about W set confusion, making it crystal clear that newly deployed contracts within a transaction do NOT automatically add to W - only pre-existing state matters.
Optimisation on duplicated bytecode is something that we can do even now. That change should be contained in its own EIP and discussed separately from the gas cost increase. |
The commit 2d37dcc (as a parent of e5a48e5) contains errors. |
Asked the rest of EL clients. Once I get answers from the rest will act accordingly. I thought about including it because this is part of the "gas harmonization cost" idea. But, indeed, this is something we can already do now.. |
|
||
##### The Database Divergence Problem | ||
|
||
Empirical analysis reveals that approximately **27,869 bytecodes** exist in full-synced node databases with no live account pointing to them (as of the time of writing this EIP). These orphaned codes arise from: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's actually at the time of cancun, not at the time of this eip
|
||
The access-list approach solves the bytecode duplication check problem by: | ||
|
||
1. Deterministic behavior: Only code hashes explicitly included in the transaction's access list are considered for deduplication. This makes the result independent of local database state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the hashes included in the tx list? my understanding is that the addresses in the list trigger a loading, and therefore the code hashes associated to these addresses are available in the cache.
Unless you want to add a new field, which is fine I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but you would still need to load the account to make sure that the hash is the one you specify in the list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw one thing of note, is that people were talking about deprecating access lists in favor of BALs. Not sure what happened to that proposal, but that's something that is at odds with this approach if someone is still pushing it.
type AccessTuple struct { | ||
Address common.Address `json:"address"` | ||
StorageKeys []common.Hash `json:"storageKeys"` | ||
CheckCodeHash bool `json:"checkCodeHash,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does that really make sense? the addresses are going to be loaded anyway, so maybe it's enough to just build a hash map with every hash present?
Adding 2 important fixes/clarifications to EIP-8037:
Success/Failure Gas Accounting: Properly distinguishes between successful and failed contract deployments. Post-execution costs (ACCOUNT_CREATION_COST, HASH_GAS, CODE_DEPOSIT_GAS) are now only charged on success, aligning with Ethereum's principle that gas should reflect actual resource consumption.
Access-List CodeHash Deduplication: Replaces database lookup mechanism with an access-list based approach to solve consensus divergence between different sync modes (full-sync vs snap-sync). Approximately 27,869 orphaned bytecodes exist in full-sync nodes with no live account links, making DB lookups non-deterministic.
Changes include: