Skip to content

[inscriptive] apply_changes is non-atomic and not rollback-safe across all managers #13

Description

@GideonBature

Summary

Every storage manager in src/inscriptive/ commits its ephemeral delta the same way: iterate the delta maps and, for each entry, write to disk then to in-memory, returning early on the first error. Because each write is an independent sled operation (no transaction()), a failure mid-apply_changes leaves the manager in a half-applied state that rollback_last cannot recover — rollback only restores the in-memory delta from the backup; it does not undo the disk writes that already succeeded.

Affected file(s)

All managers with an apply_changes:

  • src/inscriptive/coin_manager/coin_manager.rs (~apply_changes, ~lines 1470-1985)
  • src/inscriptive/state_manager/state_manager.rs (~lines 247-346)
  • src/inscriptive/flame_manager/flame_manager.rs
  • src/inscriptive/registry/registry.rs (~lines 1093-1598)
  • src/inscriptive/privileges_manager/privileges_manager.rs (~lines 836-1054)
  • src/inscriptive/graveyard/graveyard.rs (~lines 236-272)
  • src/inscriptive/params_manager/params_manager.rs (~lines 937-2189)

Location / pattern

Representative example, coin_manager.rs (account-balance commit):

for (account_key, ephemeral_account_balance) in self.delta.updated_account_balances.iter() {
    // 3.1 On-disk insertion.
    {
        let tree = self.on_disk_accounts.open_tree(account_key).map_err(...)?;   // may fail
        tree.insert(ACCOUNT_BALANCE_SPECIAL_DB_KEY, ephemeral_account_balance.to_le_bytes().to_vec())
            .map_err(...)?;                                                       // may fail
    }
    // 3.2 In-memory insertion.
    {
        let mut_permanent_account_body = self.in_memory_accounts.get_mut(account_key)
            .ok_or(...)?;                                                          // may fail
        mut_permanent_account_body.update_balance(*ephemeral_account_balance);
    }
}

The same shape repeats across the other delta maps (contracts, shadow spaces, deallocations, …) and across every other manager.

The engine lifecycle (driven from src/executive/exec_ctx/) is:

  1. pre_execution()delta.clone() into backup_of_delta
  2. mutations write to delta only
  3. commit → apply_changes(), or abort → rollback_last()delta = backup_of_delta.clone()
  4. flush_delta()

Root cause / analysis

There are two separate gaps:

Gap A — no transactional batch. apply_changes performs N independent sled writes. sled does batch writes via tree.transaction(...) / Batch, but none of the managers use it. If write k of N fails (disk full, I/O error), writes 1..k-1 are already on disk and in memory, and the function returns Err. The caller cannot distinguish "nothing applied" from "partially applied."

Gap B — rollback only covers the delta. rollback_last (coin_manager.rs ~line 1464, and equivalents) does self.restore_delta() which clones backup_of_delta back over delta. This reverts uncommitted ephemeral changes — it does not touch on-disk state or the in-memory permanent maps. So rollback is a no-op against a partially-applied apply_changes: the half-applied writes stay. (In normal use rollback_last is called instead of apply_changes, never after a failed apply_changes, so this is latent rather than active — but the contract is misleading.)

A related smell: the comment in state_manager.rs ~line 271 — "critical: apply on-disk before in-memory for atomicity" — implies atomicity that does not actually exist; the on-disk-first ordering just changes the failure mode, it doesn't provide a transaction.

Impact

  • Corruption on partial failure. If any sled write fails partway through a commit, the database is left with an inconsistent mix of old and new values (e.g. an account balance updated but its shadow allocations not), and there is no recovery path.
  • In-memory / on-disk divergence. When the on-disk write succeeds but the in-memory get_mut fails (or vice versa across iterations), the two views disagree until restart.
  • For a Bitcoin rollup, half-applied balance/ allocation state is the worst-case failure mode.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions