test: Adds tests showcasing existence of certain bugs on bp32 implementation - DO NOT MERGE#5442
Open
clockworkgr wants to merge 2 commits intofeat/jae/bp32treefrom
Open
test: Adds tests showcasing existence of certain bugs on bp32 implementation - DO NOT MERGE#5442clockworkgr wants to merge 2 commits intofeat/jae/bp32treefrom
clockworkgr wants to merge 2 commits intofeat/jae/bp32treefrom
Conversation
Collaborator
🛠 PR Checks Summary🔴 Must not contain the "don't merge" label Manual Checks (for Reviewers):
Read More🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers. ✅ Automated Checks (for Contributors):🔴 Must not contain the "don't merge" label ☑️ Contributor Actions:
☑️ Reviewer Actions:
📚 Resources:Debug
|
notJoon
added a commit
to notJoon/gno-core
that referenced
this pull request
Apr 8, 2026
Add bound checks in deserialization to preven OOM and OOB panics from corrupted node data. `readBytes` now rejects length exceeding the reader's remaining bytes, and `readInnerNode`/`readLeafNode` reject `numKeys` beyond the fixed array capacity. Ref: gnolang#5442 (Bug no.4: no bounds check on deserialized numKeys)
clockworkgr
added a commit
that referenced
this pull request
Apr 8, 2026
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
B+32 Tree Code Review — Bug Summary
Review of the
tm2/pkg/bptree/package, a versioned B+32 tree intended as adrop-in replacement for IAVL. Each bug is cross-referenced against standard
tm2 blockchain usage patterns to assess real-world likelihood.
Reproducer tests for all bugs:
bugs_test.goBug Summary Table
node.go:123resolveValuereturns hash as valuemutable_tree.go:112,immutable_tree.go:26SaveValuebypasses batch; Rollback leaksnodedb.go:151numKeysnode.go:209export.go:26prune.go:103-142Bug Details
Bug 1 — Shallow Clone Shares Mutable
[]byteSlicesLocation:
node.go:123—Clone()doesc := *n(struct copy)Problem: The struct copy shares all
[]byteslice pointers between originaland clone. If any code path later mutates key bytes in-place (e.g. via
append()or direct byte writes), the mutation would silently corrupt theoriginal node in a different version.
Current status: Exhaustive audit of
insert.go,remove.go, andsplit.goconfirms all key mutations use slot reassignment or
copyKey(). No in-placebyte mutation exists today. The bug is latent.
Blockchain likelihood: None currently. One
append()away from becoming asilent cross-version corruption bug.
Tests:
TestBug1_CloneSharesSlices,TestBug1_COWSafety,TestBug1_COWRegressionTestBug 2 —
resolveValueReturns 32-Byte Hash as ValueLocation:
mutable_tree.go:112,immutable_tree.go:26Problem: B+32 stores values out-of-line: leaf nodes contain SHA256 hashes,
actual values live in separate DB entries. When
resolveValue()is calledwithout a
ValueResolverset, it silently falls back to returning the 32-bytehash as if it were the value. No error is returned.
Blockchain likelihood: None on normal path. The store layer
(
store/bptree/store.go:70) always setsvalueResolverviast.mtree.GetValueByHash(). Three layers of setup ensure it's wired. Onlyexposed if someone constructs a tree manually (tests, external tooling).
Tests:
TestBug2_ResolveValueReturnsHash,TestBug2_ImmutableResolveValue,TestBug2_GetReturnsHashNotValue,TestBug2_StoreLayerSetsResolverBug 3 —
SaveValueBypasses Batch; Rollback Leaks ValuesLocation:
nodedb.go:151Problem:
SaveValue()writes directly to the DB viandb.db.Set(), whileall other writes (
SaveNode,SaveRoot) usendb.batch.Set(). This meansvalues are persisted immediately and irrevocably, even if the block is later
rolled back.
Rollback()only restores the root pointer — it cannot undo thedirect DB writes.
Blockchain likelihood: Certain. Every
Set()call during block executionwrites the value immediately.
Rollback()after failed ABCIDeliverTxleavesorphaned values. The codebase acknowledges this — PLAN.md notes values are
"never GC'd".
Impact: Slow DB bloat. Not a correctness bug (values are content-addressed
and deduped by hash), but disk usage grows monotonically with unreferenced
values. On a high-throughput chain, this could add GBs/year of dead data.
Tests:
TestBug3_SaveValueBypassesBatch,TestBug3_RollbackLeavesOrphanedValues,TestBug3_ValueVisibleBeforeCommit,TestBug3_OrphanAccumulationBug 4 — No Bounds Checking on Deserialized
numKeysLocation:
node.go:209—ReadNode()casts uvarint toint16withoutvalidation
Problem: When deserializing a node from DB,
numKeysis read as a uvarintand cast to
int16. No check ensures it falls within[0, B-1](i.e.[0, 31]). A value of 32 causes an array index panic. A value of 32768overflows
int16to -32768 and is silently accepted, leading to undefinediteration behavior.
Blockchain likelihood: Low but non-zero. Requires corrupted storage,
bit-flip, or malicious state sync snapshot. Standard operation always writes
valid
numKeys. However, there is no defense against storage corruption — asingle corrupted byte can brick a node.
Tests:
TestBug4_NumKeysOverflow,TestBug4_NegativeNumKeysOverflow,TestBug4_ZeroNumKeys,TestBug4_MaxValidNumKeysBug 5 — Exporter Goroutine Leak and Permanent Version Reader Lock
Location:
export.go:26—go e.run()with no context cancellationProblem:
Export()spawns a goroutine that traverses the tree and sendsnodes on a channel. If the caller abandons the
Exporterwithout callingClose(), the goroutine blocks forever on channel send and the version readercount is never decremented. This permanently prevents that version from being
pruned.
IAVL solves this with
context.WithCancel(). B+32 has no cancellationmechanism.
Blockchain likelihood: Low currently. Export/Import is only used in tests,
not production state sync. But when state sync is implemented (the interface
exists), this becomes a real concern — any timeout, error, or interruption
during export leaks a goroutine and locks the version.
Tests:
TestBug5_ExporterGoroutineLeak,TestBug5_VersionReaderLeakBug 6 — Pruning Deletes Nodes Shared Across Versions (CRITICAL)
Location:
prune.go:103-142—walkAndPrune()Problem: The pruning algorithm walks old-version and new-version trees in
parallel. For each old inner node, it finds the "corresponding" node in the new
tree and builds a set of the new node's child hashes. Any old child whose hash
is not in this set is deemed orphaned and deleted.
The flaw: after an inner node split, children from the old node may be
distributed across two or more nodes in the new version. The algorithm only
checks one corresponding node, so it treats children that moved to a sibling
(due to the split) as orphaned and deletes them — even though they are actively
referenced by the new version.
Blockchain likelihood: Certain. Default pruning strategy is
PruneSyncable(KeepRecent=705,600, KeepEvery=10). Pruning runs synchronously in every
Commit(). The first prune at block ~705,601 operates on a tree that hasundergone thousands of inner node splits. The bug is guaranteed to trigger.
Impact: Chain halt with three escalating failures:
findCorrespondingChildtraverses into a deletedsubtree and panics on missing node.
Get()/Has()on keys under deleted subtreespanic during ABCI
DeliverTx.LoadVersion()panics when encounteringmissing nodes — the node cannot start without state sync from peers.
Confirmed by tests:
TestBug6_SingleVersionPruneCorruptsTree: Panic at block ~98 with per-blockpruning, 302 cascading errors.
TestBug6_PruneCorruptsNewerVersions: Latest version corrupted; only4,686 of 18,000 keys readable after bulk prune.
TestBug6_PruneBricksNodeOnRestart:LoadVersion()panics on cold restartafter pruning.
Tests:
TestBug6_SingleVersionPruneCorruptsTree,TestBug6_PruneCorruptsNewerVersions,TestBug6_PruneBricksNodeOnRestartUsage Patterns Checked
Commit()SaveVersion()thenDeleteVersionsTo(latest - KeepRecent)PruneSyncable: keep 705,600 versions, waypoint every 10thRollback()GetImmutable()SetValueResolver()Export/ImportLoadVersion(latest)on node startupBottom Line
Bug #6 is a ship-blocker. Any chain running B+32 with default pruning will
eventually panic and brick. The 705,600-block delay before first prune means it
passes all testing but fails catastrophically in production (~12 days at
1-second blocks).
Bug #3 is the only other bug guaranteed to manifest — it causes slow DB
bloat but not correctness failures.
Bugs #2, #4, #5 are dormant under current usage but represent landmines for
future features (state sync, external tooling, storage corruption recovery).
Bug #1 is not currently exploitable but is one
append()away from becominga silent corruption bug.