feat(ffi): `Database.Close()` guarantees proposals committed or freed #1349

ARR4N · 2025-10-10T11:56:50Z

Explicit freeing of proposals propagated through libevm (i.e. geth) plumbing has proven difficult when not being committed as they are simply dropped for the GC to collect. Furthermore, strict ordering of calls to Proposal.Drop() (or Commit()) before Database.Close() is required to avoid segfaults. This PR implements a fix for both issues:

All new Proposals have a GC finalizer attached, which calls Drop(). This is safe because it is a no-op if called twice or after a call to Commit().
The Database has a sync.WaitGroup introduced, which tracks all outstanding proposals. Calls to Commit() / Drop() decrement the group counter (only once per Proposal).
Database.Close() waits on the WaitGroup before freeing its own handle, avoiding segfaults.

Assuming that all calls to Database.Propose() and Proposal.Propose() occur before the call to Database.Close() then this is a correct usage of sync.WaitGroup's documented requirement for ordering of calls to Add() and Wait().

An integration test demonstrates blocking and eventual return of Database.Close(), specifically due to the unreachability of un-dropped, un-committed Proposals, resulting in their finalizers decrementing the WaitGroup.

ffi/proposal.go

alarso16 · 2025-10-10T14:48:54Z

ffi/proposal.go

 }

+func (p *Proposal) afterDisowned() {
+	p.freeOnce.Do(func() {


It's good that we've prevented some racy behavior here anyway

What do you mean? Can you give examples? Technically this Once isn't necessary, but I put it in defensively in case the rest of the code is refactored and current invariants no longer hold.

We do rely that the consumer isn't trying to simultaneously Commit and Drop, which seems reasonable. However, in that case, you would get UB. It's probably best to make UB completely impossible, even if the actions to make it happen are unreasonable

They can definitely still race and both call the Rust code, but only under invalid usage as you say. This just guarantees that the WaitGroup never panics by going negative.

This won't prevent racy behavior. If we wanted to do that, the freeOnce call also needs to wrap the C.fwd_free_proposal and C.fwd_commit_proposal calls as there's nothing preventing a concurrent call to Drop()/Commit() while this is running.

Ahh I see. Is that worth preventing (in a separate PR)?

Putting them in the Once would be problematic as, if either returned an error, then they couldn't be called again.

There's no need to place it in a separate PR IMO as it's simply a mutex. It's also worth doing to allay @demosdemon's concerns here:

That could also potentially allow the finalizer to run on the proposal concurrently while commit is running; but, I believe that won't actually happen in practice because of the Once.

It's true that it won't actually happen because the Proposal remains alive long enough, but an explicit lock is much easier to reason about than GC lifetime.

ffi/proposal.go

alarso16 · 2025-10-10T14:51:04Z

ffi/firewood_test.go

+	case <-done:
+		t.Errorf("%T.Close() returned with undropped %T", db, p0) //nolint:forbidigo // Use of require is impossible without a hack like require.False(true)
+	case <-time.After(300 * time.Millisecond):
+		// TODO(arr4n) use `synctest` package when at Go 1.25


This is neat, I've never heard of this. This does seem to solve a pretty common pattern in testing.

Yeah, I can't wait to start using it!

In theory we could use it now if we add GOEXPERIMENT=synctest

In theory we could use it now if we add GOEXPERIMENT=synctest

Unfortunately it requires compiling Go itself with this, not just running the test.

Co-authored-by: Austin Larson <[email protected]> Signed-off-by: Arran Schlosberg <[email protected]>

ffi/proposal.go

alarso16 · 2025-10-10T15:10:45Z

ffi/proposal.go

 }

+func (p *Proposal) afterDisowned() {
+	p.freeOnce.Do(func() {


We do rely that the consumer isn't trying to simultaneously Commit and Drop, which seems reasonable. However, in that case, you would get UB. It's probably best to make UB completely impossible, even if the actions to make it happen are unreasonable

ffi/proposal.go

demosdemon

Really like the use of WaitGroup. My main concern is the explicit GC call.

But, something else I noticed is that there's nothing clearing the finalizer that's been set. So, the finalizer will always call Drop even if Drop or Commit was called outside of the finalizer. Ideally we would clear the finalizer to prevent that from happening if there was an explicit call. That could also potentially allow the finalizer to run on the proposal concurrently while commit is running; but, I believe that won't actually happen in practice because of the Once.

demosdemon · 2025-10-10T16:04:44Z

ffi/firewood.go

 		return nil
 	}

+	runtime.GC()


Not a fan of the GC call. I assume it's to try to eagerly run any outstanding finalizers. But, GC will also include everything else and may penalize us more than necessary.

I assume it's to try to eagerly run any outstanding finalizers

Yup

GC will also include everything else and may penalize us more than necessary

Good point. I've put it in a separate go routine to avoid this, but I think it's important to still include due to the above.

rkuris · 2025-10-10T15:59:56Z

ffi/firewood.go

 	}

+	runtime.GC()
+	db.proposals.Wait()


Does this mean that a leaked proposal now becomes a hang at exit time?

If so, we should consider returning an error instead of hanging. This would make it much harder to debug if it happens on a system we have no control over, but if there's a log that says "hey there was a leaked proposal" somewhere that would make debugging a lot easier.

One way to do this is via a timeout, maybe with some large amount of time (60 seconds).

The idiomatic approach is to accept a Context and then add an extra timeout. @alarso16 do we absolutely have to conform to the kvBackend interface (which precludes adding the Context)?

No, this is consumed by triedb. This struct doesn't implement any particular interface required by libevm, but will be called on DBOverride.Close(). Accepting a context would be my first thought anywhere else, so maybe we just require the consumer to understand that the operation may hang by sending a context. libevm can create an ephemeral context I guess.

Context added.

rkuris · 2025-10-10T16:02:12Z

ffi/firewood_test.go

+	case <-done:
+		t.Errorf("%T.Close() returned with undropped %T", db, p0) //nolint:forbidigo // Use of require is impossible without a hack like require.False(true)
+	case <-time.After(300 * time.Millisecond):
+		// TODO(arr4n) use `synctest` package when at Go 1.25


In theory we could use it now if we add GOEXPERIMENT=synctest

ARR4N · 2025-10-13T11:24:33Z

Really like the use of WaitGroup. My main concern is the explicit GC call.

Addressed in your code-specific comment

But, something else I noticed is that there's nothing clearing the finalizer that's been set. So, the finalizer will always call Drop even if Drop or Commit was called outside of the finalizer. Ideally we would clear the finalizer to prevent that from happening if there was an explicit call. That could also potentially allow the finalizer to run on the proposal concurrently while commit is running; but, I believe that won't actually happen in practice because of the Once.

A call to Drop after a call to either of the others is a no-op, and the two are now thread-safe with respect to each other although your final point is correct about it not actually happening. Further details here.

alarso16

I think one question we may want to answer prior to merging this is how this should effect the proposal API. Should Drop even be exposed to the user if we guarantee to free memory on GC? I think yes, it should still be available, but wanted to check with others

alarso16 · 2025-10-13T13:49:00Z

ffi/proposal.go

+
+// disownHandle is the common path of [Proposal.Commit] and [Proposal.Drop], the
+// `fn` argument defining the method-specific behaviour.
+func (p *Proposal) disownHandle(fn func(*C.ProposalHandle) error, disownEvenOnErr bool) error {


I like that the behavior is clarified for what happens to the lifetime in the error case

Me too. It actually caught me off guard when I first did this refactoring.

alarso16 · 2025-10-13T13:49:57Z

ffi/proposal.go

 func (p *Proposal) Drop() error {
-	if p.handle == nil {
-		return nil
+	if err := p.disownHandle(dropProposal, false); err != nil && err != errDroppedProposal {


Is using errors.Is better practice? Or since we know that the err isn't wrapped, this check is easier?

Is using errors.Is better practice?

Nope, it's redundant here.

Or since we know that the err isn't wrapped, this check is easier?

Exactly. It's useful when there's a desire to wrap an error, but in this case there isn't any.

ARR4N · 2025-10-15T14:52:31Z

Should Drop even be exposed to the user if we guarantee to free memory on GC? I think yes, it should still be available, but wanted to check with others

I think it absolutely should be, otherwise the only way to close the database is to guarantee that every proposal becomes unreachable.

alarso16 · 2025-10-30T13:25:26Z

ffi/firewood.go

+		return fmt.Errorf("at least one reachable %T neither dropped nor committed", &Proposal{})
+	}
+
 	if err := getErrorFromVoidResult(C.fwd_close_db(db.handle)); err != nil {


So if the user makes a mistake and forgets to drop a proposal, the database won't close. I think this is the behavior we should enforce, but it does seem weird

ARR4N added 2 commits October 10, 2025 12:56

feat(ffi): Database.Close() guarantees proposals committed or freed

3dc5df7

chore: placate the linter

d7e9c3c

ARR4N marked this pull request as ready for review October 10, 2025 12:24

ARR4N requested review from alarso16, demosdemon and rkuris as code owners October 10, 2025 12:24

ARR4N self-assigned this Oct 10, 2025

alarso16 reviewed Oct 10, 2025

View reviewed changes

ARR4N and others added 2 commits October 10, 2025 15:58

doc: fix Proposal.Propose() method comment

1920b74

Co-authored-by: Austin Larson <[email protected]> Signed-off-by: Arran Schlosberg <[email protected]>

doc: WaitGroup in Proposals

bc316c3

alarso16 reviewed Oct 10, 2025

View reviewed changes

demosdemon reviewed Oct 10, 2025

View reviewed changes

rkuris requested changes Oct 10, 2025

View reviewed changes

ARR4N added 2 commits October 13, 2025 11:52

feat: Database.Close() calls runtime.GC() in go routine

c89cc1d

feat: thread-safe Proposal.Commit() and Drop() w.r.t. each other

4337b06

Merge branch 'main' into arr4n/proposal-lifetime

b1d8308

ARR4N requested review from alarso16 and demosdemon October 13, 2025 11:38

alarso16 reviewed Oct 13, 2025

View reviewed changes

ARR4N and others added 4 commits October 15, 2025 15:56

Merge branch 'main' into arr4n/proposal-lifetime

b2cb25b

doc: refer to Proposal.Drop(), not Done()

f5ea3f8

Merge branch 'main' into arr4n/proposal-lifetime

b8c6231

refactor!: Database.Close() accepts a context

9c0765d

alarso16 reviewed Oct 30, 2025

View reviewed changes

feat(ffi): Database.Close() guarantees proposals committed or freed #1349

Are you sure you want to change the base?

feat(ffi): Database.Close() guarantees proposals committed or freed #1349

Conversation

ARR4N commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ARR4N Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

demosdemon Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

demosdemon left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ARR4N Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ARR4N Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ARR4N commented Oct 13, 2025

Uh oh!

alarso16 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ARR4N commented Oct 15, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

feat(ffi): `Database.Close()` guarantees proposals committed or freed #1349

feat(ffi): `Database.Close()` guarantees proposals committed or freed #1349

ARR4N commented Oct 10, 2025 •

edited

Loading

ARR4N Oct 10, 2025 •

edited

Loading

demosdemon Oct 10, 2025 •

edited

Loading

ARR4N Oct 13, 2025 •

edited

Loading

ARR4N Oct 13, 2025 •

edited

Loading