Allow constructing `Wollet` from materialized state, skipping `Update` replay

Hi everyone,

I run a production hot wallet on top of `lwk_wollet` that has accumulated **30k+ transactions** over its lifetime. As the update log and the in-memory caches grew, the process began to OOM-crash on boot: rehydration via `Wollet::new` allocates the full historical state (transactions, derivation maps, heights, timestamps) into resident memory before the wallet is usable, and on our hardware that final state crosses the available RAM budget.

We already maintain an **external indexer** that holds the canonical transaction history and can serve any subset of it on demand, so most of what `Wollet` rebuilds at boot is data we do not need to keep in process to operate. What we are missing from LWK is a way to construct a `Wollet` directly from a minimal, application-supplied state — bypassing the `Update`-replay path that forces full history into memory.

This issue describes that gap and proposes a narrow addition (`Wollet::from_state`) to close it, while being explicit about which existing LWK mechanisms are *not* sufficient and why.

## Summary

`lwk_wollet` already exposes a public `Persister` trait ([`persister.rs:35`](lwk_wollet/src/persister.rs#L35)) and a `Wollet::new(network, Arc<dyn DynStore>, descriptor)` constructor ([`wollet.rs:448`](lwk_wollet/src/wollet.rs#L448)) that accepts any custom backend, so applications can already plug in their own storage (database, key-value store, etc.). That part is solved.

The remaining gap is that rehydration always goes through **linear replay of every persisted `Update`** in `Wollet::restore_updates` ([`update.rs:378`](lwk_wollet/src/update.rs#L378)):

```rust
pub(crate) fn restore_updates(&mut self) -> Result<(), Error> {
    for i in 0.. {
        if let Some(update) = self.updates_persister.get_update(i)? {
            self.apply_update_inner(update)?;
        } else {
            let mut next_update_index = self.updates_persister.next_update_index()?;
            *next_update_index = self.updates_persister.merge_updates(i)?;
            break;
        }
    }
    Ok(())
}
```

I would like a constructor that takes already-materialized state and skips replay entirely, e.g. `Wollet::from_state(network, descriptor, WolletFullState)`.

## Motivation

In production we operate large, long-lived wallets whose persisted update log grows to hundreds of MB and whose fully-rehydrated `Wollet` reaches multi-GB resident memory. Boot time is dominated by `restore_updates`, and steady-state memory is dominated by the in-memory caches that replay populates. We have an external indexer that already holds the canonical transaction history and can produce a minimal materialized state on demand; we need a way to feed that state into `Wollet` without round-tripping through the `Update` log.

## Why existing mechanisms do not solve this

LWK already has compaction-flavored machinery, and it is worth being explicit about why none of it closes this gap:

1. **`only_tip` coalescing** ([`update.rs:145-159`](lwk_wollet/src/update.rs#L145-L159)). When a new `Update` only moves the tip and the previous one did too, the persister overwrites the previous entry instead of appending. The `Persister` trait doc explicitly asks implementors to behave this way.
2. **`Update::prune`** ([`update.rs:231`](lwk_wollet/src/update.rs#L231)) and the per-batch `prune` in [`update.rs:36`](lwk_wollet/src/update.rs#L36). Strips transactions/scripts not relevant to the wallet from a single `Update` before persisting.
3. **`merge_threshold`** ([`wollet.rs:113`](lwk_wollet/src/wollet.rs#L113), implemented in [`update.rs:91`](lwk_wollet/src/update.rs#L91)). When more than `threshold` updates are persisted, they are folded into a single consolidated `Update` at index `0` via `Update::merge` ([`update.rs:314`](lwk_wollet/src/update.rs#L314)). With `with_persisted_txs` ([`wollet.rs:103`](lwk_wollet/src/wollet.rs#L103)) this is forced to `Some(1)`, so at most one `Update` is ever on disk.

These three mechanisms address **disk footprint and boot time**. They do not address resident memory, which is the motivation here.

A consolidated snapshot-style update — even a hypothetical perfect one — produces the same in-memory `Wollet` state as replaying the original sequence. After construction, `cache.all_txs`, the script/derivation maps, the height map, and the unspent set hold exactly the same data regardless of whether they were built from 10,000 small `Update`s or one merged one. The cost is in the **final state**, not in the replay path. Replacing many small persisted entries with one large entry is a change of intermediate representation, not of steady-state memory.

Concretely, in our deployment:

| Mechanism                        | Disk                   | Boot time | Resident RAM           |
|----------------------------------|------------------------|-----------|------------------------|
| Status quo                       | ~800 MB                | ~30 s     | ~7 GB                  |
| Hypothetical aggressive merge    | ~50 MB                 | ~5 s      | ~7 GB                  |
| `from_state` with semantic prune | application-controlled | ~0 s      | application-controlled |

## What would actually reduce resident memory

Semantic pruning, not compaction. A `from_state` path lets the application supply only the data it needs to operate going forward — for example:

- Only transactions producing UTXOs that are still unspent.
- Only the derivation range from `last_unused` onward.
- Only the current tip, not historical tips/timestamps.

Trade-offs the application opts into when using this path:

- `wallet.transactions()` no longer returns full history.
- Reorgs deeper than the snapshot horizon require a rescan.
- Audit/tax features built on top of the wallet must source history elsewhere.

These are acceptable for our use case (transaction history lives in an external indexer); they would not be acceptable as defaults. Hence the request is an **additional constructor**, not a behavior change to existing ones.

## Proposed shape

```rust
impl Wollet {
    /// Construct a `Wollet` from already-materialized state, skipping the
    /// `Update`-replay path used by `Wollet::new`.
    ///
    /// The caller is responsible for the consistency of `state`. Fields the
    /// caller omits will be reflected as-is in the resulting wallet — e.g.
    /// transactions not included will not appear in `wallet.transactions()`,
    /// and reorgs deeper than the snapshot horizon will require a rescan.
    pub fn from_state(
        network: Network,
        descriptor: WolletDescriptor,
        state: WolletFullState,
    ) -> Result<Self, Error>;
}
```

Where `WolletFullState` is a public, serializable struct exposing the fields needed to populate `Cache` directly: tip, scripts/paths maps, heights, unspent set, `last_unused` counters, and whatever subset of transactions the caller chooses to include.

The existing private `WolletConciseState` ([`wollet.rs:261`](lwk_wollet/src/wollet.rs#L261)) is close to what is needed for the read/scan side, but a write-side equivalent would need the transaction set too. Happy to iterate on the exact shape.

## Scope and non-goals

- Pure addition; no change to existing constructors, persister semantics, or default behavior.
- Does not require touching `merge_threshold`, `only_tip` coalescing, or `Update::prune`.
- Not proposing to remove or weaken history retention by default — applications that need full history continue to use `Wollet::new`.
- Migration path for existing deployments is the application's responsibility: build a `WolletFullState` once from a full-replayed `Wollet`, persist it externally, then use `from_state` thereafter.

## Alternatives considered

- **Tuning `merge_threshold`** — addressed above; reduces disk and boot, not RAM.
- **Custom `Persister` that fabricates a single synthetic `Update`** — possible today, but `Update` is not a stable serialization target for materialized state, and the replay still goes through `apply_update_inner`, paying the same allocation cost. A direct `Cache` injection is cleaner and explicit about the contract.
- **Lazy/streamed transaction loading inside `Cache`** — larger refactor, changes semantics of existing accessors, and not needed if the application can supply a pruned state up front.

Happy to send a PR if the direction is acceptable.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow constructing `Wollet` from materialized state, skipping `Update` replay #158

Summary

Motivation

Why existing mechanisms do not solve this

What would actually reduce resident memory

Proposed shape

Scope and non-goals

Alternatives considered

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Mechanism	Disk	Boot time	Resident RAM
Status quo	~800 MB	~30 s	~7 GB
Hypothetical aggressive merge	~50 MB	~5 s	~7 GB
`from_state` with semantic prune	application-controlled	~0 s	application-controlled

Allow constructing Wollet from materialized state, skipping Update replay #158

Description

Summary

Motivation

Why existing mechanisms do not solve this

What would actually reduce resident memory

Proposed shape

Scope and non-goals

Alternatives considered

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Allow constructing `Wollet` from materialized state, skipping `Update` replay #158