Pre-Submission Checklist
LND Version
v0.19.3-beta (the bug is present unchanged on current master, afeb9e1)
LND Configuration
Relevant settings:
A gRPC client is permanently connected to routerrpc.Router/HtlcInterceptor and resolves intercepted HTLCs (settle with preimage / fail / resume).
Backend Version
Bitcoin Core (the bug is backend-independent and reproducible on regtest)
Backend Configuration
Not related to this bug.
OS/Distribution
Linux (Kubernetes)
Bug Details & Steps to Reproduce
Summary: HTLCs offered to the interceptor through the on-chain resolution flow (witness_beacon.go, added in #6219) are silently evicted from the interceptor held set on the first new block after being offered, because their AutoFailHeight is never set and the interceptor watchdog sweep (added in #6831) treats the zero value as "already expired". After the eviction, a Settle from the interceptor returns fwd not found (and tears down the interceptor stream), the preimage never reaches the witness beacon, and the HTLC is eventually claimed by the counterparty via the timeout path. This caused a direct loss of funds for us (incident details below).
Mechanism
When a channel force-closes with an unresolved intercepted HTLC, the incoming contest resolver offers it to the interceptor again so that it can still supply the preimage for an on-chain claim. The packet is built without AutoFailHeight (left at its zero value):
|
packet := &htlcswitch.InterceptedPacket{ |
|
Hash: htlc.RHash, |
|
IncomingExpiry: htlc.RefundTimeout, |
|
IncomingAmount: htlc.Amt, |
|
IncomingCircuit: models.CircuitKey{ |
|
ChanID: chanID, |
|
HtlcID: htlc.HtlcIndex, |
|
}, |
|
OutgoingChanID: payload.FwdInfo.NextHop, |
|
OutgoingExpiry: payload.FwdInfo.OutgoingCTLV, |
|
OutgoingAmount: payload.FwdInfo.AmountToForward, |
|
InOnionCustomRecords: payload.CustomRecords(), |
|
InWireCustomRecords: htlc.CustomRecords, |
|
} |
|
copy(packet.OnionBlob[:], nextHopOnionBlob) |
The forward enters the same heldHtlcSet as off-chain intercepts. On every new block, failExpiredHtlcs calls popAutoFails:
|
func (h *heldHtlcSet) popAutoFails(height uint32, cb func(InterceptedForward)) { |
|
for key, fwd := range h.set { |
|
if uint32(fwd.Packet().AutoFailHeight) > height { |
|
continue |
|
} |
|
|
|
cb(fwd) |
|
|
|
delete(h.set, key) |
|
} |
|
} |
if uint32(fwd.Packet().AutoFailHeight) > height { // 0 > height: never true
continue
}
cb(fwd) // FailWithCode -> ErrCannotFail for on-chain forwards: logged and ignored
delete(h.set, key) // entry removed unconditionally
For an on-chain forward, FailWithCode returns ErrCannotFail ("cannot fail in the on-chain flow") by design — but the entry is deleted from the set regardless. The net effect is that every on-chain intercepted HTLC survives in the held set for at most one block (~10 minutes). Any Settle arriving after that gets fwd not found, which additionally terminates the whole HtlcInterceptor stream for all other in-flight intercepts.
The off-chain interception path sets the field correctly (
|
intercepted := &interceptedForward{ |
|
htlc: htlc, |
|
packet: packet, |
|
htlcSwitch: s.htlcSwitch, |
|
autoFailHeight: int32(packet.incomingTimeout - |
|
s.cltvRejectDelta), |
|
} |
), so only the on-chain rescue path is affected. The two interacting changes were merged about six months apart (
#6219 in Apr 2022,
#6831 in Oct 2022), so every release since v0.16.0 is affected.
Steps to reproduce (regtest)
- Three nodes A -> B -> C. B runs with
requireinterceptor=true and has an interceptor client connected.
- Pay an invoice from A to C. Have the interceptor hold the intercepted HTLC at B (no resolution yet).
- Force-close the A–B channel. Once the close confirms, the contest resolver offers the HTLC to the interceptor through the on-chain flow.
- Mine one block. B logs
[ERR] HSWC: Cannot fail packet: cannot fail in the on-chain flow and the held entry is gone.
- Have the interceptor send
Settle with the correct preimage: the resolution fails with fwd not found, the interceptor stream is terminated, and B never claims the HTLC on-chain even though it had the preimage well before the HTLC's expiry.
Production incident (all times UTC)
| When |
What happened |
| Jun 6, 00:01 |
Peer force-closed the incoming channel (block 952544) carrying an unresolved intercepted HTLC (~78k sats, expiry height 952959 — about 415 blocks / ~3 days away). |
| Jun 6, 04:26 |
lnd restarted (unrelated operational event). On startup the contest resolver offered the HTLC to the interceptor → held. |
| Jun 6, 04:31 |
First new block after startup: two Cannot fail packet: cannot fail in the on-chain flow errors — both on-chain intercepts from that close evicted from the held set. |
| Jun 6, 07:29 |
Interceptor sent Settle with the correct preimage → fwd (Chan ID=951172:715:0, HTLC ID=19881) not found, stream torn down. |
| Jun 6–9 |
No further offers (no restart), so the preimage never reached the witness beacon. |
| Jun 9 |
Expiry passed; the counterparty claimed the timeout path. ~78k sats lost despite the interceptor holding the preimage ~2.5 days before expiry. |
Severity
Silent loss of funds. The exposure is burst-shaped rather than a steady drip: any incident that produces force closes of channels carrying pending intercepted HTLCs disables the on-chain rescue path for all of them at once, with only an easily-missed ERR log line as a trace. Effectively, #6219's on-chain interception recovery has been broken since #6831 for any Settle that arrives more than one block after the offer.
Proposed fix
The current guidelines indicate:
If you spot a glaring issue, we may still merge the fix or take it over ourselves. And if you're a new developer who notices an issue with the code, consider opening a detailed issue instead of a PR.
However, the critically of the bug and the scoped nature of the fix convinced us to submit a PR proposal in #10893
Expected Behavior
An on-chain intercepted forward should remain available to the interceptor until the HTLC's actual on-chain expiry (RefundTimeout). A Settle arriving at any time before expiry should hand the preimage to the witness beacon so the resolver can claim the output. There is no reason to auto-fail these entries earlier: the channel is already closed (so there is no force close left to prevent), and failing them back is impossible by construction (FailWithCode returns ErrCannotFail).
Debug Information
The two log lines an affected HTLC produces (default log levels):
2026-06-06 04:31:14.509 [ERR] HSWC: Cannot fail packet: cannot fail in the on-chain flow
2026-06-06 07:29:01.230 [ERR] RPCS: [/routerrpc.Router/HtlcInterceptor]: fwd (Chan ID=951172:715:0, HTLC ID=19881) not found
The first line fires exactly once per evicted on-chain intercept, so grepping for cannot fail in the on-chain flow over historical logs counts how many times the bug has fired on a given node.
Environment
lnd runs in Kubernetes; the interceptor client is a JVM service connected over the local network. The intercepted HTLCs are payments to our users, which is why the interceptor may legitimately hold a forward for hours before settling (waiting for the recipient to come online) — the window in which this bug destroys the rescue path.
Pre-Submission Checklist
LND Version
v0.19.3-beta (the bug is present unchanged on current master, afeb9e1)
LND Configuration
Relevant settings:
A gRPC client is permanently connected to
routerrpc.Router/HtlcInterceptorand resolves intercepted HTLCs (settle with preimage / fail / resume).Backend Version
Bitcoin Core (the bug is backend-independent and reproducible on regtest)
Backend Configuration
Not related to this bug.
OS/Distribution
Linux (Kubernetes)
Bug Details & Steps to Reproduce
Summary: HTLCs offered to the interceptor through the on-chain resolution flow (
witness_beacon.go, added in #6219) are silently evicted from the interceptor held set on the first new block after being offered, because theirAutoFailHeightis never set and the interceptor watchdog sweep (added in #6831) treats the zero value as "already expired". After the eviction, aSettlefrom the interceptor returnsfwd not found(and tears down the interceptor stream), the preimage never reaches the witness beacon, and the HTLC is eventually claimed by the counterparty via the timeout path. This caused a direct loss of funds for us (incident details below).Mechanism
When a channel force-closes with an unresolved intercepted HTLC, the incoming contest resolver offers it to the interceptor again so that it can still supply the preimage for an on-chain claim. The packet is built without
AutoFailHeight(left at its zero value):lnd/witness_beacon.go
Lines 96 to 110 in afeb9e1
The forward enters the same
heldHtlcSetas off-chain intercepts. On every new block,failExpiredHtlcscallspopAutoFails:lnd/htlcswitch/held_htlc_set.go
Lines 42 to 52 in afeb9e1
For an on-chain forward,
FailWithCodereturnsErrCannotFail("cannot fail in the on-chain flow") by design — but the entry is deleted from the set regardless. The net effect is that every on-chain intercepted HTLC survives in the held set for at most one block (~10 minutes). AnySettlearriving after that getsfwd not found, which additionally terminates the wholeHtlcInterceptorstream for all other in-flight intercepts.The off-chain interception path sets the field correctly (
lnd/htlcswitch/interceptable_switch.go
Lines 516 to 522 in afeb9e1
Steps to reproduce (regtest)
requireinterceptor=trueand has an interceptor client connected.[ERR] HSWC: Cannot fail packet: cannot fail in the on-chain flowand the held entry is gone.Settlewith the correct preimage: the resolution fails withfwd not found, the interceptor stream is terminated, and B never claims the HTLC on-chain even though it had the preimage well before the HTLC's expiry.Production incident (all times UTC)
Cannot fail packet: cannot fail in the on-chain flowerrors — both on-chain intercepts from that close evicted from the held set.Settlewith the correct preimage →fwd (Chan ID=951172:715:0, HTLC ID=19881) not found, stream torn down.Severity
Silent loss of funds. The exposure is burst-shaped rather than a steady drip: any incident that produces force closes of channels carrying pending intercepted HTLCs disables the on-chain rescue path for all of them at once, with only an easily-missed ERR log line as a trace. Effectively, #6219's on-chain interception recovery has been broken since #6831 for any
Settlethat arrives more than one block after the offer.Proposed fix
The current guidelines indicate:
However, the critically of the bug and the scoped nature of the fix convinced us to submit a PR proposal in #10893
Expected Behavior
An on-chain intercepted forward should remain available to the interceptor until the HTLC's actual on-chain expiry (
RefundTimeout). ASettlearriving at any time before expiry should hand the preimage to the witness beacon so the resolver can claim the output. There is no reason to auto-fail these entries earlier: the channel is already closed (so there is no force close left to prevent), and failing them back is impossible by construction (FailWithCodereturnsErrCannotFail).Debug Information
The two log lines an affected HTLC produces (default log levels):
The first line fires exactly once per evicted on-chain intercept, so grepping for
cannot fail in the on-chain flowover historical logs counts how many times the bug has fired on a given node.Environment
lnd runs in Kubernetes; the interceptor client is a JVM service connected over the local network. The intercepted HTLCs are payments to our users, which is why the interceptor may legitimately hold a forward for hours before settling (waiting for the recipient to come online) — the window in which this bug destroys the rescue path.