LSPS2 service: Treat replayed `HTLCIntercepted` events idempotently by tnull · Pull Request #4656 · lightningdevkit/rust-lightning

tnull · 2026-06-02T12:46:14Z

Replayed intercepted HTLC events should not duplicate queued payments or
panic after restart. Ignore already-queued intercept IDs so persisted
queues remain stable across event replay.

Also:

Terminal JIT channel state is only useful while the forwarded channel
still exists. Drop completed LSPS2 mappings once the channel is gone so
persisted service state does not retain stale entries indefinitely.

ldk-reviews-bot · 2026-06-02T12:46:17Z

👋 Thanks for assigning @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

ldk-claude-review-bot · 2026-06-02T12:53:04Z

I've reviewed the full PR against my prior analysis and the current code. The add_htlc dedup-by-intercept_id, remove_terminal_channel_state, and the channel_closed async/sync wrappers are all internally consistent, and the inner/outer map cleanup is symmetric.

No issues found.

The only finding from my prior review pass (the Vec-vs-HashSet lookup nit on the old prune_terminal_state_without_channels) is now resolved — that code was removed entirely and replaced by the event-driven channel_closed approach, so there is nothing left to address there.

One pre-existing limitation (not introduced by this PR, so not flagged inline): channels that close in a non-terminal state (after channel_ready but before PaymentForwarded) still leave stale peer_by_channel_id / intercept_scid_by_channel_id entries, since remove_terminal_channel_state only matches the PaymentForwarded arm. This is no worse than before the PR (there was no ChannelClosed handler previously).

jkczyz · 2026-06-03T16:02:32Z

 /// - [`Event::ChannelReady`] to [`LSPS2ServiceHandler::channel_ready`]
 /// - [`Event::HTLCHandlingFailed`] to [`LSPS2ServiceHandler::htlc_handling_failed`]
 /// - [`Event::PaymentForwarded`] to [`LSPS2ServiceHandler::payment_forwarded`]
+/// - [`Event::ChannelClosed`] to [`LSPS2ServiceHandler::channel_closed`]


Could you add a pending changelog for this?

For what exactly? For the fact that we expect users to now forward channel closed events? Or something else?

Yeah, since the docs say "If the LSPS2 service is configured, users must forward the following parameters from LDK events:"

jkczyz · 2026-06-03T16:07:22Z

+		let Some(counterparty_node_id) = counterparty_node_id else {
+			return Ok(());
+		};


Elsewhere we return APIError::APIMisuseError. Should we do the same here?

Good question. We could for consistency, but honestly I'm regretting we moved away from having the event-handling-related API idempotent. It seems not having it idempotent might run into issues / unexpected errors in case events get replayed on restart? But maybe that's okay?

Hmmm... yeah I guess we return Ok(()) for htlc_intercepted, so might also depend on the event.

jkczyz · 2026-06-03T16:08:54Z

+			self.persist().await.map_err(|e| APIError::APIMisuseError {
+				err: format!(
+					"Failed to persist peer state after channel {} closed: {}",
+					channel_id, e
+				),
+			})?;


I see we use this error elsewhere when persistence fails. Do we expect the caller to retry?

Yeah, error types are unfortunately still a mess (see #4074). We hope to address that soon. We don't expect the caller to necessarily retry, the persistence will be automatically retried by the background processor eventually.

f3r10 · 2026-06-05T16:05:33Z

+		if let Some(intercept_scid) = removed_intercept_scid {
+			self.peer_by_intercept_scid.write().unwrap().remove(&intercept_scid);
+			self.peer_by_channel_id.write().unwrap().remove(&channel_id);
+			self.persist().await.map_err(|e| APIError::APIMisuseError {


I have a question here. As I understand, self.persist() sweeps all peers triggering a global sweep. Would it not be better to use here self.persist_peer_state(counterparty_node_id).await ?

Yeah, note that persist_peer_state will then leave pruning/cleanup to the background processor run, but that's probably fine. Now changing to the pattern we have elsewhere.

f3r10

Just one question between the difference between self.persist() and self.persist_peer_state(counterparty_node_id).await

TheBlueMatt

Please move the test commit so that its after the fix, rather than having an intermediate state that fails tests. Also please word-wrap commit messages. Do you think it makes sense to backport the last commit to 0.2? I presume not.

Replayed intercepted HTLC events should not duplicate queued payments or panic after restart. Ignore already-queued intercept IDs so persisted queues remain stable across event replay. Co-Authored-By: HAL 9000

Terminal JIT channel state is only useful while the forwarded channel still exists. Drop completed LSPS2 mappings once the channel is gone so persisted service state does not retain stale entries indefinitely. Co-Authored-By: HAL 9000

Persisting LSPS2 service state can race with replayed intercepted HTLC events after restart. Cover replaying the same intercepted HTLC after restoring peer state so duplicate queueing is caught. Co-Authored-By: HAL 9000

tnull · 2026-06-17T13:07:22Z

Please move the test commit so that its after the fix, rather than having an intermediate state that fails tests. Also please word-wrap commit messages. Do you think it makes sense to backport the last commit to 0.2? I presume not.

Done! Yeah, likely only the idempotency commit makes sense.

Also updated to use the persist_peer_state pattern rather than full persist:

diff --git a/lightning-liquidity/src/lsps2/service.rs b/lightning-liquidity/src/lsps2/service.rs
index e1373cfb1..4f338a5fd 100644
--- a/lightning-liquidity/src/lsps2/service.rs
+++ b/lightning-liquidity/src/lsps2/service.rs
@@ -1299,9 +1299,11 @@ where
 			self.peer_by_intercept_scid.write().unwrap().remove(&intercept_scid);
 			self.peer_by_channel_id.write().unwrap().remove(&channel_id);
-			self.persist().await.map_err(|e| APIError::APIMisuseError {
-				err: format!(
-					"Failed to persist peer state after channel {} closed: {}",
-					channel_id, e
-				),
+			self.persist_peer_state(counterparty_node_id).await.map_err(|e| {
+				APIError::APIMisuseError {
+					err: format!(
+						"Failed to persist peer state after channel {} closed: {}",
+						channel_id, e
+					),
+				}
 			})?;
 		}

joostjager · 2026-06-17T13:46:16Z

+		self.outbound_channels_by_intercept_scid.remove(&intercept_scid);
+		self.intercept_scid_by_channel_id.remove(&channel_id);
+		self.intercept_scid_by_user_channel_id.retain(|_, iscid| *iscid != intercept_scid);
+		self.needs_persist = true;


Here is some overlap with #4703 perhaps.

TheBlueMatt · 2026-06-17T20:00:52Z

Backported to 0.2 in #4706

v0.2.3 - Jun 18, 2026 - "Through the Loupe" API Updates =========== * `DefaultMessageRouter` will now always generate blinded message paths that provide no privacy (where our node is the introduction node) for nodes with public channels. This works around an issue which will appear for any nodes with LND peers that enable onion messaging - such peers will refuse to forward BOLT 12 messages from unknown third parties, which most BOLT 12 payers rely on today (lightningdevkit#4647). * Explicit `amount_msats` of 0 is rejected in BOLT 12 `Offer`s; `OfferBuilder` now maps 0-amounts to an amount of `None` (lightningdevkit#4324). Bug Fixes ========= * `Features::supports_zero_conf` no longer clears the `ZeroConf` features and `Features::requires_zero_conf` now correctly reports required, rather than supported, status (lightningdevkit#4517). * If an MPP payment is claimed but `ChannelMonitorUpdate`s for some parts are still being completed asynchronously, further channel updates (e.g. forwarding another payment) are pending and the node restarts, the channel could have become stuck (lightningdevkit#4520). * The presence of unconfirmed transactions actually no longer causes `ElectrumSyncClient` to spuriously fail to sync (lightningdevkit#4590). * LSPS1, LSPS2, and LSPS5 persistence will no longer get stuck and refuse to persist again after a single failure from the KVStore (lightningdevkit#4597, lightningdevkit#4282). * Dropping the future returned by `OutputSweeper::regenerate_and_broadcast_spend_if_necessary` no longer results in future calls to the same method being spuriously ignored (lightningdevkit#4598). * Used async-receive offers are no longer refreshed on every timer tick once their refresh time is reached (lightningdevkit#4672). * `FilesystemStore::list_all_keys` will no longer fail if there are stale intermediate files lying around from a previous unclean shutdown (lightningdevkit#4618). * When forwarding an HTLC while in a blinded path with proportional fees over 200%, LDK will no longer spuriously allow a forward that pays us 1 msat too little in fees (lightningdevkit#4697). * Fixed a rare case where a channel could get stuck on reconnect when using both async `ChannelMonitorUpdate` persistence and async signing (lightningdevkit#4684). * If we had exactly zero balance in a zero-fee-commitment channel, the counterparty was able to splice all of their balance out, violating the reserve requirements they'd otherwise be forced to keep (lightningdevkit#4580). * Providing an `Event::HTLCIntercepted` to the `LSPS2ServiceHandler` twice no longer results in spuriously opening a channel early (lightningdevkit#4656). * `Event::PaymentSent::fee_paid_msat` is no longer `None` in cases where `ChannelManager::abandon_payment` was called before the payment ultimately completes anyway (lightningdevkit#4651). * `AnchorDescriptor::previous_utxo` now provides the correct `script_pubkey` for non-zero-commitment-fee anchor channels (lightningdevkit#4669). * Syncing a `ChainMonitor` using the `Confirm` trait will no longer write some full `ChannelMonitor`s to disk several times per block (lightningdevkit#4544). * `OMDomainResolver` now correctly accounts for failed queries when rate limiting, ensuring we continue to respond to queries after failures (lightningdevkit#4591). * Calling `ChannelManager::send_payment_with_route` without a `route_params` and with an invalid `Route` will no longer panic (lightningdevkit#4707). * `LSPS2ServiceHandler::channel_open_failed` now correctly fails intercepted HTLCs rather than allowing them to fail just before expiry (lightningdevkit#4677). * `StaticInvoice::is_offer_expired` was corrected to check offer, rather than static invoice, expiry (lightningdevkit#4594). * `lightning-custom-message`'s handling of `peer_connected` events now ensures that sub-handlers will see a `peer_disconnected` event if a different sub-handler refused the connection by `Err`ing `peer_connected` (lightningdevkit#4595). * Replay protection for LSPS5 signatures now detects replays which are only different in the encoded signature's case (lightningdevkit#4701). * When `lightning-liquidity` is configured in the background processor, there is no longer a stream of `Persisting LiquidityManager...` log spam (lightningdevkit#4246). * Incomplete MPP keysend payments will no longer see their HTLCs held until expiry (lightningdevkit#4558). * `InvoiceRequestBuilder` will no longer accept a `quantity` of `0` for a BOLT 12 `Offer`, allowing any quantity up to a bound (lightningdevkit#4667). * `lightning-custom-message` handlers that return `Ok(None)` when asked to deserialize a message in their defined range no longer cause panics (lightningdevkit#4709). * Several spurious debug assertions were fixed (lightningdevkit#4537, lightningdevkit#4618, lightningdevkit#4026) Security ======== 0.2.3 fixes several underestimates of the anchor reserves required to ensure we can reliably close channels, several denial-of-service vulnerabilities and a sanitization issue. * `Bolt11Invoice::recover_payee_pub_key` no longer panics if called on an invoice which set an explicit public key, rather than relying on public key recovery. Note that this method is called from `PaymentParameters::from_bolt11_invoice` (lightningdevkit#4717). * Maliciously-crafted unpayable invoices which have overflowing feerates will no longer cause an `unwrap` failure panic (lightningdevkit#4716). * Parsing an `LSPSDateTime` which is before 1970 no longer panics. This is reachable when parsing messages from counterparties (lightningdevkit#4715). * `possiblyrandom` did not properly generate random data except when it was explicitly configured to. By default this means LDK is vulnerable to various HashDoS attacks (lightningdevkit#4719). * `OMNameResolver` will no longer panic when looking up payment instructions which include unicode characters at the start of a TXT record (lightningdevkit#4718). * When using the `anchor_channel_reserves` module to calculate reserves required to pay for fees when closing anchor channels, zero-fee-commitment channels were not considered. This could allow a counterparty to open many channels, leaving us unable to properly force-close (lightningdevkit#4592). * The `anchor_channel_reserves` module overestimated the value of `Utxo`s in the wallet by ignoring the `TxIn` cost to spend them (lightningdevkit#4670). * `PrintableString` did not properly sanitize unicode format characters, allowing an attacker to corrupt the rendering of logs or UI (lightningdevkit#4593, lightningdevkit#4605). * RGS data is now limited in how large of a graph it is able to cause a client to store in memory. Note that RGS data is still considered a DoS vector in general and you should only use semi-trusted RGS data (lightningdevkit#4713). * Counterparty-provided strings in failure messages are no longer logged in full, reducing the ability of such a counterparty to spam our logs (lightningdevkit#4714). * Reading a corrupted `ChannelManager` or `ProbabilisticScorer` can no longer cause us to allocate large amounts of memory (lightningdevkit#4712). Thanks to Project Loupe for reporting most of the issues fixed in this release. Conflicts resolved in: * lightning/src/chain/channelmonitor.rs * lightning/src/events/mod.rs * lightning/src/ln/channelmanager.rs * lightning/src/ln/mod.rs * lightning/src/ln/offers_tests.rs * lightning/src/ln/onion_utils.rs

tnull added backport 0.2 backport 0.3 labels Jun 2, 2026

tnull force-pushed the 2026-06-lsps2-replayed-htlc-queue branch from 686c27d to 190b6af Compare June 2, 2026 12:48

ldk-claude-review-bot reviewed Jun 2, 2026

View reviewed changes

Comment thread lightning-liquidity/src/lsps2/service.rs Outdated

ldk-reviews-bot requested a review from jkczyz June 2, 2026 12:57

tnull force-pushed the 2026-06-lsps2-replayed-htlc-queue branch from 190b6af to 444c009 Compare June 2, 2026 13:04

jkczyz reviewed Jun 3, 2026

View reviewed changes

f3r10 reviewed Jun 5, 2026

View reviewed changes

tnull self-assigned this Jun 11, 2026

tnull added this to Weekly Goals Jun 11, 2026

tnull moved this to Goal: Merge in Weekly Goals Jun 11, 2026

TheBlueMatt reviewed Jun 15, 2026

View reviewed changes

tnull added 3 commits June 17, 2026 14:23

Treat replayed LSPS2 HTLCs idempotently

6997c88

Replayed intercepted HTLC events should not duplicate queued payments or panic after restart. Ignore already-queued intercept IDs so persisted queues remain stable across event replay. Co-Authored-By: HAL 9000

Prune closed LSPS2 terminal channel state

bab66f6

Terminal JIT channel state is only useful while the forwarded channel still exists. Drop completed LSPS2 mappings once the channel is gone so persisted service state does not retain stale entries indefinitely. Co-Authored-By: HAL 9000

Add LSPS2 replay regression coverage

68e71c2

Persisting LSPS2 service state can race with replayed intercepted HTLC events after restart. Cover replaying the same intercepted HTLC after restoring peer state so duplicate queueing is caught. Co-Authored-By: HAL 9000

tnull force-pushed the 2026-06-lsps2-replayed-htlc-queue branch from 444c009 to 68e71c2 Compare June 17, 2026 13:06

tnull requested a review from jkczyz June 17, 2026 13:07

joostjager reviewed Jun 17, 2026

View reviewed changes

Add pending changelog entry for PR 4656

8270c7c

jkczyz approved these changes Jun 17, 2026

View reviewed changes

TheBlueMatt approved these changes Jun 17, 2026

View reviewed changes

TheBlueMatt merged commit 63fcc4f into lightningdevkit:main Jun 17, 2026
1 check passed

github-project-automation Bot moved this from Goal: Merge to Done in Weekly Goals Jun 17, 2026

TheBlueMatt mentioned this pull request Jun 17, 2026

[0.2] Another round of backports + initial release notes #4706

Merged

TheBlueMatt removed the backport 0.2 label Jun 17, 2026

Uh oh!

Conversation

tnull commented Jun 2, 2026

Uh oh!

ldk-reviews-bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ldk-claude-review-bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

f3r10 left a comment

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

tnull commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheBlueMatt commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ldk-reviews-bot commented Jun 2, 2026 •

edited

Loading

ldk-claude-review-bot commented Jun 2, 2026 •

edited

Loading

tnull commented Jun 17, 2026 •

edited

Loading