Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail HTLC backwards before upstream claims on-chain #3556

Merged
merged 5 commits into from
Jan 27, 2025

Conversation

TheBlueMatt
Copy link
Collaborator

This rebases and replaces #2457. It removes the config option, as once we've lost the HTLC in question, not failing back to get three additional blocks of room to claim the inbound edge (after at least 39 blocks have already passed) doesn't seem like a reasonable tradeoff in nearly any case (and I'd like to reduce the 3 number to 2 anyway). It also cleans up channel a bit more by removing historical_inbound_htlc_fulfills entirely, uses SentHTLCId (resolving an issue where two inbound channels have relayed two different HTLCs with the same HTLC ID into a channel causing confusion), and a few other cleanups.

I'd like to backport this to 0.1 because its important for future feerate spikes.

alecchendev and others added 5 commits January 21, 2025 22:06
In a coming commit we'll expire HTLCs backwards even if we haven't
yet claimed them on-chain based on their inbound edge being close
to causing a channel force-closure.

Here we track the incoming edge's CLTV expiry in the
pending-routing state so that we can include it in the `HTLCSource`
in the next commit.

Co-authored-by: Matt Corallo <[email protected]>
In a coming commit we'll expire HTLCs backwards even if we haven't
yet claimed them on-chain based on their inbound edge being close
to causing a channel force-closure.

Here we track and expose the incoming edge's CLTV expiry in the
`HTLCSource`, giving `ChannelMonitor` access to it.

Co-authored-by: Matt Corallo <[email protected]>
This field was used to test that any HTLC failures didn't come
in after an HTLC was fulfilled (indicating, somewhat dubiously,
that there may be a bug causing us to fail when we shouldn't have).

In the next commit, we'll be failing HTLCs based on on-chain HTLC
expiry, but may ultimately receive the preimage thereafter. This
would make the `historical_inbound_htlc_fulfills` checks
potentially-brittle, so we just remove them as they have dubious
value.
Fail inbound HTLCs if they expire within a certain number of blocks from
the current height. If we haven't seen the preimage for an HTLC by the
time the previous hop's timeout expires, we've lost that HTLC, so we
might as well fail it back instead of having our counterparty
force-close the channel.

Co-authored-by: Matt Corallo <[email protected]>
If we've signed the latest holder tx (i.e. we've force-closed and
broadcasted our state), there's not much reason to accept
counterparty-transaction-updating `ChannelMonitorUpdate`s, we
should make sure the `ChannelManager` fails the channel as soon as
possible.

This standardizes the failure cases to also match those added to
the previous commit, which makes things a bit more readable.
@TheBlueMatt TheBlueMatt added this to the 0.1.1 milestone Jan 22, 2025
@TheBlueMatt TheBlueMatt linked an issue Jan 22, 2025 that may be closed by this pull request
continue;
}
if !duplicate_event {
log_error!(logger, "Failing back HTLC {} upstream to preserve the \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: is this really considered an error? Info/warn seems better suited.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its an error in that a violation of our assumptions around tx confirmation time happened, and its probably an important enough case that users should see it and think hard about what is happening.

mine_transaction(&nodes[1], &node_1_txn[1]); // HTLC timeout
connect_blocks(&nodes[1], ANTI_REORG_DELAY);
// Expect handling another fail back event, but the HTLC is already gone
expect_pending_htlcs_forwardable_and_htlc_handling_failed!(nodes[1],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could avoid having this duplicate go out if we check the set on the confirmed timeout path, but I guess it's not worth doing since a restart in between would yield it anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also spent a while trying to move the event generation to where we do the actual failure so that we could detect the duplicate in ChannelManager, but the changeset kinda blew up on me :/. And, yea, this should be a really rare/never kinda case, so duplicate handling-failed events skewing users' forwarding statistics is probably not the end of the world.

@TheBlueMatt TheBlueMatt added the weekly goal Someone wants to land this this week label Jan 23, 2025
@arik-so arik-so merged commit 8d8b4ea into lightningdevkit:main Jan 27, 2025
24 of 25 checks passed
@TheBlueMatt TheBlueMatt mentioned this pull request Jan 28, 2025
@TheBlueMatt
Copy link
Collaborator Author

Backported in #3567

TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this pull request Jan 29, 2025
v0.1.1 - Jan 28, 2025 - "Onchain Matters"

API Updates
===========

 * A `ChannelManager::send_payment_with_route` was (re-)added, with semantics
   similar to `ChannelManager::send_payment` (rather than like the pre-0.1
   `send_payent_with_route`, lightningdevkit#3534).
 * `RawBolt11Invoice::{to,from}_raw` were added (lightningdevkit#3549).

Bug Fixes
=========

 * HTLCs which were forwarded where the inbound edge times out within the next
   three blocks will have the inbound HTLC failed backwards irrespective of the
   status of the outbound HTLC. This avoids the peer force-closing the channel
   (and claiming the inbound edge HTLC on-chain) even if we have not yet managed
   to claim the outbound edge on chain (lightningdevkit#3556).
 * On restart, replay of `Event::SpendableOutput`s could have caused
   `OutputSweeper` to generate double-spending transactions, making it unable to
   claim any delayed claims. This was resolved by retaining old claims for more
   than four weeks after they are claimed on-chain to detect replays (lightningdevkit#3559).
 * Fixed the additional feerate we will pay each time we RBF on-chain claims to
   match the Bitcoin Core policy (1 sat/vB) instead of 16 sats/vB (lightningdevkit#3457).
 * Fixed a cased where a custom `Router` which returns an invalid `Route`,
   provided to `ChannelManager`, can result in an outbound payment remaining
   pending forever despite no HTLCs being pending (lightningdevkit#3531).

Security
========

0.1.1 fixes a denial-of-service vulnerability allowing channel counterparties to
cause force-closure of unrelated channels.
 * If a malicious channel counterparty force-closes a channel, broadcasting a
   revoked commitment transaction while the channel at closure time included
   multiple non-dust forwarded outbound HTLCs with identical payment hashes and
   amounts, failure to fail the HTLCs backwards could cause the channels on
   which we recieved the corresponding inbound HTLCs to be force-closed. Note
   that we'll receive, at a minimum, the malicious counterparty's reserve value
   when they broadcast the stale commitment (lightningdevkit#3556). Thanks to Matt Morehouse for
   reporting this issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
weekly goal Someone wants to land this this week
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider failing HTLC backwards before upstream claims on-chain
4 participants