Skip to content

(3/3) Add Failure Reason to HTLCHandlingFailed #3700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

carlaKC
Copy link
Contributor

@carlaKC carlaKC commented Apr 2, 2025

This PR surfaces a failure reason in HTLCHandlingFailed events. Opening up early to add some context to the prefactor PR #3601.

The heart of the PR is in d35d35f, and it could probably be reduced to just this commit. I've made some quite opinionated renaming / deprecation decisions in the other commits which aren't required for this change, but I think make for a more readable API overall - happy to drop them if it ain't broke, don't fix it applies.

Fixes: #3561
Fixes: #3541

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Apr 2, 2025

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@carlaKC carlaKC changed the title Add Failure Reason to HTLCHandlingFailed (3/3) Add Failure Reason to HTLCHandlingFailed Apr 2, 2025
@carlaKC carlaKC force-pushed the 3541-api-changes branch 2 times, most recently from d049301 to 6138779 Compare April 4, 2025 20:36
Copy link

codecov bot commented Apr 11, 2025

Codecov Report

Attention: Patch coverage is 86.55462% with 32 lines in your changes missing coverage. Please review.

Project coverage is 89.60%. Comparing base (0ee1def) to head (5badb5e).
Report is 23 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/events/mod.rs 10.71% 25 Missing ⚠️
lightning/src/ln/channelmanager.rs 90.24% 4 Missing ⚠️
lightning-liquidity/src/lsps2/service.rs 0.00% 2 Missing ⚠️
lightning/src/ln/monitor_tests.rs 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3700      +/-   ##
==========================================
+ Coverage   89.13%   89.60%   +0.46%     
==========================================
  Files         156      158       +2     
  Lines      123600   127663    +4063     
  Branches   123600   127663    +4063     
==========================================
+ Hits       110176   114390    +4214     
+ Misses      10757    10640     -117     
+ Partials     2667     2633      -34     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@valentinewallace
Copy link
Contributor

Needs rebase 🥳

@carlaKC carlaKC marked this pull request as ready for review April 17, 2025 19:24
@joostjager joostjager requested review from joostjager and removed request for wpaulino April 17, 2025 19:53
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @valentinewallace! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@carlaKC
Copy link
Contributor Author

carlaKC commented Apr 21, 2025

Pushed suggested comment/style changes + rename of HTLCFailReasonRepr to go with outer struct rename.

Happy to drop the last two commits as discussed here if the renames aren't strong enough to justify the change!

Comment on lines +532 to +541
Downstream,
/// The HTLC was failed locally by our node.
Local {
/// The reason that our node chose to fail the HTLC.
reason: LocalHTLCFailureReason,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would be nicer to avoid this outer enum and just document on the field "Will be None if the HTLC failed downstream or this event was created prior to 0.2." Seems simpler, loses a bit of information for a period of time though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be nice to surface hold_time along with Downstream in future (nice thing to put on a dashboard IMO) so tend towards keeping the enum for the sake of being able to easily add that.

Opinion not very strongly held though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't realize that was a potential goal but that makes sense! We may want to document this on the attributable failures issue, if there is one.

Comment on lines 2288 to 2289
// [`LocalHTLCFailureReason::UnknownNextHop`]. This will cover both the case
// where we have a legacy event
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: incomplete sentence

@@ -480,12 +480,16 @@ pub enum HTLCHandlingType {
channel_id: ChannelId,
},
/// Scenario where we are unsure of the next node to forward the HTLC to.
///
/// Deprecated: will only be used in versions before LDK v0.2.0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL this exists!

Struggling to #[allow(deprecated)] for the impl_writeable_tlv_based_enum_upgradable invocation to silence the cargo warning though..

@carlaKC
Copy link
Contributor Author

carlaKC commented Apr 23, 2025

Removed the last 2 rename commits (I don't think they're that much of an improvement) + fixed up comment.

Couldn't find a way to use deprecation attribute without making changes to the serialization macro so left it out.

@carlaKC carlaKC requested a review from joostjager April 24, 2025 12:35
Comment on lines 1492 to 1497
/// The type of HTLC that was handled.
handling_type: HTLCHandlingType,
/// The reason that the HTLC failed.
///
/// This field will be `None` only for objects serialized prior to LDK 0.2.0.
handling_failure: Option<HTLCHandlingFailureReason>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consulted @jkczyz our resident naming expert --

He suggested renaming HTLCHandlingType to HTLCHandlingFailureType, and renaming the fields within the event to failure_type and failure_reason. This would also imply possibly renaming the HTLCHandlingFailureType::ReceiveFailed to just ::Receive, similarly for forwards. Reasoning being that the type of HTLC handling is moreso the type of failure, conceptually (Jeff feel free to correct me here).

Thoughts on this naming scheme? cc @joostjager

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like an improvement to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🫡 gone ahead and also changed ForwardFailed -> Forward with the same reasoning as ReceiveFailed -> Receive

carlaKC added 4 commits April 25, 2025 16:14
HTLCDestination currently contains a combination of information about
the type of HTLC we handled to fail (a payment, a forward etc) and
error information for a select set of cases (unknown next peer, for
example).

In preparation for a refactor that will split the failure reason out
into its own enum, this commit renames HTLCDestination to
HTLCHandlingType.
Rename variant to be more specific to the current context - a
FailedPayment could be a payment that we failed to dispatch or one
that we rejected on receive.
Standardize naming within the HTLCHandlingType struct to present more
consistent API terminology.
This variant of HTLCHandlingType contains information about the
failure cause along with its type - as an UnknownNextPeer is just an
InvalidForward that has the failure type UnknownNextPeer.

This commit deprecates the variant's use, while still writing it to
disk to allow the option to downgrade.

The deprecated attribute is not used because it cannot be silenced in
impl_writeable_tlv_based_enum, which uses UnknownNextPeer to remain
downgradable.
Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking if we want to Just Land and save the ser tweak for follow-up

Comment on lines 1809 to 1818
let downgradable_type = match (failure_type, failure_reason) {
(HTLCHandlingFailureType::InvalidForward { requested_forward_scid },
Some(HTLCHandlingFailureReason::Local {
reason: LocalHTLCFailureReason::UnknownNextPeer
}))
=> HTLCHandlingFailureType::UnknownNextHop {
requested_forward_scid: *requested_forward_scid,
},
_ => failure_type.clone()
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if I signed off on this previously, but I think it would make more sense to always write the non-deprecated type and then include a release note that downgrades will result in reading the deprecated type as ::InvalidForward, since it'll save some complexity here and keep the door open for us to delete the variant in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having done it this way, I think it looks much better 👍 doesn't seem like that much of a loss to have less information on downgrade in comparison

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there no functional consequences after the downgrade for this loss of information?

failure_reason: Some(LocalHTLCFailureReason::UnknownNextPeer.into()),
}
}
_ => panic!("HTLCHandlingFailed wrong type")
Copy link
Contributor

@valentinewallace valentinewallace Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, is this panic reachable? May want to just remove it or use debug_assert instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes yeah, reverted to just match on HTLCHandlingType (was like this before change suggestion) - think the panic should be ok if it's for a variant we just declared right above?

This allows us to more easily get rid of this variant (as we stop
writing it), and removes the need for custom serialization. The
tradeoff here is that downgrading nodes lose some information, as the
UnknownNextPeer variant will be represented as InvalidForward (because
they can't read the accompanying reason that specifies that it's a
UnknownNextPeer.
carlaKC added a commit to carlaKC/rust-lightning that referenced this pull request Apr 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Give HTLCHandlingFailed a reason of some kind Add a reason enum to HTLCDestination in some variants
4 participants