Skip to content

collectors: collect payment and attempt counts #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 10, 2025

Conversation

calvinrzachman
Copy link
Contributor

@calvinrzachman calvinrzachman commented Feb 26, 2025

Add support for real time tracking of counters for payments and attempts. These can be used to compute and create visualizations for things like average payment throughput and average # of attempts per payment.

rate(lnd_total_payments[5m])
rate(lnd_total_htlc_attempts[30m]) / rate(lnd_total_payments[30m])
  • The information is obtained in real time from lnd using the TrackPayments RPC.

After making payments between two directly connected nodes:

alice-64db66cb64-g4sh2:/# curl -s http://localhost:8990/metrics | grep lnd_total
# HELP lnd_total_htlc_attempts Total number of HTLC attempts across all payments
# TYPE lnd_total_htlc_attempts counter
lnd_total_htlc_attempts 5
# HELP lnd_total_payments Total number of payments initiated
# TYPE lnd_total_payments counter
lnd_total_payments 5

@calvinrzachman
Copy link
Contributor Author

I may be able to consolidate the # attempts with the existing HTLC collector. Checking...


paymentLogger.Info("Starting payments monitor...")

stream, err := p.client.TrackPayments(ctx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, it might not be comparable as TrackPaymentV2 is for a single payment whereas TrackPayments is for payment events in general

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good to go. Check this out: https://lightning.engineering/api-docs/api/lnd/router/track-payments/. Originally I was looking at using the lndclient library like the rest of the code base, but it doesn't yet support TrackPayments so decided to use raw routerrpc client.

// complete list of all htlc attempts made for this payment.
func processPaymentUpdate(payment *lnrpc.Payment) {
totalPayments.Inc()
attemptCount := len(payment.Htlcs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think this can quite be used as a proxy for attempts. For that we'd need to watch a payment over time, and increment this counter with each attempt.

We may also want to introspect into the payment state itself: https://lightning.engineering/api-docs/api/lnd/router/track-payment-v2/#lnrpcpaymentpaymentstatus

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checked, and I think I'm actually wrong about this. We get a new element here for each new attempt, as it isn't just the set of final HTLCs that were settled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to learning about this TrackPayments RPC, I had created a draft PR to add these metrics as real time counters in lnd directly. Implemented there, the ChannelRouter can increment counters as each attempt is registered. My hope here was that supplying the NoInflightUpdates directive to the TrackPayments call will give us only the final payment update (settle or fail) from which we can make an accurate determination of how many total attempts were made.

// NOTE: It is expected that this receive the *final* payment update with the
// complete list of all htlc attempts made for this payment.
func processPaymentUpdate(payment *lnrpc.Payment) {
totalPayments.Inc()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should look at the payment state and only increment this if it's IN_FLIGHT. That way we won't count a new attempt as a new payment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be safe here so long as my comment about NoInflightUpdates delivering only the terminal payment update (settle/fail) is correct. That way, we'll only process the payment a single time and capture all of the attempts during its life-cycle.

I could add an explicit check that the payment status is terminal to be extra sure though.

@calvinrzachman calvinrzachman force-pushed the payment-collector branch 2 times, most recently from 5a38420 to 0b15285 Compare March 6, 2025 01:10
Copy link
Contributor

@bitromortac bitromortac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I think this looks ok with TrackPayments. We won't get a resolution of how quickly the individual attempts are being made over time, as we will only receive the info once settled/failed (for that we'd need a combination of TrackPayments and TrackPaymentsV2?). Will also test once changes were addressed

type paymentsMonitor struct {
client routerrpc.RouterClient

ctx context.Context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of an anti-pattern to have it as a member, maybe the context could be created at start time with LndServices as a field. In general there are many occurrences of ctx.Background in this repo, which needs an overhaul as well (not this PR).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Let me know if it looks good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a nice pattern would be to use a fn.ContextGuard, but since that's not used in this repo for other structs, it's better to keep things consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I'll check the type out and keep it in mind going forward. Thanks for the tip.

@lightninglabs-deploy
Copy link

@calvinrzachman, remember to re-request review from reviewers when ready

@calvinrzachman
Copy link
Contributor Author

calvinrzachman commented Apr 7, 2025

After updating to CounterVec + status label the exported metrics look like:

# HELP lnd_total_htlc_attempts Total number of HTLC attempts across all payments
# TYPE lnd_total_htlc_attempts counter
lnd_total_htlc_attempts 5

# HELP lnd_total_payments Total number of payments initiated, labeled by final status
# TYPE lnd_total_payments counter
lnd_total_payments{status="failed"} 1
lnd_total_payments{status="succeeded"} 5

Copy link
Contributor

@bitromortac bitromortac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tACK 📊

The github workflow actions need updates to v3 or v4 for CI to pass.

collect raw payment and attempt count information from
lnd using the TrackPayments RPC.
@Roasbeef Roasbeef merged commit fd56748 into lightninglabs:master Apr 10, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants