Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add waste metric for coin selection #558

Closed

Conversation

csralvall
Copy link
Contributor

@csralvall csralvall commented Mar 6, 2022

Description

Waste is a metric introduced by the BnB algorithm as part of its bounding
procedure. Later, it was included as a high level function to use in comparison
of different coin selection algorithms in bitcoin/bitcoin#22009.

This implementations considers waste as the sum of two values:

  • Timing cost
  • Creation cost

waste = timing_cost + creation_cost


Timing cost is the cost associated to the current fee rate and some long
term fee rate used as a treshold to consolidate UTXOs.

timing_cost = txin_size * current_fee_rate - txin_size * long_term_fee_rate

Timing cost can be negative if the current_fee_rate is cheaper than the
long_term_fee_rate, or zero if they are equal.

The name of this cost is derived from the action of making bets against some
kind of market (in this case the fee market) to gain profits using the
interpretation of the future behaviour of that market based on a partial
information game.


Creation cost is the cost associated to the surplus of coins besides the
transaction amount and transaction fees. It can happen in the form of a change
output or in the form of excess fees paid to the miner.

Change cost is derived from the cost of adding the extra output to the
transaction and spending that output in the future.

change_cost = current_fee_rate * change_output_size + long_term_feerate * change_spend_size

Excess happens when there is not change, and the surplus of coins is spend as
part of the fees to the miner:

excess = tx_total_value - tx_fees - target

Where target is the desired amount to send.

Creation cost can be zero if there is a perfect match as result of the coin
selection algorithm.


So, waste can be zero or negative if the creation cost is zero and the timing
cost is less than or equal to zero

Later this can be used in the wallet to select the most optimal coin selection
algorithm.

This PR continues the work initiated by @benthecarman in #435 and address some
of the features described in #483.

The current PR is based mainly in the following sources:

Notes to the reviewers

While running cargo clippy some warnings may arise due to the lack of
documentation of the new functions, those will be resolved once the code
changes are finished. The same applies for the CHANGELOG.md file.

Checklists

All Submissions:

  • I've signed all my commits
  • I followed the contribution guidelines
  • I ran cargo fmt and cargo clippy before committing

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature
  • I've updated CHANGELOG.md

Copy link
Contributor

@Eunoia1729 Eunoia1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@danielabrozzoni
Copy link
Member

Tests are failing, can you please update the PR? :)
#515 introduced the is_spent field in LocalUtxo, since you're testing coin selection (which uses only unspent utxos) you probably want to set it to false all the time :)

@csralvall
Copy link
Contributor Author

Yes, you are right! I will try to rebase my branch in top of master to get all last changes (including #515) and then apply the correction. Let me know if you have any doubt about this.

@csralvall csralvall force-pushed the coin-selection-waste-metric branch from 544fc21 to 082f1b8 Compare April 13, 2022 23:08
@csralvall
Copy link
Contributor Author

It seems that the only failing test remaining are related to missing documentation. I don't know if you want to discuss the implementation further, but I'm willing to add documentation to pass the test and then iterate again, in case something is missing.

@danielabrozzoni
Copy link
Member

I'll look into the code ASAP, sorry for the waiting :)

Copy link
Member

@danielabrozzoni danielabrozzoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, the approach could be improved:

  • get_waste should be dropped altogether, it's not really useful to cache the Waste (you just want to calculate it once at the end of the coin selection)
  • instead of having a pub fn calculate_waste, better to go with:
struct Waste {
    value: i64,
}

impl Waste {
    pub fn calculate (...) -> Result<Waste, Error> {...the whole calcualte_waste function is pasted here}
}

This is just a first look at it, I still haven't tested

@csralvall
Copy link
Contributor Author

csralvall commented Apr 15, 2022

Thanks for the review! Let me explain myself. I made a few assumptioms before
started working in this PR:

  1. Originally I followed this suggestion
    from Steve to drop the struct WasteMetric in favor of a helper function
    associated with the CoinSelectionAlgorithm. I tried this approach but the
    generic database bound imposed some constraints to the signature of the
    calculate_waste function that didn't make sense to me at that moment. So I
    decided to implement it as a general function related to the
    CoinSelectionResult through the get_waste method.

    I will try to use your proposed approach and create the struct Waste.

@csralvall
Copy link
Contributor Author

csralvall commented Apr 16, 2022

Some things that I've experienced while trying to make Waste a struct:

  1. If it will be exposed through CoinSelectionResult it should be a public struct.
  2. If we don't want pub struct Waste we can use struct Waste internally and pass the inner value to the CoinSelectionResult struct.
  3. If we proceed with the public struct, an extra de-reference from the waste field of CoinSelectionResult will be required (waste.value).

All options seem cumbersome IMHO.

I reviewed the coinselection module from bitcoin and they have structured the Waste calculus in three different functions:

  • GetSelectionWaste who does all the work and is not associated with any struct (for testing purposes).
  • ComputAndSetWaste a setter (something similar to get_waste).
  • GetWaste a getter.

Although all of that it's still a work in progress, because the values of the waste metric are hardcoded for SelectCoinsBnB.

@afilini
Copy link
Member

afilini commented Apr 19, 2022

If it will be exposed through CoinSelectionResult it should be a public struct.

Yes, because custom coin selection algorithms should also be able to calculate waste

If we proceed with the public struct, an extra de-reference from the waste field of CoinSelectionResult will be required

I don't really understand this, if you want you could also implement Deref so that Waste can behave like an i64. But I don't think it's that bad to just read the inner value. Also, if you only have one field you can also definite it as struct Waste(pub i64) and add a getter for the inner value so that you don't have to use waste.0.

I haven't reviewed the whole code yet, but I did spend some time trying to remove some useless cloning in the code: I pushed my code here, feel free to take the two top commits and apply them to your PR.

@csralvall csralvall force-pushed the coin-selection-waste-metric branch from b6cd728 to d2bf84a Compare April 23, 2022 15:44
@csralvall
Copy link
Contributor Author

csralvall commented Apr 23, 2022

I don't really understand this, if you want you could also implement Deref so that Waste can behave like an i64. But I don't think it's that bad to just read the inner value. Also, if you only have one field you can also definite it as struct Waste(pub i64) and add a getter for the inner value so that you don't have to use waste.0.

I didn't know about Deref, but now that I followed your suggestion I find struct Waste(pub i64) cleaner.
That's was all about, I prefer to have straight semantics in the code.
I don't know if there is a case of use for the getter if we can use destructuring assignment, but I can implement one if you think it's necessary.

I haven't reviewed the whole code yet, but I did spend some time trying to remove some useless cloning in the code: I pushed my code here, feel free to take the two top commits and apply them to your PR.

Thanks for this! There are things that I'm still learning about Rust and made some clunky decisions.

@csralvall csralvall force-pushed the coin-selection-waste-metric branch from 23f6e7c to 6564c96 Compare April 25, 2022 23:15
Copy link
Member

@danielabrozzoni danielabrozzoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good! 😄
A couple of comments, small stuff. I still need to review tests 😭 Sorry!

Also, can you please squash commits? I'd say you just squash in one commit, putting Alekos and Ben as co-authors.

I don't know if it's better to merge with all the TODOs around (for calculating the waste and the cost of change) or not. @afilini wdyt?

@csralvall csralvall force-pushed the coin-selection-waste-metric branch 3 times, most recently from dba785e to a6ad7dd Compare May 2, 2022 23:20
@csralvall
Copy link
Contributor Author

Do you think that this is ready to start documenting it?

@notmandatory
Copy link
Member

Hi, please rebase to pickup changes in #596. Thanks!

@danielabrozzoni
Copy link
Member

Once it compiles, I think it's ready 😄

@csralvall csralvall force-pushed the coin-selection-waste-metric branch 2 times, most recently from 5666434 to c5fa6a0 Compare May 5, 2022 17:59
@csralvall
Copy link
Contributor Author

csralvall commented May 5, 2022

I added some REVIEW comments in mod.rs and coin_selection.rs because I need some clarification of what to do with those sections of code.

Copy link
Contributor

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool that you're working on this. I really like your description of the waste metric as the composite of a "timing cost" plus a "creation cost". I have a few questions and comments, I hope not too many are just misunderstandings owed to my low Rust reading proficiency.

@csralvall csralvall force-pushed the coin-selection-waste-metric branch 4 times, most recently from bceb683 to 89f30c5 Compare May 18, 2022 00:07
Copy link
Contributor

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read it over quickly again. Good changes!
I have couple more optional nits which you can feel free to ignore, but I think there may have slipped a sign error into the tests.

@csralvall csralvall force-pushed the coin-selection-waste-metric branch from 89f30c5 to 69f7e5a Compare May 20, 2022 19:38
@murchandamus
Copy link
Contributor

I don't feel like I'm familiar enough to "ACK" here, but my prior comments appear to have been all addressed. So, I guess "Concept ACK"? :)

@@ -157,6 +260,11 @@ pub trait CoinSelectionAlgorithm<D: Database>: std::fmt::Debug {
/// - `amount_needed`: the amount in satoshi to select
/// - `fee_amount`: the amount of fees in satoshi already accumulated from adding outputs and
/// the transaction's header
/// - `_cost_of_change`: the cost of creating change and spending it in the future
/// if there is change, it must be a positive number
/// must be None if there is no change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that you don't know if there's going to be a change or not before calling the coin_selection method...

In fact, at the moment we don't even calculate if we're going to have the change in coin_select! We do it later:

bdk/src/wallet/mod.rs

Lines 768 to 781 in fbd98b4

let mut drain_output = {
let script_pubkey = match params.drain_to {
Some(ref drain_recipient) => drain_recipient.clone(),
None => self
.get_internal_address(AddressIndex::New)?
.address
.script_pubkey(),
};
TxOut {
script_pubkey,
value: 0,
}
};

We obviously can't calculate the waste properly before knowing if there's a change or not, so we should move the change creation inside the coin_select method. See #147

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: here, we should just take a u64 (we can't know if we're going to have a change or not before the coin_select).
When we calculate the waste we should decide between the cost_of_change value or None based on whether we have the change or not

/// - `fee_rate`: fee rate to use
pub fn calculate(
selected: &[WeightedUtxo],
cost_of_change: Option<u64>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really picky nit: I'd stay consistent between change_cost and cost_of_change, as far as I understand they're the same thing, so we should either use one name or the other in the code (whichever you find more clear)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, probably the cost of change should be calculated inside the Waste::calculate and not outside.

change_cost = current_fee_rate * change_output_size + long_term_feerate * change_spend_size

This means we need the following:
current_fee_rate -> we have it already
change_output_size -> we need it to have it as a param
long_term_feerate -> we have it, it's LONG_TERM_FEE_RATE
change_spend_size -> we need it to have it as a param

Which means we should either add two new parameters to the function (maybe it's better to merge those in a struct)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of exactly how to calculate the change_output_size and the change_spend_size, I'm figuring it out, and I'll get back to you

Co-authored-by: csralvall <[email protected]>
Co-authored-by: afilini <[email protected]>
@csralvall csralvall force-pushed the coin-selection-waste-metric branch from 69f7e5a to e2fba7a Compare June 9, 2022 01:11
@@ -749,13 +749,22 @@ where
params.bumping_fee.is_some(), // we mandate confirmed transactions if we're bumping the fee
)?;

let weighted_drain_txout = WeightedTxOut {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated? Do you need help figuring out how to do it? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, or yes :) Changes with this may collide with the solution for #147 . I don't know if I should address that in this same PR or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should fork from bdk master and try to implement #147 first (you can of course go back and look at this code if you feel like there are problems that you solved here already). In this way you'll come up with a solution that doesn't depend on the waste metric.

@csralvall csralvall force-pushed the coin-selection-waste-metric branch from e2fba7a to 3fa8a47 Compare June 9, 2022 14:39
@csralvall
Copy link
Contributor Author

csralvall commented Jun 9, 2022

I realized that I should return the fee amount related to the change output
together with the change TxOut.
I could implement an enum like:

pub enum Excess {
    NoChange(Error::InsufficientFunds),
    // Change(change_output, change_output_fee)
    Change(TxOut, u64),
}

Then the signature of Waste::calculate would be:

    pub fn calculate(
        selected: &[WeightedUtxo],
        weighted_drain_output: WeightedTxOut,
        target: u64,
        fee_rate: FeeRate,
    ) -> Result<(Waste, Excess), Error>

And we can decide later if we join the change output with the other outputs
inside coin_select or create_tx. Either way, if we keep the change above,
CoinSelectionResult would look like this:

pub struct CoinSelectionResult {
    /// List of outputs selected for use as inputs
    pub selected: Vec<Utxo>,
    /// Total fee amount in satoshi
    pub fee_amount: u64,
    /// Waste value of current coin selection
    pub waste: Waste,
    /// Remaining amount after coin_select
    pub excess: Excess,
}

Through the use of NoChange(err) here
we can avoid passing params.drain_wallet as parameter to
coin_select.
And with Change(output, output_fee) we can keep the
same functionality of this line
(that wouldn't be removed with the fix for #147 )
through output.script_pubkey and output.value

@murchandamus
Copy link
Contributor

murchandamus commented Jun 10, 2022

Also, returning the total resulting fee with the coin selection result will make it easier to handle spending unconfirmed inputs when the parent transaction has a lower feerate than what you intend for the new transaction you're building to have. I mean, if you e.g. want to automatically bump the parent transaction to the same feerate so you achieve the intended feerate, and if you want to account for that additional in the waste metric later.

@csralvall
Copy link
Contributor Author

I'm not sure that I've understood what you said correctly. You were thinking in
something like the following situation?

  • Tx_A ----> UTXO_A (fee rate = 5, confirmed)
  • Tx_B ----> UTXO_B(fee rate = 3, unconfirmed)
  • (UTXO_A, UTXO_B) ----> Tx_C(fee rate = 5).

The original transaction that generated UTXO_B is Tx_B, and coin select
re-bumps the fee for it (not implemented currently), lets call it Tx_B_bis.
You said that the extra amount that comes from [Tx_B.fee_amount - Tx_B_bis.fee_amount]
can now be considered as part of the fee amount of Tx_C and waste can account it?

I'm unsure because fee_amount isn't new for CoinSelectionResult.

As a side note, I'm planning to propose a modification for the fee amount
returned by CoinSelectionResult. To avoid carriage of state through multiple
functions, I was considering to just return the fee amount for the selected
inputs and then add it to the fee amount of the rest of the transaction (inside
create_tx).

@danielabrozzoni
Copy link
Member

pub enum Excess {
    NoChange(Error::InsufficientFunds),
    // Change(change_output, change_output_fee)
    Change(TxOut, u64),
}

Written like this, Excess::NoChange always contains an Error::InsufficientFunds. Is this what you want to achieve? If so, why? Couldn't you have no change while having enough funds?

@murchandamus
Copy link
Contributor

murchandamus commented Jun 14, 2022

I'm not sure that I've understood what you said correctly. You were thinking in something like the following situation?

* `Tx_A` ----> UTXO_A (fee rate = 5, confirmed)

* `Tx_B` ----> UTXO_B(fee rate = 3, unconfirmed)

* (UTXO_A, UTXO_B) ----> `Tx_C`(fee rate = 5).

The original transaction that generated UTXO_B is Tx_B, and coin select re-bumps the fee for it (not implemented currently), lets call it Tx_B_bis. You said that the extra amount that comes from [Tx_B.fee_amount - Tx_B_bis.fee_amount] can now be considered as part of the fee amount of Tx_C and waste can account it?

Yeah, if you spend an unconfirmed input from a transaction that paid a lower feerate than the new transaction you're creating, you need to add more fees to the new transaction so that the transaction package in total will have the intended feerate. I.e. in your example:

Tx_A— vsize: 200 vB, fee: 1000 sats, feerate: 5 sat/vB ↦ UTXO_A
Tx_B— vsize:150 vB, fee: 450 sats, feerate: 3 sat/vB ↦ UTXO_B

If Tx_C is spending UTXO_A and UTXO_B and has a vsize of 200 vB, how much fees should it pay to reach an effective feerate of 5 sat/vB?
⇒ Tx_A has the same feerate already, so doesn't need to be bumped. Tx_B pays less than Tx_C, so Tx_C has to add (5 sat/vB - 3 sat/vB)×Tx_B.vsize = 300 sats to bump Tx_B to 5 sat/vB, and must pay 200 vB × 5 sat/vB for itself, i.e. if Tx_C pays a total of 1,300 sat/vB the transaction package {Tx_A, Tx_B, Tx_C} will have an effective feerate of 5 sat/vB.

@csralvall
Copy link
Contributor Author

pub enum Excess {
    NoChange(Error::InsufficientFunds),
    // Change(change_output, change_output_fee)
    Change(TxOut, u64),
}

Written like this, Excess::NoChange always contains an Error::InsufficientFunds. Is this what you want to achieve? If so, why? Couldn't you have no change while having enough funds?

I realized that I can't do that. The idea came up because I was to tied up to the code in create_tx.
I will probably turn to the structure used in #630.

@danielabrozzoni
Copy link
Member

Hey, we are in the process of releasing BDK 1.0, which will under the hood work quite differently from the current BDK. For this reason, I'm closing all the PRs that don't really apply anymore. If you think this is a mistake, feel free to rebase on master and re-open!

(Specifically, the coin selection is being rewritten from scratch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants