Skip to content

[Taproot API Project] replace TaprootSpendInfo with new miniscript-specific structure #815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

apoelstra
Copy link
Member

In Miniscript, to compute control blocks, estimate satisfaction costs, or otherwise iterate through all the leaves of a Taptree, we use the bitcoin::TaprootSpendInfo structure to maintain a map of all leaves. This map is inappropriate for Miniscript (it may not be appropriate for anyone actually..) for a few reasons:

  • It is a map from Tapleaves' encoding as Script to data about the Tapleaves; but in Miniscript the Script encoding isn't primary and isn't even available for non-ToPublicKey keys
  • This map structure means that if duplicate leaves exist then only one of the dupes will be accessible.
  • The map structure is also really inefficient; it stores the entire merkle path for each leaf even those these paths significantly overlap, leading to O(n log n) space instead of O(n).
  • Furthermore, we don't need any map because we only ever iterate through the entire tree.

We fix all these issues by introducing a new TrSpendInfo struct which stores the entire Merkle tree in a flat representation and can produce an iterator over all the leaves. The iterator item can be used to access the Script, the Miniscript, the leaf version, and the control block for each leaf, while the TrSpendInfo structure itself can be used to access the internal and external keys. In other words, this one structure efficiently implements APIs for everything that rust-miniscript needs.

This completes the Taproot API overhaul project. After this I'll go back to fixing error types, eliminating recursive structures, or overhauling the validation parameters, whichever one seems most tractable from the current state of master.

apoelstra added 7 commits May 2, 2025 17:03
We have the function `update_item_with_descriptor_helper` which does a
few things: it derives a descriptor (replacing all the xpubs with actual
public keys) and updates the appropriate input or output map to map the
derived keys to their keysources.

It treats Tr outputs differently from other kinds of outputs, because
the relevant maps are different. However, in doing so, it duplicates a
bunch of work in ways that are hard to follow.

Essentially, the algorithm does three things: (a) derives all the keys
(and the descriptor), (b) optionally checks that the resulting
scriptpubkey is what we expect, and (c) updates the maps. The existing
code handles (a) separately for Tr and non-Tr descriptors.

In the Tr case, we derive all the keys using
Descriptor::<DescriptorPublicKey>::derived_descriptor which derives all
the keys and throws away the conversion. Then separately it keeps around
the un-derived descriptor, iterates through the keys, and populates the
`tap_key_origins` map by re-computing the derivation.

In the non-Tr case, we derive all the keys using the `KeySourceLookUp`
object, which does exactly the same thing as `derived_descriptor` except
that it stores its work in a BTreeMap, which is directly added to the
PSBT's `item.bip32_derivation` field.

This commit pulls out (a) into common code; it then copies all the data
out of the key map into `item.tap_key_origins` along with an empty
vector of tapleaves. It then goes through all the leaves, and for each
key that appears in each leaf, appends that leaf's hash to the vector of
tapleaves. This is still a little ineffecient but will be much cleaner
after a later commit when we improve the Taproot SpendInfo structure.

The original code dates to Lloyd's 2022 PR rust-bitcoin#339 which introduces logic to
populate these maps. That algorithm underwent significant refactoring in
response to review comments and I suspect that the duplicated logic went
unnoticed after all the refactorings.
This commit introduces a new data structure but **does not** use it. The
next commit will do this. I have separated them so that this one, which
introduces a bunch of algorithmic code, can be reviewed separately from
the API-breaking one.

When computing a `Tr` output, we need to encode all its tapleaves into
Script, put these into a Merkle tree and tweak the internal key with the
root of this tree. When spending from one of the branches of this
output, we need the Merkle path to that output.

We currently do this by using the `TaprootSpendInfo` structure from
rust-bitcoin. This is not a very good fit for rust-miniscript, because
it constructs a map from Tapscripts to their control blocks. This is
slow and memory-wasteful to construct, and while it makes random access
fairly fast, it makes sequential access pretty slow. In Miniscript we
almost always want sequential access, because all of our algorithms are
some form of either "try every possibility and choose the optimum" or
"aggregate every possibility".

It also means that if there are multiple leaves with the same script,
only one copy will ever be accessible. (If they are at different depths,
the low-depth one will be yielded, but if they are at the same depth
it's effectively random which one will get priority.) Having multiple
copies of the same script is a pointless malleability vector, but this
behavior is still surprising and annoying to have to think about.

To replace `bitcoin::TaprootSpendInfo` we create a new structure
`TrSpendInfo`. This structure doesn't maintain any maps: it stores a
full Merkleized taptree in such a way that it can efficiently yield all
of the leaves' control blocks in-order.

It is likely that at some point we will want to upport this, or some
variant of it, into rust-bitcoin, since for typical usecases it's much
faster to use and construct.
See previous commit for details about this data structure. This commit
stops using the rust-bitcoin `TaprootSpendInfo` in favor of our new
`TrSpendInfo` structure. This one allows us to iterate over all the
leaves of the Taptree, easily accessing their leaf hashes and control
blocks in order, which simplifies satisfaction logic.
Moves a bit of ugly logic out of the PSBT module into the spendinfo
module so that it's available for other users. We can convert from a
TrSpendInfo to a bitcoin::TapTree but we can't do the opposite
conversion since TrSpendInfo expects to have a Miniscript for each leaf.
This is a bit of a pain because the old lookup-by-script behavior
doesn't map cleanly onto the new iterate-through-everything behavior.
But this test is definitely useful (the unit tests in the previous
commit came from it.)
@apoelstra apoelstra force-pushed the 2025-03--taproot-api branch from 700feab to 3d12b42 Compare May 2, 2025 17:07
Copy link
Member Author

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 3d12b42 successfully ran local tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant