Skip to content

splice: Implement start_batch #8335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

ddustin
Copy link
Collaborator

@ddustin ddustin commented Jun 9, 2025

Implement the sending of start_batch as per t-bast’s spec. We need to receive start_batch as the new spec moves batch_size out of commitment_signed and instead relays that value in start_batch. For sending we implement an internal batching approach.

This approach bundles all the message bytes for start_batch and corresponding commitment_signeds into a single peer_write call. Because each message needs to be encrypted individually, we introduce an internal peer wire message type, protocol_batch_element. This type is detected by connectd and is not meant to be sent to the peer.

connectd reads the message to get information on the batch element (currently just element_size for the individual message size and channel_id for consistency with all other peer_wire message types) and does not forward it along to the peer.

Finally connectd encrypts each message element individually, batches all the encrypted messages into one large byte array, and sends the whole array in one go on the actual wire to the peer.

This rigmarole is to make sure no other messages are sent between start_batch and all of the batch messages -- without introducing any mutexes or locks to connectd.

Changelog-Added: support for start_batch

@t-bast
Copy link

t-bast commented Jun 12, 2025

Hey! I just added a small change to start_batch as requested by jkczyz to include a message_type TLV, see lightning/bolts@a821628 for more details! I think it makes it simpler for the initial implementation to only allow start_batch that contain this TLV set to 132 (commitment_signed) and leave more general cases to future implementation (if it ever becomes necessary).

@ddustin ddustin force-pushed the ddustin/start_batch branch 2 times, most recently from 35b5281 to 07fa3ad Compare June 13, 2025 17:58
@ddustin
Copy link
Collaborator Author

ddustin commented Jun 13, 2025

Hey! I just added a small change to start_batch as requested by jkczyz to include a message_type TLV, see lightning/bolts@a821628 for more details! I think it makes it simpler for the initial implementation to only allow start_batch that contain this TLV set to 132 (commitment_signed) and leave more general cases to future implementation (if it ever becomes necessary).

I will add the message_type TLV 👍

@ddustin ddustin force-pushed the ddustin/start_batch branch 8 times, most recently from cb70577 to 417a728 Compare June 16, 2025 18:18
ddustin added 6 commits June 16, 2025 14:18
We add `start_batch` to match t-bast’s splicing spec and we add a new internal wire type `WIRE_PROTOCOL_BATCH_ELEMENT` using the type number 32899 following the spec in Bolt #1 for application-specific messages:

“Custom (types 32768-65535): experimental and application-specific messages”

Changelog-Added: support for `start_batch`
The new spec sends `batch_size` in `start_batch` and removes it from `commitment_signed` so we need to stop processing it in `commitment_signed`.

Since the tlv is now reduced to one element and that automagically turns it into a direct use TLV so we have to update the code everywhere it is referenced.
Since `batch_size` has moved into this new message, we can’t ignore it anymore and have to process it
We need a way to indiciate failure of the batching system in connectd.
Since handling commit sig batches is coming for multiple locations now, add more explicity error handling so log messages are more useful.
Implement the sending of `start_batch` and `protocol_batch_element` from `channeld` to `connectd`.

Each real peer wire message is prefixed with `protocol_batch_element` so connectd can know the size of the message that were batched together.

`connectd` intercepts `protocol_batch_element` messages and eats them (doesn’t forward them to peer) to get individual messages out of the batch.

It needs this to be able to encrypt them individiaully. Afterwards it recombines the now encrypted messages into a single message to send over the wire to the peer.

`channeld` remains responsible for making `start_batch` the first message of the message bundle.
@ddustin ddustin force-pushed the ddustin/start_batch branch from 417a728 to f3acf0a Compare June 16, 2025 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants