Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iroh-gossip: connections and subscriptions are not cleaned up #2962

Open
dignifiedquire opened this issue Nov 25, 2024 · 1 comment
Open
Assignees
Labels
bug Something isn't working c-iroh-gossip
Milestone

Comments

@dignifiedquire
Copy link
Contributor

When dropping the subscription stream, neither connections nor subscriptions are fully cleaned up.

from @Frando

I found two bugs likely:

1 - there's no notification once a receiver stream on the user api side is dropped. we need to notify the actor about that. there is already logic in the actor to quit a topic once all event receivers and command senders are dropped; however it only triggers on sender drops and when failing to send an event, but not once the event receiver stream is dropped. likely we will have to add something like a oneshot sender plus atomic bool that notifies the actor, so that it can remove its event sender and check if the topic is ripe for quitting.

2 - more importantly, after a topic is quit the connection is not actually terminated. here we're back to "how to terminate a quic connection gracefully" territory, and I fear that this is not properly solvable without a wire breaking change, because as we know quic connections may only be closed by the receiver of the last message. however there is no clear last message; the protocol is message oriented and currently the logic would be "close the connection after the last message is sent" which is not possible in quic (hyparview/plumtree was designed with raw udp in mind). without a wire change we could resort to timeouts: clear connections 10s after the terminal message was sent. with wire breaking, we can add proper closing, e.g. with the function I wrote for iroh-willow (which uses uni streams to gracefully close).

@dignifiedquire dignifiedquire added bug Something isn't working c-iroh-gossip labels Nov 25, 2024
@dignifiedquire dignifiedquire added this to the v0.29.0 milestone Nov 25, 2024
@ramfox ramfox moved this to 📋 Backlog in iroh Nov 25, 2024
@Frando
Copy link
Member

Frando commented Nov 29, 2024

Some more details from when I had a quick look at this last week and wrote the above:

  • Regarding 1) above: What I thought of doing would be to wrap the impl Stream returned from Gossip::join_with_stream in a struct that a) forwards the stream impl to the inner type, and b) has a Drop impl which triggers a oneshot sender. The receiver would need to be listened for in the actor, to mark the subscription receiver as dropped and trigger disconnect if it the command sender is dropped as well and it is the last subscription for the topic. (The drop notification for the command sender already works because the command receiver in the gossip is triggered on drop of the sender handle).

  • Regarding 2) above, if we do not want to do a wire-breaking change, we have to mark connections as obsolete once we processed a DisconnectPeer event, and then clean it up some time later (to allow the terminal message to be sent out). We will not get a response for the terminal message, and quic does not give us any notification once the message is delivered, so a timer is all we can do I think. With a wire breaking change this would be solveable cleanly, e.g. with the approach for connection termination taken in iroh-willow.

@dignifiedquire dignifiedquire modified the milestones: v0.29.0, v0.30.0 Dec 4, 2024
@ramfox ramfox moved this to 👍 Ready in iroh Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working c-iroh-gossip
Projects
Status: 👍 Ready
Development

No branches or pull requests

3 participants