-
Notifications
You must be signed in to change notification settings - Fork 3
[on hold] RFC3: simultaneous tunnels #540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| AccountType account_type | ||
| Pubkey owner | ||
| Pubkey device_pk | ||
| string name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something worth considering would be to have in the Device class the device_type or an equivalent field that allows us validate that the interface name is sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, the device_type is there as a switch, but something more like device_model could capture this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a list of approved device_models that we support that we could validate against? or is this more for something that we could check in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have two models: 7130LBR and 7280CR3. Both have very different conventions around interface naming e.g. swA/B/C on the former and EthA/B on the latter.
Can be future, but we need to figure out how we want to fail if these sorts of fields prove to have garbage data.
| * can a user set their own termination "points" etc | ||
|
|
||
| #### Controller | ||
| * *optional*: configuration for tunnel termination loopbacks generated in device template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference would be to do this in the controller templating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; can we launch multicast without the controller templating? (my thought is yes but could be convinced otherwise of course). Feels like the templating could be a separate RFC to sunset ansible so that we don't block multicast too long but again open to suggestions otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can take parts of Ansible config generation process and move it to the controller over time. I think here we just need to add loopback interfaces, and I would hope it isn't too much work. We need a way to allocate these from the dz_prefix pool and pass them from the activator to the controller I guess.
Feel free to correct me if it is more work than I appreciate, we can continue to push it manually for now.
| 3. Initiate latency probes per tunnel termination | ||
| 4. Store results as <Device: Interface: LatencyResult> and serve via /latency endpoint for CLI | ||
|
|
||
| #### Activator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elitegreg are there any things that are missing from the activator / smart contract side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I can think of, but I'm not sure I'm grasping the Activator change listed.
rfcs/rfc3-simultaneous-tunnels.md
Outdated
|
|
||
| ## Motivation | ||
|
|
||
| Multiple tunnel support is required now that DoubleZero supports IBRL and multicast. In fact, multicast can not be publicly released wthout multiple tunnel support due to restrictions in the Linux kernel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a huge deal, but the limitations aren't from Linux but rather GRE.
| } | ||
|
|
||
| Interface --> Device : device_pk | ||
| ```` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juan-malbeclabs you probably understand this better than me. Wouldn't this design require loading all interfaces in order to reduce to the set for device_pk? Would it be better to have a list of interface pubkeys in the device? Then they can be looked up by the account pubkey?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I conceptually agree with the diagram, I would store all interfaces in a second account linked to the Device. We would create a DeviceInterfaceList account. This way, we would need to read both the list of Devices and the list of Interfaces (one account per Device with multiple Interfaces).
The goal is to be able to modify this account whenever an interface is created or deleted, without having to load the Device information.
| ### Service Changes | ||
|
|
||
| #### CLI | ||
| 1. The CLI currently selects the tunnel termination endpoint for a user connection based on min(latency) across all DZDs. In the event there is an existing tunnel terminated on the DZD, we need to select the next best endpoint on the same DZD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not fair to just assume that the latency of one interface is as good as another on the same DZD and just iterate the interfaces as more tunnels are added?
| 3. Initiate latency probes per tunnel termination | ||
| 4. Store results as <Device: Interface: LatencyResult> and serve via /latency endpoint for CLI | ||
|
|
||
| #### Activator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I can think of, but I'm not sure I'm grasping the Activator change listed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces RFC3, a design proposal for enabling simultaneous tunnel support in DoubleZero by defining a new Interface data structure, updating service components, and outlining related on-chain and network changes.
- Add initial RFC document outlining motivation, design, and components affected.
- Define new
Interfacestructure alongsideDevice, and detail CLI, daemon, activator, and controller updates. - Document security considerations, backwards compatibility impact, and open questions.
Comments suppressed due to low confidence (1)
rfcs/rfc3-simultaneous-tunnels.md:54
- The Mermaid code block is opened with three backticks but closed with four; use three backticks consistently.
</details>
|
|
||
| #### CLI | ||
| 1. The CLI currently selects the tunnel termination endpoint for a user connection based on min(latency) across all DZDs. In the event there is an existing tunnel terminated on the DZD, we need to select the next best endpoint on the same DZD. | ||
| 2. Users need to be able perform CRUD operations on the on-chain interfaces i.e. `doublezero interface create`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the interfaces associated with the devices created by the contributor? In this context, who is the User?
Who has the authority to execute the doublezero interface create command?
|
|
||
| #### CLI | ||
| 1. The CLI currently selects the tunnel termination endpoint for a user connection based on min(latency) across all DZDs. In the event there is an existing tunnel terminated on the DZD, we need to select the next best endpoint on the same DZD. | ||
| 2. Users need to be able perform CRUD operations on the on-chain interfaces i.e. `doublezero interface create`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think contributors need to do this, not users.
| 3. Users need to be able to display interfaces listed on-chain via `doublezero interfaces list` or some derivative command. | ||
|
|
||
| #### Daemon | ||
| Latency probing changes are needed for this as the current implementation looks at the public_ip field of device record to probe each DZD: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not understanding the benefit (or necessity) of testing latency to every tunnel termination interface on every DZD. Why would we expect different IPs on the same router to have different latency characteristics?
|
|
||
| 2. **Require users to obtain a second public address.** While this would satisfy the requirement of a unique (source, destination) tunnel IP endpoint per tunnel, it pushes this issue back on the users of DoubleZero and possibly prevents user uptake at the expense of more engineering work. | ||
|
|
||
| 3. **Adapt the devices table in the current smart contract to fit a second tunnel (i.e. multicast) endpoint.** While this seems like significantly less work on its face, we end up needing to touch the same portions of the stack as a more ideal solution as they all need to be taught to understand this field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also has the downside of limiting us to 2 tunnels, so it's not a general solution
|
|
||
| ```mermaid | ||
| classDiagram | ||
| class Interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today we can infer the interface that a Link is associated with by looking at the link's tunnel_net. But now that we have Interface, should we make this explicit by adding an optional relationship between Interface and Link?
Summary of Changes