-
Notifications
You must be signed in to change notification settings - Fork 3
[on hold] RFC3: simultaneous tunnels #540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4271a11
410a31b
a8740e2
56f3e86
87757e0
768b224
b714e73
0ad808f
01aa22f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,101 @@ | ||
| # Supporting Multiple Tunnels | ||
|
|
||
| ## Summary | ||
|
|
||
| DoubleZero needs to support multiple tunnels of the same or different types. In order to terminate multiple tunnels to the same DZD from a user machine, a unique pair of (source, destination) tunnel IP endpoints is required for the Linux kernel to correctly demux traffic. This document defines the changes necessary to allow this and bootstraps the ability to begin storing network interface metadata which is necessary for upcoming work. The outcome of this work would be a new on-chain interfaces table, minimally populated with interfaces used for tunnel termination and all subsequent systems referring to this table to derive tunnel endpoints as opposed to the `public_ip` field of the devices table. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Multiple tunnel support is required now that DoubleZero supports IBRL and multicast. In fact, multicast can not be publicly released without multiple tunnel support due to restrictions in the Linux kernel. | ||
|
|
||
| ## New Terminology | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| 1. **Only support one tunnel on a user's machine.** In the current DoubleZero architecture, we're unable to support both unicast and multicast forwarding on a single tunnel. This would require a user to make a choice between using DoubleZero for unicast traffic or multicast traffic, which is not a user friendly tradeoff. | ||
|
|
||
| 2. **Require users to obtain a second public address.** While this would satisfy the requirement of a unique (source, destination) tunnel IP endpoint per tunnel, it pushes this issue back on the users of DoubleZero and possibly prevents user uptake at the expense of more engineering work. | ||
|
|
||
| 3. **Adapt the devices table in the current smart contract to fit a second tunnel (i.e. multicast) endpoint.** While this seems like significantly less work on its face, we end up needing to touch the same portions of the stack as a more ideal solution as they all need to be taught to understand this field. | ||
|
|
||
| 4. **Use GRE keys to identify tunnels.** GRE keys enable a route to de-encapsulate packets and idenfity the right tunnel to use. This would have been a good approach except that at rates of about 250 Mbps, packets were being dropped which makes it unviable option. | ||
|
|
||
| ## Detailed Design | ||
|
|
||
| TBX | ||
|
|
||
| ### Data Structure Changes | ||
|
|
||
| A new data structure, `Interface`, will be defined that is attached to a parent `Device`'s public key. The relationship between an `Interface` and `Device` is many-to-one. | ||
|
|
||
| ```mermaid | ||
| classDiagram | ||
| class Interface { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Today we can infer the interface that a Link is associated with by looking at the link's tunnel_net. But now that we have Interface, should we make this explicit by adding an optional relationship between Interface and Link? |
||
| AccountType account_type | ||
| Pubkey owner | ||
| Pubkey device_pk | ||
| string name | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something worth considering would be to have in the Device class the device_type or an equivalent field that allows us validate that the interface name is sensible.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, the device_type is there as a switch, but something more like device_model could capture this.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a list of approved
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have two models: 7130LBR and 7280CR3. Both have very different conventions around interface naming e.g. swA/B/C on the former and EthA/B on the latter. Can be future, but we need to figure out how we want to fail if these sorts of fields prove to have garbage data. |
||
| string device_type | ||
| IpV4Inet ip4_addr | ||
| bool tunnel_termination | ||
| } | ||
| class Device { | ||
| AccountType account_type | ||
| Pubkey owner | ||
| u128 index | ||
| u8 bump_seed | ||
| Pubkey location_pk | ||
| Pubkey exchange_pk | ||
| DeviceType device_type | ||
| IpV4 public_ip | ||
| DeviceStatus status | ||
| String code | ||
| NetworkV4List dz_prefixes | ||
| } | ||
|
|
||
| Interface --> Device : device_pk | ||
| ```` | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @juan-malbeclabs you probably understand this better than me. Wouldn't this design require loading all interfaces in order to reduce to the set for device_pk? Would it be better to have a list of interface pubkeys in the device? Then they can be looked up by the account pubkey?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although I conceptually agree with the diagram, I would store all interfaces in a second account linked to the Device. We would create a DeviceInterfaceList account. This way, we would need to read both the list of Devices and the list of Interfaces (one account per Device with multiple Interfaces). The goal is to be able to modify this account whenever an interface is created or deleted, without having to load the Device information. |
||
|
|
||
| ### Network Changes | ||
| IPs will be assigned from a general pool of IP addresses. These IP addresses will be originally sourced from the IPs that the contributors provide through their minimum /29. These IPs are already used to assign src IPs for multicast tunnels. There is a limited supply of IPs that will be exhausted somewhat quickly. To mitigate the IP resource problem, DoubleZero can either request more IPs from network contributors or if necessary, IPs can be pulled from the /21 that DoubleZero owns. These are being set aside for edge filtration so they should only be used if absolutely necessary. | ||
|
|
||
| ### Service Changes | ||
|
|
||
| #### CLI | ||
| 1. The CLI currently selects the tunnel termination endpoint for a user connection based on min(latency) across all DZDs. In the event there is an existing tunnel terminated on the DZD, we need to select the next best endpoint on the same DZD. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it not fair to just assume that the latency of one interface is as good as another on the same DZD and just iterate the interfaces as more tunnels are added? |
||
| 2. Users need to be able perform CRUD operations on the on-chain interfaces i.e. `doublezero interface create`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would think contributors need to do this, not users.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are the interfaces associated with the devices created by the contributor? In this context, who is the User? |
||
| 3. Users need to be able to display interfaces listed on-chain via `doublezero interfaces list` or some derivative command. | ||
|
|
||
| #### Daemon | ||
| Latency probing changes are needed for this as the current implementation looks at the public_ip field of device record to probe each DZD: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not understanding the benefit (or necessity) of testing latency to every tunnel termination interface on every DZD. Why would we expect different IPs on the same router to have different latency characteristics? |
||
| 1. Look at device table and then the interface table based on the device pubkey | ||
| 2. Filter on tunnel termination interfaces per device | ||
| 3. Initiate latency probes per tunnel termination | ||
| 4. Store results as <Device: Interface: LatencyResult> and serve via /latency endpoint for CLI | ||
|
|
||
| #### Activator | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @elitegreg are there any things that are missing from the activator / smart contract side?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not that I can think of, but I'm not sure I'm grasping the Activator change listed. |
||
| * Logic for assigning an IP will need to be modified to account for `n` > 1 IPs instead of just the first IP available | ||
| * Smart contract will need to be amended to associate `n` > 1 interfaces with a particular device | ||
| * Initial bootstrapping of a device may have to be revisited | ||
|
|
||
|
|
||
| #### Controller | ||
| * *optional*: configuration for tunnel termination loopbacks generated in device template | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My preference would be to do this in the controller templating.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed; can we launch multicast without the controller templating? (my thought is yes but could be convinced otherwise of course). Feels like the templating could be a separate RFC to sunset ansible so that we don't block multicast too long but again open to suggestions otherwise.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can take parts of Ansible config generation process and move it to the controller over time. I think here we just need to add loopback interfaces, and I would hope it isn't too much work. We need a way to allocate these from the dz_prefix pool and pass them from the activator to the controller I guess. Feel free to correct me if it is more work than I appreciate, we can continue to push it manually for now. |
||
| * *optional*: migrate logic from ansible into the controller to reduce the need for ansible | ||
|
|
||
| ## Security Considerations | ||
|
|
||
| While this RFC introduces the concept of more tunnels, the same security mechanisms are in place that guard against unauthorized actors through the allowlist generated through the smart contract. If there are security vulnerabilities, they exist for any and all tunnels. | ||
|
|
||
| There is more information exposed on-chain, namely the `interface` struct. Perhaps someone could use that information to put together a fuller picture of a contributor's topology, but network contributors are providing resources that will be used in an open and transparent way so this is likely not an issue. | ||
|
|
||
|
|
||
| ## Backwards Compatibility | ||
|
|
||
| New logic will introduce a breaking change as this RFC covers the initial rollout of multicast. This release will be tagged with a minor version of 0.2.0 to signify the breaking change. | ||
|
|
||
| ## Open Questions | ||
| * While not necessary for this initial multiple tunnels RFC, should logic be added to the controller to start handling some of the ansible functionality? | ||
| * Updating the smart contract seems non-trivial; must it be this way or are there things that can reduce the friction to smart contract changes? | ||
| * What kind of data validation / sanitization is required to ensure that bad data isn't entered? In a SQL db, indexes can be used (or am ORM) to ensure data confirms but not sure what kind of on-chain validation can or should be done. | ||
| * Should a user be able to provide their own "termination point" or should it be assigned by DoubleZero? To start, it makes sense to not allow this but is this functionality that a user would want? | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also has the downside of limiting us to 2 tunnels, so it's not a general solution