RFC2 - DoubleZero Exchange - DZX #509

ben-malbeclabs · 2025-06-07T01:04:25Z

Details the use-case for DZX
Offers options for physical/logical DZX deployments
Considers use of community and vendor design, as well as hybrid

nikw9944 · 2025-06-17T15:14:24Z

Not sure if we have an approach for keeping rfc numbers unique, but looks like there's already an rfc1 ahead of this in the queue https://github.com/malbeclabs/doublezero/pull/439/files

nikw9944 · 2025-06-17T16:00:35Z

The doc outlines 3 options for physical topology, then it outlines 4 options for logical topology. The doc provides a recommendation for logical topology (hybrid model) but I don't see a recommendation for physical topology. Did I miss it?

nikw9944 · 2025-06-17T17:06:00Z

rfcs/rfc2-doublezero_exchange.md

+- Vendor solution abstracts away operational overhead of operating DZX for a cost
+- Lack of visibility into network fabric
+
+Cons


With DZX as a vendor fabric solution, is it a con that we won't be able to have a single DZX that spans multiple data center providers in any given metro?

Can you clarify what you mean by data center provider? Data center vendor like Equinix, Coresite etc.?

All of the vendors we have evaluated so far have POPs in multiple data centers.

Yes, I meant Equinix, Coresite, etc., and I was alluding to the fact that if we have a DZX that spans multiple data center vendors (like Equinix + Coresite) then their fabric won't be able to provide connectivity between all the DZDs in the DZX.

Potentially yes, hopefully they would cover the vast majority. If we ever choose to use one of them, then we should probably create a separate design doc to review that.

nikw9944 · 2025-06-17T17:06:27Z

rfcs/rfc2-doublezero_exchange.md

+- New automation and configuration model to support DZX switch use-case
+- Small switch buffers with Tomahawk may not be sufficient to absorb traffic bursts towards busy DZDs in the metro
+- Limitations of scaling a single broadcast domain, for example the DZX topology MUST remain loop-free and MUST NOT rely on spanning-tree protocols
+  - Limited to three DZX switches


Where does the three switch limit come from?

On reflection, this is an inaccurate statement. If we build physical topology option 3, we could have loop free and any number of DZX switches. If we build physical topology option 2, we are limited to 2 DZX switches to keep it loop free if the fabric is layer 2.

nikw9944 · 2025-06-17T17:07:00Z

rfcs/rfc2-doublezero_exchange.md

+  - Limited to three DZX switches
+    - Is this a real limitation?
+- Complex to integrate within a hybrid community/vendor model
+  - Layer 2 hand-off complexity


Can you be more specific about layer 2 hand-off complexity?

The complexity exists in creating a VLAN mapping database that represents virtual cross-connects between the different contributors. This will need to exist for the vendor fabric if we use that tech at all, but if we combine it in to a hybrid model then that needs to be extended through a DZX layer 2 fabric, potentially mapping to trunk ports and sub-interfaces on the DZDs.

How much complexity that is I suppose is open to debate and will have to be factored in to the cost of using layer 3 switches. Having a layer 3 hand-off feels much simpler to reason about.

nikw9944 · 2025-06-17T17:10:20Z

rfcs/rfc2-doublezero_exchange.md

+Cons
+- New automation and configuration model to support DZX switch use-case
+- Small switch buffers with Tomahawk may not be sufficient to absorb traffic bursts towards busy DZDs in the metro
+- Limitations of scaling a single broadcast domain, for example the DZX topology MUST remain loop-free and MUST NOT rely on spanning-tree protocols


Might be worth calling out that we can bundle multiple links between two switches using link aggregation without creating loops

"DZX topology MUST remain loop-free" - if a contributor accidentally cables up a loop, do we have any mitigations? If not, that seems like a deal breaker that takes the L2 DZX off the table

We should have a mechanism that enforces topology, and not enable a link if it breaks these rules as part of a link RFS testing.

If we have a layer 2 network, we should run any protocols that are necessary to protect against layer 2 loops.

nikw9944 · 2025-06-17T17:12:59Z

rfcs/rfc2-doublezero_exchange.md

+- Vendor solution abstracts away operational overhead of operating DZX for a cost
+- Lack of visibility into network fabric
+
+Cons


Is cost a con for vendor fabrics?

We don't have an economic model for vendor fabrics yet. It is being worked. List prices make it prohibitively expensive. It is one of the open questions.

ben-malbeclabs · 2025-06-17T23:57:46Z

The doc outlines 3 options for physical topology, then it outlines 4 options for logical topology. The doc provides a recommendation for logical topology (hybrid model) but I don't see a recommendation for physical topology. Did I miss it?

Fair point, there is no explicit recommendation for the physical topology currently. It is probably option 2 if we can answer the question about ownership and cost for the interconnect. Option 1 would work in small deployments. Option 3 is probably the best technical solution but most likely the most expensive to implement.

nikw9944 · 2025-07-10T16:36:17Z

rfcs/rfc2-doublezero_exchange.md

+
+| Option                           | Description                                                                                                                                                                                                                                                                                                                                                                                                                        | Pros                                                                                                                                                                                                                       | Cons                                                                                                                                                                                                                                                                                                                        | Recommendation                                                                                                                                                                    |
+|----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| **1. DZX as a Layer 2 Fabric**   | - DZX operated by DZ community members<br>- Each DZX switch operates at layer 2<br>- A single subnet (broadcast domain) is used to address all DZD NNIs facing the DZX<br>- The existing DZ network routing is extended across the DZX<br>&nbsp;&nbsp;&nbsp;- A full-mesh of IS-IS/PIM neighbors is formed across the DZX<br>&nbsp;&nbsp;&nbsp;- The DZX is transparent from a routing perspective<br>- All links are assumed to be of equal latency<br>&nbsp;&nbsp;&nbsp;- Helps incentivize diversification out of a single data center<br>&nbsp;&nbsp;&nbsp;- Prevents an arms race in the metro<br>- The DZX switch maintains a layer 3 connection for telemetry | - The DZX switch requires a simpler feature set (layer 2 only) than a DZD<br>- Allows the DZX operator to purchase less expensive switch platforms e.g. Arista 7060X6 Tomahawk<br>- All links are assumed to be of equal latency<br>&nbsp;&nbsp;&nbsp;- Helps incentivize diversification out of a single data center<br>&nbsp;&nbsp;&nbsp;- Prevents an arms race in the metro<br>- Full visibility for monitoring and troubleshooting | - New automation and configuration model to support DZX layer 2 switch use-case<br>- Small switch buffers with Tomahawk may not be sufficient to absorb traffic bursts towards busy DZDs in the metro<br>&nbsp;&nbsp;&nbsp;- Would likely require traffic shaping on DZDs<br>- Limitations of scaling a single broadcast domain, for example the DZX topology MUST remain loop-free and MUST NOT rely on spanning-tree protocols<br>&nbsp;&nbsp;&nbsp;- Limited to two DZX switches<br>&nbsp;&nbsp;&nbsp;- Requires uses of layer 2 protocols such as spanning-tree protocol, storm-control<br>- Complex to integrate within a hybrid community/vendor model<br>&nbsp;&nbsp;&nbsp;- Layer 2 hand-off complexity e.g. mapping between VLANs |                                                                                                                                                                                    |


Definitely a nitpick but I can't help it: "MUST NOT rely on spanning-tree protocols".... "Requires uses of layer 2 protocols such as spanning-tree protocol" 😬

Not a nitpick. That definitely needs to be made clearer and will address.

dalalama · 2025-08-02T18:57:18Z

rfcs/rfc2-doublezero_exchange.md

+## New Terminology
+
+- DoubleZero Exchange (DZX): a network fabric creating a contiguous network between all network contributors with a metro
+- DoubleZero Device / DoubleZero eXchange (DZDx): a single DZ network device that acts as both a DZD and a DZX


I like this - dzdx - but we'll just use it internally right? It'll just be confusing to folks otherwise while we only have one dedicated DZX (galaxy-ny) and mostly DZDxs

dalalama

This is awesome. Ship it. Sorry for being the slow poke on this review.

ben-malbeclabs added 5 commits June 6, 2025 09:54

Initial commit for DZX RFC.

7d44271

Further content update

186c88a

Testing image embedding

74888d1

DZX RFC major content addition

4cc7b69

DZX RFC major content addition

a8674d3

ben-malbeclabs self-assigned this Jun 7, 2025

ben-malbeclabs requested review from a team and dalalama June 7, 2025 01:04

Rename from rfc1 to rfc2; minor fixes and formatting changes

647d5b0

nikw9944 requested changes Jun 17, 2025

View reviewed changes

ben-malbeclabs added 14 commits July 9, 2025 17:40

544 - DZX RFC updates

abb5633

544 - DZX RFC updates

3fafa03

544 - DZX RFC updates

04f6029

544 - DZX RFC updates, reorganizing into tables.

375a49e

544 - DZX RFC updates, minor formatting.

f32f3ea

544 - DZX RFC updates, minor formatting.

091810d

544 - DZX RFC updates, adjusting image height.

a85f736

544 - DZX RFC updates, minotr formatting for images

ec16d82

544 - DZX RFC updates, minotr formatting for images

31c6c60

544 - DZX RFC updates, minotr formatting for images

53a649a

544 - DZX RFC updates, minotr formatting for images

0c32c54

544 - DZX RFC updates, minotr formatting for images

c422c4d

544 - DZX RFC updates, minotr formatting for images

3002690

544 - DZX RFC updates, minotr formatting for images

6ea5aca

ben-malbeclabs changed the title ~~RFC1 - DoubleZero Exchange - DZX~~ RFC2 - DoubleZero Exchange - DZX Jul 10, 2025

ben-malbeclabs requested a review from a team July 10, 2025 01:26

nikw9944 approved these changes Jul 10, 2025

View reviewed changes

elitegreg self-requested a review July 22, 2025 20:26

elitegreg approved these changes Jul 22, 2025

View reviewed changes

dalalama reviewed Aug 2, 2025

View reviewed changes

dalalama approved these changes Aug 2, 2025

View reviewed changes

RFC2 - DoubleZero Exchange - DZX #509

Are you sure you want to change the base?

RFC2 - DoubleZero Exchange - DZX #509

Uh oh!

Conversation

ben-malbeclabs commented Jun 7, 2025

Uh oh!

nikw9944 commented Jun 17, 2025

Uh oh!

nikw9944 commented Jun 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ben-malbeclabs commented Jun 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dalalama left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dalalama left a comment •

edited

Loading