Skip to content

Conversation

@ben-malbeclabs
Copy link
Contributor

  • Details the use-case for DZX
  • Offers options for physical/logical DZX deployments
  • Considers use of community and vendor design, as well as hybrid

@ben-malbeclabs ben-malbeclabs self-assigned this Jun 7, 2025
@ben-malbeclabs ben-malbeclabs requested review from a team and dalalama June 7, 2025 01:04
@nikw9944
Copy link
Contributor

Not sure if we have an approach for keeping rfc numbers unique, but looks like there's already an rfc1 ahead of this in the queue https://github.com/malbeclabs/doublezero/pull/439/files

@nikw9944
Copy link
Contributor

The doc outlines 3 options for physical topology, then it outlines 4 options for logical topology. The doc provides a recommendation for logical topology (hybrid model) but I don't see a recommendation for physical topology. Did I miss it?

- Vendor solution abstracts away operational overhead of operating DZX for a cost
- Lack of visibility into network fabric

Cons
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With DZX as a vendor fabric solution, is it a con that we won't be able to have a single DZX that spans multiple data center providers in any given metro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what you mean by data center provider? Data center vendor like Equinix, Coresite etc.?

All of the vendors we have evaluated so far have POPs in multiple data centers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant Equinix, Coresite, etc., and I was alluding to the fact that if we have a DZX that spans multiple data center vendors (like Equinix + Coresite) then their fabric won't be able to provide connectivity between all the DZDs in the DZX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially yes, hopefully they would cover the vast majority. If we ever choose to use one of them, then we should probably create a separate design doc to review that.

- New automation and configuration model to support DZX switch use-case
- Small switch buffers with Tomahawk may not be sufficient to absorb traffic bursts towards busy DZDs in the metro
- Limitations of scaling a single broadcast domain, for example the DZX topology MUST remain loop-free and MUST NOT rely on spanning-tree protocols
- Limited to three DZX switches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the three switch limit come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, this is an inaccurate statement. If we build physical topology option 3, we could have loop free and any number of DZX switches. If we build physical topology option 2, we are limited to 2 DZX switches to keep it loop free if the fabric is layer 2.

- Limited to three DZX switches
- Is this a real limitation?
- Complex to integrate within a hybrid community/vendor model
- Layer 2 hand-off complexity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific about layer 2 hand-off complexity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complexity exists in creating a VLAN mapping database that represents virtual cross-connects between the different contributors. This will need to exist for the vendor fabric if we use that tech at all, but if we combine it in to a hybrid model then that needs to be extended through a DZX layer 2 fabric, potentially mapping to trunk ports and sub-interfaces on the DZDs.

How much complexity that is I suppose is open to debate and will have to be factored in to the cost of using layer 3 switches. Having a layer 3 hand-off feels much simpler to reason about.

Cons
- New automation and configuration model to support DZX switch use-case
- Small switch buffers with Tomahawk may not be sufficient to absorb traffic bursts towards busy DZDs in the metro
- Limitations of scaling a single broadcast domain, for example the DZX topology MUST remain loop-free and MUST NOT rely on spanning-tree protocols
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth calling out that we can bundle multiple links between two switches using link aggregation without creating loops

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"DZX topology MUST remain loop-free" - if a contributor accidentally cables up a loop, do we have any mitigations? If not, that seems like a deal breaker that takes the L2 DZX off the table

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a mechanism that enforces topology, and not enable a link if it breaks these rules as part of a link RFS testing.

If we have a layer 2 network, we should run any protocols that are necessary to protect against layer 2 loops.

- Vendor solution abstracts away operational overhead of operating DZX for a cost
- Lack of visibility into network fabric

Cons
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is cost a con for vendor fabrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an economic model for vendor fabrics yet. It is being worked. List prices make it prohibitively expensive. It is one of the open questions.

@ben-malbeclabs
Copy link
Contributor Author

The doc outlines 3 options for physical topology, then it outlines 4 options for logical topology. The doc provides a recommendation for logical topology (hybrid model) but I don't see a recommendation for physical topology. Did I miss it?

Fair point, there is no explicit recommendation for the physical topology currently. It is probably option 2 if we can answer the question about ownership and cost for the interconnect. Option 1 would work in small deployments. Option 3 is probably the best technical solution but most likely the most expensive to implement.

@ben-malbeclabs ben-malbeclabs changed the title RFC1 - DoubleZero Exchange - DZX RFC2 - DoubleZero Exchange - DZX Jul 10, 2025
@ben-malbeclabs ben-malbeclabs requested a review from a team July 10, 2025 01:26

| Option | Description | Pros | Cons | Recommendation |
|----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **1. DZX as a Layer 2 Fabric** | - DZX operated by DZ community members<br>- Each DZX switch operates at layer 2<br>- A single subnet (broadcast domain) is used to address all DZD NNIs facing the DZX<br>- The existing DZ network routing is extended across the DZX<br>&nbsp;&nbsp;&nbsp;- A full-mesh of IS-IS/PIM neighbors is formed across the DZX<br>&nbsp;&nbsp;&nbsp;- The DZX is transparent from a routing perspective<br>- All links are assumed to be of equal latency<br>&nbsp;&nbsp;&nbsp;- Helps incentivize diversification out of a single data center<br>&nbsp;&nbsp;&nbsp;- Prevents an arms race in the metro<br>- The DZX switch maintains a layer 3 connection for telemetry | - The DZX switch requires a simpler feature set (layer 2 only) than a DZD<br>- Allows the DZX operator to purchase less expensive switch platforms e.g. Arista 7060X6 Tomahawk<br>- All links are assumed to be of equal latency<br>&nbsp;&nbsp;&nbsp;- Helps incentivize diversification out of a single data center<br>&nbsp;&nbsp;&nbsp;- Prevents an arms race in the metro<br>- Full visibility for monitoring and troubleshooting | - New automation and configuration model to support DZX layer 2 switch use-case<br>- Small switch buffers with Tomahawk may not be sufficient to absorb traffic bursts towards busy DZDs in the metro<br>&nbsp;&nbsp;&nbsp;- Would likely require traffic shaping on DZDs<br>- Limitations of scaling a single broadcast domain, for example the DZX topology MUST remain loop-free and MUST NOT rely on spanning-tree protocols<br>&nbsp;&nbsp;&nbsp;- Limited to two DZX switches<br>&nbsp;&nbsp;&nbsp;- Requires uses of layer 2 protocols such as spanning-tree protocol, storm-control<br>- Complex to integrate within a hybrid community/vendor model<br>&nbsp;&nbsp;&nbsp;- Layer 2 hand-off complexity e.g. mapping between VLANs | |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a nitpick but I can't help it: "MUST NOT rely on spanning-tree protocols".... "Requires uses of layer 2 protocols such as spanning-tree protocol" 😬

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a nitpick. That definitely needs to be made clearer and will address.

@elitegreg elitegreg self-requested a review July 22, 2025 20:26
## New Terminology

- DoubleZero Exchange (DZX): a network fabric creating a contiguous network between all network contributors with a metro
- DoubleZero Device / DoubleZero eXchange (DZDx): a single DZ network device that acts as both a DZD and a DZX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this - dzdx - but we'll just use it internally right? It'll just be confusing to folks otherwise while we only have one dedicated DZX (galaxy-ny) and mostly DZDxs

Copy link
Contributor

@dalalama dalalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. Ship it. Sorry for being the slow poke on this review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants