[EVPN-MH] Add CLI commands for EVPN VXLAN Multihoming configuration by bdfriedman · Pull Request #4247 · sonic-net/sonic-utilities

bdfriedman · 2026-02-04T19:14:41Z

Why I did it

This PR adds CLI commands to configure and monitor EVPN VXLAN Multihoming (EVPN-MH) feature in SONiC. These commands enable network operators to:

Configure global EVPN-MH parameters (startup delay, MAC/neighbor hold times)
Configure EVPN Ethernet Segments on LAG/PortChannel interfaces
Set Designated Forwarder (DF) preference for load distribution
Display EVPN ES, ES-EVI, and L2 nexthop information

This complements the EVPN-MH infrastructure added in sonic-swss and sonic-swss-common.

Work item tracking

Microsoft ADO (number only):

How I did it

Global EVPN-MH Configuration (evpn_mh.py):

config evpn-mh startup-delay <0-3600> - Configure EVPN-MH startup delay (default: 300s)
- Determines how long to wait before declaring local Ethernet Segments as up after system boot
- Prevents premature traffic forwarding during initialization
- Valid range: 0-3600 seconds
config evpn-mh mac-holdtime <0-86400> - Configure MAC hold time (default: 1080s)
- Duration to retain remote MAC entries after EVPN route withdrawal
- Allows for graceful handling of temporary connectivity issues
- Valid range: 0-86400 seconds
config evpn-mh neigh-holdtime <0-86400> - Configure neighbor hold time (default: 1080s)
- Duration to retain neighbor entries after EVPN route withdrawal
- Valid range: 0-86400 seconds

Interface-level EVPN-ES Configuration (main.py):
4. config interface evpn-esi add <interface> <esi> - Add EVPN Ethernet Segment Identifier

Supports two ESI types:
- Type 0 (Manual): 00:XX:XX:XX:XX:XX:XX:XX:XX:XX - Operator-configured 10-byte ESI
  - First byte must be 0x00 (Type 0)
  - Validates against reserved ESIs (all zeros, all FFs)
  - Prevents duplicate manual ESI across interfaces
- Type 3 (MAC-based): auto-system-mac - Auto-generated from system MAC
  - Uses port ID derived from interface name
  - Leverages system_mac from PORTCHANNEL table if configured
Integrates with FRR via vtysh commands
Stores configuration in CONFIG_DB EVPN_ETHERNET_SEGMENT table

config interface evpn-esi del <interface> - Remove EVPN Ethernet Segment configuration
- Removes from CONFIG_DB
- Cleans up FRR configuration (es-id, es-sys-mac, es-df-pref)
config interface evpn-df-pref <interface> <1-65535> - Set DF preference
- Controls Designated Forwarder election priority (higher value = higher preference)
- Default: 32767
- Valid range: 1-65535
- Updates both CONFIG_DB and FRR configuration

Show Commands (evpn.py):
7. show evpn - Display general EVPN information
8. show evpn es [<esi>] - Display Ethernet Segment information

Optional ESI parameter to show specific ES details

show evpn es-evi [<vni>] - Display Ethernet Segment per EVI information
- Optional VNI parameter to filter by specific VLAN/VNI
show evpn es-evi detail - Show detailed ES-EVI information
show evpn l2-nh - Display L2 nexthop groups for EVPN all-active ES

All show commands leverage FRR's show evpn commands via bgp_util.run_bgp_show_command().

Testing (210+ test cases):

config_evpn_mh_test.py - Global EVPN-MH configuration tests
- Startup delay validation (boundary values, valid/invalid inputs)
- MAC holdtime validation
- Neighbor holdtime validation
config_int_evpn_test.py - Interface EVPN-ES configuration tests
- Manual ESI configuration (Type 0)
- MAC-based ESI configuration (Type 3)
- ESI validation (reserved ESI rejection, format validation, duplicate detection)
- DF preference configuration
- Configuration add/delete operations
- Error handling and edge cases

Helper Functions:

is_reserved_esi() - Validates against reserved ESI values
parse_esi_input() - Parses and validates ESI input (Type 0 vs Type 3)
port_id_from_if_name() - Extracts numeric port ID from interface name
check_if_same_manual_esi_exists() - Prevents duplicate manual ESI configuration
run_vtysh_command() - Wraps vtysh execution for FRR integration

Files Changed:

evpn_mh.py - New file (109 lines)
main.py - Added 199 lines for interface EVPN commands
evpn.py - New file (67 lines)
main.py - Added 2 lines to register EVPN command group
config_evpn_mh_test.py - New file (104 lines)
config_int_evpn_test.py - New file (230 lines)

Total: +711 lines across 6 files

How to verify it

Global EVPN-MH Configuration:

# Configure global EVPN-MH parameters
config evpn-mh startup-delay 600
config evpn-mh mac-holdtime 2000
config evpn-mh neigh-holdtime 2000

# Verify in CONFIG_DB
redis-cli -n 4 HGETALL "EVPN_MH_GLOBAL|default"

Interface EVPN-ES Configuration:

# Configure Type 0 (manual) ESI on PortChannel
config interface evpn-esi add PortChannel1 00:01:02:03:04:05:06:07:08:09

# Configure Type 3 (MAC-based) ESI
config interface evpn-esi add PortChannel2 auto-system-mac

# Set DF preference
config interface evpn-df-pref PortChannel1 50000

# Verify in CONFIG_DB
redis-cli -n 4 HGETALL "EVPN_ETHERNET_SEGMENT|PortChannel1"

# Verify in FRR
vtysh -c "show running-config" | grep -A 5 "interface PortChannel1"

# Delete ESI
config interface evpn-esi del PortChannel1

Show Commands:

# Display EVPN information
show evpn
show evpn es
show evpn es 00:01:02:03:04:05:06:07:08:09
show evpn es-evi
show evpn es-evi 1000
show evpn es-evi detail
show evpn l2-nh

Run Unit Tests:

pytest tests/config_evpn_mh_test.py -v
pytest tests/config_int_evpn_test.py -v

Which release branch to backport (provide reason below if selected)

Tested branch (Please provide the tested image version)

Description for the changelog

Add CLI commands for EVPN VXLAN Multihoming: global EVPN-MH configuration, interface Ethernet Segment (ESI) configuration, DF preference settings, and EVPN show commands

Link to config_db schema for YANG model changes

Updates required for:

New table: EVPN_MH_GLOBAL - Global EVPN-MH configuration
- startup_delay: 0-3600 (default: 300)
- mac_holdtime: 0-86400 (default: 1080)
- neigh_holdtime: 0-86400 (default: 1080)
New table: EVPN_ETHERNET_SEGMENT - Per-interface ES configuration
- esi: 10-byte ESI string (e.g., "00:01:02:03:04:05:06:07:08:09") or "AUTO"
- type: "TYPE_0_OPERATOR_CONFIGURED" or "TYPE_3_MAC_BASED"
- df_pref: 1-65535 (default: 32767)

YANG model updates should be submitted to sonic-yang-models repository.

Depends on

sonic-swss-common PR: [Add L2 nexthop group table and raw netlink message handling support]
sonic-swss PR: [EVPN-MH infrastructure and orchestration agents]
FRR: EVPN MH support (available in FRR 8.x+)

Key Features:
✅ Global EVPN-MH parameter configuration
✅ Type 0 (manual) and Type 3 (MAC-based) ESI support
✅ DF preference configuration for load balancing
✅ FRR integration via vtysh
✅ CONFIG_DB persistence
✅ Comprehensive validation and error handling
✅ 210+ unit test cases
✅ Show commands for monitoring EVPN state

linux-foundation-easycla · 2026-02-04T19:14:48Z

The committers listed above are authorized under a signed CLA.

✅ login: bdfriedman / name: Barry Friedman (friedman) (24fb1c9, 5c4e3d1, 7e11e92, adde3a2, b575363, e89e072)
✅ login: joshle2 / name: Josh Lee (24fb1c9)
✅ login: pbrisset / name: Patrice Brissette (06f7b40, 4aa0242, 63e882d, 80a2a79, 82018c2, adda794)
✅ login: tahmed-dev / name: Tamer Ahmed (81ec7b6)
✅ login: yanjundeng / name: yanjun deng (1fb13af, 58f7397, 6edd918, d7055ed, dbebd85)

mssonicbld · 2026-02-04T19:14:49Z

/azp run

azure-pipelines · 2026-02-04T19:15:00Z

Azure Pipelines successfully started running 1 pipeline(s).

selva-nexthop · 2026-02-05T20:27:33Z

+        cmd += " {}".format(es_evi)
+
+    output = bgp_util.run_bgp_show_command(cmd)
+    print(output)


make it consistent to use click.echo instead of print.

Fixed: now uses click.echo instead of print throughout show/evpn.py.

Verified: all print() calls replaced with click.echo() in show/evpn.py. Fix looks good.

selva-nexthop · 2026-02-05T20:29:37Z

+    """Show evpn Ethernet Segment per EVI information"""
+    cmd = "show evpn es-evi"
+
+    if es_evi is not None:


it can be just

if es_evi: # This checks if es_evi is not None AND not an empty string cmd += " {}".format(es_evi)

Fixed: now uses truthy check if vni: and if es: instead of is not None.

Verified: truthy checks (if es:, if vni:) used instead of is not None. Looks good.

mssonicbld · 2026-02-12T14:51:45Z

/azp run

azure-pipelines · 2026-02-12T14:51:57Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-02-12T16:16:15Z

/azp run

azure-pipelines · 2026-02-12T16:16:30Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-02-12T16:31:29Z

/azp run

azure-pipelines · 2026-02-12T16:31:41Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-02-12T17:05:26Z

/azp run

azure-pipelines · 2026-02-12T17:05:38Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-02-12T17:07:50Z

/azp run

azure-pipelines · 2026-02-12T17:08:02Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-02-18T20:02:49Z

/azp run

azure-pipelines · 2026-02-18T20:03:00Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

Adds SONiC CLI support for configuring and monitoring EVPN VXLAN Multihoming (EVPN-MH), including new config evpn-mh global settings, config interface EVPN-ES/DF preference commands, and a new show evpn command group that proxies FRR output.

Changes:

Add config evpn-mh commands to configure global EVPN-MH timers in CONFIG_DB.
Add config interface evpn-esi ... and config interface evpn-df-pref ... to configure Ethernet Segment parameters and DF preference, including FRR vtysh integration.
Add show evpn ... commands that run FRR show evpn variants and display the output.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 17 comments.

Show a summary per file

File	Description
`config/main.py`	Adds EVPN ES interface configuration commands, parsing/validation helpers, and FRR vtysh integration.
`config/evpn_mh.py`	Adds new `config evpn-mh` command group for EVPN-MH global timer configuration.
`show/main.py`	Registers the new `show evpn` command group.
`show/evpn.py`	Implements `show evpn` subcommands by proxying FRR `show evpn ...` output.
`tests/config_evpn_mh_test.py`	Adds unit tests for EVPN-MH global configuration commands.
`tests/config_int_evpn_test.py`	Adds unit tests for EVPN ES interface configuration and DF preference behavior.

Copilot · 2026-03-11T18:01:11Z

+def set_startup_delay(ctx, startup_delay=EVPN_MH_STARTUP_DELAY_DEFAULT):
+    """Add EVPN MH"""
+    config_db = ValidatedConfigDBConnector(ctx.obj['config_db'])
+    if not is_valid_startup_delay(startup_delay):


These handlers access ctx.obj['config_db'], but when invoked from the top-level config group ctx.obj is a utilities_common.db.Db instance (see config/main.py), not a dict. This will raise TypeError: 'Db' object is not subscriptable at runtime (even though unit tests pass a dict). Update the commands to use @clicommon.pass_db and db.cfgdb (or access ctx.obj.cfgdb) consistently across all evpn-mh subcommands.

Fixed: commands now use @clicommon.pass_db decorator and access db.cfgdb instead of ctx.obj['config_db'].

Verified: config/evpn_mh.py commands now use @clicommon.pass_db decorator and access db.cfgdb. Tests updated to pass Db() instance. Fix looks correct.

Copilot · 2026-03-11T18:01:12Z

+def run_vtysh_command(cmd):
+    return clicommon.run_command(cmd)
+


run_vtysh_command() wraps clicommon.run_command(), which returns None on success and calls sys.exit() on failures. As a result, the subsequent if op: checks will never catch errors, and failures can abruptly exit without a Click-friendly error message. Consider using return_cmd=True (and ignore_error=True) to capture output/rc and then ctx.fail() on non-zero rc.

Suggested change

def run_vtysh_command(cmd):

return clicommon.run_command(cmd)

def run_vtysh_command(cmd, ctx=None):

"""

Run a vtysh command.

If a Click context is provided, capture the return code/output and

surface failures via ctx.fail() instead of exiting abruptly.

When no context is provided, preserve the original behavior of

clicommon.run_command(), which may call sys.exit() on error.

"""

if ctx is None:

# Preserve existing behavior for callers that don't use Click ctx

return clicommon.run_command(cmd)

rc, output = clicommon.run_command(cmd, return_cmd=True, ignore_error=True)

if rc != 0:

message = output if output else f"Command '{cmd}' failed with return code {rc}"

ctx.fail(message)

return output

Fixed: run_vtysh_command() now accepts an optional ctx parameter, captures rc/output with return_cmd=True, and calls ctx.fail() on non-zero rc.

Verified: run_vtysh_command() now accepts optional ctx, uses return_cmd=True and ignore_error=True, and calls ctx.fail() on non-zero rc. All callers updated.

Copilot · 2026-03-11T18:01:12Z

+    """Show Ethernet Segment per EVI information"""
+    """"show evpn es-evi <vni>"""


There is a stray standalone string literal (""""show evpn es-evi <vni>""") under es_evi(). It isn’t a docstring/comment (because the function already has a docstring) and is easy to misread; remove it or merge it into the actual docstring.

Suggested change

"""Show Ethernet Segment per EVI information"""

""""show evpn es-evi <vni>"""

"""Show Ethernet Segment per EVI information (show evpn es-evi <vni>)"""

Fixed: stray standalone string literal removed and merged into the actual docstring.

Verified: stray string literal removed, merged into the docstring. Looks good.

Copilot · 2026-03-11T18:01:12Z

+def check_if_same_manual_esi_exists(ctx, esi_args, es_data):
+    if esi_args['type'] == 'TYPE_0_OPERATOR_CONFIGURED':
+        for es_intf_name, es_intf_data in es_data.items():
+            if esi_args['esi'] == es_intf_data['esi']:
+                ctx.fail(f"The ESI '{esi_args['esi']}' is already in use by '{es_intf_name}'")


check_if_same_manual_esi_exists() compares ESI strings with exact case. Since manual ESIs are hex, the same ESI can be configured twice using different casing (e.g., ...:ff vs ...:FF), bypassing the duplicate check. Normalize ESIs (e.g., lower-case) before storing and when comparing to enforce true uniqueness.

Fixed: ESI strings are now normalized to lowercase before storing (esi_args['esi'] = esi_input_strs[0].lower()) and compared case-insensitively.

Verified: ESI strings normalized to lowercase before storing and compared case-insensitively. Fix looks correct.

Copilot · 2026-03-11T18:01:12Z

+    port_id_re = re.compile(r'[a-zA-Z]+(?P<port_id>[0-9_]+)')
+    port_id = port_id_re.search(if_name)
+    port_id_group = port_id.group('port_id')
+
+    if port_id_group:
+        port_id = port_id_group.replace('_', '')
+
+    return port_id


port_id_from_if_name() assumes the regex always matches and immediately calls .group('port_id'). If a user provides an interface name without digits (or an unexpected format), this will raise an AttributeError and crash the CLI. Handle a non-match explicitly (e.g., return None/raise a ValueError) and have the caller ctx.fail() with a clear message.

Suggested change

port_id_re = re.compile(r'[a-zA-Z]+(?P<port_id>[0-9_]+)')

port_id = port_id_re.search(if_name)

port_id_group = port_id.group('port_id')

if port_id_group:

port_id = port_id_group.replace('_', '')

return port_id

"""Extract the numeric port identifier from an interface name.

Returns a string of digits (with any underscores removed) if the

interface name matches the expected pattern, otherwise returns None.

"""

port_id_re = re.compile(r'[a-zA-Z]+(?P<port_id>[0-9_]+)')

match = port_id_re.search(if_name)

if not match:

return None

port_id_group = match.group('port_id')

if port_id_group:

return port_id_group.replace('_', '')

return None

Fixed: port_id_from_if_name() now returns None on non-match, and callers check for None with ctx.fail().

Verified: port_id_from_if_name() now returns None on non-match, and callers check for None with ctx.fail(). Looks good.

Copilot · 2026-03-11T18:01:15Z

+        ctx.fail(f"EVPN Ethernet Segment {interface_name} - DF Preference {df_pref} is not valid. " \
+                 "Valid values are {EVPN_ES_DF_PREF_MIN}-{EVPN_ES_DF_PREF_MAX}.")


The validation error message for an invalid DF preference is missing the f prefix on the second line, so {EVPN_ES_DF_PREF_MIN} / {EVPN_ES_DF_PREF_MAX} will be printed literally. Make the entire message an f-string (or build it in one f-string) so the numeric range is displayed correctly.

Suggested change

ctx.fail(f"EVPN Ethernet Segment {interface_name} - DF Preference {df_pref} is not valid. " \

"Valid values are {EVPN_ES_DF_PREF_MIN}-{EVPN_ES_DF_PREF_MAX}.")

ctx.fail(

f"EVPN Ethernet Segment {interface_name} - DF Preference {df_pref} is not valid. "

f"Valid values are {EVPN_ES_DF_PREF_MIN}-{EVPN_ES_DF_PREF_MAX}."

)

Fixed: the second line now has the f prefix so {EVPN_ES_DF_PREF_MIN}/{EVPN_ES_DF_PREF_MAX} are interpolated correctly.

Verified: f-string prefix added to the second line so {EVPN_ES_DF_PREF_MIN}/{EVPN_ES_DF_PREF_MAX} are interpolated correctly. Fix looks good.

Copilot · 2026-03-11T18:01:15Z

+    try:
+        config_db.set_entry(EVPN_ES_TABLE, interface_name, None)
+
+        # Update frr
+        cmd = ['sudo', 'vtysh', '-c', 'configure terminal', '-c', 'interface {}'.format(interface_name)]
+        cmd.append('-c')
+        cmd.append('no evpn mh es-sys-mac')
+        cmd.append('-c')
+        cmd.append('no evpn mh es-df-pref')
+        cmd.append('-c')
+        cmd.append('no evpn mh es-id')
+        op = run_vtysh_command(cmd)
+        if op:
+            ctx.fail("VTYSh config failed. Error: {}".format(op))
+    except JsonPatchConflict as e:


The EVPN ES entry is deleted from CONFIG_DB before the vtysh cleanup is executed. If vtysh fails, FRR may keep stale configuration while CONFIG_DB indicates it was removed. Consider doing FRR cleanup first or re-adding the CONFIG_DB entry on failure so the system remains consistent.

Fixed: FRR cleanup via vtysh is now performed before deleting from CONFIG_DB.

Verified: FRR cleanup via vtysh now performed before CONFIG_DB delete in del_evpn_es. Fix looks correct.

Copilot · 2026-03-11T18:01:15Z

+    try:
+        config_db.mod_entry(EVPN_ES_TABLE, interface_name, {'df_pref': int(df_pref)})
+
+        # Update frr
+        cmd = ['sudo', 'vtysh', '-c', 'configure terminal', '-c', 'interface {}'.format(interface_name)]
+        cmd.append('-c')
+        cmd.append('evpn mh es-df-pref {}'.format(int(df_pref)))
+        op = run_vtysh_command(cmd)
+        if op:
+            ctx.fail("VTYSh config failed. Error: {}".format(op))


mod_entry() is applied before the vtysh update. If the FRR update fails, CONFIG_DB will reflect the new DF preference while FRR does not. Consider applying vtysh first or rolling back the CONFIG_DB update when the vtysh call fails.

Fixed: vtysh update is now applied before mod_entry, and df_pref is stored as str(df_pref) instead of int(df_pref).

Verified: vtysh update applied before mod_entry, and df_pref stored as str(). Fix looks correct.

Copilot · 2026-03-11T18:01:15Z

+from natsort import natsorted
+from swsscommon.swsscommon import SonicV2Connector, ConfigDBConnector
+from tabulate import tabulate


show/evpn.py imports several modules that are never used in this file (natsorted, SonicV2Connector, ConfigDBConnector, tabulate). Remove unused imports to avoid confusion and keep dependencies minimal.

Suggested change

from natsort import natsorted

from swsscommon.swsscommon import SonicV2Connector, ConfigDBConnector

from tabulate import tabulate

Fixed: unused imports (natsorted, SonicV2Connector, ConfigDBConnector, tabulate) removed. re import added for validation.

Verified: unused imports removed (natsorted, SonicV2Connector, ConfigDBConnector, tabulate), re import added. Looks good.

Copilot · 2026-03-11T18:01:16Z

+    evpn_es_table = db['config_db'].get_table(EVPN_ES_TABLE)
+    assert interface_name in evpn_es_table, f"'{interface_name}' not found in {EVPN_ES_TABLE}"
+    assert evpn_es_table[interface_name]['esi'] == 'AUTO', f"Got ESI {evpn_es_table[interface_name]['esi']}, expected 'AUTO'"
+    assert evpn_es_table[interface_name]['type'] == 'TYPE_3_MAC_BASED', f"Got ESI type {evpn_es_table[interface_name]['esi']}, expected 'TYPE_3_MAC_BASED'"


In configure_mac_esi(), the assertion message says “Got ESI type …” but formats evpn_es_table[interface_name]['esi'] instead of the 'type' field. This makes failures misleading; use the 'type' value in the error message.

Suggested change

assert evpn_es_table[interface_name]['type'] == 'TYPE_3_MAC_BASED', f"Got ESI type {evpn_es_table[interface_name]['esi']}, expected 'TYPE_3_MAC_BASED'"

assert evpn_es_table[interface_name]['type'] == 'TYPE_3_MAC_BASED', f"Got ESI type {evpn_es_table[interface_name]['type']}, expected 'TYPE_3_MAC_BASED'"

Fixed: assertion message now correctly references evpn_es_table[interface_name]['type'] instead of ['esi'].

Verified: assertion message now references ["type"] instead of ["esi"]. Fix looks correct.

Chloe-Huang-13 · 2026-03-12T03:26:09Z

+@click.argument('startup_delay', metavar='<startup_delay>', required=True)
+@click.pass_context
+def set_startup_delay(ctx, startup_delay=EVPN_MH_STARTUP_DELAY_DEFAULT):
+    """Add EVPN MH"""


Perhaps using "Set EVPN MH startup delay time in seconds" would be better.
set_startup_delay、set_mac_holdtime and set_neigh_holdtime are all "Add EVPN MH".

Fixed: docstrings updated to be more descriptive - 'Set EVPN MH startup delay time in seconds', 'Set EVPN MH MAC holdtime in seconds', 'Set EVPN MH neighbor holdtime in seconds'.

Verified: docstrings updated to descriptive text. Looks good.

Chloe-Huang-13 · 2026-03-12T03:41:33Z

+                 f"Valid values are {EVPN_MH_STARTUP_DELAY_MIN}-{EVPN_MH_STARTUP_DELAY_MAX}.")
+
+    try:
+        config_db.set_entry(EVPN_MH_TABLE, 'default', {'startup_delay': startup_delay})


The "set_entry" will overwrite all existing fields, so other fields under 'default' will be cleared. e.g. mac_holdtime
Recommanded: get old entry, update only the needed ones, and then set the entry to avoid lossing existing data.

Fixed: now does get_entry first to preserve existing fields, then updates only the needed field, then set_entry.

Verified: set_startup_delay now does get_entry first to preserve existing fields, then updates only the needed field. Fix looks correct.

Chloe-Huang-13 · 2026-03-12T03:42:39Z

+                 f"Valid values are {EVPN_MH_MAC_HOLDTIME_MIN}-{EVPN_MH_MAC_HOLDTIME_MAX}.")
+
+    try:
+        config_db.set_entry(EVPN_MH_TABLE, 'default', {'mac_holdtime': mac_holdtime})


The "set_entry" will overwrite all existing fields, so other fields under 'default' will be cleared.
Recommanded: get old entry, update only the needed ones, and then set the entry to avoid lossing existing data.

Fixed: same pattern applied - get existing entry, update field, set entry.

Verified: same get-then-update pattern applied to set_mac_holdtime. Fix looks correct.

Chloe-Huang-13 · 2026-03-12T03:43:00Z

+                 f"Valid values are {EVPN_MH_NEIGH_HOLDTIME_MIN}-{EVPN_MH_NEIGH_HOLDTIME_MAX}.")
+
+    try:
+        config_db.set_entry(EVPN_MH_TABLE, 'default', {'neigh_holdtime': neigh_holdtime})


The "set_entry" will overwrite all existing fields, so other fields under 'default' will be cleared. e.g. mac_holdtime
Recommanded: get old entry, update only the needed ones, and then set the entry to avoid lossing existing data.

Fixed: same pattern applied - get existing entry, update field, set entry.

Verified: same get-then-update pattern applied to set_neigh_holdtime. Fix looks correct.

mssonicbld · 2026-03-13T00:06:26Z

/azp run

azure-pipelines · 2026-03-13T00:06:36Z

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

azure-pipelines · 2026-04-14T02:31:56Z

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

banidoru

All prior review concerns addressed in new commits — threads resolved
FRR-before-ConfigDB ordering, input validation, ESI case normalization, dead code removal all look good
es_evi detail subcommand remains unreachable due to positional vni arg conflicting with subcommand routing (existing thread left open)
New VNI/ESI input validation is a solid improvement over passing raw strings to vtysh
No new issues found in the delta diff

banidoru · 2026-04-14T03:07:38Z

The concern about the unreachable 'detail' subcommand is still valid. The 'es_evi' group's use of 'invoke_without_command=True' with a positional 'vni' argument prevents the 'detail' subcommand from being called.

banidoru

Most previous comments have been addressed.
The subcommand remains unreachable.
Unable to resolve threads due to permissions.

banidoru

Re-review of commit 68af379. All 30+ prior review threads addressed. Key improvements:

FRR-before-CONFIG_DB ordering consistently applied across all commands
run_vtysh_command properly refactored with ctx-based error handling
Input validation added (ESI format, VNI range, df_pref type safety)
Test fixtures updated for new @clicommon.pass_db pattern
Bug fixes: del_mac NameError, bare excepts, shadowed names, typos

One minor nit below. Overall looking solid.

banidoru · 2026-04-14T03:12:45Z

+            if evpn_es_tbl and 'type' in evpn_es_tbl and evpn_es_tbl['type'] == 'TYPE_3_MAC_BASED':
+                cmd.append('-c')
+                cmd.append('no evpn mh es-sys-mac')
+            run_vtysh_command(cmd, ctx)


Nit: run_vtysh_command(cmd, ctx) is now called unconditionally, but the only meaningful vtysh sub-commands (no evpn mh es-sys-mac) are only appended when the ES entry is TYPE_3_MAC_BASED. When it's not, this makes a no-op sudo vtysh -c 'configure terminal' -c 'interface ...' call. In the previous code, vtysh was only invoked inside the if evpn_es_tbl... block for the delete path. Consider guarding the vtysh call similarly to avoid unnecessary sudo invocations:

if evpn_es_tbl and 'type' in evpn_es_tbl and evpn_es_tbl['type'] == 'TYPE_3_MAC_BASED': cmd.append('-c') cmd.append('no evpn mh es-sys-mac') run_vtysh_command(cmd, ctx)

Not a functional issue, just unnecessary overhead.

Still open — not addressed in the latest commit. Low priority nit, can be deferred.

mssonicbld · 2026-04-16T23:52:38Z

/azp run

azure-pipelines · 2026-04-16T23:52:45Z

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

banidoru

Good fix: corrected run_command() return value destructuring from rc, output to output, rc — this was a real bug that would have swapped error code and output string
All 30 prior review threads have been addressed with appropriate fixes
Still open (minor): no-op vtysh call in del_evpn_es for non-MAC-based ES types — unnecessary sudo invocation (nit, not a blocker)
Still open (design): show evpn es-evi detail subcommand remains unreachable due to positional vni arg consuming "detail" — tracked separately
No new issues found in this diff. LGTM.

banidoru

New commit fixes critical bug: run_vtysh_command had swapped return values (rc, output → output, rc), matching clicommon.run_command's actual signature. Good catch.
All 30 prior review threads addressed — fixes verified in previous iteration. Unable to resolve threads due to permissions.
One open nit remains: unnecessary vtysh invocation in del_evpn_es / del_pc_sys_id_mac when ES type is not TYPE_3_MAC_BASED (low priority).
No new issues found in this single-line diff.

banidoru · 2026-04-17T00:06:53Z

+    if ctx is None:
+        # Preserve existing behavior for callers that don't use Click ctx
+        return clicommon.run_command(cmd)
+    output, rc = clicommon.run_command(cmd, return_cmd=True, ignore_error=True)


Good fix — clicommon.run_command() returns (output, returncode), not (returncode, output). The previous code would have treated the output string as rc (always truthy → always failing) and the integer return code as the error message. This was a runtime-breaking bug for any vtysh failure path.

banidoru

All reviewers approved. LGTM.

tahmed-dev · 2026-04-20T17:04:04Z

The FRR-first-then-ConfigDB pattern is good for atomicity, but if the process crashes between the vtysh call and config_db.mod_entry(), FRR has the config but ConfigDB doesn't. On next boot, ConfigDB is authoritative and FRR won't have the sys-mac. Is there a reconciliation mechanism, or should ConfigDB be the source of truth with teammgr/frrcfgd pushing to FRR?

tahmed-dev · 2026-04-20T17:04:05Z

port_id_from_if_name extracts digits from the interface name (e.g., PortChannel0 → "0"). This becomes the es-id in FRR. If two interfaces have overlapping numeric suffixes across different naming conventions (e.g., PortChannel10 and Ethernet10), could this collide? Is the intent that ES is only supported on PortChannels?

tahmed-dev · 2026-04-20T17:04:07Z

The L2 NHG resolution in fdbshow does db.keys() + loop over all NEXT_HOP_GROUP_MEMBER entries to find members of a specific group. On a switch with many NHGs (L3 ECMP + L2 MH), this could be slow. Would it help to filter by the group OID first, or is this acceptable for a show command?

tahmed-dev · 2026-04-20T17:04:08Z

int(startup_delay) in range(0, 3601) creates a range object on every call. Minor, but for consistency with other SONiC CLI validators, could use 0 <= int(startup_delay) <= 3600 which is more readable and doesn't allocate.

radha-danda · 2026-05-24T16:59:11Z

@bdfriedman, Please address comments, CI checks

mssonicbld · 2026-05-26T17:26:38Z

/azp run

azure-pipelines · 2026-05-26T17:26:46Z

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

mssonicbld · 2026-05-26T21:21:25Z

/azp run

azure-pipelines · 2026-05-26T21:21:32Z

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

mssonicbld · 2026-05-26T23:04:16Z

/azp run

Rebased on top of sonic-net/sonic-utilities@f66ad703 to resolve merge conflict in doc/Command-Reference.md (show vlan brief). Squashes the original PR sonic-net#4247 commits into a single linear commit. Preserves both the new Static Anycast Gateway field description from the PR and the multi-ASIC namespace note added on master. Also fixes PR-introduced trailing whitespace and the 'show static-anycast_gateway' typo in doc/Command-Reference.md. Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>

azure-pipelines · 2026-05-26T23:04:27Z

Azure Pipelines successfully started running 1 pipeline(s).

Rebased on top of sonic-net/sonic-utilities@f66ad703 to resolve merge conflict in doc/Command-Reference.md (show vlan brief). Squashes the original PR sonic-net#4247 commits into a single linear commit. Preserves both the new Static Anycast Gateway field description from the PR and the multi-ASIC namespace note added on master. Also fixes PR-introduced trailing whitespace and the 'show static-anycast_gateway' typo in doc/Command-Reference.md. Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>

mssonicbld · 2026-05-26T23:47:20Z

/azp run

azure-pipelines · 2026-05-26T23:47:30Z

Azure Pipelines successfully started running 1 pipeline(s).

Chloe-Huang-13 · 2026-05-28T09:05:35Z

+        entry = config_db.get_entry(EVPN_MH_TABLE, 'default') or {}
+        entry['startup_delay'] = startup_delay
+        config_db.set_entry(EVPN_MH_TABLE, 'default', entry)
+    except ValueError as e:


Exception handling is incomplete.
This handler catches only ValueError, but when ValidatedConfigDBConnector runs with YANG validation enabled, set_entry() can also raise patch-related exceptions such as JsonPatchConflict. In that case, the CLI may expose an unhandled internal error instead of returning a clean ctx.fail(...) message.

Chloe-Huang-13 · 2026-05-28T09:06:27Z

+        entry = config_db.get_entry(EVPN_MH_TABLE, 'default') or {}
+        entry['mac_holdtime'] = mac_holdtime
+        config_db.set_entry(EVPN_MH_TABLE, 'default', entry)
+    except ValueError as e:


same as set_startup_delay, exception handling is incomplete.

Chloe-Huang-13 · 2026-05-28T09:06:53Z

+        entry = config_db.get_entry(EVPN_MH_TABLE, 'default') or {}
+        entry['neigh_holdtime'] = neigh_holdtime
+        config_db.set_entry(EVPN_MH_TABLE, 'default', entry)
+    except ValueError as e:


same as set_startup_delay, exception handling is incomplete.

Chloe-Huang-13 · 2026-05-28T09:08:58Z

+                 f"Valid values are {EVPN_MH_STARTUP_DELAY_MIN}-{EVPN_MH_STARTUP_DELAY_MAX}.")
+
+    try:
+        # Get existing entry to preserve other fields


A regression test is missing for field preservation.
the current tests only cover the empty-table case and do not verify that the other existing fields are preserved after the update.

Chloe-Huang-13 · 2026-05-28T09:18:43Z

        assert "Error: Invalid interface name Ethernet3" in result.output

+    @pytest.mark.skip(reason="Test is not working, needs investigation")
+    def test_add_remove_sys_mac_portchannel(self):


Test coverage is incomplete. The key failure paths are not covered, especially FRR command failure, CONFIG_DB write failure, and the TYPE_3_MAC_BASED branch.

Chloe-Huang-13 · 2026-05-28T09:28:00Z

+@click.pass_context
+@click.argument('interface_name', metavar='<interface_name>', required=True)
+@click.argument('esi_type', metavar='<esi_type>', required=True,
+                type=str, nargs=-1)


accepts extra arguments because nargs=-1 is used, but only the first token is actually parsed.
However, the del command instead has a test for "error due to extra parameters". --- test_del_extra_arg(self, cli_db_connection):

Chloe-Huang-13 · 2026-05-28T09:37:45Z

+    if sag_entry:
+        mac_address = sag_entry.get('gateway_mac', 'unknown')
+        log.log_info(f"'static-anycast-gateway mac_address del {mac_address}' executing...")
+        db.cfgdb.mod_entry('SAG', 'GLOBAL', None)


removes the whole SAG|GLOBAL entry, not just the gateway_mac field, which is risky if more SAG global fields are added later.

Chloe-Huang-13 · 2026-05-28T09:46:23Z

+    if vlan_entry and vlan_entry.get('static_anycast_gateway') != 'true':
+        ctx.fail("static-anycast-gateway is already disabled")
+
+    db.cfgdb.mod_entry('VLAN_INTERFACE', vlan, {"static_anycast_gateway": "false"})


The disable logic is not fully consistent: an unset SAG state is treated differently from an explicitly disabled state.
A better fix is to define a single clear rule: only 'true' means enabled, and everything else means disabled.

vlan_entry = db.cfgdb.get_entry('VLAN_INTERFACE', vlan) or {} if vlan_entry.get('static_anycast_gateway') != 'true': ctx.fail("static-anycast-gateway is already disabled") db.cfgdb.mod_entry('VLAN_INTERFACE', vlan, {"static_anycast_gateway": "false"})

Chloe-Huang-13 · 2026-05-28T09:56:11Z

+        # Validate ESI format (XX:XX:XX:XX:XX:XX:XX:XX:XX:XX - 10 hex bytes separated by colons)
+        esi_pattern = r'^[0-9a-fA-F]{2}(:[0-9a-fA-F]{2}){9}$'
+        if not re.match(esi_pattern, es):
+            click.echo(f"Error: Invalid ESI format '{es}'. Expected format: XX:XX:XX:XX:XX:XX:XX:XX:XX:XX")


Invalid ESI/VNI input currently only prints an error and returns, so the command still exits with code 0.
It would be better to use ctx.fail(...) to make "illegal input" truly result in failure.

Chloe-Huang-13 · 2026-05-28T10:07:22Z

+    body = []
+
+    sag_entry = db.cfgdb.get_entry('SAG', 'GLOBAL')
+    if sag_entry:


inconsistent configuration is hidden: if VLAN interfaces have static_anycast_gateway=true but SAG|GLOBAL is missing, the command just prints an empty table instead of showing a clear problem.

The better fix is to make the command state-aware instead of silently printing an empty table.

enabled_vlans = [...] sag_mac = sag_entry.get('gateway_mac') if sag_entry else None if not sag_mac and not enabled_vlans: # not configured elif not sag_mac and enabled_vlans: # inconsistent else: # normal display

This was referenced Feb 4, 2026

Address PR review comments for EVPN VxLAN Multihoming HLD sonic-net/SONiC#2195

Merged

Merged EVPN VxLAN MH HLD from Cisco and BCM sonic-net/SONiC#1702

Merged

selva-nexthop reviewed Feb 5, 2026

View reviewed changes

bdfriedman force-pushed the evpn_mh branch from b0ff760 to 4aa0242 Compare February 18, 2026 20:02

bdfriedman changed the title ~~Add EVPN command support and enhance VXLAN remote VTEP handling~~ [EVPN-MH] Add CLI commands for EVPN VXLAN Multihoming configuration Feb 25, 2026

lguohan requested a review from Copilot March 11, 2026 17:55

Copilot started reviewing on behalf of lguohan March 11, 2026 17:55 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Chloe-Huang-13 reviewed Mar 12, 2026

View reviewed changes

banidoru approved these changes Apr 14, 2026

View reviewed changes

banidoru reviewed Apr 14, 2026

View reviewed changes

banidoru approved these changes Apr 17, 2026

View reviewed changes

banidoru reviewed Apr 17, 2026

View reviewed changes

banidoru approved these changes Apr 17, 2026

View reviewed changes

tahmed-dev force-pushed the evpn_mh branch from 885c863 to 81ec7b6 Compare May 26, 2026 21:21

tahmed-dev force-pushed the evpn_mh branch from 81ec7b6 to e2b45fa Compare May 26, 2026 23:04

tahmed-dev force-pushed the evpn_mh branch from e2b45fa to 19932b8 Compare May 26, 2026 23:47

tahmed-dev mentioned this pull request May 27, 2026

[EVPN-MH] Add standalone EVPN-MH code and tests sonic-net/sonic-swss#4608

Merged

Chloe-Huang-13 reviewed May 28, 2026

View reviewed changes

		def run_vtysh_command(cmd):
		return clicommon.run_command(cmd)

-def run_vtysh_command(cmd):
-    return clicommon.run_command(cmd)
+def run_vtysh_command(cmd, ctx=None):
+    """
+    Run a vtysh command.
+    If a Click context is provided, capture the return code/output and
+    surface failures via ctx.fail() instead of exiting abruptly.
+    When no context is provided, preserve the original behavior of
+    clicommon.run_command(), which may call sys.exit() on error.
+    """
+    if ctx is None:
+        # Preserve existing behavior for callers that don't use Click ctx
+        return clicommon.run_command(cmd)
+    rc, output = clicommon.run_command(cmd, return_cmd=True, ignore_error=True)
+    if rc != 0:
+        message = output if output else f"Command '{cmd}' failed with return code {rc}"
+        ctx.fail(message)
+    return output

		"""Show Ethernet Segment per EVI information"""
		""""show evpn es-evi <vni>"""

-    port_id_re = re.compile(r'[a-zA-Z]+(?P<port_id>[0-9_]+)')
-    port_id = port_id_re.search(if_name)
-    port_id_group = port_id.group('port_id')
-    if port_id_group:
-        port_id = port_id_group.replace('_', '')
-    return port_id
+    """Extract the numeric port identifier from an interface name.
+    Returns a string of digits (with any underscores removed) if the
+    interface name matches the expected pattern, otherwise returns None.
+    """
+    port_id_re = re.compile(r'[a-zA-Z]+(?P<port_id>[0-9_]+)')
+    match = port_id_re.search(if_name)
+    if not match:
+        return None
+    port_id_group = match.group('port_id')
+    if port_id_group:
+        return port_id_group.replace('_', '')
+    return None

		ctx.fail(f"EVPN Ethernet Segment {interface_name} - DF Preference {df_pref} is not valid. " \
		"Valid values are {EVPN_ES_DF_PREF_MIN}-{EVPN_ES_DF_PREF_MAX}.")

	from natsort import natsorted
	from swsscommon.swsscommon import SonicV2Connector, ConfigDBConnector
	from tabulate import tabulate

	assert evpn_es_table[interface_name]['type'] == 'TYPE_3_MAC_BASED', f"Got ESI type {evpn_es_table[interface_name]['esi']}, expected 'TYPE_3_MAC_BASED'"
	assert evpn_es_table[interface_name]['type'] == 'TYPE_3_MAC_BASED', f"Got ESI type {evpn_es_table[interface_name]['type']}, expected 'TYPE_3_MAC_BASED'"

Conversation

bdfriedman commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why I did it

Work item tracking

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG model changes

Depends on

Uh oh!

linux-foundation-easycla Bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mssonicbld commented Feb 4, 2026

Uh oh!

azure-pipelines Bot commented Feb 4, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mssonicbld commented Feb 12, 2026

Uh oh!

azure-pipelines Bot commented Feb 12, 2026

Uh oh!

mssonicbld commented Feb 12, 2026

Uh oh!

azure-pipelines Bot commented Feb 12, 2026

Uh oh!

mssonicbld commented Feb 12, 2026

Uh oh!

azure-pipelines Bot commented Feb 12, 2026

Uh oh!

mssonicbld commented Feb 12, 2026

Uh oh!

azure-pipelines Bot commented Feb 12, 2026

Uh oh!

mssonicbld commented Feb 12, 2026

Uh oh!

azure-pipelines Bot commented Feb 12, 2026

Uh oh!

mssonicbld commented Feb 18, 2026

Uh oh!

azure-pipelines Bot commented Feb 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdfriedman commented Feb 4, 2026 •

edited

Loading

linux-foundation-easycla Bot commented Feb 4, 2026 •

edited

Loading