[EVPN-MH] Add CLI commands for EVPN VXLAN Multihoming configuration#4247
[EVPN-MH] Add CLI commands for EVPN VXLAN Multihoming configuration#4247bdfriedman wants to merge 1 commit into
Conversation
|
|
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| cmd += " {}".format(es_evi) | ||
|
|
||
| output = bgp_util.run_bgp_show_command(cmd) | ||
| print(output) |
There was a problem hiding this comment.
make it consistent to use click.echo instead of print.
There was a problem hiding this comment.
Fixed: now uses click.echo instead of print throughout show/evpn.py.
There was a problem hiding this comment.
Verified: all print() calls replaced with click.echo() in show/evpn.py. Fix looks good.
| """Show evpn Ethernet Segment per EVI information""" | ||
| cmd = "show evpn es-evi" | ||
|
|
||
| if es_evi is not None: |
There was a problem hiding this comment.
it can be just
if es_evi: # This checks if es_evi is not None AND not an empty string
cmd += " {}".format(es_evi)
There was a problem hiding this comment.
Fixed: now uses truthy check if vni: and if es: instead of is not None.
There was a problem hiding this comment.
Verified: truthy checks (if es:, if vni:) used instead of is not None. Looks good.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Adds SONiC CLI support for configuring and monitoring EVPN VXLAN Multihoming (EVPN-MH), including new config evpn-mh global settings, config interface EVPN-ES/DF preference commands, and a new show evpn command group that proxies FRR output.
Changes:
- Add
config evpn-mhcommands to configure global EVPN-MH timers in CONFIG_DB. - Add
config interface evpn-esi ...andconfig interface evpn-df-pref ...to configure Ethernet Segment parameters and DF preference, including FRR vtysh integration. - Add
show evpn ...commands that run FRRshow evpnvariants and display the output.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
config/main.py |
Adds EVPN ES interface configuration commands, parsing/validation helpers, and FRR vtysh integration. |
config/evpn_mh.py |
Adds new config evpn-mh command group for EVPN-MH global timer configuration. |
show/main.py |
Registers the new show evpn command group. |
show/evpn.py |
Implements show evpn subcommands by proxying FRR show evpn ... output. |
tests/config_evpn_mh_test.py |
Adds unit tests for EVPN-MH global configuration commands. |
tests/config_int_evpn_test.py |
Adds unit tests for EVPN ES interface configuration and DF preference behavior. |
| def set_startup_delay(ctx, startup_delay=EVPN_MH_STARTUP_DELAY_DEFAULT): | ||
| """Add EVPN MH""" | ||
| config_db = ValidatedConfigDBConnector(ctx.obj['config_db']) | ||
| if not is_valid_startup_delay(startup_delay): |
There was a problem hiding this comment.
These handlers access ctx.obj['config_db'], but when invoked from the top-level config group ctx.obj is a utilities_common.db.Db instance (see config/main.py), not a dict. This will raise TypeError: 'Db' object is not subscriptable at runtime (even though unit tests pass a dict). Update the commands to use @clicommon.pass_db and db.cfgdb (or access ctx.obj.cfgdb) consistently across all evpn-mh subcommands.
There was a problem hiding this comment.
Fixed: commands now use @clicommon.pass_db decorator and access db.cfgdb instead of ctx.obj['config_db'].
There was a problem hiding this comment.
Verified: config/evpn_mh.py commands now use @clicommon.pass_db decorator and access db.cfgdb. Tests updated to pass Db() instance. Fix looks correct.
| def run_vtysh_command(cmd): | ||
| return clicommon.run_command(cmd) | ||
|
|
There was a problem hiding this comment.
run_vtysh_command() wraps clicommon.run_command(), which returns None on success and calls sys.exit() on failures. As a result, the subsequent if op: checks will never catch errors, and failures can abruptly exit without a Click-friendly error message. Consider using return_cmd=True (and ignore_error=True) to capture output/rc and then ctx.fail() on non-zero rc.
| def run_vtysh_command(cmd): | |
| return clicommon.run_command(cmd) | |
| def run_vtysh_command(cmd, ctx=None): | |
| """ | |
| Run a vtysh command. | |
| If a Click context is provided, capture the return code/output and | |
| surface failures via ctx.fail() instead of exiting abruptly. | |
| When no context is provided, preserve the original behavior of | |
| clicommon.run_command(), which may call sys.exit() on error. | |
| """ | |
| if ctx is None: | |
| # Preserve existing behavior for callers that don't use Click ctx | |
| return clicommon.run_command(cmd) | |
| rc, output = clicommon.run_command(cmd, return_cmd=True, ignore_error=True) | |
| if rc != 0: | |
| message = output if output else f"Command '{cmd}' failed with return code {rc}" | |
| ctx.fail(message) | |
| return output |
There was a problem hiding this comment.
Fixed: run_vtysh_command() now accepts an optional ctx parameter, captures rc/output with return_cmd=True, and calls ctx.fail() on non-zero rc.
There was a problem hiding this comment.
Verified: run_vtysh_command() now accepts optional ctx, uses return_cmd=True and ignore_error=True, and calls ctx.fail() on non-zero rc. All callers updated.
| """Show Ethernet Segment per EVI information""" | ||
| """"show evpn es-evi <vni>""" |
There was a problem hiding this comment.
There is a stray standalone string literal (""""show evpn es-evi <vni>""") under es_evi(). It isn’t a docstring/comment (because the function already has a docstring) and is easy to misread; remove it or merge it into the actual docstring.
| """Show Ethernet Segment per EVI information""" | |
| """"show evpn es-evi <vni>""" | |
| """Show Ethernet Segment per EVI information (show evpn es-evi <vni>)""" |
There was a problem hiding this comment.
Fixed: stray standalone string literal removed and merged into the actual docstring.
There was a problem hiding this comment.
Verified: stray string literal removed, merged into the docstring. Looks good.
| def check_if_same_manual_esi_exists(ctx, esi_args, es_data): | ||
| if esi_args['type'] == 'TYPE_0_OPERATOR_CONFIGURED': | ||
| for es_intf_name, es_intf_data in es_data.items(): | ||
| if esi_args['esi'] == es_intf_data['esi']: | ||
| ctx.fail(f"The ESI '{esi_args['esi']}' is already in use by '{es_intf_name}'") |
There was a problem hiding this comment.
check_if_same_manual_esi_exists() compares ESI strings with exact case. Since manual ESIs are hex, the same ESI can be configured twice using different casing (e.g., ...:ff vs ...:FF), bypassing the duplicate check. Normalize ESIs (e.g., lower-case) before storing and when comparing to enforce true uniqueness.
There was a problem hiding this comment.
Fixed: ESI strings are now normalized to lowercase before storing (esi_args['esi'] = esi_input_strs[0].lower()) and compared case-insensitively.
There was a problem hiding this comment.
Verified: ESI strings normalized to lowercase before storing and compared case-insensitively. Fix looks correct.
| port_id_re = re.compile(r'[a-zA-Z]+(?P<port_id>[0-9_]+)') | ||
| port_id = port_id_re.search(if_name) | ||
| port_id_group = port_id.group('port_id') | ||
|
|
||
| if port_id_group: | ||
| port_id = port_id_group.replace('_', '') | ||
|
|
||
| return port_id |
There was a problem hiding this comment.
port_id_from_if_name() assumes the regex always matches and immediately calls .group('port_id'). If a user provides an interface name without digits (or an unexpected format), this will raise an AttributeError and crash the CLI. Handle a non-match explicitly (e.g., return None/raise a ValueError) and have the caller ctx.fail() with a clear message.
| port_id_re = re.compile(r'[a-zA-Z]+(?P<port_id>[0-9_]+)') | |
| port_id = port_id_re.search(if_name) | |
| port_id_group = port_id.group('port_id') | |
| if port_id_group: | |
| port_id = port_id_group.replace('_', '') | |
| return port_id | |
| """Extract the numeric port identifier from an interface name. | |
| Returns a string of digits (with any underscores removed) if the | |
| interface name matches the expected pattern, otherwise returns None. | |
| """ | |
| port_id_re = re.compile(r'[a-zA-Z]+(?P<port_id>[0-9_]+)') | |
| match = port_id_re.search(if_name) | |
| if not match: | |
| return None | |
| port_id_group = match.group('port_id') | |
| if port_id_group: | |
| return port_id_group.replace('_', '') | |
| return None |
There was a problem hiding this comment.
Fixed: port_id_from_if_name() now returns None on non-match, and callers check for None with ctx.fail().
There was a problem hiding this comment.
Verified: port_id_from_if_name() now returns None on non-match, and callers check for None with ctx.fail(). Looks good.
| ctx.fail(f"EVPN Ethernet Segment {interface_name} - DF Preference {df_pref} is not valid. " \ | ||
| "Valid values are {EVPN_ES_DF_PREF_MIN}-{EVPN_ES_DF_PREF_MAX}.") |
There was a problem hiding this comment.
The validation error message for an invalid DF preference is missing the f prefix on the second line, so {EVPN_ES_DF_PREF_MIN} / {EVPN_ES_DF_PREF_MAX} will be printed literally. Make the entire message an f-string (or build it in one f-string) so the numeric range is displayed correctly.
| ctx.fail(f"EVPN Ethernet Segment {interface_name} - DF Preference {df_pref} is not valid. " \ | |
| "Valid values are {EVPN_ES_DF_PREF_MIN}-{EVPN_ES_DF_PREF_MAX}.") | |
| ctx.fail( | |
| f"EVPN Ethernet Segment {interface_name} - DF Preference {df_pref} is not valid. " | |
| f"Valid values are {EVPN_ES_DF_PREF_MIN}-{EVPN_ES_DF_PREF_MAX}." | |
| ) |
There was a problem hiding this comment.
Fixed: the second line now has the f prefix so {EVPN_ES_DF_PREF_MIN}/{EVPN_ES_DF_PREF_MAX} are interpolated correctly.
There was a problem hiding this comment.
Verified: f-string prefix added to the second line so {EVPN_ES_DF_PREF_MIN}/{EVPN_ES_DF_PREF_MAX} are interpolated correctly. Fix looks good.
| try: | ||
| config_db.set_entry(EVPN_ES_TABLE, interface_name, None) | ||
|
|
||
| # Update frr | ||
| cmd = ['sudo', 'vtysh', '-c', 'configure terminal', '-c', 'interface {}'.format(interface_name)] | ||
| cmd.append('-c') | ||
| cmd.append('no evpn mh es-sys-mac') | ||
| cmd.append('-c') | ||
| cmd.append('no evpn mh es-df-pref') | ||
| cmd.append('-c') | ||
| cmd.append('no evpn mh es-id') | ||
| op = run_vtysh_command(cmd) | ||
| if op: | ||
| ctx.fail("VTYSh config failed. Error: {}".format(op)) | ||
| except JsonPatchConflict as e: |
There was a problem hiding this comment.
The EVPN ES entry is deleted from CONFIG_DB before the vtysh cleanup is executed. If vtysh fails, FRR may keep stale configuration while CONFIG_DB indicates it was removed. Consider doing FRR cleanup first or re-adding the CONFIG_DB entry on failure so the system remains consistent.
There was a problem hiding this comment.
Fixed: FRR cleanup via vtysh is now performed before deleting from CONFIG_DB.
There was a problem hiding this comment.
Verified: FRR cleanup via vtysh now performed before CONFIG_DB delete in del_evpn_es. Fix looks correct.
| try: | ||
| config_db.mod_entry(EVPN_ES_TABLE, interface_name, {'df_pref': int(df_pref)}) | ||
|
|
||
| # Update frr | ||
| cmd = ['sudo', 'vtysh', '-c', 'configure terminal', '-c', 'interface {}'.format(interface_name)] | ||
| cmd.append('-c') | ||
| cmd.append('evpn mh es-df-pref {}'.format(int(df_pref))) | ||
| op = run_vtysh_command(cmd) | ||
| if op: | ||
| ctx.fail("VTYSh config failed. Error: {}".format(op)) |
There was a problem hiding this comment.
mod_entry() is applied before the vtysh update. If the FRR update fails, CONFIG_DB will reflect the new DF preference while FRR does not. Consider applying vtysh first or rolling back the CONFIG_DB update when the vtysh call fails.
There was a problem hiding this comment.
Fixed: vtysh update is now applied before mod_entry, and df_pref is stored as str(df_pref) instead of int(df_pref).
There was a problem hiding this comment.
Verified: vtysh update applied before mod_entry, and df_pref stored as str(). Fix looks correct.
| from natsort import natsorted | ||
| from swsscommon.swsscommon import SonicV2Connector, ConfigDBConnector | ||
| from tabulate import tabulate |
There was a problem hiding this comment.
show/evpn.py imports several modules that are never used in this file (natsorted, SonicV2Connector, ConfigDBConnector, tabulate). Remove unused imports to avoid confusion and keep dependencies minimal.
| from natsort import natsorted | |
| from swsscommon.swsscommon import SonicV2Connector, ConfigDBConnector | |
| from tabulate import tabulate |
There was a problem hiding this comment.
Fixed: unused imports (natsorted, SonicV2Connector, ConfigDBConnector, tabulate) removed. re import added for validation.
There was a problem hiding this comment.
Verified: unused imports removed (natsorted, SonicV2Connector, ConfigDBConnector, tabulate), re import added. Looks good.
| evpn_es_table = db['config_db'].get_table(EVPN_ES_TABLE) | ||
| assert interface_name in evpn_es_table, f"'{interface_name}' not found in {EVPN_ES_TABLE}" | ||
| assert evpn_es_table[interface_name]['esi'] == 'AUTO', f"Got ESI {evpn_es_table[interface_name]['esi']}, expected 'AUTO'" | ||
| assert evpn_es_table[interface_name]['type'] == 'TYPE_3_MAC_BASED', f"Got ESI type {evpn_es_table[interface_name]['esi']}, expected 'TYPE_3_MAC_BASED'" |
There was a problem hiding this comment.
In configure_mac_esi(), the assertion message says “Got ESI type …” but formats evpn_es_table[interface_name]['esi'] instead of the 'type' field. This makes failures misleading; use the 'type' value in the error message.
| assert evpn_es_table[interface_name]['type'] == 'TYPE_3_MAC_BASED', f"Got ESI type {evpn_es_table[interface_name]['esi']}, expected 'TYPE_3_MAC_BASED'" | |
| assert evpn_es_table[interface_name]['type'] == 'TYPE_3_MAC_BASED', f"Got ESI type {evpn_es_table[interface_name]['type']}, expected 'TYPE_3_MAC_BASED'" |
There was a problem hiding this comment.
Fixed: assertion message now correctly references evpn_es_table[interface_name]['type'] instead of ['esi'].
There was a problem hiding this comment.
Verified: assertion message now references ["type"] instead of ["esi"]. Fix looks correct.
| @click.argument('startup_delay', metavar='<startup_delay>', required=True) | ||
| @click.pass_context | ||
| def set_startup_delay(ctx, startup_delay=EVPN_MH_STARTUP_DELAY_DEFAULT): | ||
| """Add EVPN MH""" |
There was a problem hiding this comment.
Perhaps using "Set EVPN MH startup delay time in seconds" would be better.
set_startup_delay、set_mac_holdtime and set_neigh_holdtime are all "Add EVPN MH".
There was a problem hiding this comment.
Fixed: docstrings updated to be more descriptive - 'Set EVPN MH startup delay time in seconds', 'Set EVPN MH MAC holdtime in seconds', 'Set EVPN MH neighbor holdtime in seconds'.
There was a problem hiding this comment.
Verified: docstrings updated to descriptive text. Looks good.
| f"Valid values are {EVPN_MH_STARTUP_DELAY_MIN}-{EVPN_MH_STARTUP_DELAY_MAX}.") | ||
|
|
||
| try: | ||
| config_db.set_entry(EVPN_MH_TABLE, 'default', {'startup_delay': startup_delay}) |
There was a problem hiding this comment.
The "set_entry" will overwrite all existing fields, so other fields under 'default' will be cleared. e.g. mac_holdtime
Recommanded: get old entry, update only the needed ones, and then set the entry to avoid lossing existing data.
There was a problem hiding this comment.
Fixed: now does get_entry first to preserve existing fields, then updates only the needed field, then set_entry.
There was a problem hiding this comment.
Verified: set_startup_delay now does get_entry first to preserve existing fields, then updates only the needed field. Fix looks correct.
| f"Valid values are {EVPN_MH_MAC_HOLDTIME_MIN}-{EVPN_MH_MAC_HOLDTIME_MAX}.") | ||
|
|
||
| try: | ||
| config_db.set_entry(EVPN_MH_TABLE, 'default', {'mac_holdtime': mac_holdtime}) |
There was a problem hiding this comment.
The "set_entry" will overwrite all existing fields, so other fields under 'default' will be cleared.
Recommanded: get old entry, update only the needed ones, and then set the entry to avoid lossing existing data.
There was a problem hiding this comment.
Fixed: same pattern applied - get existing entry, update field, set entry.
There was a problem hiding this comment.
Verified: same get-then-update pattern applied to set_mac_holdtime. Fix looks correct.
| f"Valid values are {EVPN_MH_NEIGH_HOLDTIME_MIN}-{EVPN_MH_NEIGH_HOLDTIME_MAX}.") | ||
|
|
||
| try: | ||
| config_db.set_entry(EVPN_MH_TABLE, 'default', {'neigh_holdtime': neigh_holdtime}) |
There was a problem hiding this comment.
The "set_entry" will overwrite all existing fields, so other fields under 'default' will be cleared. e.g. mac_holdtime
Recommanded: get old entry, update only the needed ones, and then set the entry to avoid lossing existing data.
There was a problem hiding this comment.
Fixed: same pattern applied - get existing entry, update field, set entry.
There was a problem hiding this comment.
Verified: same get-then-update pattern applied to set_neigh_holdtime. Fix looks correct.
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
banidoru
left a comment
There was a problem hiding this comment.
- All prior review concerns addressed in new commits — threads resolved
- FRR-before-ConfigDB ordering, input validation, ESI case normalization, dead code removal all look good
es_evidetail subcommand remains unreachable due to positionalvniarg conflicting with subcommand routing (existing thread left open)- New VNI/ESI input validation is a solid improvement over passing raw strings to vtysh
- No new issues found in the delta diff
|
The concern about the unreachable 'detail' subcommand is still valid. The 'es_evi' group's use of 'invoke_without_command=True' with a positional 'vni' argument prevents the 'detail' subcommand from being called. |
banidoru
left a comment
There was a problem hiding this comment.
- Most previous comments have been addressed.
- The subcommand remains unreachable.
- Unable to resolve threads due to permissions.
banidoru
left a comment
There was a problem hiding this comment.
Re-review of commit 68af379. All 30+ prior review threads addressed. Key improvements:
- FRR-before-CONFIG_DB ordering consistently applied across all commands
run_vtysh_commandproperly refactored with ctx-based error handling- Input validation added (ESI format, VNI range, df_pref type safety)
- Test fixtures updated for new
@clicommon.pass_dbpattern - Bug fixes:
del_macNameError, bare excepts, shadowed names, typos
One minor nit below. Overall looking solid.
| if evpn_es_tbl and 'type' in evpn_es_tbl and evpn_es_tbl['type'] == 'TYPE_3_MAC_BASED': | ||
| cmd.append('-c') | ||
| cmd.append('no evpn mh es-sys-mac') | ||
| run_vtysh_command(cmd, ctx) |
There was a problem hiding this comment.
Nit: run_vtysh_command(cmd, ctx) is now called unconditionally, but the only meaningful vtysh sub-commands (no evpn mh es-sys-mac) are only appended when the ES entry is TYPE_3_MAC_BASED. When it's not, this makes a no-op sudo vtysh -c 'configure terminal' -c 'interface ...' call. In the previous code, vtysh was only invoked inside the if evpn_es_tbl... block for the delete path. Consider guarding the vtysh call similarly to avoid unnecessary sudo invocations:
if evpn_es_tbl and 'type' in evpn_es_tbl and evpn_es_tbl['type'] == 'TYPE_3_MAC_BASED':
cmd.append('-c')
cmd.append('no evpn mh es-sys-mac')
run_vtysh_command(cmd, ctx)Not a functional issue, just unnecessary overhead.
There was a problem hiding this comment.
Still open — not addressed in the latest commit. Low priority nit, can be deferred.
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
banidoru
left a comment
There was a problem hiding this comment.
- Good fix: corrected
run_command()return value destructuring fromrc, outputtooutput, rc— this was a real bug that would have swapped error code and output string - All 30 prior review threads have been addressed with appropriate fixes
- Still open (minor): no-op vtysh call in
del_evpn_esfor non-MAC-based ES types — unnecessary sudo invocation (nit, not a blocker) - Still open (design):
show evpn es-evi detailsubcommand remains unreachable due to positionalvniarg consuming "detail" — tracked separately - No new issues found in this diff. LGTM.
banidoru
left a comment
There was a problem hiding this comment.
- New commit fixes critical bug:
run_vtysh_commandhad swapped return values (rc, output→output, rc), matchingclicommon.run_command's actual signature. Good catch. - All 30 prior review threads addressed — fixes verified in previous iteration. Unable to resolve threads due to permissions.
- One open nit remains: unnecessary vtysh invocation in
del_evpn_es/del_pc_sys_id_macwhen ES type is not TYPE_3_MAC_BASED (low priority). - No new issues found in this single-line diff.
| if ctx is None: | ||
| # Preserve existing behavior for callers that don't use Click ctx | ||
| return clicommon.run_command(cmd) | ||
| output, rc = clicommon.run_command(cmd, return_cmd=True, ignore_error=True) |
There was a problem hiding this comment.
Good fix — clicommon.run_command() returns (output, returncode), not (returncode, output). The previous code would have treated the output string as rc (always truthy → always failing) and the integer return code as the error message. This was a runtime-breaking bug for any vtysh failure path.
|
The FRR-first-then-ConfigDB pattern is good for atomicity, but if the process crashes between the |
|
|
|
The L2 NHG resolution in |
|
|
|
@bdfriedman, Please address comments, CI checks |
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
|
/azp run |
Rebased on top of sonic-net/sonic-utilities@f66ad703 to resolve merge conflict in doc/Command-Reference.md (show vlan brief). Squashes the original PR sonic-net#4247 commits into a single linear commit. Preserves both the new Static Anycast Gateway field description from the PR and the multi-ASIC namespace note added on master. Also fixes PR-introduced trailing whitespace and the 'show static-anycast_gateway' typo in doc/Command-Reference.md. Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>
|
Azure Pipelines successfully started running 1 pipeline(s). |
Rebased on top of sonic-net/sonic-utilities@f66ad703 to resolve merge conflict in doc/Command-Reference.md (show vlan brief). Squashes the original PR sonic-net#4247 commits into a single linear commit. Preserves both the new Static Anycast Gateway field description from the PR and the multi-ASIC namespace note added on master. Also fixes PR-introduced trailing whitespace and the 'show static-anycast_gateway' typo in doc/Command-Reference.md. Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| entry = config_db.get_entry(EVPN_MH_TABLE, 'default') or {} | ||
| entry['startup_delay'] = startup_delay | ||
| config_db.set_entry(EVPN_MH_TABLE, 'default', entry) | ||
| except ValueError as e: |
There was a problem hiding this comment.
Exception handling is incomplete.
This handler catches only ValueError, but when ValidatedConfigDBConnector runs with YANG validation enabled, set_entry() can also raise patch-related exceptions such as JsonPatchConflict. In that case, the CLI may expose an unhandled internal error instead of returning a clean ctx.fail(...) message.
| entry = config_db.get_entry(EVPN_MH_TABLE, 'default') or {} | ||
| entry['mac_holdtime'] = mac_holdtime | ||
| config_db.set_entry(EVPN_MH_TABLE, 'default', entry) | ||
| except ValueError as e: |
There was a problem hiding this comment.
same as set_startup_delay, exception handling is incomplete.
| entry = config_db.get_entry(EVPN_MH_TABLE, 'default') or {} | ||
| entry['neigh_holdtime'] = neigh_holdtime | ||
| config_db.set_entry(EVPN_MH_TABLE, 'default', entry) | ||
| except ValueError as e: |
There was a problem hiding this comment.
same as set_startup_delay, exception handling is incomplete.
| f"Valid values are {EVPN_MH_STARTUP_DELAY_MIN}-{EVPN_MH_STARTUP_DELAY_MAX}.") | ||
|
|
||
| try: | ||
| # Get existing entry to preserve other fields |
There was a problem hiding this comment.
A regression test is missing for field preservation.
the current tests only cover the empty-table case and do not verify that the other existing fields are preserved after the update.
| assert "Error: Invalid interface name Ethernet3" in result.output | ||
|
|
||
| @pytest.mark.skip(reason="Test is not working, needs investigation") | ||
| def test_add_remove_sys_mac_portchannel(self): |
There was a problem hiding this comment.
Test coverage is incomplete. The key failure paths are not covered, especially FRR command failure, CONFIG_DB write failure, and the TYPE_3_MAC_BASED branch.
| @click.pass_context | ||
| @click.argument('interface_name', metavar='<interface_name>', required=True) | ||
| @click.argument('esi_type', metavar='<esi_type>', required=True, | ||
| type=str, nargs=-1) |
There was a problem hiding this comment.
accepts extra arguments because nargs=-1 is used, but only the first token is actually parsed.
However, the del command instead has a test for "error due to extra parameters". --- test_del_extra_arg(self, cli_db_connection):
| if sag_entry: | ||
| mac_address = sag_entry.get('gateway_mac', 'unknown') | ||
| log.log_info(f"'static-anycast-gateway mac_address del {mac_address}' executing...") | ||
| db.cfgdb.mod_entry('SAG', 'GLOBAL', None) |
There was a problem hiding this comment.
removes the whole SAG|GLOBAL entry, not just the gateway_mac field, which is risky if more SAG global fields are added later.
| if vlan_entry and vlan_entry.get('static_anycast_gateway') != 'true': | ||
| ctx.fail("static-anycast-gateway is already disabled") | ||
|
|
||
| db.cfgdb.mod_entry('VLAN_INTERFACE', vlan, {"static_anycast_gateway": "false"}) |
There was a problem hiding this comment.
The disable logic is not fully consistent: an unset SAG state is treated differently from an explicitly disabled state.
A better fix is to define a single clear rule: only 'true' means enabled, and everything else means disabled.
vlan_entry = db.cfgdb.get_entry('VLAN_INTERFACE', vlan) or {}
if vlan_entry.get('static_anycast_gateway') != 'true':
ctx.fail("static-anycast-gateway is already disabled")
db.cfgdb.mod_entry('VLAN_INTERFACE', vlan, {"static_anycast_gateway": "false"})
| # Validate ESI format (XX:XX:XX:XX:XX:XX:XX:XX:XX:XX - 10 hex bytes separated by colons) | ||
| esi_pattern = r'^[0-9a-fA-F]{2}(:[0-9a-fA-F]{2}){9}$' | ||
| if not re.match(esi_pattern, es): | ||
| click.echo(f"Error: Invalid ESI format '{es}'. Expected format: XX:XX:XX:XX:XX:XX:XX:XX:XX:XX") |
There was a problem hiding this comment.
Invalid ESI/VNI input currently only prints an error and returns, so the command still exits with code 0.
It would be better to use ctx.fail(...) to make "illegal input" truly result in failure.
| body = [] | ||
|
|
||
| sag_entry = db.cfgdb.get_entry('SAG', 'GLOBAL') | ||
| if sag_entry: |
There was a problem hiding this comment.
inconsistent configuration is hidden: if VLAN interfaces have static_anycast_gateway=true but SAG|GLOBAL is missing, the command just prints an empty table instead of showing a clear problem.
The better fix is to make the command state-aware instead of silently printing an empty table.
enabled_vlans = [...]
sag_mac = sag_entry.get('gateway_mac') if sag_entry else None
if not sag_mac and not enabled_vlans:
# not configured
elif not sag_mac and enabled_vlans:
# inconsistent
else:
# normal display
Why I did it
This PR adds CLI commands to configure and monitor EVPN VXLAN Multihoming (EVPN-MH) feature in SONiC. These commands enable network operators to:
This complements the EVPN-MH infrastructure added in sonic-swss and sonic-swss-common.
Work item tracking
How I did it
Global EVPN-MH Configuration (evpn_mh.py):
config evpn-mh startup-delay <0-3600>- Configure EVPN-MH startup delay (default: 300s)config evpn-mh mac-holdtime <0-86400>- Configure MAC hold time (default: 1080s)config evpn-mh neigh-holdtime <0-86400>- Configure neighbor hold time (default: 1080s)Interface-level EVPN-ES Configuration (main.py):
4.
config interface evpn-esi add <interface> <esi>- Add EVPN Ethernet Segment Identifier00:XX:XX:XX:XX:XX:XX:XX:XX:XX- Operator-configured 10-byte ESIauto-system-mac- Auto-generated from system MACsystem_macfrom PORTCHANNEL table if configuredEVPN_ETHERNET_SEGMENTtableconfig interface evpn-esi del <interface>- Remove EVPN Ethernet Segment configurationconfig interface evpn-df-pref <interface> <1-65535>- Set DF preferenceShow Commands (evpn.py):
7.
show evpn- Display general EVPN information8.
show evpn es [<esi>]- Display Ethernet Segment informationshow evpn es-evi [<vni>]- Display Ethernet Segment per EVI informationshow evpn es-evi detail- Show detailed ES-EVI informationshow evpn l2-nh- Display L2 nexthop groups for EVPN all-active ESAll show commands leverage FRR's
show evpncommands viabgp_util.run_bgp_show_command().Testing (210+ test cases):
config_evpn_mh_test.py - Global EVPN-MH configuration tests
config_int_evpn_test.py - Interface EVPN-ES configuration tests
Helper Functions:
is_reserved_esi()- Validates against reserved ESI valuesparse_esi_input()- Parses and validates ESI input (Type 0 vs Type 3)port_id_from_if_name()- Extracts numeric port ID from interface namecheck_if_same_manual_esi_exists()- Prevents duplicate manual ESI configurationrun_vtysh_command()- Wraps vtysh execution for FRR integrationFiles Changed:
Total: +711 lines across 6 files
How to verify it
Global EVPN-MH Configuration:
Interface EVPN-ES Configuration:
Show Commands:
# Display EVPN information show evpn show evpn es show evpn es 00:01:02:03:04:05:06:07:08:09 show evpn es-evi show evpn es-evi 1000 show evpn es-evi detail show evpn l2-nhRun Unit Tests:
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Add CLI commands for EVPN VXLAN Multihoming: global EVPN-MH configuration, interface Ethernet Segment (ESI) configuration, DF preference settings, and EVPN show commands
Link to config_db schema for YANG model changes
Updates required for:
New table:
EVPN_MH_GLOBAL- Global EVPN-MH configurationstartup_delay: 0-3600 (default: 300)mac_holdtime: 0-86400 (default: 1080)neigh_holdtime: 0-86400 (default: 1080)New table:
EVPN_ETHERNET_SEGMENT- Per-interface ES configurationesi: 10-byte ESI string (e.g., "00:01:02:03:04:05:06:07:08:09") or "AUTO"type: "TYPE_0_OPERATOR_CONFIGURED" or "TYPE_3_MAC_BASED"df_pref: 1-65535 (default: 32767)YANG model updates should be submitted to sonic-yang-models repository.
Depends on
Key Features:
✅ Global EVPN-MH parameter configuration
✅ Type 0 (manual) and Type 3 (MAC-based) ESI support
✅ DF preference configuration for load balancing
✅ FRR integration via vtysh
✅ CONFIG_DB persistence
✅ Comprehensive validation and error handling
✅ 210+ unit test cases
✅ Show commands for monitoring EVPN state