Adjust select timeouts during port update handling to allow for faster transceiver DOM polling#758
Open
aditya-nexthop wants to merge 10 commits intosonic-net:masterfrom
Open
Conversation
Handle port updates before starting the DOM polling loop with a 1 sec timeout. When looking for port updates during DOM polling, reduce timeout to 100 msec This allows DOM polling to complete in a reasonable amount of time. Signed-off-by: aditya-nexthop <aditya@nexthop.ai>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: aditya-nexthop <aditya@nexthop.ai>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves xcvrd DOM polling responsiveness by reducing time spent blocking on port-update events, so full-port DOM polling can complete closer to the intended periodic interval.
Changes:
- Add a configurable select timeout to
PortChangeObserver.handle_port_update_event()so callers can use shorter waits during DOM polling. - Refactor DOM loop port-update handling into a reusable
check_port_update()helper and introduce a two-tier timeout strategy (slow/fast). - Add unit tests for the new
check_port_update()behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| sonic-xcvrd/xcvrd/xcvrd_utilities/port_event_helper.py | Makes port-update event selection timeout configurable (enables fast/slow modes). |
| sonic-xcvrd/xcvrd/dom/dom_mgr.py | Restructures the DOM update loop and introduces two timeout constants plus check_port_update(). |
| sonic-xcvrd/tests/test_xcvrd.py | Adds unit coverage for DomInfoUpdateTask.check_port_update(). |
prgeor
reviewed
Mar 10, 2026
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: aditya-nexthop <aditya@nexthop.ai> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: aditya-nexthop <aditya@nexthop.ai> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
5358dbb to
e5c5a2f
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Signed-off-by: aditya-nexthop <aditya@nexthop.ai>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: aditya-nexthop <aditya@nexthop.ai>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: aditya-nexthop <aditya@nexthop.ai>
mihirpat1
reviewed
Mar 12, 2026
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: aditya-nexthop <aditya@nexthop.ai>
ba1e526 to
673d7c7
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…op start time. Signed-off-by: aditya-nexthop <aditya@nexthop.ai>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
mihirpat1
approved these changes
Mar 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR optimizes the DOM (Digital Optical Monitoring) polling loop in xcvrd by improving port update event handling and reducing unnecessary wait times. The changes include:
check_port_update()method for better code organization and reusabilityfixes #759 and builds on #757
Motivation and Context
Before the change, the DOM monitoring loop would wait up to 1 second (
SELECT_TIMEOUT_MSECS) for port update events during each iteration of the physical port loop. This caused significant delays in DOM data collection, especially on systems with many ports.Problem: With the 1-second timeout being called for every physical port, the DOM polling could take an excessive amount of time to complete, delaying DOM polling updates.
Solution: By separating port update handling from DOM polling and using a shorter 100ms timeout during the polling phase, the loop can complete much faster while still being responsive to port change events. The 1-second timeout is only used when explicitly waiting for port updates before starting the next DOM polling cycle.
How Has This Been Tested?
Unit Tests Added - Comprehensive test coverage for the new
check_port_update()method including:CPU usage profiling: Measured for 10 minutes after restarting
xcvrdon a switch fully populated with optical transceivers.The CPU usage is slightly higher during active polling as it spends less time waiting (100ms) between interfaces and then the loop spends time waiting for port change updates in 1s chunks.
Additional Information (Optional)
Key Technical Changes:
PORT_UPDATE_EVENT_SELECT_TIMEOUT_MSECS(1000ms) andPORT_UPDATE_EVENT_SELECT_TIMEOUT_FAST_MSECS(100ms)PortChangeObserver.handle_port_update_event()to accept a configurable timeout parameterdom_loop_start_timeto maintain consistent intervals regardless of loop execution timeBackward Compatibility: This change is fully backward compatible and does not affect the external API or configuration.