[thermalctld] Use platform API for PSU fan name#767
[thermalctld] Use platform API for PSU fan name#767marvellparthiv wants to merge 1 commit intosonic-net:masterfrom
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Fix PSU fan parent name mismatch between psud and thermalctld.
psud was updated to use get_name() API for PSU keys in STATE_DB,
but thermalctld still used index-based 'PSU {index}' format. This
inconsistency caused the SNMP subagent to fail when resolving the
parent of PSU fans (e.g., PSU2_FAN1), as the parent name in
FAN_INFO did not match any PSU key in PSU_INFO.
Add get_psu_key() method to FanUpdater that retrieves the PSU name
via chassis.get_psu().get_name(), aligning with psud's behavior.
Falls back to 'PSU {index+1}' on NotImplementedError or
AttributeError (e.g., when get_psu() returns None for an
out-of-range index).
Add unit tests for get_psu_key() covering:
- Successful PSU name retrieval from platform API
- AttributeError fallback when get_psu() returns None
- NotImplementedError fallback when get_name() is not implemented
- Fallback when chassis is None
Signed-off-by: Parthiv Shah <parthiv@marvell.com>
7ff8128 to
64993ae
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR aligns thermalctld’s PSU fan parent naming with psud by using the platform PSU get_name() API when generating the PSU “parent_name” key, preventing FAN_INFO → PSU_INFO parent resolution mismatches (notably impacting SNMP entity relationships).
Changes:
- Add
FanUpdater.get_psu_key()to derive PSU keys viachassis.get_psu(...).get_name()with fallback to legacyPSU {index+1}. - Update PSU fan parent naming in
_refresh_fan_status()to useget_psu_key(). - Add unit tests covering several
get_psu_key()fallback paths.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| sonic-thermalctld/scripts/thermalctld | Adds get_psu_key() and switches PSU fan parent_name to platform-derived PSU naming. |
| sonic-thermalctld/tests/test_thermalctld.py | Adds unit tests for get_psu_key() behavior and fallbacks. |
Comments suppressed due to low confidence (1)
sonic-thermalctld/scripts/thermalctld:325
- _refresh_fan_status() now uses get_psu_key() for PSU parent_name. If get_psu_key() returns None or raises, this can produce invalid fan_name defaults (e.g., 'None fan 1') and/or prevent publishing PSU fan entries. Ensuring get_psu_key() always returns a valid string fallback will avoid corrupt/unstable keys in FAN_INFO/PHYSICAL_ENTITY_INFO.
drawer_name = NOT_AVAILABLE if fan_type != FanType.DRAWER else str(try_get(parent.get_name))
if fan_type == FanType.PSU:
parent_name = self.get_psu_key(parent_index)
elif fan_type == FanType.MODULE:
parent_name = try_get(parent.get_name, default='Module {}'.format(parent_index + 1))
else:
parent_name = drawer_name if drawer_name != NOT_AVAILABLE else CHASSIS_INFO_KEY
fan_name = try_get(fan.get_name, '{} fan {}'.format(parent_name, fan_index + 1))
update_entity_info(self.phy_entity_table, parent_name, fan_name, fan, fan_index + 1)
| try: | ||
| return self.chassis.get_psu(psu_index).get_name() | ||
| except (NotImplementedError, AttributeError): | ||
| pass | ||
| return 'PSU {}'.format(psu_index + 1) |
There was a problem hiding this comment.
get_psu_key() may raise IndexError/other exceptions from chassis.get_psu() or psu.get_name(), which will cause PSU fan updates to be skipped (caught by update() and logged) instead of cleanly falling back to the legacy key. Consider catching broader exceptions (e.g., IndexError/Exception) and also falling back when get_name() returns None/empty, so this helper always returns a usable string key.
| def get_psu_key(self, psu_index): | ||
| if self.chassis is not None: |
There was a problem hiding this comment.
FanUpdater.get_psu_key() is a new public helper but it currently has no docstring, while the surrounding public methods in this file are documented. Please add a short docstring describing expected indexing (0-based vs 1-based) and fallback behavior.
| def test_get_psu_key_returns_psu_name(self): | ||
| """Test get_psu_key returns PSU name from platform API""" | ||
| chassis = MockChassis() | ||
| psu = MockPsu() | ||
| psu._name = 'PSU0' | ||
| chassis._psu_list.append(psu) | ||
| fan_updater = thermalctld.FanUpdater(chassis, threading.Event()) | ||
| assert fan_updater.get_psu_key(0) == 'PSU0' | ||
|
|
||
| def test_get_psu_key_attribute_error(self): | ||
| """Test get_psu_key falls back when get_psu returns None (AttributeError on .get_name())""" | ||
| chassis = MockChassis() | ||
| # No PSUs added, so get_psu(index) returns None and .get_name() raises AttributeError | ||
| fan_updater = thermalctld.FanUpdater(chassis, threading.Event()) | ||
| assert fan_updater.get_psu_key(0) == 'PSU 1' | ||
| assert fan_updater.get_psu_key(1) == 'PSU 2' | ||
|
|
||
| def test_get_psu_key_not_implemented_error(self): | ||
| """Test get_psu_key falls back to 1-based 'PSU <index+1>' when get_name raises NotImplementedError""" | ||
| chassis = MockChassis() | ||
| psu = MockPsu() | ||
| psu.get_name = mock.MagicMock(side_effect=NotImplementedError) | ||
| chassis._psu_list.append(psu) | ||
| fan_updater = thermalctld.FanUpdater(chassis, threading.Event()) | ||
| assert fan_updater.get_psu_key(0) == 'PSU 1' | ||
|
|
||
| def test_get_psu_key_chassis_none(self): | ||
| """Test get_psu_key falls back to 1-based 'PSU <index+1>' when chassis is None""" | ||
| fan_updater = thermalctld.FanUpdater(None, threading.Event()) | ||
| assert fan_updater.get_psu_key(0) == 'PSU 1' | ||
|
|
There was a problem hiding this comment.
The new get_psu_key() tests cover AttributeError/NotImplementedError and chassis=None, but they don't cover cases where chassis.get_psu() (or get_name()) raises IndexError/Exception or where get_name() returns None. Adding tests for these paths will help ensure thermalctld remains resilient across platform implementations (similar to psud's get_psu_key coverage).
Fix PSU fan parent name mismatch between psud and thermalctld.
Description
psud was updated to use get_name() API for PSU keys in STATE_DB (#102), but thermalctld still used index-based 'PSU {index}' format. This inconsistency caused the SNMP subagent to fail when resolving the parent of PSU fans (e.g., PSU2_FAN1), as the parent name in FAN_INFO did not match any PSU key in PSU_INFO.
Add get_psu_key() method to FanUpdater that retrieves the PSU name via chassis.get_psu().get_name(), aligning with psud's behavior. Falls back to 'PSU {index+1}' on NotImplementedError or AttributeError (e.g., when get_psu() returns None for an out-of-range index).
Add unit tests for get_psu_key() covering:
Successful PSU name retrieval from platform API
AttributeError fallback when get_psu() returns None
NotImplementedError fallback when get_name() is not implemented
Fallback when chassis is None
Motivation and Context
The SNMP subagent resolves fan-to-parent relationships by looking up the parent name from FAN_INFO in PSU_INFO. After psud switched to using the platform API's get_name() for PSU keys, thermalctld's index-based naming ('PSU 1', 'PSU 2') no longer matched, breaking SNMP fan parent resolution for PSU fans. This change aligns thermalctld with psud so both daemons use consistent PSU naming in STATE_DB.
How Has This Been Tested?
Confirmed SNMP subagent can correctly resolve the parent of PSU fans (e.g., PSU2_FAN1) after the fix.
Added unit tests validating all get_psu_key() code paths including AttributeError, NotImplementedError, and None chassis fallback.
Additional Information (Optional)