Skip to content

Comments

Pawel plesniak/static port fix#741

Open
PawelPlesniak wants to merge 9 commits intodevelopfrom
PawelPlesniak/StaticPortFix
Open

Pawel plesniak/static port fix#741
PawelPlesniak wants to merge 9 commits intodevelopfrom
PawelPlesniak/StaticPortFix

Conversation

@PawelPlesniak
Copy link
Collaborator

@PawelPlesniak PawelPlesniak commented Dec 10, 2025

Description

Fixes #702

Includes basic checks for the root controller and local connectivity service applications, and points the user to the commands that should be run to address this.

Changelog

Prior to instantiating the stateful node, the unified shell parses out the root controller address and LCS address to validate that they are not yet in use.

NOTE - this command will not work in the case that configuration file has been cloned locally - i.e. if the configuration file lives on CVMFS, this will not work as CVMFS is a read-only path. You would need to clone it to see the difference.

Suggested testing methods

On the same physical host run

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config FirstInstanceOfSession
boot

then in a separate tty run

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config SecondInstanceOfSession

This will give you logs such as

Found free Kubernetes NodePort: 30510
Updated RC Controller Service 'root-rccontroller_control' to use port 30510
Successfully configured RC controller port for session 'local-1x1-config'.
[2026/02/12 14:32:49 UTC] INFO       shell.py:195                             drunc.unified_shell                                The root controller port at 30006 is occupied, updating it to 30510
Found free Kubernetes NodePort: 32030
Updated Connectivity Service 'local-connectivity-service' to use port 32030
Updated runtime environment variable 'local-env-connectivity-port' to '32030'
Successfully configured connectivity service port for session 'local-1x1-config'.
[2026/02/12 14:32:49 UTC] INFO       shell.py:208                             drunc.unified_shell                                The local connectivity service port at 30980 is occupied, updating it to 32030

Note - if you attempt to use two identital session IDs with the ehn1 conn srv, this will fail as the root controller port gets mapped over, but the "new" session is pointed to the "old" session in the connectivity service storage. The discussion of having an endpoint to check against existing session IDs has been proposed here.

All the integration tests have passed.

Type of change

  • Documentation (non-breaking change that adds or improves the documentation)
  • New feature (non-breaking change which adds functionality)
  • Optimization (non-breaking, back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (whatever its nature)

Key checklist

  • All tests pass (eg. python -m pytest)
  • Pre-commit hooks run successfully (eg. pre-commit run --all-files)

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added or an issue has been opened to tackle that in the future.
    (Indicate issue here: # (issue))

@PawelPlesniak
Copy link
Collaborator Author

Note - this has not been marked as "Ready for review" as the archiecture behind the integration tests that start the LCS separately has not yet been understood, hence all integration tests that start their own LCS fail. This will be corrected, but until then, this PR is on hold

Base automatically changed from prep-release/fddaq-v5.5.0 to develop December 15, 2025 17:19
@PawelPlesniak
Copy link
Collaborator Author

In one of the Run Control technical meetings, it was decided that the run control will be able to change the port number provided in the configuration, confirmed with @mroda88, this is a high priority item.

@PawelPlesniak
Copy link
Collaborator Author

Note - this PR is currently in progress as testing with EHN1 configurations fails the root controller address checks

@PawelPlesniak
Copy link
Collaborator Author

This is now addressed, requires testing only

@PawelPlesniak
Copy link
Collaborator Author

Integration tests pass

@PawelPlesniak
Copy link
Collaborator Author

Note - in commit 190fec5, I have removed all log handlers from the root logger, as daqconf scripts still use the old style logging.basicConfig style of configuring the logging, as opposed to the more modern daqpytools implementation.

@PawelPlesniak
Copy link
Collaborator Author

When testing, it was found that when without having a set KUBE_CONFIG, one would get the following error

[2026/02/17 14:20:23 UTC] ERROR      unified_shell.py:15                      drunc.unified_shell         
🔥🔥 Exception thrown 🔥🔥
[2026/02/17 14:20:23 UTC] ERROR      unified_shell.py:16                      drunc.unified_shell         
HTTPSConnectionPool(host='10.73.136.40', port=6443): Max retries exceeded with url: /api/v1/services 
(Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify 
failed: unable to get local issuer certificate (_ssl.c:1010)')))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: root-controller always points to port 30006

1 participant