Skip to content

cmd: add flux admin system-scripts to list status and details of prolog, epilog, and housekeeping configuration#7640

Open
grondo wants to merge 6 commits into
flux-framework:masterfrom
grondo:system-scripts-introspection
Open

cmd: add flux admin system-scripts to list status and details of prolog, epilog, and housekeeping configuration#7640
grondo wants to merge 6 commits into
flux-framework:masterfrom
grondo:system-scripts-introspection

Conversation

@grondo
Copy link
Copy Markdown
Contributor

@grondo grondo commented May 22, 2026

It was brought up in this week's meeting that a command to print what is going to run for prolog, epilog, and housekeeping would be very useful. While working on new prolog/housekeeping scripts for use with flux-pam, I came to heartily agree.

This PR introduces flux admin system-scripts, which prints the configuration status of prolog, epilog, and housekeeping along with what scripts will actually be run. For example on corona:

▸ prolog: enabled (per-rank=true)
  system: /usr/libexec/flux/prolog.d
    ✓ slingshot-cxi-alloc.sh
    ✓ zstop_nnf_clientmount.sh
  site:
    ✓ /etc/flux/system/prolog (legacy, skips site scripts)

▸ epilog: not configured

▸ housekeeping: enabled (release after 30s)
  system: /usr/libexec/flux/housekeeping.d
    ✓ slingshot-cxi-cleanup.sh
  site:
    ✓ /etc/flux/system/housekeeping (legacy, skips site scripts)

The script handles the unfortunately diverse array of possible conditions for configuration of these scripts, including but not limited to:

  • configuration status (enabled/not configured/inactive)
  • per-rank or rank 0 only for prolog/epilog
  • scripts from both system and site directories with executable status
  • custom commands when configured (vs the default of flux-imp run {prolog,epilog,housekeeping}
  • legacy single-file scripts (shows they skip site directory)
  • Works for guest users (non-instance owners) with appropriate warnings
  • Verbose mode explores unconfigured systems to see available scripts
  • Warns about common issues: perilog plugin not loaded (if running as instance owner), unexpected flux-imp path (e.g. if running from a builddir against system instance with differing builtin conf)

@grondo grondo requested a review from wihobbs May 22, 2026 23:50
Comment thread src/cmd/flux-admin.py Fixed
Comment thread src/cmd/flux-admin.py Fixed
Comment thread src/cmd/flux-admin.py Fixed
Comment thread src/cmd/flux-admin.py Fixed
Comment thread src/cmd/flux-admin.py Fixed
grondo and others added 5 commits May 22, 2026 17:03
Problem: Flux's system script configuration (prolog, epilog, and
housekeeping) is complex and difficult for admins to understand.
Scripts can be in multiple directories, may use legacy single-file
format, and configuration spans multiple areas. Admins need a way
to see what scripts will actually be executed.

Add `flux admin system-scripts` command that shows:
- Configuration status (enabled/disabled, execution mode, timeouts)
- Script discovery from system and site directories
- Validation of script executability
- Support for legacy single-file format

The command is concise by default, showing only configured systems.
Use `-v/--verbose` to see all discovered scripts even when not
configured. Color output (via `--color`) helps quickly identify
issues like non-executable scripts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Problem: There is no test coverage for the new flux admin
system-scripts command. Without tests, regressions could be
introduced and the command's behavior is not validated.

Add comprehensive test coverage for flux admin system-scripts,
including:

- Basic command functionality and help output
- Default compact output vs verbose mode
- Color option handling (--color=always/never/auto, NO_COLOR env var)
- Script discovery and display from filesystem
- Executable vs non-executable script detection
- Legacy single-file format support
- Live configuration testing with flux config load
- Prolog/epilog/housekeeping enabled states
- Configuration parameters (per-rank, timeout, release-after)
- Handling of missing directories and edge cases
- Output format and width validation

The test uses the sharness framework and runs under test_under_flux
to verify behavior with a running instance. The test is added to
t/Makefile.am so it runs as part of `make check`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Problem: There's no manpage for the flux-admin(1) command, but the
recently added system-scripts subcommand needs documentation.

Add a flux-admin(1) command document both `cleanup-push` and
`system-scripts`.
Problem: The admin guide does not mention the new `flux admin
system-scripts` command for verifying system script configuration.

Add a "Verifying Script Configuration" section to the admin guide
that explains how to use `flux admin system-scripts` to view and
verify prolog/epilog/housekeeping configuration. Include example
output and explain what the command displays, particularly clarifying
that the per-rank setting controls where scripts execute (all nodes
vs rank 0), not which scripts execute.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Problem: The new `flux admin system-scripts` subcommand has no
bash completions.

Add completions for the system-scripts subcommand and its options
(-v, --verbose, --color=).

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
@grondo grondo force-pushed the system-scripts-introspection branch from b1c0fd3 to 03896bf Compare May 23, 2026 00:03
Problem: The flux admin system-scripts command fails on el8
builders with ASCII encoding errors when trying to print Unicode
checkmarks (✓ and ✗).

Reopen stdout with UTF-8 encoding at the start of main(), following
the pattern used in other Flux commands. This allows Unicode
characters to be printed correctly even when the default system
encoding is ASCII.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
@grondo grondo force-pushed the system-scripts-introspection branch from 03896bf to 67c02ec Compare May 23, 2026 00:12
@codecov
Copy link
Copy Markdown

codecov Bot commented May 23, 2026

Codecov Report

❌ Patch coverage is 89.50276% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.00%. Comparing base (9855252) to head (67c02ec).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/cmd/flux-admin.py 89.50% 19 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7640      +/-   ##
==========================================
+ Coverage   83.96%   84.00%   +0.03%     
==========================================
  Files         576      576              
  Lines       96985    97166     +181     
==========================================
+ Hits        81435    81624     +189     
+ Misses      15550    15542       -8     
Files with missing lines Coverage Δ
src/cmd/flux-admin.py 89.70% <89.50%> (-1.60%) ⬇️

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants