sprint/WINTEST: functional Pester test suite for Windows engine by muunkky · Pull Request #369 · PeonPing/peon-ping

muunkky · 2026-03-16T04:51:26Z

Stacks on #365 (SMARTPACK) and #366 (SMARTPACKDEBT). Review scope is the test suite only — 8 files, ~2,800 lines.

Motivation

The Windows hook engine (peon.ps1 embedded in install.ps1) had zero functional tests. Structural syntax checks existed in adapters-windows.Tests.ps1, but nothing verified that events actually route to the correct CESP categories, that config toggles suppress sounds, that state management (debounce, no-repeat, spam detection, TTL expiry) works, or that security boundaries hold. The SMARTPACKDEBT fixes were written and reviewed without any way to run them on Windows. This sprint builds the test infrastructure that proves the engine works.

What changed

Shared test harness (`tests/windows-setup.ps1`)

New-PeonTestEnvironment extracts the embedded peon.ps1 from install.ps1 using AST parsing, creates an isolated temp directory with mock win-play.ps1 (logs calls instead of playing audio), mock pack manifests with real CESP category structure, default config, and empty state. Supports ConfigOverrides and StateOverrides for per-test customization. Invoke-PeonHook pipes CESP JSON to the extracted script and captures exit code + audio log. New-CespJson generates well-formed event payloads.

Engine tests (`tests/peon-engine.Tests.ps1` — 17 scenarios)

Event routing (SessionStart, Stop, PermissionRequest, PostToolUseFailure, SubagentStart), notification suppression (permission_prompt, idle_prompt), Cursor camelCase remapping, config behavior (enabled toggle, category toggles, volume passthrough, missing config resilience), and state management (Stop debounce, no-repeat logic, spam detection threshold, session TTL expiry, corrupted state recovery, empty stdin).

Adapter tests (`tests/peon-adapters.Tests.ps1` — 12 scenarios)

Functional tests for all 12 Windows PowerShell adapters: each adapter is invoked with a mock event and verified to produce correct CESP JSON output with proper hook_event_name mapping. Also validates daemon mode flags, FileSystemWatcher usage patterns, and absence of ExecutionPolicy Bypass.

Security tests (`tests/peon-security.Tests.ps1` — 15 scenarios)

hook-handle-use.ps1: pack name injection, path traversal, nonexistent pack handling, CLI vs hook mode behavior. win-play.ps1: path traversal in file argument, volume clamping (negative, >1.0, non-numeric), missing file handling, backend-specific argument validation (ffplay, mpv, vlc, pwsh MediaPlayer).

Pack selection tests (`tests/peon-packs.Tests.ps1` — 7 scenarios)

Pack selection hierarchy: default_pack config, default_pack with active_pack legacy fallback, session_override via state, round-robin rotation, random rotation, session_override priority over rotation. Scenarios 4-7 (path_rules) deferred pending engine port.

CI integration

.github/workflows/test.yml: switched from explicit file list to Get-ChildItem -Filter *.Tests.ps1 auto-discovery so new test files are picked up automatically.

Fixes found during testing

install.ps1: removed stray backticks from merge artifacts causing parse errors
install.ps1: hardened config serialization extraction regex in the test harness
install.ps1: fixed stale active_pack test assertions from SMARTPACK renames
install.ps1: fixed session_override rotation mode handling

Verification

Real Windows 10 run (PowerShell 5.1):

Tests Passed: 46, Failed: 0, Skipped: 1 (74.14s)

The 1 skipped test is Scenario 14 (spam detection after 3 rapid UserPromptSubmit events) — blocked by a known ConvertTo-Hashtable bug where PS 5.1 pipeline semantics unwrap single-element arrays. Tracked as card 8ny6qr.

Risks and limitations

The test harness extracts peon.ps1 by parsing the install.ps1 here-string. If the embedding format changes, Extract-PeonHookScript will need updating.
Scenario 14 (spam detection) is skipped, not passing. The underlying engine bug exists in production.
Path_rules test scenarios 4-7 are deferred — the matching engine hasn't been ported to peon.ps1 yet (card rd6fu4).

Deferred work

Card	Description
8ny6qr	Fix ConvertTo-Hashtable array corruption (unblocks Scenario 14)
rd6fu4	Port path_rules to peon.ps1 + test scenarios 4-7
d3c6b0	Remove duplicate deepagents structural tests from peon-adapters
n5uqeo	Tighten security test assertion precision (VLC gain regex, exit code check)

…erride Completes the config key migration across all components: - Adapters (kilo, opencode): config templates use default_pack - TypeScript plugins: PeonConfig interface uses default_pack with active_pack as optional legacy fallback - install.ps1: all CLI commands and hook runtime read default_pack first with active_pack fallback, regex replacements write default_pack - install.sh: test sound lookup uses default_pack with fallback - hook-handle-use scripts: write session_override instead of agentskill - Skills docs: updated terminology throughout - Tests: updated assertions to match new key names, added legacy fallback test for TypeScript resolveActivePack

Reviewer approved commit 3f5a1f0. Routed executor close-out instructions and 1 planner card (DRY install.ps1 pack resolution) for sprint SMARTPACK.

All acceptance criteria verified as pre-existing: fnmatch-based path_rules matching in peon.sh, config.json template, override hierarchy, CLI commands (bind/unbind/bindings), and 9 BATS tests. peon.ps1 marked N/A (file does not exist in repo).

Verification-only card. All acceptance criteria confirmed as pre-existing in peon.sh, config.json, and BATS tests. No blockers found.

When a path_rule matches the current working directory, `peon status` now displays the matching rule (e.g., `path rule: */work/* -> glados`) in addition to the total count of configured rules.

Remove synchronous MediaPlayer/PresentationCore from peon.ps1 hook and win-play.ps1 to eliminate P0 deadlock caused by WPF dispatcher in headless PowerShell processes. - peon.ps1: replace inline audio block with Start-Process delegation to win-play.ps1 in a detached hidden window - peon.ps1: add 8-second System.Timers.Timer self-timeout before any I/O as safety net against unforeseen blocking - win-play.ps1: keep SoundPlayer for WAV, replace MediaPlayer with CLI player priority chain (ffplay -> mpv -> vlc) for non-WAV - install.ps1: print ffmpeg recommendation post-install if ffplay not found on PATH - Update Pester tests: assert zero MediaPlayer/PresentationCore refs, verify Start-Process delegation, Timer, and CLI player chain

…d5wz2f, review 1)

Review 1 approved at commit 57964e9. Routed executor close-out instructions and 1 backlog card (audio diagnostic logging) to planner.

Add write_state()/read_state() Python helpers (tempfile + os.replace) and Write-StateAtomic/Read-StateWithRetry PowerShell functions (PID-based temp + [System.IO.File]::Move). Replace all raw json.dump/Set-Content state writes and json.load/Get-Content state reads across peon.sh (main block + trainer blocks) and install.ps1 (embedded peon.ps1). Retry-on-read uses 50/100/200ms backoff with graceful fallback to empty defaults on corruption. BATS tests added for corrupted state recovery and concurrent Stop event safety.

…review 1)

Add Pack Selection Hierarchy table documenting the 5-layer override system (session_override > path_rules > pack_rotation > default_pack > hardcoded). Add Per-Project Pack Assignment section with bind/unbind CLI examples and manual config. Add bind/unbind/bindings CLI commands to Chinese README. Update llms.txt with hierarchy and bind/unbind context.

…tall.ps1 Consolidate the repeated default_pack -> active_pack -> "peon" fallback chain into a single Get-ActivePack helper function. Replaces ~10 inline expressions across both the installer script and the embedded peon.ps1 hook with calls to the helper. Also migrates the installer's initial config creation from active_pack to default_pack, aligning with the rename completed in peon.sh.

# Conflicts: # install.ps1 # tests/adapters-windows.Tests.ps1

…ack refactor Worktree merge for z0c9fd used --theirs which reverted HOOKBUG sprint changes (atomic state, audio delegation, safety timer). Restored install.ps1 from pre-merge state and manually applied Get-ActivePack helper extraction. 204/204 Pester tests pass.

APPROVAL routed to executor for close-out. Two BACKLOG items (Write-StateAtomic atomicity, ffplay install guidance) routed to planner for card creation.

Approved at commit 0a67a57. Executor gets close-out instructions. Planner gets 1 BACKLOG card (2 items: atomic state I/O hardening).

- opencode/kilo adapter tests: active_pack -> default_pack - hook-handle-use test: agentskill -> session_override

Add tests/windows-setup.ps1 with reusable helper functions: - Extract-PeonHookScript: extracts peon.ps1 from install.ps1 here-string - New-PeonTestEnvironment: creates isolated temp dir with config, state, mock packs, and mock win-play.ps1 audio logger - Invoke-PeonHook: pipes CESP JSON to peon.ps1 via Process API - New-CespJson, Get-PeonState, Get-PeonConfig, Get-AudioLog helpers Add tests/peon-engine.Tests.ps1 with 25 smoke tests validating the harness infrastructure and core peon.ps1 functional behavior: - Extraction produces valid PS syntax - Test env creates all required files and accepts overrides - SessionStart/Stop events play correct sounds - Disabled config skips audio - Mock win-play.ps1 logs calls without playing real audio Update CI workflow to run both Pester test files.

Add clarifying comment in tests/windows-setup.ps1 explaining that CLAUDE_PEON_DIR and PEON_TEST env vars exist for structural parity with the BATS harness and are not consumed by peon.ps1.

Add tests/windows-setup.ps1 with reusable helper functions: - Extract-PeonHookScript: extracts peon.ps1 from install.ps1 here-string - New-PeonTestEnvironment: creates isolated temp dir with config, state, mock packs, and mock win-play.ps1 audio logger - Invoke-PeonHook: pipes CESP JSON to peon.ps1 via Process API - New-CespJson, Get-PeonState, Get-PeonConfig, Get-AudioLog helpers Add tests/peon-engine.Tests.ps1 with 25 smoke tests validating the harness infrastructure and core peon.ps1 functional behavior: - Extraction produces valid PS syntax - Test env creates all required files and accepts overrides - SessionStart/Stop events play correct sounds - Disabled config skips audio - Mock win-play.ps1 logs calls without playing real audio Update CI workflow to run both Pester test files.

16 integration tests covering: - Pack name input validation (path traversal, shell injection, charset) - Session ID sanitization (malicious IDs fallback to "default") - Config/state mutation correctness (agentskill mode, pack_rotation) - Hook mode vs CLI mode behavior (stdin JSON vs arg) - win-play.ps1 WAV/MP3 branching (SoundPlayer vs CLI players) - Volume clamping at boundaries (0.0, 1.0) - Player priority chain (ffplay -> mpv -> vlc -> silent exit)

15 Pester tests covering the full pack selection override hierarchy: - Default pack fallback (active_pack, empty fallback to "peon") - Session override mode (per-session pack from state, agentskill alias) - Session override fallback (unmatched session, missing pack cleanup) - Default key for Cursor users without conversation_id - Pack rotation (random selection from array, single-pack array) - Edge cases (empty rotation, missing mode key, legacy string format) path_rules tests are deferred as the feature is not yet implemented in peon.ps1 (Windows) -- only exists in peon.sh (Unix).

New test file tests/peon-adapters.Tests.ps1 with 48 tests that actually execute adapter scripts with controlled input and verify JSON output shape. Category A (simple translators): codex, gemini, copilot, windsurf, kiro, openclaw, deepagents -- event mapping verified via mock peon.ps1 stdin capture. Category B (filesystem watchers): amp, antigravity, kimi -- pure functions (Emit-Event, Process-WireLine) extracted and tested in isolation. Category C (structural): deepagents.ps1 added to syntax validation and ExecutionPolicy Bypass checks in adapters-windows.Tests.ps1. Also adds edge case tests (missing peon.ps1, unknown events, no stdin) and CESP JSON shape validation across all Category A adapters.

Implements all test scenarios from card 1dnbzv covering: - Event routing: SessionStart, Stop, PermissionRequest, PostToolUseFailure, SubagentStart, Notification suppression, Cursor camelCase remap - Config behavior: enabled:false, category toggles, volume passthrough, missing config resilience - State management: Stop debounce, no-repeat sound selection, session TTL expiry, corrupted state recovery, empty stdin handling Scenario 14 (spam detection) is skipped due to a production bug in ConvertTo-Hashtable that corrupts prompt_timestamps arrays when reading state back from JSON -- single-element arrays become hashtables, preventing accumulation across invocations.

# Conflicts: # tests/peon-engine.Tests.ps1 # tests/windows-setup.ps1

Adds the new functional adapter test file to the Pester Run.Path array in test.yml so all 48 tests execute in CI on windows-latest.

Scenarios 1 and 7 asserted "agentskill" but hook-handle-use.ps1 sets "session_override". Updated assertions and scenario 7 description to match the actual source behavior. All 16 tests pass.

# Conflicts: # .github/workflows/test.yml

Item A: Replace brittle regex (?<=\d),(?=\d) with InvariantCulture enforcement before ConvertTo-Json. The regex corrupted integer arrays like [1,2,3] -> [1.2.3] on non-English locales. Now we save/restore CurrentCulture around the serialization call. Item B: Anchor here-string extraction on the unique marker comment "# peon-ping hook for Claude Code" inside install.ps1. Previously the regex hookScript = @'(.+?)'@ assumed exactly one here-string, which would silently misextract if a second were added. Card: WINTEST-xk4ymm

# Conflicts: # tests/windows-setup.ps1

Change $config.Run.Path from an explicit array of 5 test files to "tests/" so Pester auto-discovers all *.Tests.ps1 files. This ensures new test files are picked up without CI workflow edits. windows-setup.ps1 and hookbug-integration.ps1 are not *.Tests.ps1 files, so Pester correctly ignores them.

- Step 2A-2D: 4 parallel test cards (event routing, adapters, security, packs) - Step 2.5: harness hardening (locale serialization, extraction regex) - Step 3: CI auto-discovery for all Pester test files - Archived superseded card gtb6dm - Umbrella card j30alo checkboxes updated

vercel · 2026-03-16T04:51:32Z

@muunkky is attempting to deploy a commit to the Gary Sheng's projects Team on Vercel.

A member of the Team first needs to authorize it.

muunkky · 2026-03-16T15:45:47Z

Superseded by a clean PR that excludes .gitban/ project management content.

muunkky added 30 commits March 13, 2026 18:45

chore(gitban): route review-1 for aodz7v — APPROVAL with 1 FASTFOLLOW

f180244

Reviewer approved commit 3f5a1f0. Routed executor close-out instructions and 1 planner card (DRY install.ps1 pack resolution) for sprint SMARTPACK.

Merge branch 'worktree-agent-a725b4cd' into sprint/SMARTPACK

b818463

review: approve path_rules matching engine (card 0vvvnb, review 1)

bb3e6fd

Verification-only card. All acceptance criteria confirmed as pre-existing in peon.sh, config.json, and BATS tests. No blockers found.

chore: route approval for path-rules matching engine (card 0vvvnb)

19b3ac9

feat: show active path rule in peon status output

3bd6d47

When a path_rule matches the current working directory, `peon status` now displays the matching rule (e.g., `path rule: */work/* -> glados`) in addition to the total count of configured rules.

Merge branch 'worktree-agent-a2f10788' into sprint/SMARTPACK

624e7e1

Merge branch 'worktree-agent-ab27e9f7' into sprint/SMARTPACK

57964e9

review: approve async audio delegation and MediaPlayer removal (card …

f9237e7

…d5wz2f, review 1)

chore: route approval for async audio delegation (card d5wz2f)

4a9ca91

Review 1 approved at commit 57964e9. Routed executor close-out instructions and 1 backlog card (audio diagnostic logging) to planner.

fix: remove no-op Out-Null pipe from Start-Process in install.ps1

e647131

chore: add executor profiling log for kydihy

3ab8f6f

Merge branch 'worktree-agent-a375dfb4' into sprint/SMARTPACK

b002a52

fix: update Pester assertion for atomic state write function

5ed2ca7

review: approve atomic state writes for both platforms (card kydihy, …

13d4ece

…review 1)

chore: route approval for atomic state writes (card kydihy)

36d4068

chore: HOOKBUG sprint close-out — archive cards and generate summary

f986e20

test: add HOOKBUG integration tests for async audio and atomic state

81aa265

Merge remote-tracking branch 'origin/main' into sprint/SMARTPACK

553469c

chore: add executor profiling log for janrlf

30f1e35

Merge branch 'worktree-agent-a3e380e3' into sprint/SMARTPACK

99898e7

Merge branch 'worktree-agent-a5dc74c5' into sprint/SMARTPACK

91746e3

# Conflicts: # install.ps1 # tests/adapters-windows.Tests.ps1

chore: route janrlf review-1 approval to executor and planner

5780170

APPROVAL routed to executor for close-out. Two BACKLOG items (Write-StateAtomic atomicity, ffplay install guidance) routed to planner for card creation.

chore: route z0c9fd review-1 APPROVAL to executor and planner

ac17457

Approved at commit 0a67a57. Executor gets close-out instructions. Planner gets 1 BACKLOG card (2 items: atomic state I/O hardening).

muunkky added 26 commits March 15, 2026 13:49

fix: update stale test assertions for SMARTPACK renames

e4d0989

- opencode/kilo adapter tests: active_pack -> default_pack - hook-handle-use test: agentskill -> session_override

Merge branch 'worktree-agent-a6708a11' into sprint/WINTEST

0d965db

chore: route WINTEST q52ygy review-1 APPROVAL to executor and planner

4a9bc5a

chore: close out WINTEST q52ygy — add harness parity comment, mark done

549c6a9

Add clarifying comment in tests/windows-setup.ps1 explaining that CLAUDE_PEON_DIR and PEON_TEST env vars exist for structural parity with the BATS harness and are not consumed by peon.ps1.

chore: finalize executor log for frjune pack selection tests

30661cd

Merge branch 'worktree-agent-a3e3d654' into sprint/WINTEST

fa135b7

Merge branch 'worktree-agent-a6034502' into sprint/WINTEST

3f622ac

Merge branch 'worktree-agent-aaa084ba' into sprint/WINTEST

1126ba9

# Conflicts: # tests/peon-engine.Tests.ps1 # tests/windows-setup.ps1

chore: route WINTEST jwh5zl review-1 REJECTION to executor and planner

4d2405b

ci: add peon-adapters.Tests.ps1 to Windows CI workflow

7d5bc38

Adds the new functional adapter test file to the Pester Run.Path array in test.yml so all 48 tests execute in CI on windows-latest.

fix(test): correct pack_rotation_mode assertions in security tests

0ca4021

Scenarios 1 and 7 asserted "agentskill" but hook-handle-use.ps1 sets "session_override". Updated assertions and scenario 7 description to match the actual source behavior. All 16 tests pass.

Merge branch 'worktree-agent-a43ca62c' into sprint/WINTEST

4642099

# Conflicts: # .github/workflows/test.yml

Merge branch 'worktree-agent-a5f7db9c' into sprint/WINTEST

8ac2c7b

chore: approve WINTEST jwh5zl review-2 -- session_override fix verified

accef1f

Merge branch 'worktree-agent-a49a6253' into sprint/WINTEST

a629878

# Conflicts: # tests/windows-setup.ps1

chore: route WINTEST xk4ymm review-1 APPROVAL to executor and planner

17cca31

chore: close WINTEST sprint — archive 10 cards, finalize dispatch log

fe0024c

muunkky marked this pull request as draft March 16, 2026 05:01

muunkky closed this Mar 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sprint/WINTEST: functional Pester test suite for Windows engine#369

sprint/WINTEST: functional Pester test suite for Windows engine#369
muunkky wants to merge 95 commits intoPeonPing:mainfrom
muunkky:sprint/WINTEST

muunkky commented Mar 16, 2026

Uh oh!

vercel bot commented Mar 16, 2026

Uh oh!

muunkky commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

muunkky commented Mar 16, 2026

Motivation

What changed

Shared test harness (tests/windows-setup.ps1)

Engine tests (tests/peon-engine.Tests.ps1 — 17 scenarios)

Adapter tests (tests/peon-adapters.Tests.ps1 — 12 scenarios)

Security tests (tests/peon-security.Tests.ps1 — 15 scenarios)

Pack selection tests (tests/peon-packs.Tests.ps1 — 7 scenarios)

CI integration

Fixes found during testing

Verification

Risks and limitations

Deferred work

Uh oh!

vercel bot commented Mar 16, 2026

Uh oh!

muunkky commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Shared test harness (`tests/windows-setup.ps1`)

Engine tests (`tests/peon-engine.Tests.ps1` — 17 scenarios)

Adapter tests (`tests/peon-adapters.Tests.ps1` — 12 scenarios)

Security tests (`tests/peon-security.Tests.ps1` — 15 scenarios)

Pack selection tests (`tests/peon-packs.Tests.ps1` — 7 scenarios)