Skip to content

feat: auto-abort zombie runs idle > 60 min via gateway chat.abort#11

Merged
rjcloudsigma merged 3 commits into
mainfrom
feat/auto-abort-zombie-runs
May 25, 2026
Merged

feat: auto-abort zombie runs idle > 60 min via gateway chat.abort#11
rjcloudsigma merged 3 commits into
mainfrom
feat/auto-abort-zombie-runs

Conversation

@rjcloudsigma

Copy link
Copy Markdown
Collaborator

Summary

Extends the stuck-run detector (PR #10) with a zombie auto-abort background task.

What it does

  • Every 60s (configurable), scans agent sessions for zombie runs (idle ≥ 60 min)
  • When TAAS_AFFINITY_AUTO_ABORT_ZOMBIES=true (off by default), aborts via injected abortRun(sessionKey) callback
  • Tracks aborted sessionKeys in an in-memory LRU-bounded array (cap 1000) for idempotency
  • Dry-run mode via TAAS_AFFINITY_AUTO_ABORT_DRY_RUN=true — logs what it would abort without calling abort

Env vars

Variable Default Description
TAAS_AFFINITY_AUTO_ABORT_ZOMBIES false Enable auto-abort (must be set to true)
TAAS_AFFINITY_AUTO_ABORT_THRESHOLD_MS 3600000 (60 min) Idle threshold for zombie classification
TAAS_AFFINITY_AUTO_ABORT_CHECK_INTERVAL_MS 60000 (1 min) Background check interval
TAAS_AFFINITY_AUTO_ABORT_DRY_RUN false Log candidates without aborting

⚠️ BLOCKED

The OpenClaw plugin SDK does not expose dispatchGatewayMethod, api.runtime.chat.abort(), or any equivalent for calling gateway RPC methods from inside a plugin. The default abortRun function is a no-op that logs a warning. The detection and logging infrastructure is complete and tested — when the SDK adds a gateway dispatch capability, wire it via setAbortRunFn() in the register() method.

Tests

  • AC-AUTO-ABORT.1: Auto-abort enabled → calls abortRun for each zombie
  • AC-AUTO-ABORT.2: Auto-abort disabled (default) → logs candidates without aborting
  • AC-AUTO-ABORT.3: Idempotent — same zombie only aborted once across ticks
  • AC-AUTO-ABORT.4: LRU cap — 1100 zombies processed, set stays ≤ 1000
  • AC-AUTO-ABORT.5: Dry-run mode — logs but never calls abortRun

Commits

  1. feat: detect zombies and emit candidates list (no-op default)
  2. feat: add BLOCKED note — no plugin SDK dispatch for chat.abort yet
  3. test: zombie auto-abort acceptance coverage

cloudsigma added 3 commits May 25, 2026 18:41
The OpenClaw plugin SDK does not expose dispatchGatewayMethod,
api.runtime.chat.abort(), or any equivalent. The auto-abort feature
detects zombies correctly but the default abortRun is a no-op that
logs a warning. When the SDK adds the capability, wire it in the
register() method.
@rjcloudsigma rjcloudsigma merged commit 7399c05 into main May 25, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant