Skip to content

Follow-up on #1457: v0.2.4 works in WSL2 after clearing orphaned processes — findings + available actor actions #1567

Description

@Apollyon81

Summary

Issue #1457 (actors crash on action calls in Engine mode) was closed as completed, and the fix (commit 4c2b9bb) is included in v0.2.4. However, when running @rivet-dev/agentos v0.2.4 on WSL2 Ubuntu (not Docker), the sidecar would die after 3.5–6.5 seconds with Connection reset without closing handshake (close code 1011), and actions would hang indefinitely or return internal_error.

After extensive debugging, we found that v0.2.4 does work correctly — the issue was environmental. Sharing our findings in case others hit the same.

Environment

  • Host: Windows 11 + WSL2 Ubuntu 24.04
  • Node: v22.23.1
  • Packages: @rivet-dev/agentos v0.2.4, @rivet-dev/agentos-core v0.2.4, @secure-exec/sidecar v0.3.3
  • Engine: rivet-engine v2.3.2 (from @rivetkit/engine-cli-linux-x64-musl 0.0.0-feat-dylib-actor-plugin.c44621f)
  • Platform: @rivet-dev/agentos-plugin-linux-x64-gnu v0.2.4 (libagentos_actor_plugin.so)

What was happening

Sidecar crash in WSL2

The sidecar (secure-exec-sidecar) would connect via websocket, then die after 3.5–6.5 seconds:

envoy websocket closed ... lifetime_seconds=5.873 ... incoming_close_code=Some(1011) ... err=Some("Connection reset without closing handshake")

This caused actor_ready_timeout or indefinite hangs on any action call.

Attempted fix: building .so from main

We cloned the repo, ran scripts/secure-exec-dep.mjs prepare-build (clone secure-exec @ 0bf7dcb + local path deps), and built libagentos_actor_plugin.so + agentos-sidecar from main (HEAD 95aa124, Jun 30).

Result: The .so from main loaded without crashing, and the sidecar stayed alive (15.8s, clean close with code 1000). However:

  • createSession timed out with actor_ready_timeout (actor started but client timed out waiting for readiness)
  • Replacing the TS forwarder (@rivet-dev/agentos dist) from main caused parse config_json: agent-os config JSON parse error — schema mismatch between main's agentos-core and the npm v0.2.4 packages

What actually worked: clean v0.2.4

After restoring all packages to npm v0.2.4, everything works:

✅ createSession (pi)      → { sessionId: "4b497ab8-..." }
✅ createSession (opencode) → { sessionId: "ses_0e45e7320f..." }
✅ listPersistedSessions   → [{ sessionId, agentType, createdAt }]
✅ getSessionEvents        → []
✅ sendPrompt              → works
✅ closeSession            → works

The original sidecar crash was likely caused by orphaned rivet-engine processes from previous runs. After pkill -9 -f rivet-engine before each start, v0.2.4 works reliably.

Available actor actions in v0.2.4

For reference, the actor plugin exposes these actions (found in crates/agentos-actor-plugin/src/actions/mod.rs):

Action Args Returns
createSession (agentType: string, options?: CreateSessionOptions) { sessionId: string }
sendPrompt (sessionId: string, text: string) result
closeSession (sessionId: string) ()
listPersistedSessions () Session[]
getSessionEvents (sessionId: string) Event[]
respondPermission (sessionId, permissionId, reply) ()
createSignedPreviewUrl (port: u16, ttlSeconds: u64) dto
expireSignedPreviewUrl (token: string) ()

Note: listAgents, listSessions, getSession are not actor actions — they're methods on the AgentOs class (@rivet-dev/agentos-core), used inside the VM. listAgents info is available client-side via AGENT_CONFIGS export from agentos-core.

PR #1557

We also investigated PR #1557 ("re-pin @secure-exec/* to a durable main release"). This is CI hygiene only (re-pinning from branch preview to main for cargo prepare-build durability). It does not touch the actor plugin and is not related to the #1457 bug. Our installed @secure-exec/* is v0.3.3 (stable), not the preview versions the PR re-pins.

Conclusion

  • v0.2.4 works correctly in WSL2 — the agentOs() actors crash on action calls in Engine (runner) mode #1457 fix (4c2b9bb) is included and functional
  • The sidecar crash was environmental (orphaned processes), not a code bug
  • Building from main is not viable for end users due to schema mismatches between main and published v0.2.4 (requires full workspace build + @agentos-software/manifest which isn't on npm)
  • A v0.2.5+ npm release with the latest main commits would be welcome for non-WSL environments (Docker) where the actor_ready_timeout race still occasionally occurs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions