Skip to content

Durable Agentic Harness — crash-safe autonomous AI agents with human-in-the-loop approval #118

@darshitvvora

Description

@darshitvvora

Project link

https://github.com/darshitvvora/durable-agentic-harness

Language

Python

Short description (max 256 chars)

An autonomous stock-trading agent (OpenAI Agents SDK) that reframes Temporal as the Durable OS for agentic AI: workers = scheduling, event history = autosave, signals = human-in-the-loop. Kill the worker mid-trade and the agent replays from the exact line.

Long Description

Temporal: The Durable OS for Agentic AI

Agents are easy to demo, hard to operate. Every production agent eventually hits the
same wall — LLM calls flake, workers crash mid-tool-call, human approvals stall for
hours, parallel work loses children on restart. The usual answer ("just add retries +
Redis + a state machine") is the long road to badly reinventing Temporal.

This demo reframes Temporal not as a workflow engine but as OS primitives for agent
loops
, underneath an autonomous OpenAI-Agents-SDK stock-trading agent:

OS primitive Temporal equivalent What it gives agents
Process scheduling Workers + task queues LLM/tool work dispatched durably
Autosave / journaling Event history Replay from the exact event after a crash
IPC / interrupts Signals, Updates & queries Human-in-the-loop, mid-flight steering
Memory / state Workflow state Survives restarts — no Redis, no S3 checkpoints
Drivers Activities Side-effects, retried & idempotent by default
Long-lived sleep workflow.sleep() Pause days/weeks at zero CPU cost

Who this is for / use cases:

The trading agent is the vehicle — the real subject is the pattern for any
long-running, autonomous, or human-supervised agent. Reach for this when:

  • Crash-safe agent loops — agents that run for minutes to weeks and must survive
    worker restarts, deploys, and infra failures without losing in-flight state or
    re-doing side-effects (orders placed, emails sent, payments made).
  • Human-in-the-loop approvals — workflows that pause indefinitely at zero CPU cost
    waiting on a human to approve/reject a high-stakes action (a large trade, a refund,
    a production change), then resume exactly where they left off.
  • Parallel fan-out / fan-in — exploring N strategies, prompts, or candidates
    concurrently in isolated sandboxes and selecting a winner, with automatic cleanup of
    children if the parent restarts.
  • Auditable AI decisions — every LLM call, tool call, and signal is a queryable
    event in history, giving you a replayable audit trail for compliance and debugging
    ("why did the agent do X at tick 14?").
  • LLM/tool reliability — wrapping flaky model and API calls as activities so they
    retry idempotently by default, instead of hand-rolling retry + backoff logic.

If you're building agentic systems and finding yourself bolting on Redis, Celery, a
retry library, and a state machine for approvals, this demo shows what those concerns
look like when Temporal owns them instead.

What the agent does:

  1. Discovers a strategy by fanning out N parallel sandboxed backtests in airgapped
    Docker containers (child workflows).
  2. Lives through a tick loop: market + news context, LLM trade-intent via the OpenAI
    Agents SDK (activity_as_tool), a deterministic risk guardrail, and human-in-the-loop
    approval for large trades.
  3. Survives chaos — kill the worker mid-trade and Temporal replays the decision
    history to resume from the exact line.

Stack: FastAPI (sole Temporal client) + SSE, React 18 / Vite / Tailwind UI,
temporalio[openai-agents] workflows wired via OpenAIAgentsPlugin, Docker sandboxes
for backtests, and Mockoon for offline/deterministic market·news·broker data.

What it strips out — that you'd otherwise write yourself: no Celery, no Redis-backed
queue, no hand-rolled retry policy, no "save progress to S3" code, no state machine for
approvals, no orphan-child cleanup. Temporal owns all of it; what's left on top is just
the agent logic.

What Linux did for processes, Temporal does for agent loops.

Screenshots and architecture/sequence diagrams are in the repo README.

Author(s)

Darshit Vora - Staff Solutions Architect @ Temporal
LinkedIn - linkedin.com/in/darshitvvora

Metadata

Metadata

Assignees

Labels

code exchange submissionCode and/or content about Temporal!triageIssues that Temporal folk need to look atziggy reviewedPre-screened by ZiggyBot

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions