Durable Agentic Harness — crash-safe autonomous AI agents with human-in-the-loop approval

### Project link

https://github.com/darshitvvora/durable-agentic-harness

### Language

Python

### Short description (max 256 chars)

An autonomous stock-trading agent (OpenAI Agents SDK) that reframes Temporal as the Durable OS for agentic AI: workers = scheduling, event history = autosave, signals = human-in-the-loop. Kill the worker mid-trade and the agent replays from the exact line.


### Long Description

**Temporal: The Durable OS for Agentic AI**

Agents are easy to demo, hard to operate. Every production agent eventually hits the
same wall — LLM calls flake, workers crash mid-tool-call, human approvals stall for
hours, parallel work loses children on restart. The usual answer ("just add retries +
Redis + a state machine") is the long road to badly reinventing Temporal.

This demo reframes Temporal not as a workflow engine but as **OS primitives for agent
loops**, underneath an autonomous OpenAI-Agents-SDK stock-trading agent:

| OS primitive | Temporal equivalent | What it gives agents |
|---|---|---|
| Process scheduling | Workers + task queues | LLM/tool work dispatched durably |
| Autosave / journaling | Event history | Replay from the exact event after a crash |
| IPC / interrupts | Signals, Updates & queries | Human-in-the-loop, mid-flight steering |
| Memory / state | Workflow state | Survives restarts — no Redis, no S3 checkpoints |
| Drivers | Activities | Side-effects, retried & idempotent by default |
| Long-lived sleep | `workflow.sleep()` | Pause days/weeks at zero CPU cost |

**Who this is for / use cases:**

The trading agent is the vehicle — the real subject is the *pattern* for any
long-running, autonomous, or human-supervised agent. Reach for this when:

- **Crash-safe agent loops** — agents that run for minutes to weeks and must survive
  worker restarts, deploys, and infra failures without losing in-flight state or
  re-doing side-effects (orders placed, emails sent, payments made).
- **Human-in-the-loop approvals** — workflows that pause indefinitely at zero CPU cost
  waiting on a human to approve/reject a high-stakes action (a large trade, a refund,
  a production change), then resume exactly where they left off.
- **Parallel fan-out / fan-in** — exploring N strategies, prompts, or candidates
  concurrently in isolated sandboxes and selecting a winner, with automatic cleanup of
  children if the parent restarts.
- **Auditable AI decisions** — every LLM call, tool call, and signal is a queryable
  event in history, giving you a replayable audit trail for compliance and debugging
  ("why did the agent do X at tick 14?").
- **LLM/tool reliability** — wrapping flaky model and API calls as activities so they
  retry idempotently by default, instead of hand-rolling retry + backoff logic.

If you're building agentic systems and finding yourself bolting on Redis, Celery, a
retry library, and a state machine for approvals, this demo shows what those concerns
look like when Temporal owns them instead.


**What the agent does:**
1. **Discovers** a strategy by fanning out N parallel sandboxed backtests in airgapped
   Docker containers (child workflows).
2. **Lives** through a tick loop: market + news context, LLM trade-intent via the OpenAI
   Agents SDK (`activity_as_tool`), a deterministic risk guardrail, and human-in-the-loop
   approval for large trades.
3. **Survives** chaos — kill the worker mid-trade and Temporal replays the decision
   history to resume from the exact line.

**Stack:** FastAPI (sole Temporal client) + SSE, React 18 / Vite / Tailwind UI,
`temporalio[openai-agents]` workflows wired via `OpenAIAgentsPlugin`, Docker sandboxes
for backtests, and Mockoon for offline/deterministic market·news·broker data.

What it strips out — that you'd otherwise write yourself: no Celery, no Redis-backed
queue, no hand-rolled retry policy, no "save progress to S3" code, no state machine for
approvals, no orphan-child cleanup. Temporal owns all of it; what's left on top is just
the agent logic.

*What Linux did for processes, Temporal does for agent loops.*

Screenshots and architecture/sequence diagrams are in the repo README.


### Author(s)

Darshit Vora - Staff Solutions Architect @ Temporal
LinkedIn - [linkedin.com/in/darshitvvora](https://www.linkedin.com/in/darshitvvora/)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Durable Agentic Harness — crash-safe autonomous AI agents with human-in-the-loop approval #118

Project link

Language

Short description (max 256 chars)

Long Description

Author(s)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

OS primitive	Temporal equivalent	What it gives agents
Process scheduling	Workers + task queues	LLM/tool work dispatched durably
Autosave / journaling	Event history	Replay from the exact event after a crash
IPC / interrupts	Signals, Updates & queries	Human-in-the-loop, mid-flight steering
Memory / state	Workflow state	Survives restarts — no Redis, no S3 checkpoints
Drivers	Activities	Side-effects, retried & idempotent by default
Long-lived sleep	`workflow.sleep()`	Pause days/weeks at zero CPU cost

Durable Agentic Harness — crash-safe autonomous AI agents with human-in-the-loop approval #118

Description

Project link

Language

Short description (max 256 chars)

Long Description

Author(s)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions