DurableAgent known limitations: sandbox CPU billing, abort signal, double billing, verify loop context

## Context

We are building a production multi-agent AI orchestration platform using DurableAgent from @workflow/ai. The migration from a monolithic "use step" approach to DurableAgent (per-LLM-call + per-tool step isolation) is working but has several architectural limitations that affect correctness and billing accuracy.

Related: #1737 (development mode performance), #1315 (linear step overhead growth), #1160 (queue delay)

## Environment

- workflow: 4.2.1
- @workflow/ai: 4.1.1
- @workflow/core: 4.2.1
- Next.js: 16.1.6
- AI SDK: 6.x

## Limitation 1: No AbortSignal in V8 Sandbox (Credit Circuit Breaker)

The "use workflow" function runs in a V8 sandbox where AbortController does not exist. When a user's credit balance crosses the overdraft limit during onStepFinish, we cannot abort the DurableAgent mid-execution. We use a flag checked in prepareStep instead, which means the agent completes its current LLM call before stopping, potentially overspending by one response.

The original non-durable orchestrator uses AbortController.abort() to kill streamText immediately. DurableAgent accepts abortSignal in its stream() options, but creating an AbortController at workflow level throws ReferenceError.

Question: Is there a way to trigger DurableAgent's abort from within onStepFinish or prepareStep? Or could the V8 sandbox expose AbortController?

## Limitation 2: Double Billing Risk on Lambda Crash + Retry

onStepFinish is a side-effect callback, not a workflow step. If a Lambda crashes after onStepFinish fires (credits deducted) but before the step result is persisted, the SDK retries the step. The retried step's onStepFinish fires again, deducting credits a second time.

The creditsAlreadyDeducted counter resets per-agent (necessary for chained agents), so it cannot detect the duplicate.

Question: Does DurableAgent suppress onStepFinish for cached/replayed steps? Or does it fire every time, even on replay? If the SDK could pass a "isReplay" flag to onStepFinish, we could skip billing on replays.

## Limitation 3: Verify Loop Context (Coding Agent Fix Cycles)

After the coding agent completes, a verify loop checks the sandbox for errors and runs fix cycles using streamText. The fix cycle needs tool access (sandboxBash, sandboxWriteFile, etc.) to fix errors.

The problem: runVerifyLoopStep is a "use step" function that calls streamText with tools from buildDurableTools(config.toolSchemas). Those tools are themselves "use step" functions. This creates nested steps (a step calling sub-steps), which is not officially supported.

Additionally, the fix cycle uses config.messages (pre-agent messages) instead of result.messages (post-agent with all tool call results), so the fix agent cannot see what was already built.

Question: Is there a supported pattern for running a secondary agent loop after the primary DurableAgent completes? Could the verify loop be restructured as a separate workflow-level agent call instead of a nested step?

## Limitation 4: Sandbox CPU Billing (Upstash Box)

Each sandbox tool wrapper reconnects to the Upstash Box via reconnectUpstashBox(boxId). The reconnected adapter creates a fresh activeCpuMs counter starting at 0. The original orchestrator maintains a single adapter across the entire session, accumulating CPU time accurately.

The Upstash Box client SDK does not expose server-side CPU metrics. The only way to track CPU is the adapter's local counter, which resets on each reconnection.

This means sandbox compute goes unbilled in durable mode. In production, this is a revenue leak.

Question: This is primarily an Upstash Box SDK limitation, but is there a way to maintain persistent state (like a CPU counter) across workflow steps without serializing the adapter itself?

## Summary

| Limitation | Impact | Workaround Available? |
|---|---|---|
| No AbortSignal in V8 sandbox | One extra LLM call before credit circuit breaker stops agent | Flag + prepareStep toolChoice:"none" |
| Double billing on retry | Potential 2x charge for crashed steps | None currently |
| Nested steps in verify loop | Not officially supported, fix agent lacks context | Works in practice but fragile |
| Sandbox CPU billing | Compute goes unbilled | None without server-side metrics |

We are happy to contribute fixes or test proposed solutions. These are the last blockers before production deployment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DurableAgent known limitations: sandbox CPU billing, abort signal, double billing, verify loop context #1762

Context

Environment

Limitation 1: No AbortSignal in V8 Sandbox (Credit Circuit Breaker)

Limitation 2: Double Billing Risk on Lambda Crash + Retry

Limitation 3: Verify Loop Context (Coding Agent Fix Cycles)

Limitation 4: Sandbox CPU Billing (Upstash Box)

Summary

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Limitation	Impact	Workaround Available?
No AbortSignal in V8 sandbox	One extra LLM call before credit circuit breaker stops agent	Flag + prepareStep toolChoice:"none"
Double billing on retry	Potential 2x charge for crashed steps	None currently
Nested steps in verify loop	Not officially supported, fix agent lacks context	Works in practice but fragile
Sandbox CPU billing	Compute goes unbilled	None without server-side metrics

Uh oh!

DurableAgent known limitations: sandbox CPU billing, abort signal, double billing, verify loop context #1762

Description

Context

Environment

Limitation 1: No AbortSignal in V8 Sandbox (Credit Circuit Breaker)

Limitation 2: Double Billing Risk on Lambda Crash + Retry

Limitation 3: Verify Loop Context (Coding Agent Fix Cycles)

Limitation 4: Sandbox CPU Billing (Upstash Box)

Summary

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions