Skip to content

Conversation

@seratch
Copy link
Member

@seratch seratch commented Dec 24, 2025

This pull request resolves #636 by adding a Human-in-the-Loop (HITL) feature to the Python SDK, following a design similar to the TS SDK: https://openai.github.io/openai-agents-js/guides/human-in-the-loop/

Huge thanks to #2021, which served as the foundation for this PR.

Key changes include:

  • Introduces a RunState serialization and resume pipeline (RunResult.to_state / RunState.from_json), along with approval and rejection helpers, enabling Runner.run to pause for HITL decisions and resume interrupted runs.
  • Refactors the agent runner to surface tool-approval interruptions across function tools, apply_patch and shell execution, handoffs, and hosted MCP or local MCP tools. Per-tool decisions are tracked in RunContext while preserving usage and turn metadata.
  • Extends realtime agents' tool flows to emit approval-required events and honor approval policies.
  • HITL works while keeping previous_response_id and conversation_id tracking in sync for server-backed sessions.
  • Adds HITL-focused examples (agent patterns, memory sessions, hosted MCP with on-approval, realtime UI), along with new MCP tool filter, remote, and SSE samples and sample files.
  • Expands test coverage and helper utilities for HITL flows, run-state serialization, shell and apply_patch handling, run-step processing, MCP approvals, and related error scenarios.

Examples

Handling interruptions

result = await Runner.run(agent, "What is the weather and temperature in Oakland?")
while len(result.interruptions) > 0:
    # Process each interruption
    state = result.to_state()
    for interruption in result.interruptions:
        confirmed = await confirm("\nDo you approve this tool call?")
        if confirmed:
            print(f"✓ Approved: {interruption.name}")
            state.approve(interruption)
        else:
            print(f"✗ Rejected: {interruption.name}")
            state.reject(interruption)

    print("\nResuming agent execution...")
    result = await Runner.run(agent, state)

Streaming mode:

# Stream the run and drain events before checking interruptions.
result = Runner.run_streamed(agent, "What is the weather and temperature in Oakland?")
async for _ in result.stream_events():
    pass

while result.interruptions:
    state = result.to_state()
    for interruption in result.interruptions:
        confirmed = await confirm("\nDo you approve this tool call?")
        if confirmed:
            print(f"✓ Approved: {interruption.name}")
            state.approve(interruption)
        else:
            print(f"✗ Rejected: {interruption.name}")
            state.reject(interruption)

    print("\nResuming agent execution (streamed)...")
    result = Runner.run_streamed(agent, state)
    async for _ in result.stream_events():
        pass

Enable HITL for function tools

Simplest example:

@function_tool(needs_approval=True)
async def update_seat(confirmation_number: str, new_seat: str) -> str:

Passing function:

async def _needs_temperature_approval(_ctx, params, _call_id) -> bool:
    return "Oakland" in params.get("city", "")

@function_tool(needs_approval=_needs_temperature_approval)
async def get_temperature(city: str) -> str:

Enable HITL for hosted/local MCP server tools

agent = Agent(
    name="MCP Assistant",
    instructions="....",
    tools=[
        HostedMCPTool(
            tool_config={
                "type": "mcp",
                "server_label": "deepwiki",
                "server_url": "https://mcp.deepwiki.com/sse",
                # Add this
                "require_approval": "always",  # or "never"
                # more granular control
                # "require_approval": {"always": {"tool_names": ["do_something", "send_something"]}},
            }
        )
    ],
)

Agents as tools

When you turn an agent into a tool for a different agent, you can set HITL to the sub agent run. Not only that, the HITL settings for the sub agent's tools are merged into the result.interruptions as well.

class UserContext(BaseModel):
    user_id: str

# get approval during the sub agent execution
@function_tool(needs_approval=True)
async def get_user_name(user_id: str) -> str:
    return lookup_user_name(user_id)

contract_expert = Agent[UserContext](
    name="contract expert",
    instructions="You are a contract expert agent. You are responsible for handling the contract requests.",
    model_settings=ModelSettings(tool_choice="required"),
    tools=[get_user_name],
)
main_agent = Agent(
    name="customer service agent",
    instructions="You are a customer service agent. You are responsible for handling the customer's requests.",
    tools=[
        contract_expert.as_tool(
            tool_name="help_with_contract",
            tool_description="Help the customer with their contract questions",
            # get approval for this agent execution
            needs_approval=True,
        ),
    ],
)

Shell tools

agent = Agent(
    name="Shell HITL Assistant",
    model="gpt-5.2",
    instructions="You can run shell commands using the shell tool.",
    tools=[
        # ShellExecutor runs local shell commands when the call is approved.
        ShellTool(executor=ShellExecutor(), needs_approval=True)
    ],
)
result = await Runner.run(agent, prompt)
while result.interruptions:
    state = result.to_state()
    for interruption in result.interruptions:
        commands = _extract_commands(interruption)
        approved, always = await prompt_shell_approval(commands)
        if approved:
            state.approve(interruption, always_approve=always)
        else:
            state.reject(interruption, always_reject=always)

    result = await Runner.run(agent, state)

print(f"\nFinal response:\n{result.final_output}")

Realtime agents

When an approval is asked, your app will receive "tool_approval_required" events. Your app can display a confirmation popup etc. to the user.

async def approve_tool_call(self, session_id: str, call_id: str, *, always: bool = False):
    """Approve a pending tool call for a session."""
    session = self.active_sessions.get(session_id)
    if not session:
        return
    await session.approve_tool_call(call_id, always=always)

async def reject_tool_call(self, session_id: str, call_id: str, *, always: bool = False):
    """Reject a pending tool call for a session."""
    session = self.active_sessions.get(session_id)
    if not session:
        return
    await session.reject_tool_call(call_id, always=always)

@seratch seratch marked this pull request as ready for review December 24, 2025 04:37
@seratch
Copy link
Member Author

seratch commented Dec 24, 2025

I've finished the basic pattern testing, but there may still be some uncovered cases. If anyone is interested in trying this feature early using this git branch, your feedback would be greatly appreciated.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@seratch
Copy link
Member Author

seratch commented Dec 24, 2025

@codex review the whole changes

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Human-In-The-Loop Architecture should be implemented on top priority!

2 participants