You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Propose a GuardrailProvider protocol that intercepts tool calls before execution, enabling policy-based approval, audit logging, and argument sanitization. This plugs into the existing BaseTool.run_json() and Workbench.call_tool() paths without breaking backward compatibility.
Motivation
AutoGen currently has no standardized hook point between an agent deciding to call a tool and the tool executing. The community has raised this gap from multiple angles:
Guardrails and Safety #6017 -- Guardrails and Safety epic: comments call for "a scanning layer that inspects messages between agents" and auditing tool calls at agent boundaries.
Support Approval Func in BaseTool in AgentChat #5891 -- Support Approval Func in BaseTool: proposes an approval_func parameter on BaseTool, with open design questions about whether approval belongs at the tool or agent level.
Agentic Identity and Access Management (AIAM) #5921 -- Agentic Identity and Access Management (AIAM): identifies ten enterprise gaps including excessive permissions, missing audit trails, and inconsistent policy enforcement.
Issue #5891 tackles the approval surface specifically but scopes it to a boolean gate. A GuardrailProvider generalizes this to support argument rewriting, structured denial reasons, audit metadata, and composable policy chains -- all concerns raised across the issues above.
Proposed Interface
from __future__ importannotationsfromabcimportabstractmethodfromdataclassesimportdataclass, fieldfromenumimportEnumfromtypingimportAny, Mapping, Protocol, Sequence, runtime_checkablefromautogen_coreimportCancellationTokenclassDecision(Enum):
ALLOW="allow"DENY="deny"MODIFY="modify"@dataclassclassGuardrailResult:
"""Outcome of a guardrail evaluation."""decision: Decisionreason: str|None=Nonemodified_args: Mapping[str, Any] |None=None# only when Decision.MODIFYmetadata: dict[str, Any] =field(default_factory=dict) # audit trail data@runtime_checkableclassGuardrailProvider(Protocol):
"""Intercepts tool calls before execution for policy enforcement."""@abstractmethodasyncdefevaluate(
self,
*,
tool_name: str,
args: Mapping[str, Any],
agent_name: str|None=None,
call_id: str|None=None,
cancellation_token: CancellationToken|None=None,
) ->GuardrailResult:
"""Evaluate whether a tool call should proceed. Args: tool_name: Name of the tool being invoked. args: Arguments the agent wants to pass. agent_name: Identity of the calling agent, if known. call_id: Correlation ID for the tool call. cancellation_token: For cooperative cancellation. Returns: GuardrailResult indicating allow, deny, or modify. """
...
Summary
Propose a
GuardrailProviderprotocol that intercepts tool calls before execution, enabling policy-based approval, audit logging, and argument sanitization. This plugs into the existingBaseTool.run_json()andWorkbench.call_tool()paths without breaking backward compatibility.Motivation
AutoGen currently has no standardized hook point between an agent deciding to call a tool and the tool executing. The community has raised this gap from multiple angles:
approval_funcparameter onBaseTool, with open design questions about whether approval belongs at the tool or agent level.Issue #5891 tackles the approval surface specifically but scopes it to a boolean gate. A
GuardrailProvidergeneralizes this to support argument rewriting, structured denial reasons, audit metadata, and composable policy chains -- all concerns raised across the issues above.Proposed Interface
Integration Points
1. BaseTool.run_json() -- tool-level guard
Minimal change to
run_json()inBaseTool:2. Workbench.call_tool() -- workbench-level guard
For MCP and dynamic tool sources, guardrails can wrap
call_tool()at the workbench layer, covering tools that do not subclassBaseTool.3. AssistantAgent -- agent-level guard
Pass providers to
AssistantAgentwhich forwards them to its tools, consistent with the pattern proposed in #5891 forapproval_func.Constructor Addition to BaseTool
Fully backward compatible -- existing tools and subclasses are unaffected.
Design Rationale
runtime_checkableprotocols; avoids forcing inheritanceDecisionenum with MODIFYmetadataon resultevaluate()argsExample: Rate-Limiting Provider
Relationship to Existing Work
approval_funccan be trivially wrapped as aGuardrailProvider. If maintainers prefer to land Support Approval Func in BaseTool in AgentChat #5891 first,GuardrailProvidercan layer on top.call_tool()is a natural second integration point.A reference implementation of policy-based tool guardrails using this interface pattern is available in the APort Agent Guardrails project.
Scope and Non-Goals
This proposal covers tool call interception only. It does not cover:
Next Steps
approval_funcor supersede it.autogen-corewith tests againstFunctionToolandMcpWorkbench.