Skip to content

Commit 50489c5

Browse files
sarth6DouweM
andauthored
Add Anthropic built-in WebFetchTool support (#3427)
Co-authored-by: Douwe Maan <[email protected]>
1 parent 085e21f commit 50489c5

16 files changed

+2574
-41
lines changed

docs/builtin-tools.md

Lines changed: 51 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Pydantic AI supports the following built-in tools:
99
- **[`WebSearchTool`][pydantic_ai.builtin_tools.WebSearchTool]**: Allows agents to search the web
1010
- **[`CodeExecutionTool`][pydantic_ai.builtin_tools.CodeExecutionTool]**: Enables agents to execute code in a secure environment
1111
- **[`ImageGenerationTool`][pydantic_ai.builtin_tools.ImageGenerationTool]**: Enables agents to generate images
12-
- **[`UrlContextTool`][pydantic_ai.builtin_tools.UrlContextTool]**: Enables agents to pull URL contents into their context
12+
- **[`WebFetchTool`][pydantic_ai.builtin_tools.WebFetchTool]**: Enables agents to fetch web pages
1313
- **[`MemoryTool`][pydantic_ai.builtin_tools.MemoryTool]**: Enables agents to use memory
1414
- **[`MCPServerTool`][pydantic_ai.builtin_tools.MCPServerTool]**: Enables agents to use remote MCP servers with communication handled by the model provider
1515

@@ -306,18 +306,18 @@ For more details, check the [API documentation][pydantic_ai.builtin_tools.ImageG
306306
| `quality` |||
307307
| `size` |||
308308

309-
## URL Context Tool
309+
## Web Fetch Tool
310310

311-
The [`UrlContextTool`][pydantic_ai.builtin_tools.UrlContextTool] enables your agent to pull URL contents into its context,
311+
The [`WebFetchTool`][pydantic_ai.builtin_tools.WebFetchTool] enables your agent to pull URL contents into its context,
312312
allowing it to pull up-to-date information from the web.
313313

314314
### Provider Support
315315

316316
| Provider | Supported | Notes |
317317
|----------|-----------|-------|
318-
| Google || No [`BuiltinToolCallPart`][pydantic_ai.messages.BuiltinToolCallPart] or [`BuiltinToolReturnPart`][pydantic_ai.messages.BuiltinToolReturnPart] is currently generated; please submit an issue if you need this. Using built-in tools and function tools (including [output tools](output.md#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](output.md#prompted-output) instead. |
318+
| Anthropic || Full feature support. Uses Anthropic's [Web Fetch Tool](https://docs.claude.com/en/docs/agents-and-tools/tool-use/web-fetch-tool) internally to retrieve URL contents. |
319+
| Google || No parameter support. The limits are fixed at 20 URLs per request with a maximum of 34MB per URL. Using built-in tools and function tools (including [output tools](output.md#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](output.md#prompted-output) instead. |
319320
| OpenAI || |
320-
| Anthropic || |
321321
| Groq || |
322322
| Bedrock || |
323323
| Mistral || |
@@ -327,10 +327,10 @@ allowing it to pull up-to-date information from the web.
327327

328328
### Usage
329329

330-
```py {title="url_context_basic.py"}
331-
from pydantic_ai import Agent, UrlContextTool
330+
```py {title="web_fetch_basic.py"}
331+
from pydantic_ai import Agent, WebFetchTool
332332

333-
agent = Agent('google-gla:gemini-2.5-flash', builtin_tools=[UrlContextTool()])
333+
agent = Agent('google-gla:gemini-2.5-flash', builtin_tools=[WebFetchTool()])
334334

335335
result = agent.run_sync('What is this? https://ai.pydantic.dev')
336336
print(result.output)
@@ -339,6 +339,49 @@ print(result.output)
339339

340340
_(This example is complete, it can be run "as is")_
341341

342+
### Configuration Options
343+
344+
The `WebFetchTool` supports several configuration parameters:
345+
346+
```py {title="web_fetch_configured.py"}
347+
from pydantic_ai import Agent, WebFetchTool
348+
349+
agent = Agent(
350+
'anthropic:claude-sonnet-4-0',
351+
builtin_tools=[
352+
WebFetchTool(
353+
allowed_domains=['ai.pydantic.dev', 'docs.pydantic.dev'],
354+
max_uses=10,
355+
enable_citations=True,
356+
max_content_tokens=50000,
357+
)
358+
],
359+
)
360+
361+
result = agent.run_sync(
362+
'Compare the documentation at https://ai.pydantic.dev and https://docs.pydantic.dev'
363+
)
364+
print(result.output)
365+
"""
366+
Both sites provide comprehensive documentation for Pydantic projects. ai.pydantic.dev focuses on PydanticAI, a framework for building AI agents, while docs.pydantic.dev covers Pydantic, the data validation library. They share similar documentation styles and both emphasize type safety and developer experience.
367+
"""
368+
```
369+
370+
_(This example is complete, it can be run "as is")_
371+
372+
#### Provider Support
373+
374+
| Parameter | Anthropic | Google |
375+
|-----------|-----------|--------|
376+
| `max_uses` |||
377+
| `allowed_domains` |||
378+
| `blocked_domains` |||
379+
| `enable_citations` |||
380+
| `max_content_tokens` |||
381+
382+
!!! note "Anthropic Domain Filtering"
383+
With Anthropic, you can only use either `blocked_domains` or `allowed_domains`, not both.
384+
342385
## Memory Tool
343386

344387
The [`MemoryTool`][pydantic_ai.builtin_tools.MemoryTool] enables your agent to use memory.

pydantic_ai_slim/pydantic_ai/__init__.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@
1414
ImageGenerationTool,
1515
MCPServerTool,
1616
MemoryTool,
17-
UrlContextTool,
17+
UrlContextTool, # pyright: ignore[reportDeprecated]
18+
WebFetchTool,
1819
WebSearchTool,
1920
WebSearchUserLocation,
2021
)
@@ -216,6 +217,7 @@
216217
# builtin_tools
217218
'WebSearchTool',
218219
'WebSearchUserLocation',
220+
'WebFetchTool',
219221
'UrlContextTool',
220222
'CodeExecutionTool',
221223
'ImageGenerationTool',

pydantic_ai_slim/pydantic_ai/builtin_tools.py

Lines changed: 64 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,14 @@
66

77
import pydantic
88
from pydantic_core import core_schema
9-
from typing_extensions import TypedDict
9+
from typing_extensions import TypedDict, deprecated
1010

1111
__all__ = (
1212
'AbstractBuiltinTool',
1313
'WebSearchTool',
1414
'WebSearchUserLocation',
1515
'CodeExecutionTool',
16+
'WebFetchTool',
1617
'UrlContextTool',
1718
'ImageGenerationTool',
1819
'MemoryTool',
@@ -166,18 +167,78 @@ class CodeExecutionTool(AbstractBuiltinTool):
166167

167168

168169
@dataclass(kw_only=True)
169-
class UrlContextTool(AbstractBuiltinTool):
170+
class WebFetchTool(AbstractBuiltinTool):
170171
"""Allows your agent to access contents from URLs.
171172
173+
The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models.
174+
172175
Supported by:
173176
177+
* Anthropic
174178
* Google
175179
"""
176180

177-
kind: str = 'url_context'
181+
max_uses: int | None = None
182+
"""If provided, the tool will stop fetching URLs after the given number of uses.
183+
184+
Supported by:
185+
186+
* Anthropic
187+
"""
188+
189+
allowed_domains: list[str] | None = None
190+
"""If provided, only these domains will be fetched.
191+
192+
With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both.
193+
194+
Supported by:
195+
196+
* Anthropic, see <https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering>
197+
"""
198+
199+
blocked_domains: list[str] | None = None
200+
"""If provided, these domains will never be fetched.
201+
202+
With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both.
203+
204+
Supported by:
205+
206+
* Anthropic, see <https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering>
207+
"""
208+
209+
enable_citations: bool = False
210+
"""If True, enables citations for fetched content.
211+
212+
Supported by:
213+
214+
* Anthropic
215+
"""
216+
217+
max_content_tokens: int | None = None
218+
"""Maximum content length in tokens for fetched content.
219+
220+
Supported by:
221+
222+
* Anthropic
223+
"""
224+
225+
kind: str = 'web_fetch'
178226
"""The kind of tool."""
179227

180228

229+
@deprecated('Use `WebFetchTool` instead.')
230+
@dataclass(kw_only=True)
231+
class UrlContextTool(WebFetchTool):
232+
"""Deprecated alias for WebFetchTool. Use WebFetchTool instead.
233+
234+
Overrides kind to 'url_context' so old serialized payloads with {"kind": "url_context", ...}
235+
can be deserialized to UrlContextTool for backward compatibility.
236+
"""
237+
238+
kind: str = 'url_context'
239+
"""The kind of tool (deprecated value for backward compatibility)."""
240+
241+
181242
@dataclass(kw_only=True)
182243
class ImageGenerationTool(AbstractBuiltinTool):
183244
"""A builtin tool that allows your agent to generate images.

pydantic_ai_slim/pydantic_ai/models/anthropic.py

Lines changed: 69 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
from .. import ModelHTTPError, UnexpectedModelBehavior, _utils, usage
1414
from .._run_context import RunContext
1515
from .._utils import guard_tool_call_id as _guard_tool_call_id
16-
from ..builtin_tools import CodeExecutionTool, MCPServerTool, MemoryTool, WebSearchTool
16+
from ..builtin_tools import CodeExecutionTool, MCPServerTool, MemoryTool, WebFetchTool, WebSearchTool
1717
from ..exceptions import ModelAPIError, UserError
1818
from ..messages import (
1919
BinaryContent,
@@ -67,6 +67,7 @@
6767
BetaBase64PDFBlockParam,
6868
BetaBase64PDFSourceParam,
6969
BetaCacheControlEphemeralParam,
70+
BetaCitationsConfigParam,
7071
BetaCitationsDelta,
7172
BetaCodeExecutionTool20250522Param,
7273
BetaCodeExecutionToolResultBlock,
@@ -114,12 +115,18 @@
114115
BetaToolUnionParam,
115116
BetaToolUseBlock,
116117
BetaToolUseBlockParam,
118+
BetaWebFetchTool20250910Param,
119+
BetaWebFetchToolResultBlock,
120+
BetaWebFetchToolResultBlockParam,
117121
BetaWebSearchTool20250305Param,
118122
BetaWebSearchToolResultBlock,
119123
BetaWebSearchToolResultBlockContent,
120124
BetaWebSearchToolResultBlockParam,
121125
BetaWebSearchToolResultBlockParamContentParam,
122126
)
127+
from anthropic.types.beta.beta_web_fetch_tool_result_block_param import (
128+
Content as WebFetchToolResultBlockParamContent,
129+
)
123130
from anthropic.types.beta.beta_web_search_tool_20250305_param import UserLocation
124131
from anthropic.types.model_param import ModelParam
125132

@@ -423,6 +430,8 @@ def _process_response(self, response: BetaMessage) -> ModelResponse:
423430
items.append(_map_web_search_tool_result_block(item, self.system))
424431
elif isinstance(item, BetaCodeExecutionToolResultBlock):
425432
items.append(_map_code_execution_tool_result_block(item, self.system))
433+
elif isinstance(item, BetaWebFetchToolResultBlock):
434+
items.append(_map_web_fetch_tool_result_block(item, self.system))
426435
elif isinstance(item, BetaRedactedThinkingBlock):
427436
items.append(
428437
ThinkingPart(id='redacted_thinking', content='', signature=item.data, provider_name=self.system)
@@ -518,6 +527,20 @@ def _add_builtin_tools(
518527
elif isinstance(tool, CodeExecutionTool): # pragma: no branch
519528
tools.append(BetaCodeExecutionTool20250522Param(name='code_execution', type='code_execution_20250522'))
520529
beta_features.append('code-execution-2025-05-22')
530+
elif isinstance(tool, WebFetchTool): # pragma: no branch
531+
citations = BetaCitationsConfigParam(enabled=tool.enable_citations) if tool.enable_citations else None
532+
tools.append(
533+
BetaWebFetchTool20250910Param(
534+
name='web_fetch',
535+
type='web_fetch_20250910',
536+
max_uses=tool.max_uses,
537+
allowed_domains=tool.allowed_domains,
538+
blocked_domains=tool.blocked_domains,
539+
citations=citations,
540+
max_content_tokens=tool.max_content_tokens,
541+
)
542+
)
543+
beta_features.append('web-fetch-2025-09-10')
521544
elif isinstance(tool, MemoryTool): # pragma: no branch
522545
if 'memory' not in model_request_parameters.tool_defs:
523546
raise UserError("Built-in `MemoryTool` requires a 'memory' tool to be defined.")
@@ -627,6 +650,7 @@ async def _map_message( # noqa: C901
627650
| BetaServerToolUseBlockParam
628651
| BetaWebSearchToolResultBlockParam
629652
| BetaCodeExecutionToolResultBlockParam
653+
| BetaWebFetchToolResultBlockParam
630654
| BetaThinkingBlockParam
631655
| BetaRedactedThinkingBlockParam
632656
| BetaMCPToolUseBlockParam
@@ -689,6 +713,14 @@ async def _map_message( # noqa: C901
689713
input=response_part.args_as_dict(),
690714
)
691715
assistant_content_params.append(server_tool_use_block_param)
716+
elif response_part.tool_name == WebFetchTool.kind:
717+
server_tool_use_block_param = BetaServerToolUseBlockParam(
718+
id=tool_use_id,
719+
type='server_tool_use',
720+
name='web_fetch',
721+
input=response_part.args_as_dict(),
722+
)
723+
assistant_content_params.append(server_tool_use_block_param)
692724
elif (
693725
response_part.tool_name.startswith(MCPServerTool.kind)
694726
and (server_id := response_part.tool_name.split(':', 1)[1])
@@ -735,6 +767,19 @@ async def _map_message( # noqa: C901
735767
),
736768
)
737769
)
770+
elif response_part.tool_name == WebFetchTool.kind and isinstance(
771+
response_part.content, dict
772+
):
773+
assistant_content_params.append(
774+
BetaWebFetchToolResultBlockParam(
775+
tool_use_id=tool_use_id,
776+
type='web_fetch_tool_result',
777+
content=cast(
778+
WebFetchToolResultBlockParamContent,
779+
response_part.content, # pyright: ignore[reportUnknownMemberType]
780+
),
781+
)
782+
)
738783
elif response_part.tool_name.startswith(MCPServerTool.kind) and isinstance(
739784
response_part.content, dict
740785
): # pragma: no branch
@@ -955,6 +1000,11 @@ async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]:
9551000
vendor_part_id=event.index,
9561001
part=_map_code_execution_tool_result_block(current_block, self.provider_name),
9571002
)
1003+
elif isinstance(current_block, BetaWebFetchToolResultBlock): # pragma: lax no cover
1004+
yield self._parts_manager.handle_part(
1005+
vendor_part_id=event.index,
1006+
part=_map_web_fetch_tool_result_block(current_block, self.provider_name),
1007+
)
9581008
elif isinstance(current_block, BetaMCPToolUseBlock):
9591009
call_part = _map_mcp_server_use_block(current_block, self.provider_name)
9601010
builtin_tool_calls[call_part.tool_call_id] = call_part
@@ -1061,7 +1111,14 @@ def _map_server_tool_use_block(item: BetaServerToolUseBlock, provider_name: str)
10611111
args=cast(dict[str, Any], item.input) or None,
10621112
tool_call_id=item.id,
10631113
)
1064-
elif item.name in ('web_fetch', 'bash_code_execution', 'text_editor_code_execution'): # pragma: no cover
1114+
elif item.name == 'web_fetch':
1115+
return BuiltinToolCallPart(
1116+
provider_name=provider_name,
1117+
tool_name=WebFetchTool.kind,
1118+
args=cast(dict[str, Any], item.input) or None,
1119+
tool_call_id=item.id,
1120+
)
1121+
elif item.name in ('bash_code_execution', 'text_editor_code_execution'): # pragma: no cover
10651122
raise NotImplementedError(f'Anthropic built-in tool {item.name!r} is not currently supported.')
10661123
else:
10671124
assert_never(item.name)
@@ -1097,6 +1154,16 @@ def _map_code_execution_tool_result_block(
10971154
)
10981155

10991156

1157+
def _map_web_fetch_tool_result_block(item: BetaWebFetchToolResultBlock, provider_name: str) -> BuiltinToolReturnPart:
1158+
return BuiltinToolReturnPart(
1159+
provider_name=provider_name,
1160+
tool_name=WebFetchTool.kind,
1161+
# Store just the content field (BetaWebFetchBlock) which has {content, type, url, retrieved_at}
1162+
content=item.content.model_dump(mode='json'),
1163+
tool_call_id=item.tool_use_id,
1164+
)
1165+
1166+
11001167
def _map_mcp_server_use_block(item: BetaMCPToolUseBlock, provider_name: str) -> BuiltinToolCallPart:
11011168
return BuiltinToolCallPart(
11021169
provider_name=provider_name,

0 commit comments

Comments
 (0)