Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
473623b
support rae cot 1
dsfaccini Nov 25, 2025
ff4d45b
include live gpt-oss streaming test and remove computer-use model
dsfaccini Nov 25, 2025
9c8007f
simplify filter
dsfaccini Nov 25, 2025
88de1ed
note about test flakiness
dsfaccini Nov 25, 2025
364b711
re-add computer use names
dsfaccini Nov 25, 2025
4afd4b7
handle raw cot in parts manager
dsfaccini Nov 27, 2025
50bc7fa
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini Nov 27, 2025
ddd7df4
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini Nov 27, 2025
65ae9a5
refactor parts manager
dsfaccini Nov 27, 2025
d0c7d77
add defensive handling of potential summary after rawCoT
dsfaccini Nov 28, 2025
8d52d65
Clarify usage of agent factories
dsfaccini Nov 28, 2025
99812f8
migrate to callback
dsfaccini Nov 29, 2025
0a245b6
dont emit empty events
dsfaccini Nov 30, 2025
3128b4a
Merge branch 'pydantic:main' into main
dsfaccini Nov 30, 2025
dc7aa6a
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini Dec 1, 2025
3b6013f
complex testcase
dsfaccini Dec 1, 2025
f87896a
improvde dostring
dsfaccini Dec 1, 2025
2c3d767
narrow docstring
dsfaccini Dec 1, 2025
e69a7c2
Clarify agent instantiation options in documentation
dsfaccini Dec 2, 2025
bc2e31e
address review points
dsfaccini Dec 4, 2025
d4a6c8b
Merge remote-tracking branch 'origin/main' into lm-studio-openai-resp…
dsfaccini Dec 4, 2025
d5f6503
Merge upstream/main into lm-studio-openai-responses-with-gpt-oss
dsfaccini Dec 4, 2025
c7d43bd
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini Dec 4, 2025
85636a8
chain callables or dict mergings
dsfaccini Dec 4, 2025
53579d0
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini Dec 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions docs/models/openai.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,11 +132,10 @@ The Responses API has built-in tools that you can use instead of building your o
- [Code interpreter](https://platform.openai.com/docs/guides/tools-code-interpreter): allow models to write and run Python code in a sandboxed environment before generating a response.
- [Image generation](https://platform.openai.com/docs/guides/tools-image-generation): allow models to generate images based on a text prompt.
- [File search](https://platform.openai.com/docs/guides/tools-file-search): allow models to search your files for relevant information before generating a response.
- [Computer use](https://platform.openai.com/docs/guides/tools-computer-use): allow models to use a computer to perform tasks on your behalf.

Web search, Code interpreter, and Image generation are natively supported through the [Built-in tools](../builtin-tools.md) feature.

File search and Computer use can be enabled by passing an [`openai.types.responses.FileSearchToolParam`](https://github.com/openai/openai-python/blob/main/src/openai/types/responses/file_search_tool_param.py) or [`openai.types.responses.ComputerToolParam`](https://github.com/openai/openai-python/blob/main/src/openai/types/responses/computer_tool_param.py) in the `openai_builtin_tools` setting on [`OpenAIResponsesModelSettings`][pydantic_ai.models.openai.OpenAIResponsesModelSettings]. They don't currently generate [`BuiltinToolCallPart`][pydantic_ai.messages.BuiltinToolCallPart] or [`BuiltinToolReturnPart`][pydantic_ai.messages.BuiltinToolReturnPart] parts in the message history, or streamed events; please submit an issue if you need native support for these built-in tools.
File search can be enabled by passing an [`openai.types.responses.FileSearchToolParam`](https://github.com/openai/openai-python/blob/main/src/openai/types/responses/file_search_tool_param.py) in the `openai_builtin_tools` setting on [`OpenAIResponsesModelSettings`][pydantic_ai.models.openai.OpenAIResponsesModelSettings]. It doesn't currently generate [`BuiltinToolCallPart`][pydantic_ai.messages.BuiltinToolCallPart] or [`BuiltinToolReturnPart`][pydantic_ai.messages.BuiltinToolReturnPart] parts in the message history, or streamed events; please submit an issue if you need native support for these built-in tools.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since https://platform.openai.com/docs/models/computer-use-preview and https://platform.openai.com/docs/guides/tools-computer-use don't say anything about the model being removed/deprecated, let's revert these changes.

The error you shared stated 'The model computer-use-preview does not exist or you do not have access to it.', so maybe the reason was lack of access? In any case, let's remove this from this PR.


```python {title="file_search_tool.py"}
from openai.types.responses import FileSearchToolParam
Expand Down
35 changes: 32 additions & 3 deletions pydantic_ai_slim/pydantic_ai/models/openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,12 @@
by using that prefix like `x-openai-connector:<connector-id>` in a URL, you can pass a connector ID to a model.
"""

_RAW_COT_ID_SUFFIX = '-content-'
"""
Suffix used in ThinkingPart IDs to identify raw Chain of Thought from gpt-oss models.
Raw CoT IDs follow the pattern 'rs_123-content-0', 'rs_123-content-1', etc.
"""

_CHAT_FINISH_REASON_MAP: dict[
Literal['stop', 'length', 'tool_calls', 'content_filter', 'function_call'], FinishReason
] = {
Expand Down Expand Up @@ -1152,8 +1158,15 @@ def _process_response( # noqa: C901
provider_name=self.system,
)
)
# NOTE: We don't currently handle the raw CoT from gpt-oss `reasoning_text`: https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot
# If you need this, please file an issue.
# Handle raw CoT content from gpt-oss models (via LM Studio, vLLM, etc.)
if item.content:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are both summaries and content (like in the third code snippet at https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot#responses-api), we currently create separate ThinkingParts for all of the items.

But that doc says:

Introducing a new content property on reasoning. This allows a reasoning summary that could be displayed to the end user to be returned at the same time as the raw CoT (which should not be shown to the end user, but which might be helpful for interpretability research).

Which means we should not show the raw values to the user, if there is also a summary.

So in that case, I think we should not create new ThinkingParts for the raw CoT, but store it on the first ThinkingPart (like we do with the signature) under provider_details['raw_content'] = [...] or something like that. Otherwise we'll require people building frontends to explicitly check whether a ThinkingPart is raw or not, which we can't expect them to do.

Theoretically, there can also be both content and encrypted_content (ThinkingPart.signature in our case), but I don't think we need to handle that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provider_details['raw_content'] sounds great actually 👍

for idx, content_item in enumerate(item.content):
items.append(
ThinkingPart(
content=content_item.text,
id=f'{item.id}{_RAW_COT_ID_SUFFIX}{idx}',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a magic ID, can we add a key to ThinkingPart.provider_details to indicate whether this was a raw CoT, so that we can check that to know if we should send it back as summary or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeahp, that's a good idea

)
)
elif isinstance(item, responses.ResponseOutputMessage):
for content in item.content:
if isinstance(content, responses.ResponseOutputText): # pragma: no branch
Expand Down Expand Up @@ -1184,7 +1197,7 @@ def _process_response( # noqa: C901
items.append(file_part)
items.append(return_part)
elif isinstance(item, responses.ResponseComputerToolCall): # pragma: no cover
# Pydantic AI doesn't yet support the ComputerUse built-in tool
# OpenAI's `computer-use` model is no longer available
pass
elif isinstance(item, responses.ResponseCustomToolCall): # pragma: no cover
# Support is being implemented in https://github.com/pydantic/pydantic-ai/pull/2572
Expand Down Expand Up @@ -1668,6 +1681,11 @@ async def _map_messages( # noqa: C901
# If `send_item_ids` is false, we won't send the `BuiltinToolReturnPart`, but OpenAI does not have a type for files from the assistant.
pass
elif isinstance(item, ThinkingPart):
# we don't send back raw CoT
# https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot
if _RAW_COT_ID_SUFFIX in (item.id or ''):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, to get the best performance from the model, I think we should try to send back the reasoning text, otherwise the API won't have the full context on why it did the things it did, to use it in next steps.

So I think we should update the code below to work for all combinations of signature/summary/content, and what I wrote above about storing the raw CoT on the initial ThinkingPart in case there's also a summary.

I think we can change if item.id and send_item_ids: to if item.id and (send_item_ids or raw_content): (once we have the raw_content variable), because the if item.id and send_item_ids check exists only for OpenAI Responses and we know they will never send raw content. (I can explain exactly why that check is there in a call if you like)

continue

if item.id and send_item_ids:
signature: str | None = None
if (
Expand Down Expand Up @@ -2138,6 +2156,17 @@ async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]:
id=chunk.item_id,
)

elif isinstance(chunk, responses.ResponseReasoningTextDeltaEvent):
raw_cot_id = f'{chunk.item_id}{_RAW_COT_ID_SUFFIX}{chunk.content_index}'
yield self._parts_manager.handle_thinking_delta(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, we should store the raw-ness on the thinking part.

The result of streaming should match non-streaming, so we should make sure we test both.

What I don't know yet how to do right is if there are both raw content and summaries, and the raw content is streamed first. In the non-streaming code above, I suggested creating parts for the summaries, and then storing the raw stuff on the first part's details, but that won't work if the summaries come after the raw content because we'll already have created the parts with the raw text...

Maybe it'll be easier if instead of creating multiple ThinkingParts, we create just one with all the summaries/content joined by \n\n (prepended to the delta if chunk.{content,summary}_index is higher than the one we saw most recently?). Then if we've created one already with the raw text, and we receive a summary, we can move the raw text to the field on provider_details, and store the summary as the content instead. That would require looking up the part for chunk.item_id in _parts_manager to see what it already has.

Happy to discuss more on Slack/call

vendor_part_id=raw_cot_id,
content=chunk.delta,
id=raw_cot_id,
)

elif isinstance(chunk, responses.ResponseReasoningTextDoneEvent):
pass # content already accumulated via delta events

elif isinstance(chunk, responses.ResponseOutputTextAnnotationAddedEvent):
# TODO(Marcelo): We should support annotations in the future.
pass # there's nothing we need to do here
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
interactions:
- request:
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '95'
content-type:
- application/json
host:
- openrouter.ai
method: POST
parsed_body:
input:
- content: What is 2+2?
role: user
model: openai/gpt-oss-20b
stream: true
uri: https://openrouter.ai/api/v1/responses
response:
body:
string: |+
: OPENROUTER PROCESSING

data: {"type":"response.created","response":{"object":"response","id":"gen-1764104064-iUq1yl7qGCPjLRwOM13E","created_at":1764104064,"status":"in_progress","error":null,"output_text":"","output":[],"model":"openai/gpt-oss-20b","incomplete_details":null,"max_tool_calls":null,"tools":[],"tool_choice":"auto","parallel_tool_calls":true,"max_output_tokens":null,"temperature":null,"top_p":null,"metadata":{},"background":false,"previous_response_id":null,"service_tier":"auto","truncation":null,"store":false,"instructions":null,"reasoning":null,"safety_identifier":null,"prompt_cache_key":null,"user":null},"sequence_number":0}

data: {"type":"response.in_progress","response":{"object":"response","id":"gen-1764104064-iUq1yl7qGCPjLRwOM13E","created_at":1764104064,"status":"in_progress","error":null,"output_text":"","output":[],"model":"openai/gpt-oss-20b","incomplete_details":null,"max_tool_calls":null,"tools":[],"tool_choice":"auto","parallel_tool_calls":true,"max_output_tokens":null,"temperature":null,"top_p":null,"metadata":{},"background":false,"previous_response_id":null,"service_tier":"auto","truncation":null,"store":false,"instructions":null,"reasoning":null,"safety_identifier":null,"prompt_cache_key":null,"user":null},"sequence_number":1}

data: {"type":"response.output_item.added","output_index":0,"item":{"type":"reasoning","id":"rs_tmp_v8dbq4u1wu","summary":[]},"sequence_number":2}

data: {"type":"response.content_part.added","item_id":"rs_tmp_v8dbq4u1wu","output_index":0,"content_index":0,"part":{"type":"reasoning_text","text":""},"sequence_number":3}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":"The","sequence_number":4}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" user","sequence_number":5}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" asks","sequence_number":6}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" a","sequence_number":7}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" simple","sequence_number":8}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" question","sequence_number":9}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":":","sequence_number":10}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" What","sequence_number":11}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" is","sequence_number":12}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" ","sequence_number":13}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":"2","sequence_number":14}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":"+","sequence_number":15}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":"2","sequence_number":16}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":"?","sequence_number":17}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" The","sequence_number":18}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" answer","sequence_number":19}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" is","sequence_number":20}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" ","sequence_number":21}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":"4","sequence_number":22}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":".","sequence_number":23}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" Should","sequence_number":24}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" respond","sequence_number":25}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" straightforward","sequence_number":26}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":"ly","sequence_number":27}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":".","sequence_number":28}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" Maybe","sequence_number":29}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":" friendly","sequence_number":30}

data: {"type":"response.reasoning_text.delta","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"delta":".","sequence_number":31}

data: {"type":"response.output_item.added","output_index":1,"item":{"type":"message","status":"in_progress","content":[],"id":"msg_tmp_500nhmzytzo","role":"assistant"},"sequence_number":32}

data: {"type":"response.content_part.added","item_id":"msg_tmp_500nhmzytzo","output_index":1,"content_index":0,"part":{"type":"output_text","annotations":[],"text":""},"sequence_number":33}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":"2","sequence_number":34}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":" ","sequence_number":35}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":"+","sequence_number":36}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":" ","sequence_number":37}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":"2","sequence_number":38}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":" =","sequence_number":39}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":" ","sequence_number":40}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":"4","sequence_number":41}

data: {"type":"response.output_text.delta","logprobs":[],"output_index":1,"item_id":"msg_tmp_500nhmzytzo","content_index":0,"delta":".","sequence_number":42}

data: {"type":"response.output_text.done","item_id":"msg_tmp_500nhmzytzo","output_index":1,"content_index":0,"text":"2 + 2 = 4.","logprobs":[],"sequence_number":43}

data: {"type":"response.content_part.done","item_id":"msg_tmp_500nhmzytzo","output_index":1,"content_index":0,"part":{"type":"output_text","annotations":[],"text":"2 + 2 = 4."},"sequence_number":44}

data: {"type":"response.output_item.done","output_index":1,"item":{"type":"message","status":"completed","content":[{"type":"output_text","text":"2 + 2 = 4.","annotations":[]}],"id":"msg_tmp_500nhmzytzo","role":"assistant"},"sequence_number":45}

data: {"type":"response.reasoning_text.done","output_index":0,"item_id":"rs_tmp_v8dbq4u1wu","content_index":0,"text":"The user asks a simple question: What is 2+2? The answer is 4. Should respond straightforwardly. Maybe friendly.","sequence_number":46}

data: {"type":"response.content_part.done","item_id":"rs_tmp_v8dbq4u1wu","output_index":0,"content_index":0,"part":{"type":"reasoning_text","text":"The user asks a simple question: What is 2+2? The answer is 4. Should respond straightforwardly. Maybe friendly."},"sequence_number":47}

data: {"type":"response.output_item.done","output_index":0,"item":{"type":"reasoning","id":"rs_tmp_v8dbq4u1wu","summary":[],"content":[{"type":"reasoning_text","text":"The user asks a simple question: What is 2+2? The answer is 4. Should respond straightforwardly. Maybe friendly."}]},"sequence_number":48}

data: {"type":"response.completed","response":{"object":"response","id":"gen-1764104064-iUq1yl7qGCPjLRwOM13E","created_at":1764104064,"model":"openai/gpt-oss-20b","status":"completed","output":[{"type":"reasoning","id":"rs_tmp_v9kn4wuqksb","summary":[],"content":[{"type":"reasoning_text","text":"The user asks a simple question: What is 2+2? The answer is 4. Should respond straightforwardly. Maybe friendly."}]},{"type":"message","status":"completed","content":[{"type":"output_text","text":"2 + 2 = 4.","annotations":[]}],"id":"msg_tmp_500nhmzytzo","role":"assistant"}],"output_text":"","error":null,"incomplete_details":null,"usage":{"input_tokens":76,"input_tokens_details":{"cached_tokens":64},"output_tokens":47,"output_tokens_details":{"reasoning_tokens":28},"total_tokens":123,"cost":0.0000132,"is_byok":false,"cost_details":{"upstream_inference_cost":null,"upstream_inference_input_cost":0.0000038,"upstream_inference_output_cost":0.0000094}},"max_tool_calls":null,"tools":[],"tool_choice":"auto","parallel_tool_calls":true,"max_output_tokens":null,"temperature":null,"top_p":null,"metadata":{},"background":false,"previous_response_id":null,"service_tier":"auto","truncation":null,"store":false,"instructions":null,"reasoning":null,"safety_identifier":null,"prompt_cache_key":null,"user":null},"sequence_number":49}

data: [DONE]

headers:
access-control-allow-origin:
- '*'
cache-control:
- no-cache
connection:
- keep-alive
content-type:
- text/event-stream
permissions-policy:
- payment=(self "https://checkout.stripe.com" "https://connect-js.stripe.com" "https://js.stripe.com" "https://*.js.stripe.com"
"https://hooks.stripe.com")
referrer-policy:
- no-referrer, strict-origin-when-cross-origin
transfer-encoding:
- chunked
vary:
- Accept-Encoding
status:
code: 200
message: OK
version: 1
...
Loading