-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Support raw CoT reasoning from LM Studio and other OpenAI Responses-compatible APIs #3559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
DouweM
merged 25 commits into
pydantic:main
from
dsfaccini:lm-studio-openai-responses-with-gpt-oss
Dec 5, 2025
Merged
Changes from 8 commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
473623b
support rae cot 1
dsfaccini ff4d45b
include live gpt-oss streaming test and remove computer-use model
dsfaccini 9c8007f
simplify filter
dsfaccini 88de1ed
note about test flakiness
dsfaccini 364b711
re-add computer use names
dsfaccini 4afd4b7
handle raw cot in parts manager
dsfaccini 50bc7fa
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini ddd7df4
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini 65ae9a5
refactor parts manager
dsfaccini d0c7d77
add defensive handling of potential summary after rawCoT
dsfaccini 8d52d65
Clarify usage of agent factories
dsfaccini 99812f8
migrate to callback
dsfaccini 0a245b6
dont emit empty events
dsfaccini 3128b4a
Merge branch 'pydantic:main' into main
dsfaccini dc7aa6a
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini 3b6013f
complex testcase
dsfaccini f87896a
improvde dostring
dsfaccini 2c3d767
narrow docstring
dsfaccini e69a7c2
Clarify agent instantiation options in documentation
dsfaccini bc2e31e
address review points
dsfaccini d4a6c8b
Merge remote-tracking branch 'origin/main' into lm-studio-openai-resp…
dsfaccini d5f6503
Merge upstream/main into lm-studio-openai-responses-with-gpt-oss
dsfaccini c7d43bd
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini 85636a8
chain callables or dict mergings
dsfaccini 53579d0
Merge branch 'main' into lm-studio-openai-responses-with-gpt-oss
dsfaccini File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,7 +13,7 @@ | |
|
|
||
| from __future__ import annotations as _annotations | ||
|
|
||
| from collections.abc import Hashable | ||
| from collections.abc import Hashable, Iterator | ||
| from dataclasses import dataclass, field, replace | ||
| from typing import Any | ||
|
|
||
|
|
@@ -76,7 +76,7 @@ def handle_text_delta( | |
| provider_details: dict[str, Any] | None = None, | ||
| thinking_tags: tuple[str, str] | None = None, | ||
| ignore_leading_whitespace: bool = False, | ||
| ) -> ModelResponseStreamEvent | None: | ||
| ) -> Iterator[ModelResponseStreamEvent]: | ||
| """Handle incoming text content, creating or updating a TextPart in the manager as appropriate. | ||
|
|
||
| When `vendor_part_id` is None, the latest part is updated if it exists and is a TextPart; | ||
|
|
@@ -93,10 +93,9 @@ def handle_text_delta( | |
| thinking_tags: If provided, will handle content between the thinking tags as thinking parts. | ||
| ignore_leading_whitespace: If True, will ignore leading whitespace in the content. | ||
|
|
||
| Returns: | ||
| - A `PartStartEvent` if a new part was created. | ||
| - A `PartDeltaEvent` if an existing part was updated. | ||
| - `None` if no new event is emitted (e.g., the first text part was all whitespace). | ||
| Yields: | ||
| A `PartStartEvent` if a new part was created, or a `PartDeltaEvent` if an existing part was updated. | ||
| Yields nothing if no event should be emitted (e.g., the first text part was all whitespace). | ||
|
|
||
| Raises: | ||
| UnexpectedModelBehavior: If attempting to apply text content to a part that is not a TextPart. | ||
|
|
@@ -121,11 +120,12 @@ def handle_text_delta( | |
| if content == thinking_tags[1]: | ||
| # When we see the thinking end tag, we're done with the thinking part and the next text delta will need a new part | ||
| self._vendor_id_to_part_index.pop(vendor_part_id) | ||
| return None | ||
| return | ||
| else: | ||
| return self.handle_thinking_delta( | ||
| yield from self.handle_thinking_delta( | ||
| vendor_part_id=vendor_part_id, content=content, provider_details=provider_details | ||
| ) | ||
| return | ||
| elif isinstance(existing_part, TextPart): | ||
| existing_text_part_and_index = existing_part, part_index | ||
| else: | ||
|
|
@@ -134,29 +134,54 @@ def handle_text_delta( | |
| if thinking_tags and content == thinking_tags[0]: | ||
| # When we see a thinking start tag (which is a single token), we'll build a new thinking part instead | ||
| self._vendor_id_to_part_index.pop(vendor_part_id, None) | ||
| return self.handle_thinking_delta( | ||
| yield from self.handle_thinking_delta( | ||
| vendor_part_id=vendor_part_id, content='', provider_details=provider_details | ||
| ) | ||
| return | ||
|
|
||
| if existing_text_part_and_index is None: | ||
| # This is a workaround for models that emit `<think>\n</think>\n\n` or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), | ||
| # which we don't want to end up treating as a final result when using `run_stream` with `str` a valid `output_type`. | ||
| if ignore_leading_whitespace and (len(content) == 0 or content.isspace()): | ||
| return None | ||
| return | ||
|
|
||
| # There is no existing text part that should be updated, so create a new one | ||
| new_part_index = len(self._parts) | ||
| part = TextPart(content=content, id=id, provider_details=provider_details) | ||
| if vendor_part_id is not None: | ||
| self._vendor_id_to_part_index[vendor_part_id] = new_part_index | ||
| self._parts.append(part) | ||
| return PartStartEvent(index=new_part_index, part=part) | ||
| yield PartStartEvent(index=new_part_index, part=part) | ||
| else: | ||
| # Update the existing TextPart with the new content delta | ||
| existing_text_part, part_index = existing_text_part_and_index | ||
| part_delta = TextPartDelta(content_delta=content, provider_details=provider_details) | ||
| self._parts[part_index] = part_delta.apply(existing_text_part) | ||
| return PartDeltaEvent(index=part_index, delta=part_delta) | ||
| yield PartDeltaEvent(index=part_index, delta=part_delta) | ||
|
|
||
| def _update_raw_content( | ||
dsfaccini marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| self, | ||
| part: ThinkingPart, | ||
DouweM marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| part_index: int, | ||
| raw_content_delta: str, | ||
| raw_content_index: int, | ||
| ) -> ThinkingPart: | ||
| """Update raw_content in provider_details and return updated part.""" | ||
| existing_details = dict(part.provider_details or {}) | ||
| raw_content_list = list(existing_details.get('raw_content', [])) | ||
| while len(raw_content_list) <= raw_content_index: | ||
| raw_content_list.append('') | ||
| raw_content_list[raw_content_index] += raw_content_delta | ||
| existing_details['raw_content'] = raw_content_list | ||
| updated = ThinkingPart( | ||
| content=part.content, | ||
| id=part.id, | ||
| signature=part.signature, | ||
| provider_name=part.provider_name, | ||
| provider_details=existing_details, | ||
| ) | ||
| self._parts[part_index] = updated | ||
| return updated | ||
|
|
||
| def handle_thinking_delta( | ||
| self, | ||
|
|
@@ -167,7 +192,9 @@ def handle_thinking_delta( | |
| signature: str | None = None, | ||
| provider_name: str | None = None, | ||
| provider_details: dict[str, Any] | None = None, | ||
| ) -> ModelResponseStreamEvent: | ||
| raw_content_delta: str | None = None, | ||
| raw_content_index: int = 0, | ||
| ) -> Iterator[ModelResponseStreamEvent]: | ||
|
||
| """Handle incoming thinking content, creating or updating a ThinkingPart in the manager as appropriate. | ||
|
|
||
| When `vendor_part_id` is None, the latest part is updated if it exists and is a ThinkingPart; | ||
|
|
@@ -183,9 +210,13 @@ def handle_thinking_delta( | |
| signature: An optional signature for the thinking content. | ||
| provider_name: An optional provider name for the thinking part. | ||
| provider_details: An optional dictionary of provider-specific details for the thinking part. | ||
| raw_content_delta: Raw chain-of-thought content delta (stored in provider_details['raw_content'], | ||
| not shown to users). | ||
| raw_content_index: Index into the raw_content list for the current delta (default 0). | ||
|
|
||
| Returns: | ||
| Yields: | ||
| A `PartStartEvent` if a new part was created, or a `PartDeltaEvent` if an existing part was updated. | ||
| Yields nothing if only raw_content was updated (raw content updates don't emit visible events). | ||
|
|
||
| Raises: | ||
| UnexpectedModelBehavior: If attempting to apply a thinking delta to a part that is not a ThinkingPart. | ||
|
|
@@ -209,35 +240,47 @@ def handle_thinking_delta( | |
| existing_thinking_part_and_index = existing_part, part_index | ||
|
|
||
| if existing_thinking_part_and_index is None: | ||
| if content is not None or signature is not None: | ||
| if content is not None or signature is not None or raw_content_delta is not None: | ||
| # There is no existing thinking part that should be updated, so create a new one | ||
| new_part_index = len(self._parts) | ||
| new_provider_details = dict(provider_details) if provider_details else {} | ||
| if raw_content_delta is not None: | ||
| raw_content_list: list[str] = [''] * (raw_content_index + 1) | ||
| raw_content_list[raw_content_index] = raw_content_delta | ||
| new_provider_details['raw_content'] = raw_content_list | ||
| part = ThinkingPart( | ||
| content=content or '', | ||
| id=id, | ||
| signature=signature, | ||
| provider_name=provider_name, | ||
| provider_details=provider_details, | ||
| provider_details=new_provider_details or None, | ||
| ) | ||
| if vendor_part_id is not None: # pragma: no branch | ||
| self._vendor_id_to_part_index[vendor_part_id] = new_part_index | ||
| self._parts.append(part) | ||
| return PartStartEvent(index=new_part_index, part=part) | ||
| yield PartStartEvent(index=new_part_index, part=part) | ||
| else: | ||
| raise UnexpectedModelBehavior('Cannot create a ThinkingPart with no content or signature') | ||
| raise UnexpectedModelBehavior('Cannot create a ThinkingPart with no content, signature, or raw_content') | ||
| else: | ||
| existing_thinking_part, part_index = existing_thinking_part_and_index | ||
|
|
||
| if raw_content_delta is not None: | ||
| existing_thinking_part = self._update_raw_content( | ||
| existing_thinking_part, part_index, raw_content_delta, raw_content_index | ||
| ) | ||
| if content is None and signature is None: | ||
| return | ||
|
|
||
| if content is not None or signature is not None: | ||
| # Update the existing ThinkingPart with the new content and/or signature delta | ||
| existing_thinking_part, part_index = existing_thinking_part_and_index | ||
| part_delta = ThinkingPartDelta( | ||
| content_delta=content, | ||
| signature_delta=signature, | ||
| provider_name=provider_name, | ||
| provider_details=provider_details, | ||
| ) | ||
| self._parts[part_index] = part_delta.apply(existing_thinking_part) | ||
| return PartDeltaEvent(index=part_index, delta=part_delta) | ||
| else: | ||
| yield PartDeltaEvent(index=part_index, delta=part_delta) | ||
| elif raw_content_delta is None: # pragma: no branch | ||
| raise UnexpectedModelBehavior('Cannot update a ThinkingPart with no content or signature') | ||
|
|
||
| def handle_tool_call_delta( | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.