Reasoning models and structured output (e.g. QwQ-32B and JSON) #15644
Replies: 2 comments 2 replies
-
What I would have expected with reasoning parser enabled, with structured outputFor the first example, I was hoping I'd see output that was something like this:
Reasoning parser enabled, without structured outputIf I run with reasoning enabled using:
And I comment out the
The problem with this result is the lack of structure in the response. Reasoning parser disabled, with structured outputIf I run without reasoning enabled using
Then I get the result in the content section, as I'd expect:
But in this scenario, I imagine all the reasoning of the model has been effectively turned off. Reasoning parser disabled, without structured outputIf I run without reasoning enabled using
And I comment out the
This is great, but clearly fails to output only either "London" or "Paris". |
Beta Was this translation helpful? Give feedback.
-
Oh, it's a V0-only feature! See this discussion. Running
Now gives the output
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to understand how to run reasoning models (e.g. QwQ-32B) and also use structured output (specifically, I'm interested in JSON).
I have tried the examples in the documentation, and I'm very surprised by the output. For the first three examples, I see output via the
reasoning_content
and not oncontent
. The fourth example I cannot get to run.Unfortunately there is nothing on the website to say what the expected result is. Is this expected? I would have thought that you'd want the reasoning content (between the
<think>
and</think>
tags) to be completely free-form and to enforce the structure only on the content. But that's not at all what I'm seeing. Do I misunderstand something?Thanks in advance.
My output for the first example was:
My output for the second example was:
For the first two examples, as per the documentation, I ran
To get the third example to work, I had to run
The code for the first example is:
EDIT:
I'm using vllm 0.8.2
Beta Was this translation helpful? Give feedback.
All reactions