InferenceClient: allow passing a pydantic model as response_format
#2646
Labels
enhancement
New feature or request
response_format
#2646
(issue opened after discussion with @lhoestq)
In
InferenceClient.chat_completion
, one can pass aresponse_format
which constraint the output format. It must be either a regex or a json schema. A usual use case is to have a dataclass or a Pydantic model and you want the LLM to generate an instance of that class. This can currently be done like this:It would be good to either:
client.chat_completion(..., response_format=MyCustomModel)
and handle the serialization automatically before making the http call. If we do so, pydantic shouldn't be a dependency.Note: the same should be done for
client.text_generation(..., grammar=...)
.Note: it seems that it's also possible to handle simple dataclasses with something like this. Unsure if it's worth the hassle though. If we add that, we should not add a dependency, simply copy the code + license into a submodule given how tiny and unmaintained the code is.
The text was updated successfully, but these errors were encountered: