Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InferenceClient: allow passing a pydantic model as response_format #2646

Open
Wauplin opened this issue Oct 31, 2024 · 2 comments · May be fixed by #2647
Open

InferenceClient: allow passing a pydantic model as response_format #2646

Wauplin opened this issue Oct 31, 2024 · 2 comments · May be fixed by #2647
Labels
enhancement New feature or request

Comments

@Wauplin
Copy link
Contributor

Wauplin commented Oct 31, 2024

(issue opened after discussion with @lhoestq)

In InferenceClient.chat_completion, one can pass a response_format which constraint the output format. It must be either a regex or a json schema. A usual use case is to have a dataclass or a Pydantic model and you want the LLM to generate an instance of that class. This can currently be done like this:

client.chat_completion(..., response_format={"type": "json", "value": MyCustomModel.schema()})

It would be good to either:

  1. document this particular use case for convenience
  2. or even allow passing client.chat_completion(..., response_format=MyCustomModel) and handle the serialization automatically before making the http call. If we do so, pydantic shouldn't be a dependency.

Note: the same should be done for client.text_generation(..., grammar=...).


Note: it seems that it's also possible to handle simple dataclasses with something like this. Unsure if it's worth the hassle though. If we add that, we should not add a dependency, simply copy the code + license into a submodule given how tiny and unmaintained the code is.

@Wauplin Wauplin added the enhancement New feature or request label Oct 31, 2024
@hanouticelina
Copy link
Contributor

adding to that, I think It's common to pass Pydantic objects to structure outputs - for example, OpenAI client supports this too (passing a Pydantic object to constrain the output format). Example from their documentation:

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed

@Wauplin
Copy link
Contributor Author

Wauplin commented Oct 31, 2024

Cool to know about the .parsed field!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants