Skip to content

Conversation

@rahulmansharamani14
Copy link

@rahulmansharamani14 rahulmansharamani14 commented Sep 28, 2025

This draft PR begins addressing #32197 (Support structured output with Pydantic models in langchain-huggingface).

Currently, ChatHuggingFace.with_structured_output converts Pydantic schemas into JSON Schema and parses into dicts. PydanticOutputParser is not utilized, leading to dict outputs and extra token usage on retries.

What’s included

  • Detects Pydantic model classes (is_basemodel_subclass).
  • Switches to PydanticOutputParser when schema is Pydantic and method="json_schema".
  • Preserves existing behavior for JSON Schema dicts, TypedDicts, and json_mode.
  • function_calling with Pydantic continues to raise NotImplementedError.

Next steps (planned)

  • Add unit tests under libs/partners/huggingface/tests/unit_tests/ for:
    • Round-trip returning Pydantic instances.
    • include_raw=True path.
    • No extra model calls on parse retry.
    • Non-Pydantic schemas unchanged.
  • Un-xfail Pydantic-related integration tests once validated.

Notes

Opening early for review to validate direction before completing test coverage and unskipping integration tests.

@vercel
Copy link

vercel bot commented Sep 28, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
langchain Ignored Ignored Preview Sep 28, 2025 9:57pm

@github-actions github-actions bot added the integration Related to a provider partner package integration label Sep 28, 2025
@rahulmansharamani14 rahulmansharamani14 changed the title WIP: fix(huggingface): add Pydantic structured output support WIP: fix(huggingface): add Pydantic structured output support (Issue #32197) Sep 28, 2025
@github-actions github-actions bot added integration Related to a provider partner package integration and removed integration Related to a provider partner package integration labels Sep 28, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Sep 28, 2025

CodSpeed Instrumentation Performance Report

Merging #33141 will create unknown performance changes

Comparing rahulmansharamani14:fix/hf-pydantic-structured-output (585b25d) with master (9863023)1

Summary

⚠️ No benchmarks were detected in both the base of the PR and the PR.
Please ensure that your benchmarks are correctly instrumented with CodSpeed.

Check out the benchmarks creation guide
⏩ 21 skipped2

Footnotes

  1. No successful run was found on master (54ea620) during the generation of this report, so 9863023 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@rahulmansharamani14
Copy link
Author

@mdrxy @ccurme, please advise!

Copy link
Collaborator

@ccurme ccurme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rahulmansharamani14, the plan sounds good to me although it doesn't look implemented here (I see no logical change as we continue to instantiate JsonOutputParser in each case). Let me know if you have any specific questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration Related to a provider partner package integration

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants