Skip to content

Conversation

@conradlee
Copy link

@conradlee conradlee commented Nov 6, 2025

Summary

This PR updates GoogleJsonSchemaTransformer to support the enhanced JSON Schema features announced by Google in November 2025 (announcement) for Gemini 2.5+ models.

Changes

Core Transformer Updates (pydantic_ai/profiles/google.py)

  • Changed prefer_inlined_defs=True to False to use native $ref/$defs instead of inlining
  • Removed stripping of now-supported fields: title, additionalProperties, prefixItems
  • Removed error raising for $ref schemas (now natively supported)
  • Removed oneOfanyOf conversion (both now work natively)
  • Kept stripping exclusiveMinimum/exclusiveMaximum (empirically confirmed not yet supported by Google SDK)
  • Kept stripping discriminator field (causes validation errors with nested oneOf)
  • Updated docstring with supported features list and reference to Google's announcement

New Features Now Supported

Native support added for:

  • title fields for short property descriptions
  • anyOf and oneOf for conditional structures (unions)
  • $ref and $defs for recursive schemas and reusable definitions
  • minimum and maximum numeric constraints
  • additionalProperties for dictionaries
  • type: 'null' for optional fields
  • prefixItems for tuple-like arrays

Still not supported (empirically tested):

  • exclusiveMinimum and exclusiveMaximum (Google SDK validation error)
  • discriminator field (causes validation errors with nested oneOf)

Documentation (docs/models/google.md)

Added new "Enhanced JSON Schema Support" section with:

  • List of newly supported features
  • Example of recursive schemas (tree structures)
  • Example of union types with discriminated unions

Tests

  • Updated test_gemini.py::test_json_def_recursive to verify recursive schemas work instead of raising errors
  • Added test_google_json_schema_features.py with comprehensive tests for all new features
  • Added test_google_discriminator.py documenting that discriminator field must be stripped

Testing Methodology

All changes were empirically validated using Google Vertex AI with Application Default Credentials (project: ck-nest-dev, location: europe-west1, model: gemini-2.5-flash). Each feature was tested individually to confirm:

  1. Schema generation with the new features
  2. Successful API validation
  3. Correct model responses

Features confirmed NOT working (exclusiveMinimum/exclusiveMaximum, discriminator) were tested and shown to cause Google SDK validation errors.

Migration Impact

This is a backwards-compatible enhancement:

  • Existing code will continue to work
  • Schemas will now be more expressive with native $ref support
  • Models can handle more complex structures (recursive schemas, union types, etc.)

Notes

  • VCR cassettes for new tests will need to be recorded with a real API key during review
  • The PR was largely written by Claude Code under the direction of @conradlee and is meant to be a helpful draft for the maintainers to review and refine

Related: Google Announcement - Gemini API Structured Outputs

🤖 Generated with Claude Code

Co-Authored-By: Claude [email protected]

conradlee and others added 5 commits November 6, 2025 16:50
Google announced in November 2025 that Gemini 2.5+ models now support
enhanced JSON Schema features including title, $ref/$defs, anyOf/oneOf,
minimum/maximum, additionalProperties, prefixItems, and property ordering.
This removes workarounds in GoogleJsonSchemaTransformer and allows native
$ref and oneOf support instead of forced inlining and conversion.

Key findings from empirical testing:
- Native $ref/$defs support confirmed (no inlining needed)
- Both anyOf and oneOf work natively (no conversion needed)
- exclusiveMinimum/exclusiveMaximum NOT yet supported by Google SDK

Changes:
- Set prefer_inlined_defs=False to use native $ref/$defs instead of inlining
- Remove oneOf→anyOf conversion (both work natively now)
- Remove adapter code that stripped title, additionalProperties, and prefixItems
- Keep stripping exclusiveMinimum/exclusiveMaximum (not yet supported)
- Remove code that raised errors for $ref schemas
- Update GoogleJsonSchemaTransformer docstring to document all supported features
- Update test_json_def_recursive to verify recursive schemas work with $ref
- Add comprehensive test suite for new JSON Schema capabilities
- Add documentation section highlighting enhanced JSON Schema support with examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Updated GoogleJsonSchemaTransformer docstring to note that discriminator
  is not supported (causes validation errors with nested oneOf)
- Added reference to Google's announcement blog post
- Added test_google_discriminator.py to document the limitation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Changed test to verify discriminator stripping without API calls
- Added proper type hints for pyright compliance
- Test now validates transformation behavior directly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Critical fixes:
- Rewrote test_google_json_schema_features.py to test schema transformation only
  (not API calls) since enhanced features require Vertex AI which CI doesn't have
- Added prominent warning in docs that enhanced features are Vertex AI only
- Updated doc examples to use google-vertex: prefix
- Fixed test_google_discriminator.py schema path issue
- All tests now pass locally

Key discovery: additionalProperties, $ref, and other enhanced features
are NOT supported in the Generative Language API (google-gla:), only
in Vertex AI (google-vertex:). This is validated by the Google SDK.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
CRITICAL FIX: The same GoogleJsonSchemaTransformer was being used for both
Vertex AI and GLA, but they have different JSON Schema support levels.

Changes:
- Created GoogleVertexJsonSchemaTransformer (enhanced features supported)
  * Supports: $ref, $defs, additionalProperties, title, prefixItems, etc.
  * Uses prefer_inlined_defs=False for native $ref support

- Created GoogleGLAJsonSchemaTransformer (limited features)
  * Strips: additionalProperties, title, prefixItems
  * Uses prefer_inlined_defs=True to inline all $refs
  * More conservative transformations for GLA compatibility

- Updated GoogleGLAProvider to use google_gla_model_profile
- Updated GoogleVertexProvider to use google_vertex_model_profile
- GoogleJsonSchemaTransformer now aliases to Vertex version (backward compat)
- Updated all tests to use GoogleVertexJsonSchemaTransformer

This ensures GLA won't receive unsupported schema features that cause
validation errors like "additionalProperties is not supported in the Gemini API"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
## Enhanced JSON Schema Support

!!! note "Vertex AI Only"
The enhanced JSON Schema features listed below are **only available when using Vertex AI** (`google-vertex:` prefix or `GoogleProvider(vertexai=True)`). They are **not supported** in the Generative Language API (`google-gla:` prefix).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that https://ai.google.dev/gemini-api/docs/structured-output?example=feedback#model_support says we have to use response_json_schema instead of the response_schema key we currently set:

response_schema=response_schema,

response_schema=generation_config.get('response_schema'),

When we do that, maybe it will work for GLA and Vertex?


`GoogleModel` supports multi-modal input, including documents, images, audio, and video. See the [input documentation](../input.md) for details and examples.

## Enhanced JSON Schema Support
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this entire docs section; we transform the schemas exactly so the user doesn't have to know about these details.

Note: This is a generic profile. For Google-specific providers, use:
- google_vertex_model_profile() for Vertex AI (supports enhanced JSON Schema)
- google_gla_model_profile() for Generative Language API (limited JSON Schema)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I don't think this should be necessary


def __init__(self, schema: JsonSchema, *, strict: bool | None = None):
super().__init__(schema, strict=strict, prefer_inlined_defs=True, simplify_nullable_unions=True)
super().__init__(schema, strict=strict, prefer_inlined_defs=False, simplify_nullable_unions=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We can drop prefer_inlined_defs as it defaults to False
  • Do we still need simplify_nullable_unions? type: 'null' is now supported natively

# - additionalProperties (for dict types)
# - $ref (for recursive schemas)
# - prefixItems (for tuple-like arrays)
# These are no longer stripped from the schema.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this comment



async def test_json_def_recursive(allow_model_requests: None):
"""Test that recursive schemas with $ref are now supported (as of November 2025)."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can drop this test as we're not really testing anything useful here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of testing the schema transformer itself, we should add a test to test_google.py that uses a BaseModel like this as NativeOutput and then verifies that the request succeeds.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above:

  • We don't need to test that things are preserved
  • We should test that BaseModels that use certain JSON schema features work as output type, verifying that the schema is transformed in a way that actually works with the API, rather than superficially testing the specific schema transformation we implemented.

@DouweM
Copy link
Collaborator

DouweM commented Nov 7, 2025

@conradlee Thanks for working on this Conrad!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Gemini API's response_json_schema for structured output

2 participants