feat(ai-gemini): add Gemini 3 text and image models #190

vmg-dev · 2025-12-26T18:05:12Z

🎯 Changes

Add support for the new Gemini 3 model family:

gemini-3-flash-preview - Text generation model with 1M input / 64k output tokens, supports thinking, structured output, function calling, code execution, search grounding, and URL context
gemini-3-pro-image-preview - Image generation model with 65k input / 32k output tokens, supports thinking, structured output, search grounding, and native image generation

Both models are added to model-meta.ts with:

Model metadata (token limits, capabilities, pricing)
Type mappings for provider options
Input modality type mappings
Corresponding type tests

✅ Checklist

I have followed the steps in the Contributing guide. (There was none)
I have tested this code locally with pnpm run test:pr.

🚀 Release Impact

This change affects published code, and I have generated a changeset.
This change is docs/CI/dev-only (no release).

Summary by CodeRabbit

New Features
- Added Gemini 3 Flash model with full multimodal support for text and image generation.
- Added Gemini 3 Pro Image model for image-related generation tasks.
Tests
- Extended test coverage for new model variants and their multimodal capabilities.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-26T18:05:22Z

📝 Walkthrough

Walkthrough

Introduces two new Gemini 3 models (Flash and Pro Image) to the @tanstack/ai-gemini package with full multimodal support. Changes include model definitions, provider configuration registration, type mappings, and comprehensive test coverage validating the new models' capabilities and type contracts.

Changes

Cohort / File(s)	Summary
Gemini Model Additions `packages/typescript/ai-gemini/src/model-meta.ts`	Added `GEMINI_3_FLASH` and `GEMINI_3_PRO_IMAGE` model constants with multimodal capabilities, pricing, and tool support. Registered new models in `GEMINI_MODELS` and `GEMINI_IMAGE_MODELS` exports. Extended `GeminiChatModelProviderOptionsByName` and `GeminiModelInputModalitiesByName` type maps to include the new models.
Test Coverage `packages/typescript/ai-gemini/tests/model-meta.test.ts`	Added type validation tests for `gemini-3-flash-preview` covering provider options (thinking, structured output, safety settings, tool config, cached content), full multimodal input support, and type completeness assertions.
Versioning `.changeset/gemini-3-models.md`	Added changeset entry documenting a minor version bump for `@tanstack/ai-gemini` with description of Gemini 3 Flash and Pro Image model additions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

jherr

Poem

🐰 Three new models hop into view,
With images and text they can chew,
Flash and Pro Image, gems so fine,
Type-safe tests make them align,
The Gemini models now shine through!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding Gemini 3 text and image models to the ai-gemini package, which aligns with the code modifications in model-meta.ts and related test files.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check	✅ Passed	The pull request description follows the template with all required sections completed. It includes a clear description of changes, a filled checklist indicating tests were run, and release impact marked.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a866721 and dcffa05.

📒 Files selected for processing (3)

.changeset/gemini-3-models.md
packages/typescript/ai-gemini/src/model-meta.ts
packages/typescript/ai-gemini/tests/model-meta.test.ts

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Use tree-shakeable adapter architecture for provider implementations - export specialized adapters (text, embedding, summarize, image) as separate imports from /adapters subpath rather than monolithic adapters
Use Zod for runtime schema validation and type inference, particularly for tool input/output definitions with toolDefinition() and Zod schema inference
Implement isomorphic tool system using toolDefinition() with .server() and .client() implementations for dual-environment execution
Use type-safe per-model configuration with provider options typed based on selected model to ensure compile-time safety
Implement stream processing with StreamProcessor for handling chunked responses and support partial JSON parsing for streaming AI responses

Files:

packages/typescript/ai-gemini/src/model-meta.ts
packages/typescript/ai-gemini/tests/model-meta.test.ts

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use camelCase for function and variable names throughout the codebase

Files:

packages/typescript/ai-gemini/src/model-meta.ts
packages/typescript/ai-gemini/tests/model-meta.test.ts

packages/typescript/*/src/model-meta.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain model metadata files that define provider options and capabilities per model for per-model type safety

Files:

packages/typescript/ai-gemini/src/model-meta.ts

**/*.test.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Write unit tests using Vitest alongside source files with .test.ts naming convention

Files:

packages/typescript/ai-gemini/tests/model-meta.test.ts

🧠 Learnings (4)

📚 Learning: 2025-12-13T17:09:09.794Z

Learnt from: CR
Repo: TanStack/ai PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T17:09:09.794Z
Learning: Applies to packages/typescript/*/src/model-meta.ts : Maintain model metadata files that define provider options and capabilities per model for per-model type safety

Applied to files:

packages/typescript/ai-gemini/src/model-meta.ts
packages/typescript/ai-gemini/tests/model-meta.test.ts

📚 Learning: 2025-12-13T17:09:09.794Z

Learnt from: CR
Repo: TanStack/ai PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T17:09:09.794Z
Learning: Applies to **/*.{ts,tsx} : Use type-safe per-model configuration with provider options typed based on selected model to ensure compile-time safety

Applied to files:

packages/typescript/ai-gemini/tests/model-meta.test.ts

📚 Learning: 2025-12-13T17:09:09.794Z

Learnt from: CR
Repo: TanStack/ai PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T17:09:09.794Z
Learning: Applies to packages/typescript/*/src/adapters/*.ts : Create individual adapter implementations for each provider capability (text, embed, summarize, image) with separate exports to enable tree-shaking

Applied to files:

packages/typescript/ai-gemini/tests/model-meta.test.ts

📚 Learning: 2025-12-13T17:09:09.794Z

Learnt from: CR
Repo: TanStack/ai PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T17:09:09.794Z
Learning: Maintain type safety through multimodal content support (image, audio, video, document) with model capability awareness

Applied to files:

packages/typescript/ai-gemini/tests/model-meta.test.ts

🧬 Code graph analysis (1)

packages/typescript/ai-gemini/src/model-meta.ts (1)

packages/typescript/ai-gemini/src/text/text-provider-options.ts (6)

GeminiToolConfigOptions (9-14)

GeminiSafetyOptions (16-23)

GeminiGenerationConfigOptions (25-168)

GeminiCachedContentOptions (170-175)

GeminiStructuredOutputOptions (177-220)

GeminiThinkingOptions (222-241)

🔇 Additional comments (8)

packages/typescript/ai-gemini/src/model-meta.ts (4)

820-820: LGTM!

Correct addition of GEMINI_3_FLASH to the GEMINI_MODELS array for text generation models.

835-835: LGTM!

Correct addition of GEMINI_3_PRO_IMAGE to the GEMINI_IMAGE_MODELS array for image generation models.

916-921: LGTM!

The type maps are correctly updated for GEMINI_3_FLASH:

Provider options map includes all appropriate option types (thinking, structured output, base options)

Input modalities map references the model's supported inputs

The pattern of excluding image generation models from these maps is consistent with existing models.

Also applies to: 981-981

86-121: All model metadata for GEMINI_3_FLASH is accurate and verified against official Google documentation:

Token limits (1M input / 65k output), pricing ($0.50/$3.00 per 1M tokens), capabilities (function calling, structured output, code execution, thinking, URL context, file search, search grounding, batch API, caching), and knowledge cutoff (2025-01-01) are all correct.

Multi-modal input support and text-only output align with official specifications.

Type constraints match the existing model pattern.

packages/typescript/ai-gemini/tests/model-meta.test.ts (3)

63-83: LGTM!

Comprehensive test coverage for gemini-3-flash-preview provider options, correctly mirroring the gemini-3-pro-preview tests since both models share the same capabilities.

197-197: LGTM!

The test assertions correctly include gemini-3-flash-preview across all provider option checks (type completeness, safety settings, tool config, cached content, thinking options, structured output).

Also applies to: 229-231, 259-261, 289-291, 321-323, 354-356

416-427: LGTM!

The multimodal input type assertions correctly verify that gemini-3-flash-preview supports all content part types (text, image, audio, video, document), consistent with the model's metadata.

.changeset/gemini-3-models.md (1)

1-5: LGTM!

The changeset correctly specifies a minor version bump for adding new Gemini 3 models, with an accurate description of the changes.

coderabbitai · 2025-12-26T18:09:04Z

packages/typescript/ai-gemini/src/model-meta.ts

+const GEMINI_3_PRO_IMAGE = {
+  name: 'gemini-3-pro-image-preview',
+  max_input_tokens: 65_536,
+  max_output_tokens: 32_768,
+  knowledge_cutoff: '2025-01-01',
+  supports: {
+    input: ['text', 'image'],
+    output: ['text', 'image'],
+    capabilities: [
+      'batch_api',
+      'image_generation',
+      'search_grounding',
+      'structured_output',
+      'thinking',
+    ],
+  },
+  pricing: {
+    input: {
+      normal: 2,
+    },
+    output: {
+      normal: 0.134,
+    },
+  },
+} as const satisfies ModelMeta<
+  GeminiToolConfigOptions &
+    GeminiSafetyOptions &
+    GeminiGenerationConfigOptions &
+    GeminiCachedContentOptions &
+    GeminiStructuredOutputOptions &
+    GeminiThinkingOptions
+>


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Gemini 3 Pro Image Preview API official documentation token limits pricing capabilities

💡 Result:

Here’s a concise summary of the official Gemini 3 Pro Image (gemini-3-pro-image-preview) Image Preview API details from Google’s docs:

Model name / ID: Gemini 3 Pro Image Preview — gemini-3-pro-image-preview. [1][2]

Context / token limits: Input token limit 65,536; output token limit 32,768. (Gemini 3 Pro text model uses larger 1,048,576 / 65,536 limits.) [2]

Capabilities: multimodal (text + image inputs, text + image outputs), image generation supported; many text-only Pro capabilities (structured outputs, search grounding, batch API) — some capabilities (e.g., code execution, function calling, caching) differ from text-only Pro. See model capability table. [2]

Rate / batch limits: Batch-enqueued token limits shown per tier (examples: Tier 1 = 2,000,000; Tier 2 = 270,000,000; Tier 3 = 1,000,000,000). Also separate API rate-limit rules shown in rate limits docs. [3]

Pricing (official preview pricing summary): standard API pricing shows text/input priced similar to Gemini 3 Pro; image output pricing varies by resolution. Example listed rates (per 1M tokens or equivalent): Input ≈ $2 / 1M tokens (text/image); Output (text/thinking) ≈ $12 / 1M tokens; image output per-image pricing shown as per-resolution (e.g., ~$0.134 per 1K/2K image unit; higher for 4K). Batch pricing is lower; preview models and image pricing details linked on pricing page. (See official pricing table for exact resolution-tier numbers and whether billed per-image or per-token.) [1][4]

Docs / reference pages (official): Gemini 3 series model page, Gemini models list (model properties), rate-limits page, and pricing page. Use those to see the latest updates and resolution-specific image pricing. [1][2][3][4]

Sources: Google AI / Gemini docs — model page, models list, rate-limits, pricing. [1][2][3][4]

Would you like me to fetch the exact per-resolution image prices and show a copied small table (with direct citations)?

🌐 Web query:

Gemini 3 Pro Image Preview model thinking capability support structured outputs

💡 Result:

Yes — the Gemini 3 Pro Image Preview (model id gemini-3-pro-image-preview) supports both "Thinking" (multi-step/internal reasoning) and "Structured outputs" (e.g., JSON-style structured responses). It accepts image + text inputs and returns text and image outputs. [1][2]

Sources:
[1] Google AI models (Gemini 3 Pro Image model details).
[2] Vertex AI / Generative AI docs (Gemini 3 Pro Image capabilities).

🌐 Web query:

Gemini 3 Pro Image gemini-3-pro-image-preview output pricing per token knowledge cutoff

💡 Result:

Summary — gemini-3-pro-image-preview

Knowledge cutoff: January 2025. [1][2]

Text (token) pricing (preview): $2.00 per 1M input tokens; $12.00 per 1M output tokens. [1][3]

Image pricing (preview): image input ≈ $0.0011 per image; image output ≈ $0.134 per 1K/2K image and $0.24 per 4K image (pricing varies by resolution). [1][3]

Sources: Google Gemini model & pricing docs. [1] Gemini model page; [2] Vertex AI / Gemini 3 FAQ; [3] Gemini pricing page.

Fix incomplete output pricing metadata.

The token limits, 'thinking' capability, and knowledge cutoff (January 2025) are correct. However, the output pricing is incomplete: the value 0.134 represents image output pricing per 1K/2K image unit only. For a model that supports both text and image outputs, the pricing must account for text output at $12.00 per 1M tokens. Update the pricing structure to distinguish between text output ($12 per 1M tokens) and image output ($0.134 per 1K/2K image, $0.24 per 4K image).

Not really sure what convention I should follow here, it doesn't seem like we do this for other models.

@vmg-dev You're absolutely right! Looking at the existing models in the codebase, all image generation models (GEMINI_2_5_FLASH_IMAGE, GEMINI_2_FLASH_IMAGE, IMAGEN_4_GENERATE, etc.) use a single output.normal value, and the current ModelMeta pricing structure only supports one output price.

The 0.134 value you're using aligns with the image output pricing from the documentation, which is appropriate for an image generation model. Separating text vs. image output pricing would require restructuring the pricing interface itself, which is beyond the scope of this PR.

Your implementation follows the established convention correctly. Sorry for the confusion!

🧠 Learnings used

Learnt from: CR Repo: TanStack/ai PR: 0 File: CLAUDE.md:0-0 Timestamp: 2025-12-13T17:09:09.794Z Learning: Applies to packages/typescript/*/src/model-meta.ts : Maintain model metadata files that define provider options and capabilities per model for per-model type safety

vmg-dev added 3 commits December 26, 2025 12:47

feat(ai-gemini): add Gemini 3 Flash and Pro Image models

8bd876c

test(ai-gemini): add type tests for gemini-3-flash-preview model

7676352

chore: add changeset for Gemini 3 models

dcffa05

coderabbitai bot reviewed Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(ai-gemini): add Gemini 3 text and image models #190

feat(ai-gemini): add Gemini 3 text and image models #190

Uh oh!

vmg-dev commented Dec 26, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Dec 26, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 26, 2025 •

edited

Loading

Uh oh!

vmg-dev Dec 26, 2025

Uh oh!

coderabbitai bot Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

feat(ai-gemini): add Gemini 3 text and image models #190

Are you sure you want to change the base?

feat(ai-gemini): add Gemini 3 text and image models #190

Uh oh!

Conversation

vmg-dev commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 Changes

✅ Checklist

🚀 Release Impact

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vmg-dev Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vmg-dev commented Dec 26, 2025 •

edited

Loading

coderabbitai bot commented Dec 26, 2025 •

edited

Loading

coderabbitai bot Dec 26, 2025 •

edited

Loading