Skip to content

Add Gemini 3.1 Pro, Flash, and Flash Lite benchmark results#4922

Open
DUCKJAIII wants to merge 1 commit intoAider-AI:mainfrom
DUCKJAIII:add-gemini-3-benchmarks
Open

Add Gemini 3.1 Pro, Flash, and Flash Lite benchmark results#4922
DUCKJAIII wants to merge 1 commit intoAider-AI:mainfrom
DUCKJAIII:add-gemini-3-benchmarks

Conversation

@DUCKJAIII
Copy link

Description

This PR adds the Polyglot benchmark results for the latest Google Vertex AI Gemini models, including the new 3.1 preview models.

Models Benchmarked:

  • vertex_ai/gemini-3.1-pro-preview
  • vertex_ai/gemini-3.1-flash-lite-preview
  • vertex_ai/gemini-3-flash

Run Details & Edit Formats

  • Pro & Flash Models: Explicitly forced to use the diff-fenced edit format, which they handled exceptionally well.
  • Flash Lite: Allowed to default to the whole edit format to ensure stability and avoid syntax loop traps.
  • Environment: Ran via Aider's official Docker container. Note: Encountered some expected litellm timeouts / API connection drops due to strict Vertex AI quota limits on the preview endpoints, but the runs were successfully resumed and completed.

Benchmark Results (Pass Rate)

  • Gemini 3.1 Pro: 94.2%
  • Gemini 3 Flash: 82.7%
  • Gemini 3.1 Flash Lite: 68.4%

Let me know if you need me to upload any of the raw run directories for verification!

@CLAassistant
Copy link

CLAassistant commented Mar 15, 2026

CLA assistant check
All committers have signed the CLA.

@DUCKJAIII
Copy link
Author

@CLAassistant recheck

@DUCKJAIII DUCKJAIII force-pushed the add-gemini-3-benchmarks branch from 49f30d9 to d7afbc9 Compare March 15, 2026 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants