Skip to content

Conversation

@Chesars
Copy link
Contributor

@Chesars Chesars commented Nov 12, 2025

Title

feat: Add support for reasoning_effort="none" for Gemini models

Relevant issues

Fixes #16420

Pre-Submission checklist

  • I have added testing in the tests/llm_translation/ directory
  • I have verified my test passes locally
  • My PR's scope is isolated - it only adds support for reasoning_effort="none"

Type

🆕 New Feature

Changes

Summary

Adds support for reasoning_effort="none" - the OpenAI standard parameter for disabling thinking in Gemini models.

This is the official OpenAI-compatible way, as documented in Google's Gemini OpenAI guide:

"If you want to disable thinking, you can set reasoning_effort to 'none'"

Provides up to 96% cost savings when reasoning is not required.

Implementation Details

1. Model Configuration (model_prices_and_context_window.json)

  • Added "supports_reasoning": true to gemini-2.0-flash-thinking-exp-01-21
  • Enables proper parameter validation for Gemini thinking models

2. Core Logic (vertex_and_google_ai_studio_gemini.py)

  • Implemented reasoning_effort="none"{thinkingBudget: 0, includeThoughts: false}
  • Follows OpenAI standard specification

3. Testing (tests/llm_translation/test_gemini.py)

  • Added test_reasoning_effort_none_mapping() unit test
  • Verified correct Gemini API transformation

4. Documentation (docs/providers/gemini.md)

  • Updated with OpenAI standard reference and cost optimization guidance
  • Added note about Gemini 2.5 Pro limitations

Performance Impact

Configuration Tokens Savings
Default (no parameter) ~313 baseline
reasoning_effort="none" ~12 96% cheaper
reasoning_effort="disable" ~12 Same result

Usage

import litellm

response = litellm.completion(
    model="gemini/gemini-2.0-flash-thinking-exp-01-21",
    messages=[{"role": "user", "content": "What is 2+2?"}],
    reasoning_effort="none"  # OpenAI standard, 96% cheaper!
)

Why "none" or "disable"?

  • "none" = OpenAI standard ✅ (recommended)
  • "disable" = LiteLLM legacy (same result, still works)

Both produce identical results, but "none" is the official OpenAI API specification.

Backward Compatibility

✅ Fully backward compatible - existing code continues to work unchanged.


Note

Adds support for reasoning_effort="none" for Gemini, mapping to thinkingBudget=0/includeThoughts=false, with model metadata, docs, and tests updated.

  • Core (Gemini)
    • Map reasoning_effort="none" to thinkingConfig {"thinkingBudget": 0, "includeThoughts": false} in vertex_and_google_ai_studio_gemini.py.
  • Model Metadata
    • Mark reasoning capability with "supports_reasoning": true for relevant Gemini models in model_prices_and_context_window.json.
  • Tests
    • Add test_reasoning_effort_none_mapping to verify correct mapping.
  • Docs
    • Update docs/providers/gemini.md with cost-optimization guidance, OpenAI compatibility, mapping table including "none", and note on Gemini 2.5 Pro limitations; add usage examples.

Written by Cursor Bugbot for commit c977ca1. This will update automatically on new commits. Configure here.

Implements support for reasoning_effort="none" parameter for Gemini models,
providing significant cost savings (up to 96% cheaper) by disabling thinking
budget while maintaining response quality.

Changes:
- Added "supports_reasoning": true to gemini-2.0-flash-thinking-exp-01-21 in model config
- Implemented mapping for reasoning_effort="none" to thinkingConfig {thinkingBudget: 0, includeThoughts: false}
- Added unit test to verify the mapping works correctly

Performance impact:
- Without reasoning_effort: ~313 tokens
- With reasoning_effort="none": ~12 tokens (96% cheaper)

Closes BerriAI#16420
@vercel
Copy link

vercel bot commented Nov 12, 2025

@Chesars is attempting to deploy a commit to the CLERKIEAI Team on Vercel.

A member of the Team first needs to authorize it.

@krrishdholakia krrishdholakia merged commit 491f57a into BerriAI:main Nov 13, 2025
4 of 6 checks passed
@Chesars Chesars deleted the feat/gemini-reasoning-effort-none branch November 13, 2025 10:45
LingXuanYin pushed a commit to talesofai/litellm that referenced this pull request Nov 14, 2025
…iAI#16548)

Implements support for reasoning_effort="none" parameter for Gemini models,
providing significant cost savings (up to 96% cheaper) by disabling thinking
budget while maintaining response quality.

Changes:
- Added "supports_reasoning": true to gemini-2.0-flash-thinking-exp-01-21 in model config
- Implemented mapping for reasoning_effort="none" to thinkingConfig {thinkingBudget: 0, includeThoughts: false}
- Added unit test to verify the mapping works correctly

Performance impact:
- Without reasoning_effort: ~313 tokens
- With reasoning_effort="none": ~12 tokens (96% cheaper)

Closes BerriAI#16420

Co-authored-by: Krish Dholakia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Support "reasoning_effort=none" for Gemini API

2 participants