Skip to content

Add automatic retry logic for transient Gemini API errors (503, 429)#385

Open
YuqiGuo105 wants to merge 2 commits intogoogle:mainfrom
YuqiGuo105:fix/chunk-retry-503
Open

Add automatic retry logic for transient Gemini API errors (503, 429)#385
YuqiGuo105 wants to merge 2 commits intogoogle:mainfrom
YuqiGuo105:fix/chunk-retry-503

Conversation

@YuqiGuo105
Copy link

@YuqiGuo105 YuqiGuo105 commented Feb 20, 2026

Description

Fixes #240

This change implements exponential backoff retry for transient errors in the Gemini provider, preventing entire document processing failures when a single chunk encounters temporary service overload (503) or rate limiting (429) errors.

Changes:

  • Add retry configuration parameters (max_retries, retry_delay, max_retry_delay)
  • Implement _is_retryable_error() to distinguish temporary vs permanent errors
  • Add exponential backoff retry logic in _process_single_prompt()
  • Each chunk retries independently without affecting other chunks
  • Add comprehensive test coverage (30 test cases)

How Has This Been Tested?

Test suite in tests/test_gemini_retry.py with 30 test cases covering error classification, retry logic, parallel processing, and configuration.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have added tests that prove my fix is effective
  • New and existing unit tests pass locally with my changes

@google-cla
Copy link

google-cla bot commented Feb 20, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@github-actions github-actions bot added the size/L Pull request with 600-1000 lines changed label Feb 20, 2026
Fixes google#240

This change implements exponential backoff retry for transient errors
in the Gemini provider, preventing entire document processing failures
when a single chunk encounters temporary service overload (503) or rate
limiting (429) errors.

Changes:
- Add retry configuration parameters (max_retries, retry_delay, max_retry_delay)
- Implement _is_retryable_error() to distinguish temporary vs permanent errors
- Add exponential backoff retry logic in _process_single_prompt()
- Each chunk retries independently without affecting other chunks
- Add comprehensive test coverage (30 test cases)

Benefits:
- Prevents API quota waste from re-processing entire documents
- Reduces 429 errors from excessive retries
- Improves reliability for large batch processing
@github-actions
Copy link

⚠️ Branch Update Required

Your branch is 1 commits behind main. Please update your branch to ensure CI checks run with the latest code:

git fetch origin main
git merge origin/main
git push

Note: Enable "Allow edits by maintainers" to allow automatic updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Pull request with 600-1000 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

langextract chunking does not recover from 503 error "The model is overloaded"

1 participant