Skip to content

Conversation

@ishaan-jaff
Copy link
Contributor

@ishaan-jaff ishaan-jaff commented Nov 12, 2025

[Feat] Dynamic Rate Limiter - Allow defining a rate limit policy by model + error

Fixes LIT-1389

This PR implements a dynamic rate limit policy that allows fine-grained control over when rate limits are enforced based on provider-specific error thresholds. When using dynamic rate limiting (rpm_limit_type: "dynamic" or tpm_limit_type: "dynamic"), rate limits will only be enforced when specific error types exceed configured thresholds per provider.

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4
  - model_name: claude-3-sonnet
    litellm_params:
      model: bedrock/anthropic.claude-3-sonnet-20240229-v1:0

litellm_settings:
  # Define provider-specific error thresholds
  dynamic_rate_limit_policy:
    openai:
      BadRequestErrorThreshold: 3
      RateLimitErrorThreshold: 5
      TimeoutErrorThreshold: 2
    bedrock:
      ContentPolicyViolationErrorThreshold: 4
      RateLimitErrorThreshold: 10
    azure:
      BadRequestErrorThreshold: 5

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature
✅ Test

Changes

@vercel
Copy link

vercel bot commented Nov 12, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Error Error Nov 12, 2025 2:36am

Comment on lines +1553 to +1554
f"[Dynamic Rate Limit] Tracked failure for deployment {deployment_id}, "
f"provider {custom_llm_provider}, error type {error_type}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 1 day ago

To fix this problem, avoid logging sensitive identifiers such as API keys, their hashes, deployment IDs (if derived from credentials), or any values that originate from authorization secrets. In this context, the deployment_id logged on line 1553 potentially exposes sensitive data, so we should redact it or replace it with a non-sensitive stand-in (e.g., "[REDACTED]"). Changing only the log output while keeping the rest of the logic intact preserves functionality.

How to fix:

  • On line 1553, when logging, mask or redact the deployment_id to prevent leakage of sensitive information.
  • You may use a general placeholder like "[REDACTED]" or something more descriptive, depending on real use-case knowledge.
  • No new imports are needed, and the edit only affects the single logging line.

Files/regions to change:

  • Edit the logging statement in litellm/proxy/hooks/parallel_request_limiter_v3.py, specifically lines 1553-1555.
Suggested changeset 1
litellm/proxy/hooks/parallel_request_limiter_v3.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/proxy/hooks/parallel_request_limiter_v3.py b/litellm/proxy/hooks/parallel_request_limiter_v3.py
--- a/litellm/proxy/hooks/parallel_request_limiter_v3.py
+++ b/litellm/proxy/hooks/parallel_request_limiter_v3.py
@@ -1550,7 +1550,7 @@
                                 error_type=error_type,
                             )
                             verbose_proxy_logger.debug(
-                                f"[Dynamic Rate Limit] Tracked failure for deployment {deployment_id}, "
+                                f"[Dynamic Rate Limit] Tracked failure for deployment [REDACTED], "
                                 f"provider {custom_llm_provider}, error type {error_type}"
                             )
                         
EOF
@@ -1550,7 +1550,7 @@
error_type=error_type,
)
verbose_proxy_logger.debug(
f"[Dynamic Rate Limit] Tracked failure for deployment {deployment_id}, "
f"[Dynamic Rate Limit] Tracked failure for deployment [REDACTED], "
f"provider {custom_llm_provider}, error type {error_type}"
)

Copilot is powered by AI and may make mistakes. Always verify output.
rpm_limit_type = metadata.get("rpm_limit_type")
tpm_limit_type = metadata.get("tpm_limit_type")

# Get dynamic rate limit policy from general_settings
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only going to work with rpm_limit_type/tpm_limit_type at key level. We also have that at team level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants