openai-compatibility 上游请求 60s 硬超时返回 500，长响应被截断 #2157

adminwdsj · 2026-03-14T19:36:30Z

adminwdsj
Mar 14, 2026

描述

通过 openai-compatibility 配置的上游（如 NVIDIA API），当模型响应时间超过 60s 时，
CLIProxyAPI 直接返回 HTTP 500，请求被截断。

复现场景

配置了 NVIDIA moonshotai/kimi-k2-instruct-0905 作为 openai-compatibility 上游
客户端（Chrome 扩展 / cursor2api）发起 non-streaming 请求
上游模型推理耗时 > 60s 时，CLIProxyAPI 主动断开连接返回 500

部分请求耗时超过 1 分钟时会被强制中断。

期望行为

openai-compatibility 上游应支持可配置的超时时间（类似 PR feat: add per-credential response-header-timeout and treat 524 as transient error #2060 对 claude-api-key 的 response-header-timeout）
或者至少将默认超时提高到 300s 以适配推理类模型

建议

参考 #2060 的思路，在 openai-compatibility 的配置中也支持 response-header-timeout 字段：

openai-compatibility:
- name: NVIDIA
  base-url: https://integrate.api.nvidia.com/v1
  response-header-timeout: 300  # 等待上游首字节的超时
  api-key-entries:
    - api-key: nvapi-xxx

#2060 的 `response-header-timeout` 应该也覆盖 `openai-compatibility` provider，不只是 `claude-api-key`。

luispater · 2026-03-15T23:16:26Z

luispater
Mar 15, 2026
Maintainer

该项目中在处理上下游请求的代码中是不允许出现请求超时的。

从连接代理、请求上游、转发上游数据包给客户端，你都可以看到没有一行关于代码超时的功能。

作为一个LLM Proxy，有超时设置是不负责任的。

建议你自行排查你的CDN、Nginx设置。

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

openai-compatibility 上游请求 60s 硬超时返回 500，长响应被截断 #2157

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

openai-compatibility 上游请求 60s 硬超时返回 500，长响应被截断 #2157

Uh oh!

adminwdsj Mar 14, 2026

描述

复现场景

期望行为

建议

Replies: 1 comment

Uh oh!

luispater Mar 15, 2026 Maintainer

adminwdsj
Mar 14, 2026

luispater
Mar 15, 2026
Maintainer