Skip to content

Added timeout and cancellation support to ModelClient polling loop…#559

Open
Timandes wants to merge 3 commits intoalibaba:masterfrom
Timandes:fix/gh-issue-550
Open

Added timeout and cancellation support to ModelClient polling loop…#559
Timandes wants to merge 3 commits intoalibaba:masterfrom
Timandes:fix/gh-issue-550

Conversation

@Timandes
Copy link
Contributor

@Timandes Timandes commented Mar 3, 2026

Summary

Fixes #550 - Add timeout and cancellation support to ModelClient polling loops.

Problem

ModelClient 中的轮询循环 (while True) 缺少超时机制和取消支持,可能导致程序永久阻塞:

  • pop_request 方法无限等待请求
  • wait_for_first_request 方法无限等待日志文件
  • 无法响应 asyncio.CancelledError 取消信号

Solution

结合方案一和方案二,实现完整的超时和取消支持:

  1. 超时机制

    • 添加 timeout 参数,默认值 60 秒
    • 使用 time.monotonic() 计算超时时间
    • 超时后抛出 TimeoutError
  2. 取消支持

    • 显式捕获 asyncio.CancelledError
    • 记录日志后重新抛出,遵循项目规范

Changes

Core Implementation

  • rock/sdk/model/client.py
    • Add DEFAULT_POLL_TIMEOUT = 60.0 constant
    • pop_request(index, timeout=DEFAULT_POLL_TIMEOUT) - 添加超时参数
    • wait_for_first_request(timeout=DEFAULT_POLL_TIMEOUT) - 添加超时参数
    • Both methods now properly handle asyncio.CancelledError

Tests

  • tests/unit/sdk/model/test_model_client.py
    • test_pop_request_raises_timeout_error_when_timeout_expires
    • test_wait_for_first_request_raises_timeout_error_when_timeout_expires
    • test_pop_request_propagates_cancelled_error
    • test_wait_for_first_request_propagates_cancelled_error

Documentation

  • Updated SDK documentation (English & Chinese)
  • Added new Model Service SDK section

Examples

  • examples/model_client_demo.py - Comprehensive demo showcasing:
    • Timeout configuration
    • Cancellation handling
    • Normal request retrieval

Usage Example

  from rock.sdk.model.client import ModelClient

  async def main():
      client = ModelClient()

      try:
          # Wait with 30-second timeout
          await client.wait_for_first_request(timeout=30.0)

          #  Pop request with default 60-second timeout
          request = await client.pop_request(index=1)
      except TimeoutError as e:
          print(f"Operation timed out: {e}")

  asyncio.run(main())

Breaking Changes

None. The timeout parameter has a sensible default value, making the change backward compatible.

Related

Timandes added a commit to Timandes/ROCK that referenced this pull request Mar 4, 2026
…ing loops

- Add DEFAULT_POLL_TIMEOUT constant (60 seconds)
- Add PollOptions interface for timeout and AbortSignal support
- Update popRequest() to accept timeout and cancellation via AbortSignal
- Update waitForFirstRequest() to accept timeout and cancellation
- Add comprehensive unit tests for timeout and cancellation scenarios

Fixes issue where while(true) loops could block forever without
timeout or cancellation mechanism. Aligns with Python SDK PR alibaba#559.
…aba#550)

Replace hardcoded DEFAULT_POLL_TIMEOUT constant with ROCK_MODEL_CLIENT_POLL_TIMEOUT
environment variable. Default is now None (infinite wait) which is appropriate for
waiting on Agent Actions.

Changes:
- Add ROCK_MODEL_CLIENT_POLL_TIMEOUT env var in rock/env_vars.py
- Remove DEFAULT_POLL_TIMEOUT = 60.0 constant from client.py
- pop_request/wait_for_first_request now read timeout from env var
- Add tests for environment variable functionality
…#550)

Per PR review feedback, use env_vars.ROCK_MODEL_CLIENT_POLL_TIMEOUT
directly as the default value for timeout parameter instead of
checking it inside the function body.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Python SDK: ModelClient polling loops lack timeout and cancellation support

2 participants