Skip to content

Conversation

@hhoikoo
Copy link
Member

@hhoikoo hhoikoo commented Nov 11, 2025

resolves #6720 (BA-3023)

This change introduces EtcdClientRegistry, which is a class that can contain etcd clients for multiple agents, each with its own prefix information at scaling group and individual node level.

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation
  • Documentation
    • Contents in the docs directory
    • docstrings in public interfaces and type annotations

@hhoikoo hhoikoo requested review from HyeockJinKim, achimnol and Copilot and removed request for achimnol November 11, 2025 02:34
@github-actions github-actions bot added size:L 100~500 LoC comp:agent Related to Agent component labels Nov 11, 2025
Copilot finished reviewing on behalf of hhoikoo November 11, 2025 02:37
@hhoikoo hhoikoo force-pushed the feat/BA-3023/multi-agent-etcd branch from 376a46d to 242db6e Compare November 11, 2025 02:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces EtcdClientRegistry, a new class designed to manage etcd clients for multiple agents within a single process. This enables clean separation of etcd client instances, each with appropriate scope prefixes for accessing agent-specific, scaling-group, and global configurations.

Key changes:

  • Created EtcdClientRegistry class to centralize etcd client management with support for multiple agents
  • Refactored AgentRPCServer to use the registry instead of a single etcd client
  • Updated all etcd access patterns to use either global_etcd for shared configuration or agent-specific clients

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/ai/backend/agent/etcd.py New file introducing the EtcdClientRegistry class that manages etcd clients for multiple agents with proper scoping
src/ai/backend/agent/server.py Refactored to use EtcdClientRegistry instead of a single AsyncEtcd instance; updated all etcd access to use appropriate client (global vs agent-specific)
tests/agent/test_agent.py Updated test fixtures and mocks to use the new etcd_client_registry structure with global_etcd access pattern
changes/.feature.md Added changelog entry documenting the new feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@hhoikoo hhoikoo force-pushed the feat/BA-2752/multiple-agents-config branch from 3318324 to f0e60cc Compare November 11, 2025 04:39
@hhoikoo hhoikoo force-pushed the feat/BA-3023/multi-agent-etcd branch from 242db6e to 272a4b4 Compare November 11, 2025 04:40
@hhoikoo hhoikoo force-pushed the feat/BA-2752/multiple-agents-config branch from f0e60cc to ab573ad Compare November 11, 2025 06:03
@hhoikoo hhoikoo force-pushed the feat/BA-3023/multi-agent-etcd branch 5 times, most recently from 1e4b560 to 5b9fb27 Compare November 11, 2025 08:04
@github-actions github-actions bot added size:XL 500~ LoC and removed size:L 100~500 LoC labels Nov 11, 2025
@hhoikoo hhoikoo force-pushed the feat/BA-2752/multiple-agents-config branch 2 times, most recently from 0c85e11 to 6ebc74f Compare November 11, 2025 09:13
@hhoikoo hhoikoo changed the base branch from feat/BA-2752/multiple-agents-config to feat/BA-3026/agent-runtime November 11, 2025 13:32
@hhoikoo hhoikoo force-pushed the feat/BA-3026/agent-runtime branch from 7b409f1 to daee31e Compare November 11, 2025 13:33
@hhoikoo hhoikoo force-pushed the feat/BA-3023/multi-agent-etcd branch from 5b9fb27 to fbe9d54 Compare November 11, 2025 13:34
@hhoikoo hhoikoo force-pushed the feat/BA-3026/agent-runtime branch from daee31e to b96cb0b Compare November 11, 2025 13:43
@hhoikoo hhoikoo force-pushed the feat/BA-3023/multi-agent-etcd branch from fbe9d54 to a7e29d5 Compare November 11, 2025 13:45
@hhoikoo hhoikoo force-pushed the feat/BA-3026/agent-runtime branch from b96cb0b to 14675fa Compare November 11, 2025 16:29
@hhoikoo hhoikoo force-pushed the feat/BA-3023/multi-agent-etcd branch from a7e29d5 to 0249c25 Compare November 11, 2025 16:32
@hhoikoo hhoikoo force-pushed the feat/BA-3026/agent-runtime branch from 14675fa to 184737e Compare November 12, 2025 04:18
@hhoikoo hhoikoo force-pushed the feat/BA-3023/multi-agent-etcd branch from 0249c25 to 9f6248c Compare November 12, 2025 04:21
@hhoikoo hhoikoo force-pushed the feat/BA-3026/agent-runtime branch from 184737e to f3e6537 Compare November 12, 2025 04:42
This change introduces AgentEtcdClientView, which is a class that acts
as an adaptor layer for ensuring that the config scope of etcd is always
in sync with the specific agent's scaling group and agent ID.
@hhoikoo hhoikoo force-pushed the feat/BA-3023/multi-agent-etcd branch from 9f6248c to 33ecbc4 Compare November 12, 2025 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:agent Related to Agent component size:XL 500~ LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants