Skip to content

get_architecture_overview_tool returns huge payload and can consume hundreds of thousands of tokens #476

@zer09

Description

@zer09

Problem

get_architecture_overview_tool can return an extremely large payload for bigger repositories, causing very high context usage. In one observed call, the tool result appears to have loaded roughly hundreds
of thousands of tokens into the agent context.

Evidence

Local session evidence:

  • Session file: 2026-05-13T18-24-49-204Z_019e2295-9633-773c-bc26-ad0e642d3c1c.jsonl
  • Line: 37
  • JSONL line size: 4,598,692 bytes
  • Tool: code-review-graph.get_architecture_overview_tool
  • Result text size: 1,516,253 chars
  • Same large result appears duplicated in:
    • message.content[0].text
    • message.details.mcpResult.content[0].text
  • Structured result also includes the same large data in:
    • message.details.mcpResult.structuredContent

The largest contributor appears to be full members arrays for each community:

  • 21 communities
  • 84 cross-community edges
  • 3 warnings
  • 11,425 total community members
  • 1,452,537 chars just from member paths

Actual behavior

get_architecture_overview_tool returns full member lists for every community. On large repos, or repos where worktrees are included in the graph, this creates a massive response.

Example summary from the result:

{
  "summary": "Architecture: 21 communities, 84 cross-community edges, 3 warning(s)",
  "communityCount": 21,
  "edgeCount": 84,
  "warningCount": 3,
  "totalMembers": 11425,
  "totalMemberChars": 1452537
}

Expected behavior

The architecture overview should be compact by default and safe to call from an LLM agent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions