SequentialAgent Architecture: Why Send Chat History When Using output_key + Template Variables? #3457

prabha-git · 2025-11-09T01:53:35Z

prabha-git
Nov 9, 2025

I'm analyzing AgentOps traces from my ADK SequentialAgent pipeline and observed an interesting pattern in how data flows between agents. I'd like to understand the design rationale behind it.

Observed Behavior

When Agent 2 receives data from Agent 1 in a SequentialAgent pipeline, the data appears twice in the LLM request:

1. Template Substitution in System Instruction

The {summary} template variable is replaced with the actual data from context.state['summary']:

"system_instruction": "You are a professional Spanish translator...

ARTICLE SUMMARY:
{'title': 'Phoenix Suns Win Historic NBA Championship...', 
 'key_points': [...], 
 'main_subject': 'Phoenix Suns', 
 'event_outcome': 'NBA Championship Win', 
 'word_count': 210, 
 'sentiment': 'positive'}

Translate the following into Spanish..."

2. Chat History Context

The same data also appears in the conversation history as a user message:

{
  "content": "For context:\n[summarizer_agent] said: {\n  \"title\": \"Phoenix Suns Win Historic...\",\n  ...full summary JSON...\n}",
  "role": "user"
}

Code Pattern

# Agent 1
summarizer_agent = Agent(
    model="gemini-2.0-flash",
    name="summarizer_agent",
    instruction="...",
    output_key="summary",  # Saves to context.state['summary']
    output_schema=ArticleSummary,
)

# Agent 2
translator_agent = Agent(
    model="gemini-2.0-flash",
    name="translator_agent",
    instruction="""You are a professional Spanish translator.

ARTICLE SUMMARY:
{summary}  # Template injection from state

Translate the following into Spanish...
""",
    output_key="translation",
    output_schema=SpanishTranslation,
)

news_pipeline = SequentialAgent(
    name="news_pipeline",
    sub_agents=[summarizer_agent, translator_agent],
)

Questions

Is this duplication intentional? Does ADK deliberately include both mechanisms for a specific architectural reason?
What value does chat history provide beyond template-injected data? Since the data is already in the system instruction via {summary}, what additional benefit does seeing [summarizer_agent] said: {...} in chat history provide to the LLM?
Are there scenarios where I should optimize this? For large data payloads , this results in significant token duplication (~150-200 extra tokens per agent call). Should I disable chat history, or is keeping both mechanisms recommended?

Environment

ADK Version: 1.16.0
Language: Python
Agent Type: SequentialAgent with output_key pattern
Observability: AgentOps traces confirming both mechanisms active

Trace Evidence

Full trace available showing both mechanisms: [would attach traces.json excerpt]

Any insights into the design rationale would be greatly appreciated! I want to ensure I'm following best practices for multi-agent data flow.

adk-bot · 2025-11-09T01:54:57Z

adk-bot
Nov 9, 2025
Maintainer

Response from ADK Answering Agent (experimental, answer may be inaccurate)

TLDR: This behavior is intentional and reflects two core ADK concepts: shared state for direct data passing and event history for complete observability. You can and should optimize this for large data payloads by configuring the agent to not include the full conversation history in the LLM prompt, which will reduce token usage.

Hi @prabha-git, that's an excellent observation and a great question about the design of SequentialAgent. Let's break it down.

1. Is this duplication intentional?

Yes, the behavior you're seeing is intentional. It arises from two distinct and fundamental mechanisms in ADK that work together:

Shared State (context.state): This is the primary mechanism for direct, structured data passing between agents in a sequence. When you use output_key="summary", you are explicitly telling the summarizer_agent to save its output to context.state['summary']. The next agent then uses this value to populate the {summary} template variable. This is the intended method for sharing data payloads. [1]
Event History (session.events): This is the universal logging and observability mechanism in ADK. Every significant action—an agent running, producing output, or a tool being called—is recorded as an event in a chronological log. [2] The [summarizer_agent] said: {...} message in the chat history is the result of this system. It ensures that a complete, auditable trace of the entire interaction is always available. [3]

2. What value does chat history provide?

The event history serves a different purpose than the state dictionary:

Complete Context & Observability: The history provides a full conversational trace, which is invaluable for debugging and understanding the agent's flow of logic. It answers "what happened and in what order?" For instance, if an agent were to fail or produce a warning, that event would appear in the history, providing crucial context that a simple state variable would miss. [4]
Decoupling: The templating mechanism ({summary}) is decoupled from the history. An agent can receive data through templates without needing to parse the entire conversation history. The history is there for the LLM's broader contextual understanding and for developers to trace execution. [5]

3. Are there scenarios where I should optimize this?

Absolutely. Your instinct is correct. For large data payloads, sending the full history in addition to the templated data can be inefficient and lead to unnecessary token costs.

ADK provides ways to control this. While the specifics can depend on the agent implementation, you can generally control what is included in the prompt sent to the LLM. You should be able to configure your translator_agent to not include the preceding conversation history. This would prevent the summarizer_agent's output from being included twice in the final LLM prompt, thereby saving tokens. You would still get the benefit of the {summary} template injection from the shared state.

By default, ADK leans towards providing full context (history) for robustness, but it gives you the option to optimize for token efficiency by customizing what the LLM sees.

I hope this clarifies the design rationale! It's a great question that gets to the heart of how ADK manages data and context.

[1] https://google.github.io/adk-docs/agents/workflow-agents/sequential-agents/
[2] https://google.github.io/adk-docs/events/
[3] https://google.github.io/adk-docs/get-started/about/
[4] https://google.github.io/adk-docs/sessions/session/
[5] https://google.github.io/adk-docs/agents/multi-agents/

0 replies

prabha-git · 2025-11-09T03:58:46Z

prabha-git
Nov 9, 2025
Author

Follow-up: `include_contents='none'` Doesn't Eliminate Agent Transition Messages

Thank you for the detailed response! I've tested include_contents='none' and observed some unexpected behavior.

Test Setup

# Sequential pipeline with two agents
summarizer_agent = Agent(
    model="gemini-2.0-flash-exp",
    name="summarizer_agent",
    instruction="...",
    output_key="summary",  # Saves to context.state['summary']
    output_schema=ArticleSummary,
)

translator_agent = Agent(
    model="gemini-2.0-flash-exp",
    name="translator_agent",
    instruction="""...
    ARTICLE SUMMARY:
    {summary}  # Template injection from state
    ...""",
    output_key="translation",
    output_schema=SpanishTranslation,
    include_contents='none',  # Testing this
)

news_pipeline = SequentialAgent(
    name="news_pipeline",
    sub_agents=[summarizer_agent, translator_agent],
)

Observed Behavior

Before (include_contents='default'):

Contents: 2 messages
- Message 1: Initial user message ("Please summarize this news article...")
- Message 2: Agent transition ([summarizer_agent] said: {...full JSON...})
Prompt tokens: 645

After (include_contents='none'):

Contents: 1 message
- Message 1: Agent transition ([summarizer_agent] said: {...full JSON...})
Prompt tokens: 577
Savings: 68 tokens (10.5% reduction)

The Issue

The [summarizer_agent] said: {...} message is still included despite include_contents='none'. This means:

Initial user message removed
Agent transition message still present
Data still duplicated (system instruction via {summary} template + agent transition message in contents)

Evidence from AgentOps trace:

{
  "system_instruction": "...ARTICLE SUMMARY:\n{'title': 'Phoenix Suns...', 'key_points': [...]}...",
  "contents": [
    {
      "parts": [{
        "text": "[summarizer_agent] said: {\n  \"title\": \"Phoenix Suns...\",\n  \"key_points\": [...]\n}"
      }],
      "role": "user"
    }
  ]
}

Questions

Is this the intended behavior of include_contents='none'? Does it only remove conversation history but preserve agent transition messages?
Is there a way to disable agent transition messages specifically? For pipelines with large structured data payloads, the duplication persists even with include_contents='none'.
Should I rely solely on template variables ({key}) for data passing? Is there a configuration to completely eliminate the agent transition messages since the data is already injected via templates in the system instruction?

The current behavior saves only ~10% tokens instead of the expected elimination of duplicate data.

Environment

ADK Version: 1.16.0
Python
SequentialAgent with output_key + template variable pattern
AgentOps trace comparison available upon request

Thank you for any clarification on the intended design!

0 replies

zmy1116 · 2025-11-11T02:04:29Z

zmy1116
Nov 11, 2025

I think you can modify the chat history by replacing the entire message to be something like [MESSAGE COMPACTED] the summary is generated and displayed in later context, the same way we compress tool results.

0 replies

prabha-git · 2025-11-11T03:41:59Z

prabha-git
Nov 11, 2025
Author

Thanks for the suggestion, @zmy1116! Compressing agent messages is definitely an improvement over the current behavior.

However, in sequential pipelines where template variables pass all necessary data, message compression doesn't fully eliminate the duplication concern:

# What the second agent receives:
System Instruction: "Translate this summary: {full_json_summary}"
Chat History: "[summarizer_agent] said: message compacted"

The full data is already in the instruction via {summary}. The compressed stub in chat history adds no value for the LLM and still consumes tokens (albeit fewer).

For maximum efficiency in sequential workflows, an option to skip agent messages entirely from chat history would be ideal:

translator_agent = Agent(
    instruction="Translate {summary}",
    include_contents='none',
    skip_agent_messages=True  # ← Feature request
)

2 replies

zmy1116 Nov 11, 2025

yeah i did not realize the order of messages are getting weird, then just trim off the message from history yourself https://github.com/google/adk-python/blob/main/contributing/samples/history_management/agent.py

prabha-git Nov 12, 2025
Author

hmm , i haven't seen the message order issue since i am testing in a tiny use case

prabha-git · 2025-11-12T21:26:34Z

prabha-git
Nov 12, 2025
Author

Update: Root Cause Identified & Workaround Implemented

I've found the root cause and successfully implemented a workaround for this issue.

Root Cause

This behavior is tracked in Issue #2207 and was closed as "NOT_PLANNED". The ADK team considers the automatic injection of agent transition messages ([agent_name] said:) intentional.

This means there's no built-in configuration to disable it, but we can work around it using callbacks.

Solution: `before_model_callback`

For template-only pipelines where agents receive all data through template variables (e.g., {summary}), you can use before_model_callback to clear conversation history before the request is sent to the LLM:

from google.adk.agents import Agent, SequentialAgent

def clear_all_history(callback_context=None, llm_request=None):
    """Remove ALL conversation history from LLM requests.
    
    Use this when agents receive all necessary data through template variables
    in the instruction. No conversation history is needed.
    """
    if llm_request is not None:
        llm_request.contents = []  # Clear all history
    return None

# Agent 1: Creates summary
summarizer_agent = Agent(
    name="summarizer_agent",
    output_key="summary",
    instruction="Summarize this article: {article_text}",
)

# Agent 2: Receives data ONLY through {summary} template
translator_agent = Agent(
    name="translator_agent",
    instruction="""ARTICLE SUMMARY:
    {summary}
    
    Translate this summary to Spanish...
    """,
    before_model_callback=clear_all_history,  # ← Key solution
)

pipeline = SequentialAgent(
    sub_agents=[summarizer_agent, translator_agent],
)

How It Works

ADK builds the LLM request with full history (including [agent_name] said: messages)
Your callback runs and modifies llm_request.contents = [] (Python passes by reference)
The modified request is sent to the LLM with only the system instruction

Results

What the LLM receives:

System instruction with populated template variables ({summary})
Empty conversation history (contents = [])
Eliminates token duplication

Reference

For alternative approaches (selective filtering, history trimming for conversational agents), see the discussion in Issue #2207.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SequentialAgent Architecture: Why Send Chat History When Using output_key + Template Variables? #3457

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 5 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

SequentialAgent Architecture: Why Send Chat History When Using output_key + Template Variables? #3457

Uh oh!

Uh oh!

prabha-git Nov 9, 2025

Observed Behavior

1. Template Substitution in System Instruction

2. Chat History Context

Code Pattern

Questions

Environment

Trace Evidence

Replies: 5 comments · 2 replies

Uh oh!

adk-bot Nov 9, 2025 Maintainer

1. Is this duplication intentional?

2. What value does chat history provide?

3. Are there scenarios where I should optimize this?

Uh oh!

prabha-git Nov 9, 2025 Author

Follow-up: include_contents='none' Doesn't Eliminate Agent Transition Messages

Test Setup

Observed Behavior

The Issue

Questions

Environment

Uh oh!

zmy1116 Nov 11, 2025

Uh oh!

prabha-git Nov 11, 2025 Author

Uh oh!

zmy1116 Nov 11, 2025

Uh oh!

prabha-git Nov 12, 2025 Author

Uh oh!

Uh oh!

prabha-git Nov 12, 2025 Author

Update: Root Cause Identified & Workaround Implemented

Root Cause

Solution: before_model_callback

How It Works

Results

Reference

prabha-git
Nov 9, 2025

Replies: 5 comments 2 replies

adk-bot
Nov 9, 2025
Maintainer

prabha-git
Nov 9, 2025
Author

Follow-up: `include_contents='none'` Doesn't Eliminate Agent Transition Messages

zmy1116
Nov 11, 2025

prabha-git
Nov 11, 2025
Author

prabha-git Nov 12, 2025
Author

prabha-git
Nov 12, 2025
Author

Solution: `before_model_callback`