Feat/tau2 adk #398

heiko-hotz · 2025-10-01T13:12:41Z

Description

This pull request introduces a new evaluation harness designed to bridge the gap between agents built with Google's Agent Development Kit (ADK) and the Tau2 Bench evaluation framework.

This initial version includes the core components needed for end-to-end evaluation:

Main Evaluation Runner (run_evaluation.py):

Orchestrates the conversational flow between the Tau2 User Simulator and the ADK Agent.
Dynamically loads ADK agents from a specified file path.
Injects the task-specific Tau2 domain policy into the ADK agent's instructions at runtime.

Tool Mapping & Translation Layer (harness/tool_mapper.py):

Provides a simple, extensible system for mapping tool names and arguments from the ADK agent's perspective to the Tau2 environment's implementation.
It intercepts FunctionCall events from the ADK agent, translates them, and executes the real tool within the Tau2 environment.

Sample ADK Agent (sample_adk_agent/):

A fully functional example agent for the airline domain is included.
This serves as a clear template for how to structure an ADK agent to be compatible with this harness.

Comprehensive Documentation (README.md):

A detailed README.md explains the project's purpose, architecture, and provides clear instructions for setup, usage, and extension to new domains.

mstyer-google

Generally looks great. Check your line lengths across the board. Most of my other comments are just suggestions and not requirements - if you decide not to implement them just ack the comment.

python/evaluation/tau2-adk-harness/harness/tool_mapper.py

python/evaluation/tau2-adk-harness/run_evaluation.py

heiko-hotz added 4 commits October 1, 2025 11:37

feat: Add tau2-bench as a submodule

14c7581

initial commit

fcc870d

updated README

cc48b94

updated requirements

d26ef0a

mstyer-google requested changes Oct 3, 2025

View reviewed changes

heiko-hotz added 2 commits October 6, 2025 11:33

width correction

1b89d7a

feedback incorporated

764f7b2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/tau2 adk #398

Feat/tau2 adk #398

Uh oh!

heiko-hotz commented Oct 1, 2025

Uh oh!

mstyer-google left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feat/tau2 adk #398

Are you sure you want to change the base?

Feat/tau2 adk #398

Uh oh!

Conversation

heiko-hotz commented Oct 1, 2025

Description

Uh oh!

mstyer-google left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants