-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Fix/aws realtime tooluse barge in #3704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
theomonnom
merged 6 commits into
livekit:main
from
kachenjr:fix/aws-realtime-tooluse-barge-in
Oct 27, 2025
Merged
Fix/aws realtime tooluse barge in #3704
theomonnom
merged 6 commits into
livekit:main
from
kachenjr:fix/aws-realtime-tooluse-barge-in
Oct 27, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit fixes three critical bugs in the Nova Sonic realtime plugin that prevent it from working correctly in production scenarios. ## Issues Fixed ### 1. Audio Routing After Tool Calls **Problem**: Audio frames not playing after tool execution **Root Cause**: Audio routed to wrong generation after tool calls complete **Solution**: Create new generation per ASSISTANT SPECULATIVE text event **Impact**: Audio now plays correctly for each assistant response Nova Sonic sends ASSISTANT SPECULATIVE text events to signal new assistant turns, including after tool calls. Each turn needs its own generation to ensure audio frames route to the correct audio channel. ### 2. Tool Use Across Multiple Turns **Problem**: Tool calls fail or behave incorrectly across multiple turns **Root Cause**: Generation not closed after tool call, preventing framework from delivering tool results via update_chat_ctx() **Solution**: Close generation immediately after emitting tool call **Impact**: Tool use now works reliably across multiple conversation turns The LiveKit framework expects the generation to close so it can call update_chat_ctx() with tool results. A new generation is created when Nova Sonic sends the next ASSISTANT SPECULATIVE event with the response. ### 3. Crashes on User Interruption (Barge-In) **Problem**: Session crashes when user interrupts assistant mid-response **Root Cause**: Race conditions with future initialization and None pointer access after barge-in sets _current_generation to None **Solution**: - Initialize futures as None, create lazily in initialize_streams() - Add defensive None checks throughout event handlers **Impact**: Interruptions handled gracefully without crashes Creating futures in __init__ causes race conditions during session restart. Lazy initialization ensures the event loop exists before future creation. ## Additional Improvements ### Simplified Architecture - **Message tracking**: Single content_id_map dict instead of 4 separate dicts (messages, user_messages, speculative_messages, tool_messages) - **Restart tracking**: Per-turn _restart_attempts instead of session-level tracking for better barge-in metrics - **Timestamps**: Float (time.time()) instead of ISO-8601 strings for easier duration calculations The single dict approach is simpler, easier to debug, and sufficient for tracking content IDs and their types. Both approaches have identical memory characteristics (no leaks) since dicts live inside _ResponseGeneration instances that are created and destroyed per turn. ### Adopted from Origin - Added ModelStreamErrorException to recoverable errors (from origin/main commit c674705, Oct 16, 2025) - Removed child-safety line from DEFAULT_SYSTEM_PROMPT to match origin ## Testing All features verified working: - Audio playback ✓ - Tool use (multiple turns) ✓ - Barge-in/interruptions ✓ - Multi-turn conversations ✓ Tested against origin/main and confirmed tool use does not work without these fixes. ## Breaking Changes None. Public API unchanged. Metrics format unchanged. Only internal implementation differs. ## AI Assistance Portions of this code were developed with assistance from AI tools for debugging, testing, and implementation of the fixes described above. --- Co-authored-by: Amazon Q Developer
Add safe_mode=True to jokeapi call to filter out inappropriate content.
- Add None check before accessing content_id_map in tool handler - Add type cast for audio_bytes to satisfy mypy Buffer type requirement
Add type cast for tool task result to fix indexing error on line 1261
BumaldaOverTheWater94
approved these changes
Oct 24, 2025
theomonnom
approved these changes
Oct 27, 2025
Member
|
Thanks! |
akshaym1shra
pushed a commit
to akshaym1shra/agents
that referenced
this pull request
Nov 3, 2025
Co-authored-by: Jarrett Kachenmeister <[email protected]>
akshaym1shra
pushed a commit
to akshaym1shra/agents
that referenced
this pull request
Nov 10, 2025
Co-authored-by: Jarrett Kachenmeister <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix Nova Sonic audio routing, tool use, and barge-in handling
This commit fixes three critical bugs in the Nova Sonic realtime plugin that
prevent it from working correctly in production scenarios.
Issues Fixed
1. Audio Routing After Tool Calls
Problem: Audio frames not playing after tool execution
Root Cause: Audio routed to wrong generation after tool calls complete
Solution: Create new generation per ASSISTANT SPECULATIVE text event
Impact: Audio now plays correctly for each assistant response
Nova Sonic sends ASSISTANT SPECULATIVE text events to signal new assistant
turns, including after tool calls. Each turn needs its own generation to
ensure audio frames route to the correct audio channel.
2. Tool Use Across Multiple Turns
Problem: Tool calls fail or behave incorrectly across multiple turns
Root Cause: Generation not closed after tool call, preventing framework
from delivering tool results via update_chat_ctx()
Solution: Close generation immediately after emitting tool call
Impact: Tool use now works reliably across multiple conversation turns
The LiveKit framework expects the generation to close so it can call
update_chat_ctx() with tool results. A new generation is created when
Nova Sonic sends the next ASSISTANT SPECULATIVE event with the response.
3. Crashes on User Interruption (Barge-In)
Problem: Session crashes when user interrupts assistant mid-response
Root Cause: Race conditions with future initialization and None pointer
access after barge-in sets _current_generation to None
Solution:
Impact: Interruptions handled gracefully without crashes
Creating futures in init causes race conditions during session restart.
Lazy initialization ensures the event loop exists before future creation.
Additional Improvements
Simplified Architecture
(messages, user_messages, speculative_messages, tool_messages)
tracking for better barge-in metrics
duration calculations
The single dict approach is simpler, easier to debug, and sufficient for
tracking content IDs and their types. Both approaches have identical memory
characteristics (no leaks) since dicts live inside _ResponseGeneration
instances that are created and destroyed per turn.
Adopted from Origin
commit c674705, Oct 16, 2025)
Testing
All features verified working:
Tested against origin/main and confirmed tool use does not work without
these fixes.
Breaking Changes
None. Public API unchanged. Metrics format unchanged. Only internal
implementation differs.
AI Assistance
Portions of this code were developed with assistance from AI tools for
debugging, testing, and implementation of the fixes described above.
Co-authored-by: Amazon Q Developer