-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Open
Labels
Description
What happened?
Summary
When streaming responses from AWS Bedrock models, LiteLLM generates a unique ID for each chunk instead of maintaining the same ID across all chunks in a stream. This violates the OpenAI streaming specification, which requires all chunks in a single streaming response to share the same id field.
Affected Components
- API:
/v1/chat/completionswithstream=true - Provider: AWS Bedrock
Expected Behavior (OpenAI Specification)
All chunks in a streaming response must share the same id value:
{"id": "chatcmpl-123", "choices": [...], "delta": {"content": "Hello"}}
{"id": "chatcmpl-123", "choices": [...], "delta": {"content": " world"}}
{"id": "chatcmpl-123", "choices": [...], "delta": {"content": "!"}}Actual Behavior
Each chunk receives a different id value:
{"id": "chatcmpl-abc", "choices": [...], "delta": {"content": "Hello"}}
{"id": "chatcmpl-def", "choices": [...], "delta": {"content": " world"}}
{"id": "chatcmpl-ghi", "choices": [...], "delta": {"content": "!"}}Likely Root Cause
File: llms/bedrock/chat/invoke_handler.py
AWSEventStreamDecoder.converse_chunk_parser() creates a new ModelResponseStream without passing an id parameter, causing ModelResponseStream.__init__() to generate a unique ID for each chunk.
Proposed Fix
- Add
self.response_id = str(uuid.uuid4())toAWSEventStreamDecoder.__init__() - Pass
id=self.response_idwhen creatingModelResponseStreaminconverse_chunk_parser()
Relevant log output
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
v1.79.1
Twitter / LinkedIn details
No response
DLakin01, zynga-atimko and BrianBroussard76