Conversation Role Alternation Error with Gemma-3-12b-it Model: litellm.BadRequestError #3455

henrywang0314 · 2025-06-13T09:14:15Z

henrywang0314
Jun 13, 2025

Issue: Conversation Role Alternation Error with Gemma-3-12b-it Model

Description

When using the Gemma-3-12b-it model through hosted vLLM, the system encounters a conversation role alternation error. The error occurs during agent interactions, specifically when the conversation flow attempts to transfer between agents.

Error Message

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.

2025-06-13 17:06:35,578 - ERROR - fast_api.py:817 - Error in event_generator: litellm.BadRequestError: Hosted_vllmException - Conversation roles must alternate user/assistant/user/assistant/...
Traceback (most recent call last):
File "/root/anaconda3/lib/python3.12/site-packages/litellm/llms/openai/openai.py", line 801, in acompletion
headers, response = await self.make_openai_chat_completion_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/lib/python3.12/site-packages/litellm/litellm_core_utils/logging_utils.py", line 135, in async_wrapper
result = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/lib/python3.12/site-packages/litellm/llms/openai/openai.py", line 436, in make_openai_chat_completion_request
raise e
File "/root/anaconda3/lib/python3.12/site-packages/litellm/llms/openai/openai.py", line 418, in make_openai_chat_completion_request
await openai_aclient.chat.completions.with_raw_response.create(
File "/root/anaconda3/lib/python3.12/site-packages/openai/_legacy_response.py", line 381, in wrapped
return cast(LegacyAPIResponse[R], await func(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2028, in create
return await self._post(
^^^^^^^^^^^^^^^^^
File "/root/anaconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1742, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1549, in request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': ' Conversation roles must alternate user/assistant/user/assistant/...', 'type': 'BadRequestError', 'param': None, 'code': 400}

Environment

Model: google/gemma-3-12b-it
Provider: hosted_vllm
API Base URL: http://localhost:5021/v1
Framework: Google ADK

Steps to Reproduce

Set up a multi-agent system with a coordinator agent and sub-agents
Configure the system to use Gemma-3-12b-it model through hosted vLLM
Initiate a conversation that requires agent transfer
The error occurs during the agent transfer process

Code Context

The error occurs in a system with the following agent structure:

booking_agent = LlmAgent(name="Booker", description="Handles flight and hotel bookings.")
info_agent = LlmAgent(name="Info", description="Provides general information and answers questions.")

root_agent = LlmAgent(
    name="Coordinator",
    model=model,
    instruction="You are an assistant. Delegate booking tasks to Booker and info requests to Info.",
    description="Main coordinator.",
    sub_agents=[booking_agent, info_agent]
)

Expected Behavior

The conversation should flow smoothly between agents, maintaining proper role alternation in the conversation history.

Actual Behavior

The system fails with a role alternation error, suggesting that the conversation history is not properly maintaining the user/assistant alternation pattern required by the model.

Impact

This issue prevents proper agent-to-agent communication and breaks the conversation flow in multi-agent systems using the Gemma-3-12b-it model.

LiuYuWei · 2025-06-19T08:56:19Z

LiuYuWei
Jun 19, 2025

Got the same problem about VLLM + Gemma 3 12B/27B Model.

0 replies

masahiro-hamada · 2025-07-04T05:44:30Z

masahiro-hamada
Jul 4, 2025

As a temporary solution, I used the following jinja-template which removed the error handling and the error went away.

{{ bos_token }} {%- if messages[0]['role'] == 'system' -%} {%- if messages[0]['content'] is string -%} {%- set first_user_prefix = messages[0]['content'] + '\n' -%} {%- else -%} {%- set first_user_prefix = messages[0]['content'][0]['text'] + '\n' -%} {%- endif -%} {%- set loop_messages = messages[1:] -%} {%- else -%} {%- set first_user_prefix = "" -%} {%- set loop_messages = messages -%} {%- endif -%} {%- for message in loop_messages -%} {%- if (message['role'] == 'assistant') -%} {%- set role = "model" -%} {%- else -%} {%- set role = message['role'] -%} {%- endif -%} {{ '<start_of_turn>' + role + '\n' + (first_user_prefix if loop.first else "") }} {%- if message['content'] is string -%} {{ message['content'] | trim }} {%- elif message['content'] is iterable -%} {%- for item in message['content'] -%} {%- if item['type'] == 'image' -%} {{ '<start_of_image>' }} {%- elif item['type'] == 'text' -%} {{ item['text'] | trim }} {%- endif -%} {%- endfor -%} {%- else -%} {{ raise_exception("Invalid content type") }} {%- endif -%} {{ '<end_of_turn>\n' }} {%- endfor -%} {%- if add_generation_prompt -%} {{'<start_of_turn>model\n'}} {%- endif -%}

0 replies

alfredcs · 2025-07-11T18:52:23Z

alfredcs
Jul 11, 2025

The newly released medgemma-27b-it and medgemma-27b-text-it have the same issue.

0 replies

adk-bot · 2025-11-08T15:39:02Z

adk-bot
Nov 8, 2025
Maintainer

Response from ADK Answering Agent (experimental, answer may be inaccurate)

TLDR: This error occurs because the conversation history sent to the Gemma model has consecutive messages with the same role, which violates the required user/assistant/user/assistant alternating pattern. This can happen during agent-to-agent transfers in ADK. A reliable workaround is to implement a before_model_callback on your agent to manually merge consecutive messages of the same role before they are sent to the model.

Hello, thank you for reporting this issue. This is a known challenge when integrating certain models that have strict requirements on conversation history structure, especially in a multi-agent setup.

The Root Cause

The litellm.BadRequestError: Conversation roles must alternate error is explicit: the sequence of messages sent to the Gemma model via your vLLM endpoint does not strictly alternate between the user and assistant roles.

In a multi-agent system, ADK is designed to handle this. When one agent (e.g., your Coordinator) finishes its turn and delegates to another (e.g., Booker), the framework typically wraps the first agent's output (role: assistant) into a new message with a user role for the next agent. This maintains the required alternating pattern.

However, it seems that with your specific stack (ADK, vLLM, Gemma), this transformation is not happening correctly, leading to two consecutive messages with the same role being sent to the model, which then rejects the request.

Solution: Implement a Callback to Sanitize History

A robust way to fix this is to intercept the conversation history right before it's sent to the model and clean it up. You can do this by using a before_model_callback in your LlmAgent definition. This callback will receive the LlmRequest and can modify it.

Here is a Python function that merges consecutive messages and an example of how to apply it to your agent:

from google.adk.agents import LlmAgent, LlmRequest
from google.genai.types import Content, Part

def merge_consecutive_roles(request: LlmRequest) -> LlmRequest:
    """Merges consecutive messages with the same role in the conversation history."""
    if not request.contents:
        return request

    merged_contents = [request.contents[0]]

    for i in range(1, len(request.contents)):
        current_content = request.contents[i]
        last_merged_content = merged_contents[-1]

        # If the role is the same as the previous one, merge them
        if current_content.role == last_merged_content.role:
            # Combine the parts of the two messages
            new_parts = list(last_merged_content.parts) + list(current_content.parts)
            # Create a new Content object with the combined parts
            merged_contents[-1] = Content(role=current_content.role, parts=new_parts)
        else:
            merged_contents.append(current_content)

    request.contents = merged_contents
    return request

# Now, apply this callback to your root agent
root_agent = LlmAgent(
    name="Coordinator",
    model=model,
    instruction="You are an assistant. Delegate booking tasks to Booker and info requests to Info.",
    description="Main coordinator.",
    sub_agents=[booking_agent, info_agent],
    before_model_callback=merge_consecutive_roles
)

By adding the before_model_callback=merge_consecutive_roles, you ensure that the conversation history is always correctly formatted before it hits the vLLM endpoint, which should resolve the role alternation error.

Please try this solution and let us know if it resolves your issue.

0 replies

danistrebel · 2025-11-10T10:59:50Z

danistrebel
Nov 10, 2025

Thanks @adk-bot this is exactly what I needed.. A few mistakes in the types etc but you got me there much quicker for sure.. here's the snippet that worked great for me for a mult-agent ADK example with Gemma3 4b it with LiteLLM

def merge_consecutive_roles(
    callback_context: CallbackContext, llm_request: LlmRequest
) -> Optional[LlmResponse]:
    """Merges consecutive messages with the same role in the conversation history."""
    if not llm_request.contents:
        return None

    merged_contents = [llm_request.contents[0]]

    for i in range(1, len(llm_request.contents)):
        current_content = llm_request.contents[i]
        last_merged_content = merged_contents[-1]

        if current_content.role == last_merged_content.role:
            new_parts = list(last_merged_content.parts) + list(current_content.parts)
            merged_contents[-1] = Content(role=current_content.role, parts=new_parts)
        else:
            merged_contents.append(current_content)

    llm_request.contents = merged_contents
    return None

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Conversation Role Alternation Error with Gemma-3-12b-it Model: litellm.BadRequestError #3455

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Conversation Role Alternation Error with Gemma-3-12b-it Model: litellm.BadRequestError #3455

Uh oh!

henrywang0314 Jun 13, 2025

Issue: Conversation Role Alternation Error with Gemma-3-12b-it Model

Description

Error Message

Environment

Steps to Reproduce

Code Context

Expected Behavior

Actual Behavior

Impact

Replies: 5 comments

Uh oh!

Uh oh!

LiuYuWei Jun 19, 2025

Uh oh!

masahiro-hamada Jul 4, 2025

Uh oh!

alfredcs Jul 11, 2025

Uh oh!

adk-bot Nov 8, 2025 Maintainer

The Root Cause

Solution: Implement a Callback to Sanitize History

Uh oh!

danistrebel Nov 10, 2025

henrywang0314
Jun 13, 2025

LiuYuWei
Jun 19, 2025

masahiro-hamada
Jul 4, 2025

alfredcs
Jul 11, 2025

adk-bot
Nov 8, 2025
Maintainer

danistrebel
Nov 10, 2025