Skip to content

Changing response model results in Anthropic cache miss #1349

Closed
@ameade

Description

@ameade
  • [X ] This is actually a bug report.
  • I am not getting good LLM Results
  • I have tried asking for help in the community on discord or discussions and have not received a response.
  • I have tried searching the documentation and have not found an answer.

What Model are you using?

  • gpt-3.5-turbo
  • gpt-4-turbo
  • gpt-4
  • [ X] Other (please specify)
    sonnet3.5

Describe the bug
It looks like Anthropic prompt caching always results in a cache miss when changing between response models.

Examples from my logs
<cache_write> <cache_read> <input_tokens> <output_tokens>
Different response models:
9246 0 2229 961
0 9246 2281 763
0 9246 2248 851
0 9246 1414 772
-- Change response model, different prompt --
9046 0 2482 1235
-- Change response model, different prompt--
8274 0 1087 477

Same response model:
9295 0 2233 1152
0 9295 2285 1027
0 9295 2252 935
0 9295 1418 1008
-- Same response model, l, different prompt --
0 9083 2642 1652
-- Same response model, l, different prompt --
0 9083 1482 1131

To Reproduce
Steps to reproduce the behavior, including code snippets of the model and the input data and openai response.

  1. Make two requests with different ResponseModels that share the same first part of their prompts
        messages = [
            {
                "role": "system",
                "content": self.base_system_prompt.format(language=transcript_language)
            },
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": f"""<transcript language="{transcript_language}"> {transcript_text} </transcript>""",
                        "cache_control": {"type": "ephemeral"},
                    },
....
  1. Do requests, notice that the cache always results in a miss.

Expected behavior
A clear and concise description of what you expected to happen.
I expect the first part of the messages to Anthropic not to change based on the Responsemodel

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions