ValueError: Invalid `cache_implementation` (offloaded). #34718

leigao97 · 2024-11-13T17:47:39Z

System Info

transformers version: 4.46.2
Platform: Linux-5.15.0-1049-oracle-x86_64-with-glibc2.35
Python version: 3.10.14
Huggingface_hub version: 0.25.1
Safetensors version: 0.4.5
Accelerate version: 1.0.1
Accelerate config: not found
PyTorch version (GPU?): 2.4.1+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?:
Using GPU in script?:
GPU type: NVIDIA A100-SXM4-40GB

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

I am following the official example to enable KV cache offloading. https://huggingface.co/docs/transformers/en/kv_cache#offloaded-cache

And I got the error message:

  File "/transformers/generation/configuration_utils.py", line 726, in validate
    raise ValueError(
ValueError: Invalid `cache_implementation` (offloaded). Choose one of: ['static', 'offloaded_static', 'sliding_window', 'hybrid', 'mamba', 'quantized', 'static']

Expected behavior

I expected that cache_implementation="offloaded" is a valid option taken by model.generate(). After enabling KV cache offloading, the peak memory usage should go down and inference time should go up.

The text was updated successfully, but these errors were encountered:

leigao97 added the bug label Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Invalid `cache_implementation` (offloaded). #34718

ValueError: Invalid `cache_implementation` (offloaded). #34718

leigao97 commented Nov 13, 2024 •

edited

Loading

ValueError: Invalid cache_implementation (offloaded). #34718

ValueError: Invalid cache_implementation (offloaded). #34718

Comments

leigao97 commented Nov 13, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

ValueError: Invalid `cache_implementation` (offloaded). #34718

ValueError: Invalid `cache_implementation` (offloaded). #34718

leigao97 commented Nov 13, 2024 •

edited

Loading