KV cache disk offload #13346
Unanswered
ha-seungwon
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying to run
llama.cpp
on a small machine, but the KV cache is too large. Instead of pre-allocating the KV cache in memory as a buffer, is there a way to offload it to disk?Beta Was this translation helpful? Give feedback.
All reactions