Reusing the KV cache across multiple users #13453
Unanswered
DOGEUNNKIM
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
If KV cache cannot be shared between slots, does this mean that KV cache cannot be shared between multi users, i.e. if multiple users use the same long document as an input prompt, the KV prefill process will occur separately for each user? Does ollama have any suggestions on how to solve this?
Beta Was this translation helpful? Give feedback.
All reactions