Increase prefix cache memory space #15410
Unanswered
M0rpheus-0
asked this question in
Q&A
Replies: 1 comment 2 replies
-
I have the same use-case. @M0rpheus-0 They recently moved discussions to a forum, I dont know why. So you wont get an answer here |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello!
Is there a way to increase the available memory vLLM uses to store prefixes of past prompts, so that it can cache more prompts?
More so, could we share/allocate some of the CPU RAM to do so?
I am working with 2xH100, and ample RAM to spare.
If not can I at least force a prompt to stay cached?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions