Any other mechanisms to save gpu memory other than paged attention? #290

lyz-sys · 2023-06-28T11:34:17Z

lyz-sys
Jun 28, 2023

Hi guys, are there other mechanisms implemented in vllm to save gpu memory other than paged attention? Thank you.

zhuohan123 · 2023-06-29T15:20:29Z

Another technique is continuous batching, which reduces padding memory and computation. You can read this blog to learn more.

0 replies