Does the KV cache really grow to use more memory than the model weights? The red... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		barbegal 30 days ago \| parent \| context \| favorite \| on: What if AI doesn't need more RAM but better math? Does the KV cache really grow to use more memory than the model weights? The reduction in overall RAM relies on the KV cache being a substantial proportion of the memory usage but with very large models I can't see how that holds true.

zozbot234 30 days ago [–]

For long context, yes this is at least plausible. And the latest models are reaching context lengths of 1M tokens or perhaps more.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact