KV Cache Calculator
Interactive calculator that estimates KV cache and total VRAM for serving an LLM. Pick a model preset (or custom config), batch size, sequence length, KV cache precision, and GPU class — and see whether it fits before you hit OOM.
Open tool →