Deep DivesTurboQuant: Google's 6x KV Cache Compression Hits 3-Bit With Zero Loss
Google's ICLR 2026 algorithm compresses the KV cache 6x at just 3 bits per element — no training, no calibration, near-zero accuracy loss, and an 8x attention speedup on H100s.
By Aisha Patel · 5 min · May 11, 2026