turboquant-wasm

Shrink embedding vectors 6x. Search them without decompressing.

Float32 embeddings don't compress well — gzip only shrinks them 7% (7.3 MB → 6.8 MB). TurboQuant (Google Research, ICLR 2026) reduces them 6x (7.3 MB → 1.2 MB) and lets you search directly on compressed data without decompressing. No training step, no codebook — just encode(), decode(), dot(). This npm package runs it in the browser via WASM with relaxed SIMD.

Type a query, search 5K Wikipedia passages instantly — all in your browser. Left column uses compressed vectors (1.2 MB), right uses uncompressed (7.7 MB). Same results, 6x less data to download.

Primary use case

Image Similarity

Click any photo to find visually similar ones. 1K Unsplash images indexed with compressed embeddings (0.2 MB instead of 1.5 MB). Similarity computed directly on compressed data.

Primary use case

3D Gaussian Splatting

Side-by-side 3D LEGO scene. Original (57 MB) vs compressed (24 MB). Spherical harmonic coefficients compressed — geometry stays intact.

Proof of concept

npm GitHub Paper