Skip to content
Mobrief
Mobrief
Back to archive

Research · Reddit MachineLearning

What if your HNSW index stored 3-bit embeddings instead of float32? [R]

The author has been experimenting with an approach to vector indexing where the HNSW graph nodes store quantized embeddings (~388 bytes each at dim=1024) instead of float32 vectors (~4,096 bytes).

Apr 11, 2026 00:35 UTC · ~4 min read · Research
Research

Check methodology

Read original
  • The key insight: if you quantize embeddings using Lloyd-Max scalar quantization after a random orthogonal rotation (the PolarQuant approach from Zandieh et al., ICLR 2026), you can precompute a…
  • During graph traversal, distance computation become

Context

The key insight: if you quantize embeddings using Lloyd-Max scalar quantization after a random orthogonal rotation (the PolarQuant approach from Zandieh et al., ICLR 2026), you can precompute a centroid-centroid inner product table (8x8 = 64 floats for 3-bit). During graph traversal, distance computation become

For builders

The key insight: if you quantize embeddings using Lloyd-Max scalar quantization after a random orthogonal rotation (the PolarQuant approach from Zandieh et al., ICLR 2026), you can precompute a…

The key insight: if you quantize embeddings using Lloyd-Max scalar quantization after a random orthogonal rotation (the PolarQuant approach from Zandieh et al., ICLR 2026), you can precompute a…

O Original S Save / Search