Back

DiskANN Explorer

DiskANN keeps a tiny PQ-compressed copy of every vector in RAM and the full vectors + graph on SSD. Search walks the graph with cheap RAM compares, then re-ranks the top candidates with a handful of expensive disk reads.

Why DiskANN exists: the memory wall

RAMPQ codes (32 B/vec)3.2 GB
DISKFull vectors + graph (3072 B/vec)307.2 GB
100M vectors × 768 dimDISK / RAM ≈ 96×

A single commodity SSD holds the disk side easily. The RAM side fits in any laptop. Without the PQ trick, you'd need a server with hundreds of GB of RAM to even load the dataset.

R5
Graph degree
L8
Candidate pool
re-rank5
Disk reads
Speed420ms
Per step
PQ compare (RAM) in candidate pool disk re-rank top-K result
Current Step
30 / 30
Search complete. Top-3: [1, 28, 27].

Cost so far

RAM PQ compares2020 µs
SSD reads5500 µs
Total latency0.52 ms
Modeled with PQ ≈ 1 µs, SSD ≈ 100 µs. Real numbers depend on hardware.

Search Stats

Visited
20 / 36
Re-ranked
5
Top-3
[1, 28, 27]

How DiskANN works

  1. Build the Vamana graph: a single-layer kNN graph (degree R) optimized for sequential disk layout. No hierarchy — just one big graph.
  2. Quantize and split storage: every vector gets a tiny PQ code (e.g. 32 bytes). PQ codes live in RAM. Full vectors + graph edges live on SSD.
  3. Beam-walk with PQ distances: from a fixed entry point (the medoid), keep a candidate pool of size L. Pop the best, expand its neighbors, estimate distances using only the in-RAM PQ codes. No disk I/O during the walk.
  4. Disk re-rank the finalists: the candidate pool's top entries had their distances approximated by PQ. Read their full vectors from disk and recompute exact distances to fix the ranking.
  5. Return top-K. Total disk I/O = a small number of random reads, usually fewer than 20.

The trade

  • Why it scales: RAM only holds compressed codes. A 1B-vector dataset with 32-byte PQ needs ~32 GB RAM — fits on a single workstation.
  • Why it's slower than HNSW: SSD random reads are ~100× slower than RAM. Even with only ~10 reads, query latency is dominated by disk.
  • Why it still beats IVF_PQ for recall: the re-rank step recovers quantization errors that pure PQ-based methods can't fix.

The mental model: PQ is the cheap "is this neighborhood promising?" test. Disk reads are the expensive "let me actually verify" check. DiskANN gets most of its work done with the cheap test.