Back

HNSW Explorer: Hierarchical Navigable Small World, Visualized

Watch a search travel through the layered graph — from the sparse top layer (highways) down to the dense bottom layer (local roads). Drag the query point on Layer 0 and tune M / ef to see how the graph changes.

M4
Max neighbors per node
ef8
Layer-0 candidate pool
Speed450ms
Per step
Current Step
14 / 14
Search complete. Top-3: [3, 10, 4].

Search Stats

Visited
17 / 30
Layers
4
Entry
node 0
Top-3
[3, 10, 4]

Layer Distribution

L3
2
L2
2
L1
7
L0
30

How HNSW works

  1. Layered graph: each node is randomly assigned a max layer using an exponential distribution. Most nodes only live at Layer 0; very few reach the top.
  2. Within each layer: every node connects to its M nearest neighbors. Higher layers are sparse (few nodes, long edges = "highways"). Layer 0 is dense (all nodes, short edges = "local roads").
  3. Search starts at the top: from a fixed entry point, greedily walk to the neighbor closest to the query, repeat until no improvement.
  4. Drop down a layer and repeat: the position from the previous layer becomes the start of the next layer's walk. Each drop refines the result with finer-grained edges.
  5. Layer 0 with ef: at the bottom layer, instead of stopping at the first local minimum, HNSW maintains a pool of ef best candidates and expands their neighborhoods. Larger ef → more accurate but slower.

Why it's fast: the top layers act like an index over the index — you cover huge distances in just a few hops, then zoom in. For N points, search visits roughly O(log N) nodes instead of N.

What is HNSW?

HNSW (Hierarchical Navigable Small World) is a graph-based approximate nearest neighbor algorithm. It organizes vectors into a stack of navigable small-world graphs: sparse upper layers with long-range edges for coarse routing, and a dense bottom layer with short edges for the final fine-grained walk. A search greedily hops toward the query on each layer, then drops down — visiting roughly O(log N) nodes instead of scanning all N.

HNSW is the most widely used in-memory vector index, and the default graph index in Milvus. Milvus also offers HNSW_SQ and HNSW_PQ variants that quantize the stored vectors to cut memory at a small recall cost.

HNSW parameters in Milvus

  • M (build time): the maximum number of edges per node, the same knob as the slider above. Typical values are 8–64. Higher M improves recall and graph connectivity but grows memory linearly and slows construction.
  • efConstruction (build time): the candidate pool size used while inserting nodes. Higher values build a better-quality graph (higher achievable recall) at the cost of slower index builds. Typical values are 100–500.
  • ef / ef_search (query time): the candidate pool kept during the Layer-0 walk — the main recall/latency knob at search time. It must be at least your top-K. Increase it until recall plateaus; latency grows roughly linearly with it.

HNSW vs IVF vs DiskANN

HNSW typically delivers the best query latency at high recall for in-memory search — that's why it's the default choice when the dataset (plus graph overhead) fits in RAM. Its costs are memory and build time: the graph adds noticeable overhead per vector, and inserts are slower than IVF's k-means assignment. Pick IVF when you need fast index builds or aggressive quantization, and DiskANN when the dataset is too large for RAM but you still need high recall.

Frequently asked questions

What does the M parameter in HNSW control?

M caps how many edges each node keeps per layer. Larger M makes the graph better connected — higher recall and more robust routing — but memory grows linearly with M and index builds slow down. Values between 8 and 64 cover most workloads; 16–32 is a common sweet spot.

What is the difference between ef and efConstruction?

Both are candidate-pool sizes, applied at different times. efConstruction is used while building the index: it determines how thoroughly each inserted node searches for neighbors, so it controls graph quality. ef (ef_search) is used at query time: it controls how wide the Layer-0 search is, trading latency for recall on every query.

How much memory does an HNSW index need?

Roughly the raw vectors plus the graph: about d × 4 bytes per vector for float32 data, plus on the order of M × 2 × 8 bytes of edges per node. For memory-constrained deployments, Milvus offers HNSW_SQ and HNSW_PQ, which quantize stored vectors, or DiskANN, which moves full vectors to SSD.

When should I use HNSW instead of IVF?

Use HNSW when query latency at high recall is the priority and the index fits in memory — it usually wins that race. Prefer IVF when you rebuild indexes frequently (much faster builds), or when combined with SQ8/PQ quantization to fit a tight memory budget.

Go deeper: read the HNSW documentation for exact index params, or see Index Explained for the full decision guide.

Keep exploring