The key configuration parameters for an HNSW (Hierarchical Navigable Small World) index are M, efConstruction, and efSearch. These parameters directly influence the balance between index size, build time, query speed, and recall. Here’s how each works and their trade-offs:
M (Maximum Connections per Node): M determines the number of bidirectional links each node maintains in the graph layers of HNSW. A higher M (e.g., 24 vs. 12) increases the graph’s connectivity, improving recall by reducing the chance of search getting trapped in local minima. However, more connections also expand the index size (memory usage) and slow down build time, as each insertion requires more comparisons to establish links. For example, doubling M from 12 to 24 might quadruple build time in some cases. During queries, a higher M can speed up search by enabling faster traversal through the graph’s “shortcuts,” but this depends on how well the graph is structured during construction. Developers often tune M based on dataset size and memory constraints—larger datasets may require higher M for acceptable recall but will incur higher memory costs.
efConstruction (Construction-Time Search Depth): efConstruction controls the number of candidate neighbors explored when inserting a node into the graph. A higher efConstruction (e.g., 400 vs. 200) allows the algorithm to find more optimal connections during index creation, leading to a higher-quality graph and better recall. However, this significantly increases build time, as each insertion requires more distance computations. For instance, setting efConstruction=400 might double build time compared to efConstruction=200. The parameter does not affect index size, as it only influences how links are selected. Developers often prioritize higher efConstruction for critical applications like recommendation systems where recall is paramount, even if it means waiting longer for the index to build.
efSearch (Query-Time Search Depth): efSearch determines the size of the dynamic candidate list during querying. A higher efSearch (e.g., 500 vs. 100) increases recall by exploring more neighbors, but it slows down queries due to additional distance calculations. For example, in a 10-million-vector dataset, efSearch=500 might achieve 98% recall but take 5ms per query, while efSearch=100 might drop to 85% recall with 1ms latency. This parameter is often adjusted dynamically: a large efSearch is used for accuracy-critical tasks (e.g., medical image retrieval), while smaller values suit real-time applications (e.g., autocomplete suggestions). Importantly, efSearch must be set ≥ the desired number of nearest neighbors (k) to return meaningful results.
Practical Trade-offs and Use Cases: Tuning these parameters requires balancing priorities. For example, a high-recall setup (M=24, efConstruction=400, efSearch=500) suits offline batch processing but demands significant memory and build time. In contrast, a real-time system might use M=12, efConstruction=200, and efSearch=100 to prioritize speed and resource efficiency. Experimentation is key: start with default values (e.g., M=16, efConstruction=200) and adjust incrementally while monitoring recall, latency, and resource usage. Tools like ANN benchmarks can help quantify trade-offs for specific datasets and hardware.
Zilliz Cloud is a managed vector database built on Milvus perfect for building GenAI applications.
Try FreeLike the article? Spread the word