🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Milvus
Zilliz
Home
  • User Guide
  • Home
  • Docs
  • User Guide

  • Indexes

  • Floating Vector Indexes

  • IVF_RABITQ

IVF_RABITQCompatible with Milvus 2.6.x

The IVF_RABITQ index is a binary quantization-based indexing algorithm that quantizes FP32 vectors into binary representations. This index offers exceptional storage efficiency with a 1-to-32 compression ratio while maintaining relatively good recall rates. It supports optional refinement to achieve higher recall at the cost of additional storage, making it a versatile replacement for IVF_SQ8 and IVF_FLAT in memory-constrained scenarios.

Overview

The IVF_RABITQ stands for Inverted File with RaBitQ quantization, combining two powerful techniques for efficient vector search and storage.

IVF

Inverted File (IVF) organizes the vector space into manageable regions using k-means clustering. Each cluster is represented by a centroid, serving as a reference point for the vectors within that cluster. This clustering approach reduces the search space by allowing the algorithm to focus only on the most relevant clusters during query processing.

To learn more about IVF technical details, refer to IVF_FLAT.

RaBitQ

RaBitQ is a state-of-the-art binary quantization method with theoretical guarantees, introduced in the research paper “RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search” by Jianyang Gao and Cheng Long.

RaBitQ introduces several innovative concepts:

Angular Information Encoding: Unlike traditional spatial encoding, RaBitQ encodes angular information through vector normalization. In IVF_RABITQ, data vectors are normalized against their nearest IVF centroid, enhancing the precision of the quantization process.

Theoretical Foundation: The core distance approximation formula is:

orqr2orco2+qrco22C(or,co)o~,qrco+C1(or,co)\lVert \mathbf{o_r} - \mathbf{q_r} \rVert^2 \approx \lVert \mathbf{o_r} - \mathbf{c_o} \rVert^2 + \lVert \mathbf{q_r} - \mathbf{c_o} \rVert^2 - 2 \cdot C(\mathbf{o_r}, \mathbf{c_o}) \cdot \langle \tilde{\mathbf{o}}, \mathbf{q_r} - \mathbf{c_o} \rangle + C_1(\mathbf{o_r}, \mathbf{c_o})

Where:

  • or\mathbf{o_r} is a data vector from the dataset
  • qr\mathbf{q_r} is a query vector
  • co\mathbf{c_o} is the nearest IVF centroid vector for or\mathbf{o_r}
  • C(or,co)C(\mathbf{o_r}, \mathbf{c_o}) and C1(or,co)C_1(\mathbf{o_r}, \mathbf{c_o}) are precomputed constants
  • o~\tilde{\mathbf{o}} is the quantized binary vector stored in the index
  • o~,qrco\langle \tilde{\mathbf{o}}, \mathbf{q_r} - \mathbf{c_o} \rangle represents the dot-product operation

Computational Efficiency: The binary nature of o~\tilde{\mathbf{o}} makes distance calculations extremely fast, particularly benefiting from modern CPU architectures with dedicated AVX-512 VPOPCNTDQ instructions on Intel Ice Lake+ or AMD Zen 4+ processors.

Algorithmic Enhancements: RaBitQ integrates effectively with established techniques like the FastScan approach and random rotations for improved performance.

IVF + RaBitQ

The IVF_RABITQ index combines IVF’s efficient clustering with RaBitQ’s advanced binary quantization:

  1. Coarse Filtering: IVF partitions the vector space into clusters, significantly reducing the search scope by focusing on the most relevant cluster regions.

  2. Binary Quantization: Within each cluster, RaBitQ compresses vectors into binary representations while preserving essential distance relationships through theoretical guarantees.

  3. Optional Refinement: When enabled, the index stores additional refined data using higher precision formats (SQ6, SQ8, FP16, BF16, or FP32) to improve recall rates at the cost of increased storage.

Milvus implements IVF_RABITQ using the following FAISS factory strings:

  • With refinement: "RR({dim}),IVF{nlist},RaBitQ,Refine({refine_index})"
  • Without refinement: "RR({dim}),IVF{nlist},RaBitQ"

Build index

To build an IVF_RABITQ index on a vector field in Milvus, use the add_index() method, specifying the index_type, metric_type, and additional parameters for the index.

from pymilvus import MilvusClient

# Prepare index building params
index_params = MilvusClient.prepare_index_params()

index_params.add_index(
    field_name="your_vector_field_name", # Name of the vector field to be indexed
    index_type="IVF_RABITQ", # Type of the index to create
    index_name="vector_index", # Name of the index to create
    metric_type="L2", # Metric type used to measure similarity
    params={
        "nlist": 1024, # Number of clusters for the index
        "refine": True, # Enable refinement for higher recall
        "refine_type": "SQ8" # Refinement data format
    } # Index building params
)

In this configuration:

  • index_type: The type of index to be built. In this example, set the value to IVF_RABITQ.

  • metric_type: The method used to calculate the distance between vectors. Supported values include COSINE, L2, and IP. For details, refer to Metric Types.

  • params: Additional configuration options for building the index. For details, refer to Index building params.

Once the index parameters are configured, you can create the index by using the create_index() method directly or passing the index params in the create_collection method. For details, refer to Create Collection.

Search on index

Once the index is built and entities are inserted, you can perform similarity searches on the index.

search_params = {
    "params": {
        "nprobe": 128, # Number of clusters to search
        "rbq_query_bits": 0, # Query vector quantization bits
        "refine_k": 1 # Refinement magnification factor
    }
}

res = MilvusClient.search(
    collection_name="your_collection_name", # Collection name
    anns_field="vector_field", # Vector field name
    data=[[0.1, 0.2, 0.3, 0.4, 0.5]], # Query vector
    limit=3, # TopK results to return
    search_params=search_params
)

In this configuration:

The IVF_RABITQ index heavily relies on the popcount hardware instruction for optimal performance. Modern CPU architectures such as Intel IceLake+ or AMD Zen 4+ with AVX512VPOPCNTDQ instruction sets provide significant performance improvements for RaBitQ operations.

Index params

This section provides an overview of the parameters used for building an index and performing searches on the index.

Index building params

The following table lists the parameters that can be configured in params when building an index.

Parameter

Description

Value Range

Tuning Suggestion

IVF

nlist

The number of clusters to create using the k-means algorithm during index building. Each cluster, represented by a centroid, stores a list of vectors. Increasing this parameter reduces the number of vectors in each cluster, creating smaller, more focused partitions.

Type: Integer
Range: [1, 65536]
Default value: 128

Larger nlist values improve recall by creating more refined clusters but increase index building time. Optimize based on dataset size and available resources. In most cases, we recommend you set a value within this range: [32, 4096].

RaBitQ

refine

Enables the refine process and stores the refined data.

Type: Boolean
Range: [true, false]
Default value: false

Set to true if a 0.9+ recall rate is needed. Enabling refinement improves accuracy but increases storage requirements and index building time.

refine_type

Defines the data representation used for refining when refine is enabled.

Type: String
Range: [SQ6, SQ8, FP16, BF16, FP32]
Default value: None

The listed values are presented in order of increasing recall rate, decreasing QPS, and increasing storage size. SQ8 is recommended as a starting point, offering a good balance between accuracy and resource usage.

Index-specific search params

The following table lists the parameters that can be configured in search_params.params when searching on the index.

Parameter

Description

Value Range

Tuning Suggestion

IVF

nprobe

The number of clusters to search for candidates. Higher values allow more clusters to be searched, improving recall by expanding the search scope but at the cost of increased query latency.

Type: Integer
Range: [1, nlist]
Default value: 8

Increasing this value improves recall but may slow down the search. Set nprobe proportionally to nlist to balance speed and accuracy. In most cases, we recommend you set a value within this range: [1, nlist].

RaBitQ

rbq_query_bits

Sets whether additional scalar quantization of a query vector is applied. If set to 0, the query is used without quantization. If set to a value within [1, 8], the query is preprocessed using n-bit scalar quantization.

Type: Integer
Range: [0, 8]
Default value: 0

The default 0 value provides maximum recall rate but slowest performance. We recommend testing values 0, 8, and 6, as they provide similar recall rates with 6 being the fastest. Use smaller values for higher recall requirements.

refine_k

The refining process uses higher quality quantization to pick the needed number of nearest neighbors from a refine_k times larger pool of candidates chosen using IVF_RABITQ.

Type: Float
Range: [1, float_max)
Default value: 1

Higher refine_k values decrease QPS but increase recall rate. Start with 1 and test values 2, 3, 4, and 5 to find the optimal trade-off between QPS and recall for your dataset.

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started
Feedback

Was this page helpful?