BIN_IVF_FLAT
The BIN_IVF_FLAT index is a variant of the IVF_FLAT index exclusively for binary embeddings. It enhances query efficiency by first partitioning the vector data into multiple clusters (nlist units) and then comparing the target input vector to the center of each cluster. BIN_IVF_FLAT significantly reduces query time while allowing users to fine-tune the balance between accuracy and speed. For more information, refer to IVF_FLAT.
Build index
To build a BIN_IVF_FLAT
index on a vector field in Milvus, use the add_index()
method, specifying the index_type
, metric_type
, and additional parameters for the index.
from pymilvus import MilvusClient
# Prepare index building params
index_params = MilvusClient.prepare_index_params()
index_params.add_index(
field_name="your_binary_vector_field_name", # Name of the vector field to be indexed
index_type="BIN_IVF_FLAT", # Type of the index to create
index_name="vector_index", # Name of the index to create
metric_type="HAMMING", # Metric type used to measure similarity
params={
"nlist": 64, # Number of clusters for the index
} # Index building params
)
In this configuration:
index_type
: The type of index to be built. In this example, set the value toBIN_IVF_FLAT
.metric_type
: The method used to calculate the distance between vectors. Supported values for binary embeddings includeHAMMING
(default) andJACCARD
. For details, refer to Metric Types.params
: Additional configuration options for building the index.nlist
: Number of clusters to divide the dataset.
To learn more building parameters available for the
BIN_IVF_FLAT
index, refer to Index building params.
Once the index parameters are configured, you can create the index by using the create_index()
method directly or passing the index params in the create_collection
method. For details, refer to Create Collection.
Search on index
Once the index is built and entities are inserted, you can perform similarity searches on the index.
search_params = {
"params": {
"nprobe": 10, # Number of clusters to search
}
}
res = MilvusClient.search(
collection_name="your_collection_name", # Collection name
anns_field="binary_vector_field", # Binary vector field
data=[query_binary_vector], # Query binary vector
limit=3, # TopK results to return
search_params=search_params
)
In this configuration:
params
: Additional configuration options for searching on the index.nprobe
: Number of clusters to search for.
To learn more search parameters available for the
BIN_IVF_FLAT
index, refer to Index-specific search params.
Index params
This section provides an overview of the parameters used for building an index and performing searches on the index.
Index building params
The following table lists the parameters that can be configured in params
when building an index.
Parameter |
Description |
Value Range |
Tuning Suggestion |
---|---|---|---|
|
The number of clusters to create using the k-means algorithm during index building. Each cluster, represented by a centroid, stores a list of vectors. Increasing this parameter reduces the number of vectors in each cluster, creating smaller, more focused partitions. |
Type: Integer Range: [1, 65536] Default value: |
Larger |
Index-specific search params
The following table lists the parameters that can be configured in search_params.params
when searching on the index.
Parameter |
Description |
Value Range |
Tuning Suggestion |
---|---|---|---|
|
The number of clusters to search for candidates. Higher values allow more clusters to be searched, improving recall by expanding the search scope but at the cost of increased query latency. |
Type: Integer Range: [1, nlist] Default value: |
Increasing this value improves recall but may slow down the search.
Set In most cases, we recommend you set a value within this range: [1, nlist]. |