Index Vector Fields
This guide walks you through the basic operations on creating and managing indexes on vector fields in a collection.
Overview
Leveraging the metadata stored in an index file, Milvus organizes your data in a specialized structure, facilitating rapid retrieval of requested information during searches or queries.
Milvus provides several index types and metrics to sort field values for efficient similarity searches. The following table lists the supported index types and metrics for different vector field types. Currently, Milvus supports various types of vector data, including floating point embeddings (often known as floating point vectors or dense vectors), binary embeddings (also known as binary vectors), and sparse embeddings (also known as sparse vectors). For details, refer to In-memory Index and Similarity Metrics.
Metric Types | Index Types |
---|---|
|
|
Metric Types | Index Types |
---|---|
|
|
Metric Types | Index Types |
---|---|
IP |
|
It is recommended to create indexes for both the vector field and scalar fields that are frequently accessed.
Preparations
As explained in Manage Collections, Milvus automatically generates an index and loads it into memory when creating a collection if any of the following conditions are specified in the collection creation request:
The dimensionality of the vector field and the metric type, or
The schema and the index parameters.
The code snippet below repurposes the existing code to establish a connection to a Milvus instance and create a collection without specifying its index parameters. In this case, the collection lacks an index and remains unloaded.
To prepare for indexing, use MilvusClient
to connect to the Milvus server and set up a collection by using create_schema()
, add_field()
, and create_collection()
.
To prepare for indexing, use MilvusClientV2
to connect to the Milvus server and set up a collection by using createSchema()
, addField()
, and createCollection()
.
To prepare for indexing, use MilvusClient
to connect to the Milvus server and set up a collection by using createCollection()
.
from pymilvus import MilvusClient, DataType
# 1. Set up a Milvus client
client = MilvusClient(
uri="http://localhost:19530"
)
# 2. Create schema
# 2.1. Create schema
schema = MilvusClient.create_schema(
auto_id=False,
enable_dynamic_field=True,
)
# 2.2. Add fields to schema
schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5)
# 3. Create collection
client.create_collection(
collection_name="customized_setup",
schema=schema,
)
import io.milvus.v2.client.ConnectConfig;
import io.milvus.v2.client.MilvusClientV2;
import io.milvus.v2.common.DataType;
import io.milvus.v2.service.collection.request.CreateCollectionReq;
String CLUSTER_ENDPOINT = "http://localhost:19530";
// 1. Connect to Milvus server
ConnectConfig connectConfig = ConnectConfig.builder()
.uri(CLUSTER_ENDPOINT)
.build();
MilvusClientV2 client = new MilvusClientV2(connectConfig);
// 2. Create a collection
// 2.1 Create schema
CreateCollectionReq.CollectionSchema schema = client.createSchema();
// 2.2 Add fields to schema
schema.addField(AddFieldReq.builder().fieldName("id").dataType(DataType.Int64).isPrimaryKey(true).autoID(false).build());
schema.addField(AddFieldReq.builder().fieldName("vector").dataType(DataType.FloatVector).dimension(5).build());
// 3 Create a collection without schema and index parameters
CreateCollectionReq customizedSetupReq = CreateCollectionReq.builder()
.collectionName("customized_setup")
.collectionSchema(schema)
.build();
client.createCollection(customizedSetupReq);
// 1. Set up a Milvus Client
client = new MilvusClient({address, token});
// 2. Define fields for the collection
const fields = [
{
name: "id",
data_type: DataType.Int64,
is_primary_key: true,
autoID: false
},
{
name: "vector",
data_type: DataType.FloatVector,
dim: 5
},
]
// 3. Create a collection
res = await client.createCollection({
collection_name: "customized_setup",
fields: fields,
})
console.log(res.error_code)
// Output
//
// Success
//
Index a Collection
To create an index for a collection or index a collection, use prepare_index_params()
to prepare index parameters and create_index()
to create the index.
To create an index for a collection or index a collection, use IndexParam
to prepare index parameters and createIndex()
to create the index.
To create an index for a collection or index a collection, use createIndex()
.
# 4.1. Set up the index parameters
index_params = MilvusClient.prepare_index_params()
# 4.2. Add an index on the vector field.
index_params.add_index(
field_name="vector",
metric_type="COSINE",
index_type="IVF_FLAT",
index_name="vector_index",
params={ "nlist": 128 }
)
# 4.3. Create an index file
client.create_index(
collection_name="customized_setup",
index_params=index_params,
sync=False # Whether to wait for index creation to complete before returning. Defaults to True.
)
import io.milvus.v2.common.IndexParam;
import io.milvus.v2.service.index.request.CreateIndexReq;
// 4 Prepare index parameters
// 4.2 Add an index for the vector field "vector"
IndexParam indexParamForVectorField = IndexParam.builder()
.fieldName("vector")
.indexName("vector_index")
.indexType(IndexParam.IndexType.IVF_FLAT)
.metricType(IndexParam.MetricType.COSINE)
.extraParams(Map.of("nlist", 128))
.build();
List<IndexParam> indexParams = new ArrayList<>();
indexParams.add(indexParamForVectorField);
// 4.3 Crate an index file
CreateIndexReq createIndexReq = CreateIndexReq.builder()
.collectionName("customized_setup")
.indexParams(indexParams)
.build();
client.createIndex(createIndexReq);
// 4. Set up index for the collection
// 4.1. Set up the index parameters
res = await client.createIndex({
collection_name: "customized_setup",
field_name: "vector",
index_type: "AUTOINDEX",
metric_type: "COSINE",
index_name: "vector_index",
params: { "nlist": 128 }
})
console.log(res.error_code)
// Output
//
// Success
//
Parameter | Description |
---|---|
field_name |
The name of the target file to apply this object applies. |
metric_type |
The algorithm that is used to measure similarity between vectors. Possible values are IP, L2, COSINE, JACCARD, HAMMING. This is available only when the specified field is a vector field. For more information, refer to Indexes supported in Milvus. |
index_type |
The name of the algorithm used to arrange data in the specific field. For applicable algorithms, refer to In-memory Index and On-disk Index. |
index_name |
The name of the index file generated after this object has been applied. |
params |
The fine-tuning parameters for the specified index type. For details on possible keys and value ranges, refer to In-memory Index. |
collection_name |
The name of an existing collection. |
index_params |
An IndexParams object containing a list of IndexParam objects. |
sync |
Controls how the index is built in relation to the client’s request. Valid values:
|
Parameter | Description |
---|---|
fieldName |
The name of the target field to apply this IndexParam object applies. |
indexName |
The name of the index file generated after this object has been applied. |
indexType |
The name of the algorithm used to arrange data in the specific field. For applicable algorithms, refer to In-memory Index and On-disk Index. |
metricType |
The distance metric to use for the index. Possible values are IP, L2, COSINE, JACCARD, HAMMING. |
extraParams |
Extra index parameters. For details, refer to In-memory Index and On-disk Index. |
Parameter | Description |
---|---|
collection_name |
The name of an existing collection. |
field_name |
The name of the field in which to create an index. |
index_type |
The type of the index to create. |
metric_type |
The metric type used to measure vector distance. |
index_name |
The name of the index to create. |
params |
Other index-specific parameters. |
notes
Currently, you can create only one index file for each field in a collection.
Check Index Details
Once you have created an index, you can check its details.
To check the index details, use list_indexes()
to list the index names and describe_index()
to get the index details.
To check the index details, use describeIndex()
to get the index details.
To check the index details, use describeIndex()
to get the index details.
# 5. Describe index
res = client.list_indexes(
collection_name="customized_setup"
)
print(res)
# Output
#
# [
# "vector_index",
# ]
res = client.describe_index(
collection_name="customized_setup",
index_name="vector_index"
)
print(res)
# Output
#
# {
# "index_type": ,
# "metric_type": "COSINE",
# "field_name": "vector",
# "index_name": "vector_index"
# }
import io.milvus.v2.service.index.request.DescribeIndexReq;
import io.milvus.v2.service.index.response.DescribeIndexResp;
// 5. Describe index
// 5.1 List the index names
ListIndexesReq listIndexesReq = ListIndexesReq.builder()
.collectionName("customized_setup")
.build();
List<String> indexNames = client.listIndexes(listIndexesReq);
System.out.println(indexNames);
// Output:
// [
// "vector_index"
// ]
// 5.2 Describe an index
DescribeIndexReq describeIndexReq = DescribeIndexReq.builder()
.collectionName("customized_setup")
.indexName("vector_index")
.build();
DescribeIndexResp describeIndexResp = client.describeIndex(describeIndexReq);
System.out.println(JSONObject.toJSON(describeIndexResp));
// Output:
// {
// "metricType": "COSINE",
// "indexType": "AUTOINDEX",
// "fieldName": "vector",
// "indexName": "vector_index"
// }
// 5. Describe the index
res = await client.describeIndex({
collection_name: "customized_setup",
index_name: "vector_index"
})
console.log(JSON.stringify(res.index_descriptions, null, 2))
// Output
//
// [
// {
// "params": [
// {
// "key": "index_type",
// "value": "AUTOINDEX"
// },
// {
// "key": "metric_type",
// "value": "COSINE"
// }
// ],
// "index_name": "vector_index",
// "indexID": "449007919953063141",
// "field_name": "vector",
// "indexed_rows": "0",
// "total_rows": "0",
// "state": "Finished",
// "index_state_fail_reason": "",
// "pending_index_rows": "0"
// }
// ]
//
You can check the index file created on a specific field, and collect the statistics on the number of rows indexed using this index file.
Drop an Index
You can simply drop an index if it is no longer needed.
Before dropping an index, make sure it has been released first.
To drop an index, use drop_index()
.
To drop an index, use dropIndex()
.
To drop an index, use dropIndex()
.
# 6. Drop index
client.drop_index(
collection_name="customized_setup",
index_name="vector_index"
)
// 6. Drop index
DropIndexReq dropIndexReq = DropIndexReq.builder()
.collectionName("customized_setup")
.indexName("vector_index")
.build();
client.dropIndex(dropIndexReq);
// 6. Drop the index
res = await client.dropIndex({
collection_name: "customized_setup",
index_name: "vector_index"
})
console.log(res.error_code)
// Output
//
// Success
//