milvus-logo
LFAI
Home
  • User Guide

Index Vector Fields

This guide walks you through the basic operations on creating and managing indexes on vector fields in a collection.

Overview

Leveraging the metadata stored in an index file, Milvus organizes your data in a specialized structure, facilitating rapid retrieval of requested information during searches or queries.

Milvus provides several index types and metrics to sort field values for efficient similarity searches. The following table lists the supported index types and metrics for different vector field types. For details, refer to In-memory Index and Similarity Metrics.

Metric Types Index Types
  • Euclidean distance (L2)
  • Inner product (IP)
  • Cosine similarity (COSINE)
  • FLAT
  • IVF_FLAT
  • IVF_SQ8
  • IVF_PQ
  • GPU_IVF_FLAT
  • GPU_IVF_PQ
  • HNSW
  • DISKANN
Metric Types Index Types
  • Jaccard (JACCARD)
  • Hamming (HAMMING)
  • BIN_FLAT
  • BIN_IVF_FLAT
Metric Types Index Types
IP
  • SPARSE_INVERTED_INDEX
  • SPARSE_WAND

It is recommended to create indexes for both the vector field and scalar fields that are frequently accessed.

Preparations

As explained in Manage Collections, Milvus automatically generates an index and loads it into memory when creating a collection if any of the following conditions are specified in the collection creation request:

  • The dimensionality of the vector field and the metric type, or

  • The schema and the index parameters.

The code snippet below repurposes the existing code to establish a connection to a Milvus instance and create a collection without specifying its index parameters. In this case, the collection lacks an index and remains unloaded.

To prepare for indexing, use MilvusClient to connect to the Milvus server and set up a collection by using create_schema(), add_field(), and create_collection().

To prepare for indexing, use MilvusClientV2 to connect to the Milvus server and set up a collection by using createSchema(), addField(), and createCollection().

To prepare for indexing, use MilvusClient to connect to the Milvus server and set up a collection by using createCollection().

from pymilvus import MilvusClient, DataType

# 1. Set up a Milvus client
client = MilvusClient(
    uri="http://localhost:19530"
)

# 2. Create schema
# 2.1. Create schema
schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=True,
)

# 2.2. Add fields to schema
schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5)

# 3. Create collection
client.create_collection(
    collection_name="customized_setup", 
    schema=schema, 
)
import io.milvus.v2.client.ConnectConfig;
import io.milvus.v2.client.MilvusClientV2;
import io.milvus.v2.common.DataType;
import io.milvus.v2.service.collection.request.CreateCollectionReq;

String CLUSTER_ENDPOINT = "http://localhost:19530";

// 1. Connect to Milvus server
ConnectConfig connectConfig = ConnectConfig.builder()
    .uri(CLUSTER_ENDPOINT)
    .build();

MilvusClientV2 client = new MilvusClientV2(connectConfig);

// 2. Create a collection

// 2.1 Create schema
CreateCollectionReq.CollectionSchema schema = client.createSchema();

// 2.2 Add fields to schema
schema.addField(AddFieldReq.builder().fieldName("id").dataType(DataType.Int64).isPrimaryKey(true).autoID(false).build());
schema.addField(AddFieldReq.builder().fieldName("vector").dataType(DataType.FloatVector).dimension(5).build());

// 3 Create a collection without schema and index parameters
CreateCollectionReq customizedSetupReq = CreateCollectionReq.builder()
.collectionName("customized_setup")
.collectionSchema(schema)
.build();

client.createCollection(customizedSetupReq);
// 1. Set up a Milvus Client
client = new MilvusClient({address, token});

// 2. Define fields for the collection
const fields = [
    {
        name: "id",
        data_type: DataType.Int64,
        is_primary_key: true,
        autoID: false
    },
    {
        name: "vector",
        data_type: DataType.FloatVector,
        dim: 5
    },
]

// 3. Create a collection
res = await client.createCollection({
    collection_name: "customized_setup",
    fields: fields,
})

console.log(res.error_code)  

// Output
// 
// Success
// 

Index a Collection

To create an index for a collection or index a collection, use prepare_index_params() to prepare index parameters and create_index() to create the index.

To create an index for a collection or index a collection, use IndexParam to prepare index parameters and createIndex() to create the index.

To create an index for a collection or index a collection, use createIndex().

# 4.1. Set up the index parameters
index_params = MilvusClient.prepare_index_params()

# 4.2. Add an index on the vector field.
index_params.add_index(
    field_name="vector",
    metric_type="COSINE",
    index_type="IVF_FLAT",
    index_name="vector_index",
    params={ "nlist": 128 }
)

# 4.3. Create an index file
client.create_index(
    collection_name="customized_setup",
    index_params=index_params,
    sync=False # Whether to wait for index creation to complete before returning. Defaults to True.
)
import io.milvus.v2.common.IndexParam;
import io.milvus.v2.service.index.request.CreateIndexReq;

// 4 Prepare index parameters

// 4.2 Add an index for the vector field "vector"
IndexParam indexParamForVectorField = IndexParam.builder()
    .fieldName("vector")
    .indexName("vector_index")
    .indexType(IndexParam.IndexType.IVF_FLAT)
    .metricType(IndexParam.MetricType.COSINE)
    .extraParams(Map.of("nlist", 128))
    .build();

List<IndexParam> indexParams = new ArrayList<>();
indexParams.add(indexParamForVectorField);

// 4.3 Crate an index file
CreateIndexReq createIndexReq = CreateIndexReq.builder()
    .collectionName("customized_setup")
    .indexParams(indexParams)
    .build();

client.createIndex(createIndexReq);
// 4. Set up index for the collection
// 4.1. Set up the index parameters
res = await client.createIndex({
    collection_name: "customized_setup",
    field_name: "vector",
    index_type: "AUTOINDEX",
    metric_type: "COSINE",   
    index_name: "vector_index",
    params: { "nlist": 128 }
})

console.log(res.error_code)

// Output
// 
// Success
// 
Parameter Description
field_name The name of the target file to apply this object applies.
metric_type The algorithm that is used to measure similarity between vectors. Possible values are IP, L2, COSINE, JACCARD, HAMMING. This is available only when the specified field is a vector field. For more information, refer to Indexes supported in Milvus.
index_type The name of the algorithm used to arrange data in the specific field. For applicable algorithms, refer to In-memory Index and On-disk Index.
index_name The name of the index file generated after this object has been applied.
params The fine-tuning parameters for the specified index type. For details on possible keys and value ranges, refer to In-memory Index.
collection_name The name of an existing collection.
index_params An IndexParams object containing a list of IndexParam objects.
sync Controls how the index is built in relation to the client’s request. Valid values:
  • True (default): The client waits until the index is fully built before it returns. This means you will not get a response until the process is complete.
  • False: The client returns immediately after the request is received and the index is being built in the background. To find out if index creation has been completed, use the describe_index() method.
Parameter Description
fieldName The name of the target field to apply this IndexParam object applies.
indexName The name of the index file generated after this object has been applied.
indexType The name of the algorithm used to arrange data in the specific field. For applicable algorithms, refer to In-memory Index and On-disk Index.
metricType The distance metric to use for the index. Possible values are IP, L2, COSINE, JACCARD, HAMMING.
extraParams Extra index parameters. For details, refer to In-memory Index and On-disk Index.
Parameter Description
collection_name The name of an existing collection.
field_name The name of the field in which to create an index.
index_type The type of the index to create.
metric_type The metric type used to measure vector distance.
index_name The name of the index to create.
params Other index-specific parameters.

notes

Currently, you can create only one index file for each field in a collection.

Check Index Details

Once you have created an index, you can check its details.

To check the index details, use list_indexes() to list the index names and describe_index() to get the index details.

To check the index details, use describeIndex() to get the index details.

To check the index details, use describeIndex() to get the index details.

# 5. Describe index
res = client.list_indexes(
    collection_name="customized_setup"
)

print(res)

# Output
#
# [
#     "vector_index",
# ]

res = client.describe_index(
    collection_name="customized_setup",
    index_name="vector_index"
)

print(res)

# Output
#
# {
#     "index_type": ,
#     "metric_type": "COSINE",
#     "field_name": "vector",
#     "index_name": "vector_index"
# }
import io.milvus.v2.service.index.request.DescribeIndexReq;
import io.milvus.v2.service.index.response.DescribeIndexResp;

// 5. Describe index
// 5.1 List the index names
ListIndexesReq listIndexesReq = ListIndexesReq.builder()
    .collectionName("customized_setup")
    .build();

List<String> indexNames = client.listIndexes(listIndexesReq);

System.out.println(indexNames);

// Output:
// [
//     "vector_index"
// ]

// 5.2 Describe an index
DescribeIndexReq describeIndexReq = DescribeIndexReq.builder()
    .collectionName("customized_setup")
    .indexName("vector_index")
    .build();

DescribeIndexResp describeIndexResp = client.describeIndex(describeIndexReq);

System.out.println(JSONObject.toJSON(describeIndexResp));

// Output:
// {
//     "metricType": "COSINE",
//     "indexType": "AUTOINDEX",
//     "fieldName": "vector",
//     "indexName": "vector_index"
// }
// 5. Describe the index
res = await client.describeIndex({
    collection_name: "customized_setup",
    index_name: "vector_index"
})

console.log(JSON.stringify(res.index_descriptions, null, 2))

// Output
// 
// [
//   {
//     "params": [
//       {
//         "key": "index_type",
//         "value": "AUTOINDEX"
//       },
//       {
//         "key": "metric_type",
//         "value": "COSINE"
//       }
//     ],
//     "index_name": "vector_index",
//     "indexID": "449007919953063141",
//     "field_name": "vector",
//     "indexed_rows": "0",
//     "total_rows": "0",
//     "state": "Finished",
//     "index_state_fail_reason": "",
//     "pending_index_rows": "0"
//   }
// ]
// 

You can check the index file created on a specific field, and collect the statistics on the number of rows indexed using this index file.

Drop an Index

You can simply drop an index if it is no longer needed.

Before dropping an index, make sure it has been released first.

To drop an index, use drop_index().

To drop an index, use dropIndex().

To drop an index, use dropIndex().

# 6. Drop index
client.drop_index(
    collection_name="customized_setup",
    index_name="vector_index"
)
// 6. Drop index

DropIndexReq dropIndexReq = DropIndexReq.builder()
    .collectionName("customized_setup")
    .indexName("vector_index")
    .build();

client.dropIndex(dropIndexReq);
// 6. Drop the index
res = await client.dropIndex({
    collection_name: "customized_setup",
    index_name: "vector_index"
})

console.log(res.error_code)

// Output
// 
// Success
// 
Feedback

Was this page helpful?