Home

About Milvus
Get Started
Concepts
User Guide
Models
Administration Guide
Tools
Integrations
Tutorials
FAQs
API Reference

Home
Docs
User Guide
Advanced Features
Partition Key

Use Partition Key

This guide walks you through using the partition key to accelerate data retrieval from your collection.

Overview

You can set a particular field in a collection as the partition key so that Milvus distributes incoming entities into different partitions according to their respective partition values in this field. This allows entities with the same key value to be grouped in a partition, accelerating search performance by avoiding the need to scan irrelevant partitions when filtering by the key field. When compared to traditional filtering methods, the partition key can greatly enhance query performance.

You can use the partition key to implement multi-tenancy. For details on multi-tenancy, read Multi-tenancy for more.

Enable partition key

To set a field as the partition key, specify partition_key_field when creating a collection schema.

In the example code below, num_partitions determines the number of partitions that will be created. By default, it is set to 64. We recommend you retain the default value.

For more information on parameters, refer to MilvusClient, create_schema(), and add_field() in the SDK reference.

For more information on parameters, refer to MilvusClientV2, createSchema(), and addField() in the SDK reference.

For more information on parameters, refer to MilvusClient and createCollection() in the SDK reference.

Python Java Node.js

import random, time
from pymilvus import connections, MilvusClient, DataType

SERVER_ADDR = "http://localhost:19530"

# 1. Set up a Milvus client
client = MilvusClient(
    uri=SERVER_ADDR
)

# 2. Create a collection
schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=True,
    partition_key_field="color",
    num_partitions=64 # Number of partitions. Defaults to 64.
)

schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5)
schema.add_field(field_name="color", datatype=DataType.VARCHAR, max_length=512)

import io.milvus.v2.client.ConnectConfig;
import io.milvus.v2.client.MilvusClientV2;
import io.milvus.v2.common.DataType;
import io.milvus.v2.common.IndexParam;
import io.milvus.v2.service.collection.request.AddFieldReq;
import io.milvus.v2.service.collection.request.CreateCollectionReq;

String CLUSTER_ENDPOINT = "http://localhost:19530";

// 1. Connect to Milvus server
ConnectConfig connectConfig = ConnectConfig.builder()
    .uri(CLUSTER_ENDPOINT)
    .build();

MilvusClientV2 client = new MilvusClientV2(connectConfig);

// 2. Create a collection in customized setup mode

// 2.1 Create schema
CreateCollectionReq.CollectionSchema schema = client.createSchema();
schema.setEnableDynamicField(true);

// 2.2 Add fields to schema
schema.addField(AddFieldReq.builder()
        .fieldName("id")
        .dataType(DataType.Int64)
        .isPrimaryKey(true)
        .autoID(false)
        .build());

schema.addField(AddFieldReq.builder()
        .fieldName("vector")
        .dataType(DataType.FloatVector)
        .dimension(5)
        .build());

schema.addField(AddFieldReq.builder()
        .fieldName("color")
        .dataType(DataType.VarChar)
        .maxLength(512)
        .isPartitionKey(true)
        .build());

const { MilvusClient, DataType, sleep } = require("@zilliz/milvus2-sdk-node")

const address = "http://localhost:19530"

async function main() {
// 1. Set up a Milvus Client
client = new MilvusClient({address}); 

// 2. Create a collection
// 2.1 Define fields
const fields = [
    {
        name: "id",
        data_type: DataType.Int64,
        is_primary_key: true,
        auto_id: false
    },
    {
        name: "vector",
        data_type: DataType.FloatVector,
        dim: 5
    },
    {
        name: "color",
        data_type: DataType.VarChar,
        max_length: 512,
        is_partition_key: true
    }
]

After you have defined the fields, set up the index parameters.

Python Java Node.js

index_params = MilvusClient.prepare_index_params()

index_params.add_index(
    field_name="id",
    index_type="STL_SORT"
)

index_params.add_index(
    field_name="color",
    index_type="Trie"
)

index_params.add_index(
    field_name="vector",
    index_type="IVF_FLAT",
    metric_type="L2",
    params={"nlist": 1024}
)

// 2.3 Prepare index parameters
Map<String, Object> params = new HashMap<>();
params.put("nlist", 1024);
IndexParam indexParamForVectorField = IndexParam.builder()
        .fieldName("vector")
        .indexType(IndexParam.IndexType.IVF_FLAT)
        .metricType(IndexParam.MetricType.IP)
        .extraParams(params)
        .build();

List<IndexParam> indexParams = new ArrayList<>();
indexParams.add(indexParamForVectorField);

// 2.2 Prepare index parameters
const index_params = [{
    field_name: "color",
    index_type: "Trie"
},{
    field_name: "id",
    index_type: "STL_SORT"
},{
    field_name: "vector",
    index_type: "IVF_FLAT",
    metric_type: "IP",
    params: { nlist: 1024}
}]

Finally, you can create a collection.

Python Java Node.js

client.create_collection(
    collection_name="test_collection",
    schema=schema,
    index_params=index_params
)

// 2.4 Create a collection with schema and index parameters
CreateCollectionReq customizedSetupReq = CreateCollectionReq.builder()
        .collectionName("test_collection")
        .collectionSchema(schema)
        .indexParams(indexParams)
        .build();

client.createCollection(customizedSetupReq);

// 2.3 Create a collection with fields and index parameters
res = await client.createCollection({
    collection_name: "test_collection",
    fields: fields, 
    index_params: index_params,
})

console.log(res.error_code)

// Output
// 
// Success
//

List partitions

Once a field of a collection is used as the partition key, Milvus creates the specified number of partitions and manages them on your behalf. Therefore, you cannot manipulate the partitions in this collection anymore.

The following snippet demonstrates that 64 partitions in a collection once one of its fields is used as the partition key.

Insert data

Once the collection is ready, start inserting data as follows:

Prepare data

Python Java Node.js

# 3. Insert randomly generated vectors 
colors = ["green", "blue", "yellow", "red", "black", "white", "purple", "pink", "orange", "brown", "grey"]
data = []

for i in range(1000):
    current_color = random.choice(colors)
    current_tag = random.randint(1000, 9999)
    data.append({
        "id": i,
        "vector": [ random.uniform(-1, 1) for _ in range(5) ],
        "color": current_color,
        "tag": current_tag,
        "color_tag": f"{current_color}_{str(current_tag)}"
    })

print(data[0])

// 3. Insert randomly generated vectors
List<String> colors = Arrays.asList("green", "blue", "yellow", "red", "black", "white", "purple", "pink", "orange", "brown", "grey");
List<JsonObject> data = new ArrayList<>();

Gson gson = new Gson();
Random rand = new Random();
for (int i=0; i<1000; i++) {
    String current_color = colors.get(rand.nextInt(colors.size()-1));
    int current_tag = rand.nextInt(8999) + 1000;
    JsonObject row = new JsonObject();
    row.addProperty("id", (long) i);
    row.add("vector", gson.toJsonTree(Arrays.asList(rand.nextFloat(), rand.nextFloat(), rand.nextFloat(), rand.nextFloat(), rand.nextFloat())));
    row.addProperty("color", current_color);
    row.addProperty("tag", current_tag);
    row.addProperty("color_tag", current_color + "_" + (rand.nextInt(8999) + 1000));
    data.add(row);
}

System.out.println(data.get(0));

// 3. Insert randomly generated vectors 
const colors = ["green", "blue", "yellow", "red", "black", "white", "purple", "pink", "orange", "brown", "grey"]
var data = []

for (let i = 0; i < 1000; i++) {
    const current_color = colors[Math.floor(Math.random() * colors.length)]
    const current_tag = Math.floor(Math.random() * 8999 + 1000)
    data.push({
        id: i,
        vector: [Math.random(), Math.random(), Math.random(), Math.random(), Math.random()],
        color: current_color,
        tag: current_tag,
        color_tag: `${current_color}_${current_tag}`
    })
}

console.log(data[0])

You can view the structure of the generated data by checking its first entry.

{
    id: 0,
    vector: [
        0.1275656405044483,
        0.47417858592773277,
        0.13858264437643286,
        0.2390904907020377,
        0.8447862593689635
    ],
    color: 'blue',
    tag: 2064,
    color_tag: 'blue_2064'
}

Insert data

Use the insert() method to insert the data into the collection.

Python Java Node.js

res = client.insert(
    collection_name="test_collection",
    data=data
)

print(res)

# Output
#
# {
#     "insert_count": 1000,
#     "ids": [
#         0,
#         1,
#         2,
#         3,
#         4,
#         5,
#         6,
#         7,
#         8,
#         9,
#         "(990 more items hidden)"
#     ]
# }

// 3.1 Insert data into the collection
InsertReq insertReq = InsertReq.builder()
        .collectionName("test_collection")
        .data(data)
        .build();

InsertResp insertResp = client.insert(insertReq);

System.out.println(insertResp.getInsertCnt());

// Output:
// 1000

res = await client.insert({
    collection_name: "test_collection",
    data: data,
})

console.log(res.insert_cnt)

// Output
// 
// 1000
//

Use partition key

Once you have indexed and loaded the collection as well as inserted data, you can conduct a similarity search using the partition key.

For more information on parameters, refer to search() in the SDK reference.

notes

To conduct a similarity search using the partition key, you should include either of the following in the boolean expression of the search request:

expr='<partition_key>=="xxxx"'
expr='<partition_key> in ["xxx", "xxx"]'

Do replace <partition_key> with the name of the field that is designated as the partition key.

Python Java Node.js

# 4. Search with partition key
query_vectors = [[0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]]

res = client.search(
    collection_name="test_collection",
    data=query_vectors,
    filter="color == 'green'",
    search_params={"metric_type": "L2", "params": {"nprobe": 10}},
    output_fields=["id", "color_tag"],
    limit=3
)

print(res)

# Output
#
# [
#     [
#         {
#             "id": 970,
#             "distance": 0.5770174264907837,
#             "entity": {
#                 "id": 970,
#                 "color_tag": "green_9828"
#             }
#         },
#         {
#             "id": 115,
#             "distance": 0.6898155808448792,
#             "entity": {
#                 "id": 115,
#                 "color_tag": "green_4073"
#             }
#         },
#         {
#             "id": 899,
#             "distance": 0.7028976678848267,
#             "entity": {
#                 "id": 899,
#                 "color_tag": "green_9897"
#             }
#         }
#     ]
# ]

// 4. Search with partition key
List<BaseVector> query_vectors = Collections.singletonList(new FloatVec(new float[]{0.3580376395471989f, -0.6023495712049978f, 0.18414012509913835f, -0.26286205330961354f, 0.9029438446296592f}));

SearchReq searchReq = SearchReq.builder()
        .collectionName("test_collection")
        .data(query_vectors)
        .filter("color == \"green\"")
        .topK(3)
        .outputFields(Collections.singletonList("color_tag"))
        .build();

SearchResp searchResp = client.search(searchReq);

List<List<SearchResp.SearchResult>> searchResults = searchResp.getSearchResults();
for (List<SearchResp.SearchResult> results : searchResults) {
    System.out.println("TopK results:");
    for (SearchResp.SearchResult result : results) {
        System.out.println(result);
    }
}

// Output:
// SearchResp.SearchResult(entity={color_tag=green_4945}, score=1.192079, id=542)
// SearchResp.SearchResult(entity={color_tag=green_4633}, score=0.9138917, id=144)
// SearchResp.SearchResult(entity={color_tag=green_8038}, score=0.8381896, id=962)

// 4. Search with partition key
const query_vectors = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]

res = await client.search({
    collection_name: "test_collection",
    data: query_vectors,
    filter: "color == 'green'",
    output_fields: ["color_tag"],
    limit: 3
})

console.log(res.results)

// Output
// 
// [
//   { score: 2.402090549468994, id: '135', color_tag: 'green_2694' },
//   { score: 2.3938629627227783, id: '326', color_tag: 'green_7104' },
//   { score: 2.3235254287719727, id: '801', color_tag: 'green_3162' }
// ]
//

Typical use cases

You can utilize the partition key feature to achieve better search performance and enable multi-tenancy. This can be done by assigning a tenant-specific value as the partition key field for each entity. When searching or querying the collection, you can filter entities by the tenant-specific value by including the partition key field in the boolean expression. This approach ensures data isolation by tenants and avoids scanning unnecessary partitions.

Use Partition Key
Overview
Enable partition key
List partitions
Insert data
Use partition key
Typical use cases

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?

Use Partition Key

Overview

Enable partition key

List partitions

Insert data

Prepare data

Insert data

Use partition key

Typical use cases

Table of contents

Try Managed Milvus for Free

Feedback