• User Guide

Load a Collection

This topic describes how to load the collection to memory before a search or a query. All search and query operations within Milvus are executed in memory. You should create an index before you can load a collection.

Milvus allows users to load a collection as multiple replicas to utilize the CPU and memory resources of extra query nodes. This feature boosts the overall QPS and throughput without extra hardware. Before loading a collection, ensure that you have already indexed it.

  • The volume of the data to load must be under 90% of the total memory resources of all query nodes to reserve memory resources for the execution engine.
  • All the online query nodes will be divided into multiple replica groups according to the replica number specified by users. All replica groups shall have minimal memory resources to load one replica of the provided collection. Otherwise, an error will be returned.
  • Create an index before loading a collection. To implement searches, create at least an IVF_FLAT index on the collection.

When interacting with Milvus using Python code, you have the flexibility to choose between PyMilvus and MilvusClient (new). For more information, refer to Python SDK.

from pymilvus import Collection, utility

# Get an existing collection.
collection = Collection("book")      

# Check the loading progress and loading status
# Output: <LoadState: Loaded>

# Output: {'loading_progress': 100%}
await milvusClient.loadCollection({
  collection_name: "book",

err := milvusClient.LoadCollection(
  context.Background(),   // ctx
  "book",                 // CollectionName
  false,                  // async
if err != nil {
  log.Fatal("failed to load collection:", err.Error())

// To get the load status
loadStatus, err := milvusClient.GetLoadState(
  context.Background(),             // ctx
  "book",                           // CollectionName
  []string{"Default partition"},    // List of partitions
if err != nil {
    log.Fatal("failed to get the load state", err.Error())

// To get the loading progress
percentage, err := milvusClient.GetLoadingProgress(
    context.Background(),           // ctx
    "book",                         // CollectionName
    []string{"Default partition"},  // List of partitions
if err != nil {
    log.Fatal("failed to get the loading progress", err.Error())

// You can check the loading status 

GetLoadStateParam param = GetLoadStateParam.newBuilder()
R<GetLoadStateResponse> response = client.getLoadState(param);
if (response.getStatus() != R.Status.Success.getCode()) {

// and loading progress as well

GetLoadingProgressParam param = GetLoadingProgressParam.newBuilder()
R<GetLoadingProgressResponse> response = client.getLoadingProgress(param);
if (response.getStatus() != R.Status.Success.getCode()) {
```shell load -c book ```
curl -X 'POST' \
  'http://localhost:9091/api/v1/collection/load' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "collection_name": "book"
Parameter Description
partition_name (optional) Name of the partition to load.
replica_number (optional) Number of the replica to load.
Parameter Description
collection_name Name of the collection to load.
Parameter Description
ctx Context to control API invocation process.
CollectionName Name of the collection to load.
async Switch to control sync/async behavior. The deadline of context is not applied in sync load.
Parameter Description
CollectionName Name of the collection to load.

Get replica information

You can check the information of the loaded replicas.

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.load(replica_number=2)    # Load collection as 2 replicas
result = collection.get_replicas()

Below is an example of the output.

Replica groups:
- Group: <group_id:435309823872729305>, <group_nodes:(21, 20)>, <shards:[Shard: <channel_name:milvus-zong-rootcoord-dml_27_435367661874184193v0>, <shard_leader:21>, <shard_nodes:[21]>, Shard: <channel_name:milvus-zong-rootcoord-dml_28_435367661874184193v1>, <shard_leader:20>, <shard_nodes:[20, 21]>]>
- Group: <group_id:435309823872729304>, <group_nodes:(25,)>, <shards:[Shard: <channel_name:milvus-zong-rootcoord-dml_28_435367661874184193v1>, <shard_leader:25>, <shard_nodes:[25]>, Shard: <channel_name:milvus-zong-rootcoord-dml_27_435367661874184193v0>, <shard_leader:25>, <shard_nodes:[25]>]>


  • Error will be returned at the attempt to load partition(s) when the parent collection is already loaded. Future releases will support releasing partitions from a loaded collection, and (if needed) then loading some other partition(s).
  • "Load successfully" will be returned at the attempt to load the collection that is already loaded.
  • Error will be returned at the attempt to load the collection when the child partition(s) is/are already loaded. Future releases will support loading the collection when some of its partitions are already loaded.
  • Loading different partitions in a same collection via separate RPCs is not allowed.

What's next


Was this page helpful?