milvus-logo
LFAI
Home

Milvus 2.2 Benchmark Test Report

This report shows the major test results of Milvus 2.2.0. It aims to provide a picture of Milvus 2.2.0 search performance, especially in the capability to scale up and scale out.

Milvus Performance Evaluation 2023

We have recently run a benchmark against Milvus 2.2.3 and have the following key findings:

  • A 2.5x reduction in search latency
  • A 4.5x increase in QPS
  • Billion-scale similarity search with little performance degradation
  • Linear scalability when using multiple replicas

For details, welcome referring to this whitepaper and related benchmark test code.

Summary

  • Comparing with Milvus 2.1, the QPS of Milvus 2.2.0 increases over 48% in cluster mode and over 75% in standalone mode.
  • Milvus 2.2.0 has an impressive capability to scale up and scale out:
    • QPS increases linearly when expanding CPU cores from 8 to 32.
    • QPS increases linearly when expanding Querynode replicas from 1 to 8.

Terminology

Click to see the details of the terms used in the test
Term Description
nq Number of vectors to be searched in one search request
topk Number of the nearest vectors to be retrieved for each vector (in nq) in a search request
ef A search parameter specific to HNSW index
RT Response time from sending the request to receiving the response
QPS Number of search requests that are successfully processed per second

Test environment

All tests are performed under the following environments.

Hardware environment

HardwareSpecification
CPUIntel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz
Memory16*\32 GB RDIMM, 3200 MT/s
SSDSATA 6 Gbps

Software environment

SoftwareVersion
Milvusv2.2.0
Milvus GO SDKv2.2.0

Deployment scheme

  • Milvus instances (standalone or cluster) are deployed via Helm on a Kubernetes cluster based on physical or virtual machines.
  • Different tests merely vary in the number of CPU cores, the size of memory, and the number of replicas (worker nodes), which only applies to Milvus clusters.
  • Unspecified configurations are identical to default configurations.
  • Milvus dependencies (MinIO, Pulsar and Etcd) store data on the local SSD in each node.
  • Search requests are sent to the Milvus instances via Milvus GO SDK.

Data sets

The test uses the open-source dataset SIFT (128 dimensions) from ANN-Benchmarks.

Test pipeline

  1. Start a Milvus instance by Helm with respective server configurations as listed in each test.
  2. Connect to the Milvus instance via Milvus GO SDK and get the corresponding test results.
  3. Create a collection.
  4. Insert 1 million SIFT vectors. Build an HNSW index and configure the index parameters by setting M to 8 and efConstruction to 200.
  5. Load the collection.
  6. Search with different concurrent numbers with search parameters nq=1, topk=1, ef=64, the duration of each concurrency is at least 1 hour.

Test results

Milvus 2.2.0 v.s. Milvus 2.1.0

Cluster

Server configurations (cluster)

queryNode:
  replicas: 1
  resources:
    limits:
      cpu: "12.0"
      memory: 8Gi
    requests:
      cpu: "12.0"
      memory: 8Gi

Search performance

MilvusQPSRT(TP99) / msRT(TP50) / msfail/s
2.1.0690459280
2.2.01024863240

Cluster search performance Cluster search performance

Standalone

Server configurations (standalone)

standalone:
  replicas: 1
  resources:
    limits:
      cpu: "12.0"
      memory: 16Gi
    requests:
      cpu: "12.0"
      memory: 16Gi

Search performance

MilvusQPSRT(TP99) / msRT(TP50) / msfail/s
2.1.04287104760
2.2.07522127790

Standalone search performance Standalone search performance

Milvus 2.2.0 Scale-up

Expand the CPU cores in one Querynode to check the capability to scale up.

Server configurations (cluster)

queryNode:
 replicas: 1
 resources:
   limits:
     cpu: "8.0" /"12.0" /"16.0" /"32.0"
     memory: 8Gi
   requests:
     cpu: "8.0" /"12.0" /"16.0" /"32.0"
     memory: 8Gi

Search Performance

CPU coresConcurrent NumberQPSRT(TP99) / msRT(TP50) / msfail/s
85007153127830
123001024863240
166001413585420
326002028163280

Search performance by Querynode CPU cores Search performance by Querynode CPU cores

Milvus 2.2.0 Scale-out

Expand more replicas with more Querynodes to check the capability to scale out.

Note: the number of Querynodes equals the replica_number when loading the collection.

Server configurations (cluster)

queryNode:
  replicas: 1 / 2 / 4 / 8      
  resources:
    limits:
      cpu: "8.0"
      memory: 8Gi
    requests:
      cpu: "8.0"
      memory: 8Gi

ReplicasConcurrent NumberQPSRT(TP99) / msRT(TP50) / msfail/s
15007153127830
250015903105270
480019281109400
812003065593380

Search performance by Querynode replicas Search performance by Querynode replicas

What's next

  • Try performing Milvus 2.2.0 benchmark tests on your own by referring to this guide, except that you should instead use Milvus 2.2 and Pymilvus 2.2 in this guide.
Feedback

Was this page helpful?