Milvus Metrics Dashboard

Milvus outputs a list of detailed time-series metrics during runtime. You can use Prometheus and Grafana to visualize the metrics. This topic introduces the monitoring metrics displayed in the Grafana Milvus Dashboard.

The time unit in this topic is milliseconds. And “99th percentile” in this topic refers to the fact that 99 percent of the time statistics are controlled within a certain value.

We recommend reading Milvus monitoring framework overview to understand Prometheus metrics first.

Proxy

Panel	Panel description	PromQL (Prometheus query language)	The Milvus metrics used	Milvus metrics description
Search Vector Count Rate	The average number of vectors queried per second by each proxy within the past two minutes.	`sum(increase(milvus_proxy_search_vectors_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id)`	`milvus_proxy_search_vectors_count`	The accumulated number of vectors queried.
Insert Vector Count Rate	The average number of vectors inserted per second by each proxy within the past two minutes.	`sum(increase(milvus_proxy_insert_vectors_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id)`	`milvus_proxy_insert_vectors_count`	The accumulated number of vectors inserted.
Search Latency	The average latency and the 99th percentile of the latency of receiving search and query requests by each proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_sq_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)`	`milvus_proxy_sq_latency`	The latency of search and query requests.
Collection Search Latency	The average latency and the 99th percentile of the latency of receiving search and query requests to a specific collection by each proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_collection_sq_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])))` avg: `sum(increase(milvus_proxy_collection_sq_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_collection_sq_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])) by (pod, node_id, query_type)`	`milvus_proxy_collection_sq_latency_sum`	The latency of search and query requests to a specific collection
Mutation Latency	The average latency and the 99th percentile of the latency of receiving mutation requests by each proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, msg_type, pod, node_id) (rate(milvus_proxy_mutation_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_mutation_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type) / sum(increase(milvus_proxy_mutation_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type)`	`milvus_proxy_mutation_latency_sum`	The latency of mutation requests.
Collection Mutation Latency	The average latency and the 99th percentile of the latency of receiving mutation requests to a specific collection by each proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_collection_sq_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])))` avg: `sum(increase(milvus_proxy_collection_sq_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_collection_sq_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])) by (pod, node_id, query_type)`	`milvus_proxy_collection_sq_latency_sum`	The latency of mutation requests to a specific collection
Wait Search Result Latency	The average latency and the 99th percentile of the latency between sending search and query requests and receiving results by proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_wait_result_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_sq_wait_result_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_wait_result_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)`	`milvus_proxy_sq_wait_result_latency`	The latency between sending search and query requests and receiving results.
Reduce Search Result Latency	The average latency and the 99th percentile of the latency of aggregating search and query results by proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_reduce_result_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_sq_reduce_result_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_reduce_result_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)`	`milvus_proxy_sq_reduce_result_latency`	The latency of aggregating search and query results returned by each query node.
Decode Search Result Latency	The average latency and the 99th percentile of the latency of decoding search and query results by proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_decode_result_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_sq_decode_result_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_decode_resultlatency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)`	`milvus_proxy_sq_decode_result_latency`	The latency of decoding each search and query result.
Msg Stream Object Num	The average, maximum, and minimum number of the msgstream objects created by each proxy on its corresponding physical topic within the past two minutes.	`avg(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_proxy_msgstream_obj_num`	The number of msgstream objects created on each physical topic.
Mutation Send Latency	The average latency and the 99th percentile of the latency of sending insertion or deletion requests by each proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, msg_type, pod, node_id) (rate(milvus_proxy_mutation_send_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_mutation_send_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type) / sum(increase(milvus_proxy_mutation_send_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type)`	`milvus_proxy_mutation_send_latency`	The latency of sending insertion or deletion requests.
Cache Hit Rate	The average cache hit rate of operations including `GeCollectionID`, `GetCollectionInfo`, and `GetCollectionSchema` per second within the past two minutes.	`sum(increase(milvus_proxy_cache_hit_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", cache_state="hit"}[2m])/120) by(cache_name, pod, node_id) / sum(increase(milvus_proxy_cache_hit_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(cache_name, pod, node_id)`	`milvus_proxy_cache_hit_count`	The statistics of hit and failure rate of each cache reading operation.
Cache Update Latency	The average latency and the 99th percentile of cache update latency by proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_cache_update_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_cache_update_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_cache_update_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)`	`milvus_proxy_cache_update_latency`	The latency of updating cache each time.
Sync Time	The average, maximum, and minimum number of epoch time synced by each proxy in its corresponding physical channel.	`avg(milvus_proxy_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_proxy_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_proxy_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_proxy_sync_epoch_time`	Each physical channel’s epoch time (Unix time, the milliseconds passed ever since January 1, 1970). There is a default `ChannelName` apart from the physical channels.
Apply PK Latency	The average latency and the 99th percentile of primary key application latency by each proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_apply_pk_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_apply_pk_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_apply_pk_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)`	`milvus_proxy_apply_pk_latency`	The latency of applying primary key.
Apply Timestamp Latency	The average latency and the 99th percentile of timestamp application latency by each proxy within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_apply_timestamp_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_apply_timestamp_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_apply_timestamp_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)`	`milvus_proxy_apply_timestamp_latency`	The latency of applying timestamp.
Request Success Rate	The number of successful requests received per second by each proxy, with a detailed breakdown of each request type. Possible request types are DescribeCollection, DescribeIndex, GetCollectionStatistics, HasCollection, Search, Query, ShowPartitions, Insert, etc.
`sum(increase(milvus_proxy_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", status="success"}[2m])/120) by(function_name, pod, node_id)`	`milvus_proxy_req_count`	The number of all types of receiving requests
Request Failed Rate	The number of failed requests received per second by each proxy, with a detailed breakdown of each request type. Possible request types are DescribeCollection, DescribeIndex, GetCollectionStatistics, HasCollection, Search, Query, ShowPartitions, Insert, etc.
`sum(increase(milvus_proxy_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", status="fail"}[2m])/120) by(function_name, pod, node_id)`	`milvus_proxy_req_count`	The number of all types of receiving requests
Request Latency	The average latency and the 99th percentile of the latency of all types of receiving requests by each proxy	p99: `histogram_quantile(0.99, sum by (le, pod, node_id, function_name) (rate(milvus_proxy_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_proxy_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, function_name) / sum(increase(milvus_proxy_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, function_name)`	`milvus_proxy_req_latency`	The latency of all types of receiving requests
Insert/Delete Request Byte Rate	The number of bytes of insert and delete requests received per second by proxy within the past two minutes.	`sum(increase(milvus_proxy_receive_bytes_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(pod, node_id)`	`milvus_proxy_receive_bytes_count`	The count of insert and delete requests.
Send Byte Rate	The number of bytes per second sent back to the client while each proxy is responding to search and query requests within the past two minutes.	`sum(increase(milvus_proxy_send_bytes_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(pod, node_id)`	`milvus_proxy_send_bytes_count`	The number of bytes sent back to the client while each proxy is responding to search and query requests.

Root coordinator

Panel	Panel description	PromQL (Prometheus query language)	The Milvus metrics used	Milvus metrics description
Proxy Node Num	The number of proxies created.	`sum(milvus_rootcoord_proxy_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_rootcoord_proxy_num`	The number of proxies.
Sync Time	The average, maximum, and minimum number of epoch time synced by each root coord in each physical channel (PChannel).	avg(milvus_rootcoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) max(milvus_rootcoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) min(milvus_rootcoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)	`milvus_rootcoord_sync_epoch_time`	Each physical channel’s epoch time (Unix time, the milliseconds passed ever since January 1, 1970).
DDL Request Rate	The status and number of DDL requests per second within the past two minutes.	`sum(increase(milvus_rootcoord_ddl_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, function_name)`	`milvus_rootcoord_ddl_req_count`	The total number of DDL requests including `CreateCollection`, `DescribeCollection`, `DescribeSegments`, `HasCollection`, `ShowCollections`, `ShowPartitions`, and `ShowSegments`.
DDL Request Latency	The average latency and the 99th percentile of DDL request latency within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, function_name) (rate(milvus_rootcoord_ddl_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_rootcoord_ddl_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name) / sum(increase(milvus_rootcoord_ddl_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name)`	`milvus_rootcoord_ddl_req_latency`	The latency of all types of DDL requests.
Sync Timetick Latency	The average latency and the 99th percentile of the time used by root coord to sync all timestamp to PChannel within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le) (rate(milvus_rootcoord_sync_timetick_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_rootcoord_sync_timetick_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_rootcoord_sync_timetick_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))`	`milvus_rootcoord_sync_timetick_latency`	the time used by root coord to sync all timestamp to pchannel.
ID Alloc Rate	The number of IDs assigned by root coord per second within the past two minutes.	`sum(increase(milvus_rootcoord_id_alloc_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120)`	`milvus_rootcoord_id_alloc_count`	The accumulated number of IDs assigned by root coord.
Timestamp	The latest timestamp of root coord.	`milvus_rootcoord_timestamp{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}`	`milvus_rootcoord_timestamp`	The latest timestamp of root coord.
Timestamp Saved	The pre-assigned timestamps that root coord saves in meta storage.	`milvus_rootcoord_timestamp_saved{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}`	`milvus_rootcoord_timestamp_saved`	The pre-assigned timestamps that root coord saves in meta storage. The timestamps are assigned 3 seconds earlier. And the timestamp is updated and saved in meta storage every 50 millisecond.
Collection Num	The total number of collections.	`sum(milvus_rootcoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_rootcoord_collection_num`	The total number of collections existing in Milvus currently.
Partition Num	The total number of partitions.	`sum(milvus_rootcoord_partition_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_rootcoord_partition_num`	The total number of partitions existing in Milvus currently.
DML Channel Num	The total number of DML channels.	`sum(milvus_rootcoord_dml_channel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_rootcoord_dml_channel_num`	The total number of DML channels existing in Milvus currently.
Msgstream Num	The total number of msgstreams.	`sum(milvus_rootcoord_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_rootcoord_msgstream_obj_num`	The total number of msgstreams in Milvus currently.
Credential Num	The total number of credentials.	`sum(milvus_rootcoord_credential_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_rootcoord_credential_num`	The total number of credentials in Milvus currently.
Time Tick Delay	The sum of the maximum time tick delay of the flow graphs on all DataNodes and QueryNodes.	`sum(milvus_rootcoord_time_tick_delay{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_rootcoord_time_tick_delay`	The maximum time tick delay of the flow graphs on each DataNode and QueryNode.

Query coordinator

Panel	Panel description	PromQL (Prometheus query language)	The Milvus metrics used	Milvus metrics description
Collection Loaded Num	The number of collections that are currently loaded into memory.	`sum(milvus_querycoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_querycoord_collection_num`	The number of collections that are currently loaded by Milvus.
Entity Loaded Num	The number of entities that are currently loaded into memory.	`sum(milvus_querycoord_entity_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_querycoord_entitiy_num`	The number of entities that are currently loaded by Milvus.
Load Request Rate	The number of load requests per second within the past two minutes.	`sum(increase(milvus_querycoord_load_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])120) by (status)`	`milvus_querycoord_load_req_count`	The accumulated number of load requests.
Release Request Rate	The number of release requests per second within the past two minutes.	`sum(increase(milvus_querycoord_release_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status)`	`milvus_querycoord_release_req_count`	The accumulated number of release requests.
Load Request Latency	The average latency and the 99th percentile of load request latency within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_load_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querycoord_load_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_load_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))`	`milvus_querycoord_load_latency`	The time used to complete a load request.
Release Request Latency	The average latency and the 99th percentile of release request latency within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_release_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querycoord_release_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_release_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))`	`milvus_querycoord_release_latency`	The time used to complete a release request.
Sub-Load Task	The number of sub load tasks.	`sum(milvus_querycoord_child_task_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_querycoord_child_task_num`	The number of sub load tasks. A query coord splits a load request into multiple sub load tasks.
Parent Load Task	The number of parent load tasks.	`sum(milvus_querycoord_parent_task_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_querycoord_parent_task_num`	The number of sub load tasks. Each load request corresponds to a parent task in the task queue.
Sub-Load Task Latency	The average latency and the 99th percentile of the latency of a sub load task within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_child_task_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querycoord_child_task_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_child_task_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) namespace"}[2m])))`	`milvus_querycoord_child_task_latency`	The latency to complete a sub load task.
Query Node Num	The number of query nodes managed by query coord.	`sum(milvus_querycoord_querynode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_querycoord_querynode_num`	The number of query nodes managed by query coord.

Query node

Panel	Panel description	PromQL (Prometheus query language)	The Milvus metrics used	Milvus metrics description
Collection Loaded Num	The number of collections loaded into memory by each query node.	`sum(milvus_querynode_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_collection_num`	The number of collection loaded by each query node.
Partition Loaded Num	The number of partitions loaded into memory by each query node.	`sum(milvus_querynode_partition_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_partition_num`	The number of partitions loaded by each query node.
Segment Loaded Num	The number of segments loaded into memory by each query node.	`sum(milvus_querynode_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_segment_num`	The number of segments loaded by each query node.
Queryable Entity Num	The number of queryable and searchable entities on each query node.	`sum(milvus_querynode_entity_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_entity_num`	The number of queryable and searchable entities on each query node.
DML Virtual Channel	The number of DML virtual channels watched by each query node.	`sum(milvus_querynode_dml_vchannel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_dml_vchannel_num`	The number of DML virtual channels watched by each query node.
Delta Virtual Channel	The number of delta channels watched by each query node.	`sum(milvus_querynode_delta_vchannel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_delta_vchannel_num`	The number of delta channels watched by each query node.
Consumer Num	The number of consumers in each query node.	`sum(milvus_querynode_consumer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_consumer_num`	The number of consumers in each query node.
Search Request Rate	The total number of search and query requests received per second by each query node and the number of successful search and query requests within the past two minutes.	`sum(increase(milvus_querynode_sq_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (query_type, status, pod, node_id)`	`milvus_querynode_sq_req_count`	The accumulated number of search and query requests.
Search Request Latency	The average latency and the 99th percentile of the time used in search and query requests by each query node within the past two minutes. This panel displays the latency of search and query requests whose status are “success” or "total".	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_sq_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_sq_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)`	`milvus_querynode_sq_req_latency`	The search request latency of query node.
Search in Queue Latency	The average latency and the 99th percentile of the latency of search and query requests in queue within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id, query_type) (rate(milvus_querynode_sq_queue_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_sq_queue_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_queue_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)`	`milvus_querynode_sq_queue_latency`	The latency of the search and query requests received by query node.
Search Segment Latency	The average latency and the 99th percentile of the time each query node takes to search and query a segment within the past two minutes. The status of a segment can be sealed or growing.	p99: `histogram_quantile(0.99, sum by (le, query_type, segment_state, pod, node_id) (rate(milvus_querynode_sq_segment_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_sq_segment_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type, segment_state) / sum(increase(milvus_querynode_sq_segment_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type, segment_state)`	`milvus_querynode_sq_segment_latency`	The time each query node takes to search and query each segment.
Segcore Request Latency	The average latency and the 99th percentile of the time each query node takes to search and query in segcore within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_querynode_sq_core_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_sq_core_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_core_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)`	`milvus_querynode_sq_core_latency`	The time each query node takes to search and query in segcore.
Search Reduce Latency	The average latency and the 99th percentile of the time used by each query node during the reduce stage of a search or query within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id, query_type) (rate(milvus_querynode_sq_reduce_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_sq_reduce_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_reduce_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)`	`milvus_querynode_sq_reduce_latency`	The time each query spends during the stage of reduce.
Load Segment Latency	The average latency and the 99th percentile of the time each query node takes to load a segment in the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_load_segment_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_load_segment_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_load_segment_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_querynode_load_segment_latency_bucket`	The time each query node takes to load a segment.
Flowgraph Num	The number of flowgraphs in each query node.	`sum(milvus_querynode_flowgraph_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_flowgraph_num`	The number of flowgraphs in each query node.
Unsolved Read Task Length	The length of the queue of unsolved read requests in each query node.	`sum(milvus_querynode_read_task_unsolved_len{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_read_task_unsolved_len`	The length of the queue of unsolved read requests.
Ready Read Task Length	The length of the queue of read requests to be executed in each query node.	`sum(milvus_querynode_read_task_ready_len{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_read_task_ready_len`	The length of the queue of read requests to be executed.
Parallel Read Task Num	The number of concurrent read requests currently executed in each query node.	`sum(milvus_querynode_read_task_concurrency{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_read_task_concurrency`	The number of concurrent read requests currently executed.
Estimate CPU Usage	The CPU usage by each query node estimated by the scheduler.	`sum(milvus_querynode_estimate_cpu_usage{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_querynode_estimate_cpu_usage`	The CPU usage by each query node estimated by the scheduler. When the value is 100, this means a whole virtual CPU (vCPU) is used.
Search Group Size	The average number and the 99th percentile of the search group size (i.e. The total number of original search requests in the combined search requests executed by each query node) within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_size_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_search_group_size_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_size_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_querynode_load_segment_latency_bucket`	The number of original search tasks among the combined search tasks from different buckets (i.e. The search group size).
Search NQ	The average number and the 99th percentile of the number of queries (NQ) done while each query node executes search requests within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_size_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_search_group_size_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_size_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	milvus_querynode_load_segment_latency_bucket	The number of queries (NQ) of search requests.
Search Group NQ	The average number and the 99th percentile of NQ of search requests combined and executed by each query node within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_nq_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_search_group_nq_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_nq_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_querynode_load_segment_latency_bucket`	The NQ of search requests combined from different buckets.
Search Top_K	The average number and the 99th percentile of the `Top_K` of search requests executed by each query node within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_topk_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_search_topk_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_topk_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_querynode_load_segment_latency_bucket`	The `Top_K` of search requests.
Search Group Top_K	The average number and the 99th percentile of the `Top_K` of search requests combined and executed by each query node within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_topk_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_querynode_search_group_topk_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_topk_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_querynode_load_segment_latency_bucket`	The `Top_K` of search requests combined from different buckets .
Evicted Read Requests Rate	The number of read requests evicted per second by each query node within the past two minutes.	`sum(increase(milvus_querynode_read_evicted_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id)`	`milvus_querynode_sq_req_count`	The accumulated number of read requests evicted by query node due to traffic restriction.

Data coordinator

Panel	Panel description	PromQL (Prometheus query language)	The Milvus metrics used	Milvus metrics description
Data Node Num	The number of data nodes managed by data coord.	`sum(milvus_datacoord_datanode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_datacoord_datanode_num`	The number of data nodes managed by data coord.
Segment Num	The number of all types of segments recorded in metadata by data coord.	`sum(milvus_datacoord_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (segment_state)`	`milvus_datacoord_segment_num`	The number of all types of segments recorded in metadata by data coord. Types of segment include: dropped, flushed, flushing, growing, and sealed.
Collection Num	The number of collections recorded in metadata by data coord.	`sum(milvus_datacoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_datacoord_collection_num`	The number of collections recorded in metadata by data coord.
Stored Rows	The accumulated number of rows of valid and flushed data in data coord.	`sum(milvus_datacoord_stored_rows_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_datacoord_stored_rows_num`	The accumulated number of rows of valid and flushed data in data coord.
Stored Rows Rate	The average number of rows flushed per second within the past two minutes.	`sum(increase(milvus_datacoord_stored_rows_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id)`	`milvus_datacoord_stored_rows_count`	The accumulated number of rows flushed by data coord.
Sync Time	The average, maximum, and minimum number of epoch time synced by data coord in each physical channel.	avg(milvus_datacoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) max(milvus_datacoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) min(milvus_datacoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)	`milvus_datacoord_sync_epoch_time`	Each physical channel’s epoch time (Unix time, the milliseconds passed ever since January 1, 1970).
Stored Binlog Size	The total size of stored binlog.	`sum(milvus_datacoord_stored_binlog_size{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_datacoord_stored_binlog_size`	The total size of binlog stored in Milvus.

Data node

Panel	Panel description	PromQL (Prometheus query language)	The Milvus metrics used	Milvus metrics description
Flowgraph Num	The number of flowgraph objects that correspond to each data node.	`sum(milvus_datanode_flowgraph_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_datanode_flowgraph_num`	The number of flowgraph objects. Each shard in a collection corresponds to a flowgraph object.
Msg Rows Consume Rate	The number of rows of streaming messages consumed per second by each data node within the past two minutes.	`sum(increase(milvus_datanode_msg_rows_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (msg_type, pod, node_id)`	`milvus_datanode_msg_rows_count`	The number of rows of streaming messages consumed. Currently, streaming messages counted by data node only include insertion and deletion messages.
Flush Data Size Rate	The size of each flushed message recorded per second by each data node within the past two minutes.	`sum(increase(milvus_datanode_flushed_data_size{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (msg_type, pod, node_id)`	`milvus_datanode_flushed_data_size`	The size of each flushed message. Currently, streaming messages counted by data node only include insertion and deletion messages.
Consumer Num	The number of consumers created on each data node.	`sum(milvus_datanode_consumer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_datanode_consumer_num`	The number of consumers created on each data node. Each flowgraph corresponds to a consumer.
Producer Num	The number of producers created on each data node.	`sum(milvus_datanode_producer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_datanode_producer_num`	The number of consumers created on each data node. Each shard in a collection corresponds to a delta channel producer and a timetick channel producer.
Sync Time	The average, maximum, and minimum number of epoch time synced by each data node in all physical topics.	`avg(milvus_datanode_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_datanode_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_datanode_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_datanode_sync_epoch_time`	The epoch time (Unix time, the milliseconds passed ever since January 1, 1970.) of each physical topic on a data node.
Unflushed Segment Num	The number of unflushed segments created on each data node.	`sum(milvus_datanode_unflushed_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)`	`milvus_datanode_unflushed_segment_num`	The number of unflushed segments created on each data node.
Encode Buffer Latency	The average latency and the 99th percentile of the time used to encode a buffer by each data node within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_encode_buffer_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_datanode_encode_buffer_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_encode_buffer_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_datanode_encode_buffer_latency`	The time each data node takes to encode a buffer.
Save Data Latency	The average latency and the 99th percentile of the time used to write a buffer into the storage layer by each data node within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_save_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_datanode_save_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_save_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_datanode_save_latency`	The time each data node takes to write a buffer into the storage layer.
Flush Operate Rate	The number of times each data node flushes a buffer per second within the past two minutes.	`sum(increase(milvus_datanode_flush_buffer_op_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id)`	`milvus_datanode_flush_buffer_op_count`	The accumulated number of times a data node flushes a buffer.
Autoflush Operate Rate	The number of times each data node auto-flushes a buffer per second within the past two minutes.	`sum(increase(milvus_datanode_autoflush_buffer_op_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id)`	`milvus_datanode_autoflush_buffer_op_count`	The accumulated number of times a data node auto-flushes a buffer.
Flush Request Rate	The number of times each data node receives a buffer flush request per second within the past two minute.	`sum(increase(milvus_datanode_flush_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id)`	`milvus_datanode_flush_req_count`	The accumulated number of times a data node receives a flush request from a data coord.
Compaction Latency	The average latency and the 99 the percentile of the time each data node takes to execute a compaction task within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_compaction_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_datanode_compaction_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_compaction_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_datanode_compaction_latency`	The time each data node takes to execute a compaction task.

Index coordinator

Panel	Panel description	PromQL (Prometheus query language)	The Milvus metrics used	Milvus metrics description
Index Request Rate	The average number of index building requests received per second within the past two minutes.	`sum(increase(milvus_indexcoord_indexreq_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status)`	`milvus_indexcoord_indexreq_count`	The number of index building requests received.
Index Task Count	The count of all indexing tasks recorded in index metadata.	`sum(milvus_indexcoord_indextask_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (index_task_status)`	`milvus_indexcoord_indextask_count`	The count of all indexing tasks recorded in index metadata.
Index Node Num	The number of managed index nodes.	`sum(milvus_indexcoord_indexnode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)`	`milvus_indexcoord_indexnode_num`	The number of managed index nodes.

Index node

Panel	Panel description	PromQL (Prometheus query language)	The Milvus metrics used	Milvus metrics description
Index Task Rate	The average number of index building tasks received by each index node per second within the past two minutes.	`sum(increase(milvus_indexnode_index_task_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id)`	`milvus_indexnode_index_task_count`	The number of index building tasks received.
Load Field Latency	The average latency and the 99th percentile of the time used by each index node to load segment field data each time within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_load_field_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_indexnode_load_field_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_load_field_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_indexnode_load_field_latency`	The time used by index node to load segment field data.
Decode Field Latency	The average latency and the 99th percentile of the time used by each index node to encode field data each time within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_decode_field_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_indexnode_decode_field_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_decode_field_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_indexnode_decode_field_latency`	The time used to decode field data.
Build Index Latency	The average latency and the 99th percentile of the time used by each index node to build indexes within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_build_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_indexnode_build_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_build_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_indexnode_build_index_latency`	The time used to build indexes.
Encode Index Latency	The average latency and the 99th percentile of the time used by each index node to encode index files within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_encode_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_indexnode_encode_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_encode_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_indexnode_encode_index_latency`	The time used to encode index files.
Save Index Latency	The average latency and the 99th percentile of the time used by each index node to save index files within the past two minutes.	p99: `histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_save_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))` avg: `sum(increase(milvus_indexnode_save_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_save_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)`	`milvus_indexnode_save_index_latency`	The time used to save index files.

Milvus Metrics Dashboard

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?

Milvus Metrics Dashboard

Table of contents

Try Managed Milvus for Free

Feedback