šŸš€ Try Zilliz Cloud, the fully managed Milvus, for freeā€”experience 10x faster performance! Try Now>>

milvus-logo
LFAI
Home
  • Administration Guide

Milvus monitoring framework overview

This topic explains how Milvus uses Prometheus to monitor metrics and Grafana to visualize metrics and create alerts.

Prometheus in Milvus

Prometheus is an open-source monitoring and alerting toolkit for Kubernetes implementations. It collects and stores metrics as time-series data. This means that metrics are stored with timestamps when recorded, alongside with optional key-value pairs called labels.

Currently Milvus uses the following components of Prometheus:

  • Prometheus endpoint to pull data from endpoints set by exporters.
  • Prometheus operator to effectively manage Prometheus monitoring instances.
  • Kube-prometheus to provide easy to operate end-to-end Kubernetes cluster monitoring.

Metric names

A valid metric name in Prometheus contains three elements: namespace, subsystem, and name. These three elements are connected with "_".

The namespace of Milvus metrics monitored by Prometheus is "milvus". Depending on the role that a metric belongs to, its subsystem should be one of the following eight roles: "rootcoord", "proxy", "querycoord", "querynode", "indexcoord", "indexnode", "datacoord", "datanode".

For instance, the Milvus metric that calculates the total number of vectors queried is named milvus_proxy_search_vectors_count.

Metric types

Prometheus supports four types of metrics:

  • Counter: a type of cumulative metrics whose value can only increase or be reset to zero upon restart.
  • Gauge: a type of metrics whose value can either go up and down.
  • Histogram: a type of metrics that are counted based on configurable buckets. A common example is request duration.
  • Summary: a type of metrics similar to histogram that calculates configurable quantiles over a sliding time window.

Metric labels

Prometheus differentiates samples with the same metric name by labeling them. A label is a certain attribute of a metric. Metrics with the same name must have the same value for the variable_labels field. The following table lists the names and meanings of common labels of Milvus metrics.

Label nameDefinitionValues
ā€œnode_idā€The unique identity of a role.A globally unique ID generated by milvus.
ā€œstatusā€The status of a processed operation or request."abandon", "success", or "fail".
ā€œquery_typeā€The type of a read request.ā€œsearchā€ or "query".
ā€œmsg_typeā€The type of messages."insert", "delete", "search", or "query".
ā€œsegment_stateā€The status of a segment."Sealed", "Growing", "Flushed", "Flushing", "Dropped", or "Importing".
ā€œcache_stateā€The status of a cached object.ā€œhitā€ or "miss".
ā€œcache_nameā€The name of a cached object. This label is used together with the label "cache_state".Eg. "CollectionID", "Schema", etc.
ā€œchannel_name"Physical topics in message storage (Pulsar or Kafka).Eg."by-dev-rootcoord-dml_0", "by-dev-rootcoord-dml_255", etc.
ā€œfunction_nameā€The name of a function that handles certain requests.Eg. "CreateCollection", "CreatePartition", "CreateIndex", etc.
ā€œuser_nameā€The user name used for authentication.A user name of your preference.
ā€œindex_task_statusā€The status of an index task in meta storage."unissued", "in-progress", "failed", "finished", or "recycled".

Grafana in Milvus

Grafana is an open-source visualizing stack that can connect with all data sources. By pulling up metrics, it helps users understand, analyze and monitor massive data.

Milvus uses Grafanaā€™s customizable dashboards for metric visualization.

Whatā€™s next

After learning about the basic workflow of monitoring and alerting, learn:

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started
Feedback

Was this page helpful?