Configure and Start Prometheus
This page describes how to configure and start up Prometheus, and how to connect Alertmanager to Prometheus for metrics visualization and early warning purposes.
Install Prometheus
-
Download the Prometheus tarball for your operating system.
-
Go to the directory holding the Prometheus file, and ensure that Prometheus is properly installed:
$ ./prometheus --version
You can add the path to Prometheus toPATH
. This makes it easy to start Prometheus from any shell.
Configure and start Prometheus
-
Start Pushgateway:
./pushgateway
You must start Pushgateway before starting the Milvus Server. -
Start the Prometheus monitor in server_config.yaml and set the address and port number of Pushgateway:
metric: enable: true # Set the value to true to enable the Prometheus monitor. address: <your_IP_address> # Set the IP address of Pushgateway. port: 9091 # Set the port number of Pushgateway.
In the Kubernetes cluster, you need to set the server_config.yaml for each node to monitor. -
Go to the Prometheus root directory, and download starter Prometheus configuration file for Milvus:
$ wget https://raw.githubusercontent.com/milvus-io/docs/master/v1.1.0/assets/monitoring/prometheus.yml \ -O prometheus.yml
-
Download starter alerting rules for Milvus to the Prometheus root directory:
wget -P rules https://raw.githubusercontent.com/milvus-io/docs/master/v1.1.0/assets/monitoring/alert_rules.yml
-
Edit the Prometheus configuration file according to your needs:
global
: Configures parameters such asscrape_interval
andevaluation_interval
.
global: scrape_interval: 2s # Set the crawl time interval to 2s. evaluation_interval: 2s # Set the evaluation interval to 2s.
alerting
: Sets the address and port of Alertmanager.
alerting: alertmanagers: - static_configs: - targets: ['localhost:9093']
rule_files
: Specifies the file that defines the alerting rules.
rule_files: - "alert_rules.yml"
scrape_configs
: Setsjob_name
andtargets
for scraping data.
scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] - job_name: 'pushgateway' honor_labels: true static_configs: - targets: ['localhost:9091']
See Prometheus Configuration for more information about the configuration file of Prometheus. -
Start Prometheus:
./prometheus --config.file=prometheus.yml
After starting up Prometheus, you can display and render on its interface the metrics that Milvus provides. See Milvus Metrics for more information.
Configure Alertmanager
Events to create alert rules
Proactively monitoring metrics contributes to identification of emerging issues. Creating alerting rules for events requiring immediate intervention is essential as well.
This section includes the most important events for which you must create alerting rules.
Server is down
- Rule: Send an alert when the Milvus server is down.
- How to detect: If the Milvus server is down, No Data is displayed for various metrics on the monitoring dashboard.
CPU/GPU temperature is too high
- Rule: Send an alert when the CPU/GPU temperature exceeds 80 degrees Celsius.
- How to detect: Check the metrics
CPU Temperature
andGPU Temperature
on the monitoring dashboard.
Configuration steps
-
Download the latest Alertmanager tarball for your operating system.
-
Ensure that Alertmanager is properly installed:
$ alertmanager --version
You can add the path to Alertmanager toPATH
. This makes it easy to start Alertmanager from any shell. -
Create the Alertmanager configuration file to specify the desired receivers for notifications, and add it to Alertmanager root directory.
-
Start the Alertmanager server, with the
--config.file
flag pointing to the configuration file:alertmanager --config.file=alertmanager.yml
-
Use your browser to open http://<hostname of machine running alertmanager>:9093, and use the Alertmanager UI to define rules for muting alerts.