milvus-logo

Install Milvus Cluster with GPU Support

Milvus now can use GPU devices to build indexes and perform ANN searches thanks to the contribution from NVIDIA. This guide will show you how to install Milvus with GPU support on your machine.

Prerequisites

Before installing Milvus with GPU support, make sure you have the following prerequisites:

  • The compute capability of your GPU device is 7.0、7.5、8.0、8.6、8.9、9.0. To check whether your GPU device suffices the requirement, check Your GPU Compute Capability on the NVIDIA developer website.

  • You have installed the NVIDIA driver for your GPU device on one of the supported Linux distributions and then the NVIDIA Container Toolkit following this guide.

    For Ubuntu 22.04 users, you can install the driver and the container toolkit with the following commands:

    $ sudo apt install --no-install-recommends nvidia-headless-545 nvidia-utils-545
    

    For other OS users, please refer to the official installation guide.

    You can check whether the driver has been installed correctly by running the following command:

    $ modinfo nvidia | grep "^version"
    version:        545.29.06
    

    You are recommended to use the drivers of version 545 and above.

  • You have installed a Kubernetes cluster, and the kubectl command-line tool has been configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts.

Create a K8s Cluster

If you have already deployed a K8s cluster for production, you can skip this step and proceed directly to Install Helm Chart for Milvus. If not, you can follow the steps below to quickly create a K8s for testing, and then use it to deploy a Milvus cluster with Helm.

1. Install minikube

See install minikube for more information.

2. Start a K8s cluster using minikube

After installing minikube, run the following command to start a K8s cluster.

$ minikube start --gpus all

3. Check the K8s cluster status

Run $ kubectl cluster-info to check the status of the K8s cluster you just created. Ensure that you can access the K8s cluster via kubectl. If you have not installed kubectl locally, see Use kubectl inside minikube.

Minikube has a dependency on the default StorageClass when installed. Check the dependency by running the following command. Other installation methods require manual configuration of the StorageClass. See Change the default StorageClass for more information.

$ kubectl get sc
NAME                  PROVISIONER                  RECLAIMPOLICY    VOLUMEBIINDINGMODE    ALLOWVOLUMEEXPANSION     AGE
standard (default)    k8s.io/minikube-hostpath     Delete           Immediate             false                    3m36s

Start a Kubernetes cluster with GPU worker nodes

If you prefer to use GPU-enabled worker nodes, you can follow the steps below to create a K8s cluster with GPU worker nodes. We recommend installing Milvus on a Kubernetes cluster with GPU worker nodes and using the default storage class provisioned.

1. Prepare GPU worker nodes

See Prepare GPU worker nodes for more information.

2. Enable GPU support on Kubernetes

See install nvidia-device-plugin with helm for more information.

After setting up, run kubectl describe node <gpu-worker-node> to view the GPU resources. The command output should be similar to the following:

Capacity:
  ...
  nvidia.com/gpu:     4
  ...
Allocatable:
  ...
  nvidia.com/gpu:     4
  ...

Note: In this example, we have set up a GPU worker node with 4 GPU cards.

Check the default storage class

Milvus relies on the default storage class to automatically provision volumes for data persistence. Run the following command to check storage classes:

$ kubectl get sc

The command output should be similar to the following:

NAME                   PROVISIONER                                     RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path                           Delete          WaitForFirstConsumer   false                  461d

Install Helm Chart for Milvus

Helm is a K8s package manager that can help you deploy Milvus quickly.

  1. Add Milvus Helm repository.
$ helm repo add milvus https://zilliztech.github.io/milvus-helm/

The Milvus Helm Charts repo at https://milvus-io.github.io/milvus-helm/ has been archived and you can get further updates from https://zilliztech.github.io/milvus-helm/ as follows:

helm repo add zilliztech https://zilliztech.github.io/milvus-helm
helm repo update
# upgrade existing helm release
helm upgrade my-release zilliztech/milvus

The archived repo is still available for the charts up to 4.0.31. For later releases, use the new repo instead.

  1. Update charts locally.
$ helm repo update

Start Milvus

Once you have installed the Helm chart, you can start Milvus on Kubernetes. In this section, we will guide you through the steps to start Milvus with GPU support.

You should start Milvus with Helm by specifying the release name, the chart, and the parameters you expect to change. In this guide, we use my-release as the release name. To use a different release name, replace my-release in the following commands with the one you are using.

Milvus allows you to assign one or more GPU devices to Milvus.

  • Assign a single GPU device

    Run the following commands to assign a single GPU device to Milvus:

    cat <<EOF > custom-values.yaml
    indexNode:
      resources:
        requests:
          nvidia.com/gpu: "1"
        limits:
          nvidia.com/gpu: "1"
    queryNode:
      resources:
        requests:
          nvidia.com/gpu: "1"
        limits:
          nvidia.com/gpu: "1"
    EOF
    
    $ helm install my-release milvus/milvus -f custom-values.yaml
    
  • Assign multiple GPU devices

    Run the following commands to assign multiple GPU devices to Milvus:

    cat <<EOF > custom-values.yaml
    indexNode:
      resources:
        requests:
          nvidia.com/gpu: "2"
        limits:
          nvidia.com/gpu: "2"
    queryNode:
      resources:
        requests:
          nvidia.com/gpu: "2"
        limits:
          nvidia.com/gpu: "2"
    EOF
    

    In the configuration above, the indexNode and queryNode share two GPUs. To assign different GPUs to the indexNode and the queryNode, you can modify the configuration accordingly by setting extraEnv in the configuration file as follows:

    cat <<EOF > custom-values.yaml
    indexNode:
      resources:
        requests:
          nvidia.com/gpu: "1"
        limits:
          nvidia.com/gpu: "1"
      extraEnv:
        - name: CUDA_VISIBLE_DEVICES
          value: "0"
    queryNode:
      resources:
        requests:
          nvidia.com/gpu: "1"
        limits:
          nvidia.com/gpu: "1"
      extraEnv:
        - name: CUDA_VISIBLE_DEVICES
          value: "1"
    EOF
    
    $ helm install my-release milvus/milvus -f custom-values.yaml
    
    • The release name should only contain letters, numbers and dashes. Dots are not allowed in the release name.
    • The default command line installs cluster version of Milvus while installing Milvus with Helm. Further setting is needed while installing Milvus standalone.
    • According to the deprecated API migration guide of Kuberenetes, the policy/v1beta1 API version of PodDisruptionBudget is not longer served as of v1.25. You are suggested to migrate manifests and API clients to use the policy/v1 API version instead.
      As a workaround for users who still use the policy/v1beta1 API version of PodDisruptionBudget on Kuberenetes v1.25 and later, you can instead run the following command to install Milvus:
      helm install my-release milvus/milvus --set pulsar.bookkeeper.pdb.usePolicy=false,pulsar.broker.pdb.usePolicy=false,pulsar.proxy.pdb.usePolicy=false,pulsar.zookeeper.pdb.usePolicy=false
    • See Milvus Helm Chart and Helm for more information.

    Check the status of the running pods.

    $ kubectl get pods
    

After Milvus starts, the READY column displays 1/1 for all pods.

NAME                                             READY  STATUS   RESTARTS  AGE
my-release-etcd-0                                1/1    Running   0        3m23s
my-release-etcd-1                                1/1    Running   0        3m23s
my-release-etcd-2                                1/1    Running   0        3m23s
my-release-milvus-datacoord-6fd4bd885c-gkzwx     1/1    Running   0        3m23s
my-release-milvus-datanode-68cb87dcbd-4khpm      1/1    Running   0        3m23s
my-release-milvus-indexcoord-5bfcf6bdd8-nmh5l    1/1    Running   0        3m23s
my-release-milvus-indexnode-5c5f7b5bd9-l8hjg     1/1    Running   0        3m24s
my-release-milvus-proxy-6bd7f5587-ds2xv          1/1    Running   0        3m24s
my-release-milvus-querycoord-579cd79455-xht5n    1/1    Running   0        3m24s
my-release-milvus-querynode-5cd8fff495-k6gtg     1/1    Running   0        3m24s
my-release-milvus-rootcoord-7fb9488465-dmbbj     1/1    Running   0        3m23s
my-release-minio-0                               1/1    Running   0        3m23s
my-release-minio-1                               1/1    Running   0        3m23s
my-release-minio-2                               1/1    Running   0        3m23s
my-release-minio-3                               1/1    Running   0        3m23s
my-release-pulsar-autorecovery-86f5dbdf77-lchpc  1/1    Running   0        3m24s
my-release-pulsar-bookkeeper-0                   1/1    Running   0        3m23s
my-release-pulsar-bookkeeper-1                   1/1    Running   0        98s
my-release-pulsar-broker-556ff89d4c-2m29m        1/1    Running   0        3m23s
my-release-pulsar-proxy-6fbd75db75-nhg4v         1/1    Running   0        3m23s
my-release-pulsar-zookeeper-0                    1/1    Running   0        3m23s
my-release-pulsar-zookeeper-metadata-98zbr       0/1   Completed  0        3m24s

Connect to Milvus

Verify which local port the Milvus server is listening on. Replace the pod name with your own.

$ kubectl get pod my-release-milvus-proxy-6bd7f5587-ds2xv --template
='{{(index (index .spec.containers 0).ports 0).containerPort}}{{"\n"}}'
19530

Open a new terminal and run the following command to forward a local port to the port that Milvus uses. Optionally, omit the designated port and use :19530 to let kubectl allocate a local port for you so that you don't have to manage port conflicts.

$ kubectl port-forward service/my-release-milvus 27017:19530
Forwarding from 127.0.0.1:27017 -> 19530

By default, ports forward by kubectl only listen on localhost. Use flag address if you want Milvus server to listen on selected IP or all addresses.

$ kubectl port-forward --address 0.0.0.0 service/my-release-milvus 27017:19530
Forwarding from 0.0.0.0:27017 -> 19530

Uninstall Milvus

Run the following command to uninstall Milvus.

$ helm uninstall my-release

Stop the K8s cluster

Stop the cluster and the minikube VM without deleting the resources you created.

$ minikube stop

Run minikube start to restart the cluster.

Delete the K8s cluster

Run $ kubectl logs `pod_name` to get the stderr log of the pod before deleting the cluster and all resources.

Delete the cluster, the minikube VM, and all resources you created including persistent volumes.

$ minikube delete

What's next

Having installed Milvus, you can:

On this page