milvus-logo
LFAI
Home
  • Integrations
    • Orchestration

Use Milvus in PrivateGPT

PrivateGPT is a production-ready AI project that enables users to ask questions about their documents using Large Language Models without an internet connection while ensuring 100% privacy. PrivateGPT offers an API divided into high-level and low-level blocks. It also provides a Gradio UI client and useful tools like bulk model download scripts and ingestion scripts. Conceptually, PrivateGPT wraps a RAG pipeline and exposes its primitives, being ready to use and providing a full implementation of the API and RAG pipeline.

In this tutorial, we will show you how to use Milvus as the backend vector database for PrivateGPT.

This tutorial is mainly referred to the PrivateGPT official installation guide. If you find that this tutorial has outdated parts, you can prioritize following the official guide and create an issue to us.

Base requirements to run PrivateGPT

1. Clone the PrivateGPT Repository

Clone the repository and navigate to it:

$ git clone https://github.com/zylon-ai/private-gpt
$ cd private-gpt

2. Install Poetry

Install Poetry for dependency management: Follow the instructions on the official Poetry website to install it.

3. (Optional) Install make

To run various scripts, you need to install make.

macOS (Using Homebrew):

$ brew install make

Windows (Using Chocolatey):

$ choco install make

Install Available Modules

PrivateGPT allows customization of the setup. We need to specify some modules settings. In this tutorial, we will use the following modules:

  • LLM: Ollama
  • Embeddings: Ollama
  • Vector Stores: Milvus
  • UI: Gradio
$ poetry install --extras "llms-ollama embeddings-ollama vector-stores-milvus ui"

Start Ollama service

Go to ollama.ai and follow the instructions to install Ollama on your machine.

After the installation, make sure the Ollama desktop app is closed.

Now, start Ollama service (it will start a local inference server, serving both the LLM and the Embeddings):

$ ollama serve

Install the models to be used, the default settings-ollama.yaml is configured to user llama3.1 8b LLM (~4GB) and nomic-embed-text Embeddings (~275MB)

By default, PrivateGPT will automatically pull models as needed. This behavior can be changed by modifying the ollama.autopull_models property.

In any case, if you want to manually pull models, run the following commands:

$ ollama pull llama3.1
$ ollama pull nomic-embed-text

You can optionally change to your favorite models in the settings-ollama.yaml file and pull them manually.

Change Milvus Settings

In the file settings-ollama.yaml, set the vectorstore to milvus:

vectorstore:
  database: milvus

You can also add some cumstom Milvus configuration to specify your settings. Like this:

milvus:
  uri: http://localhost:19530
  collection_name: my_collection

The available configuration options are:

Field OptionDescription
uriDefault is set to “local_data/private_gpt/milvus/milvus_local.db” as a local file; you can also set up a more performant Milvus server on docker or k8s e.g.http://localhost:19530, as your uri; To use Zilliz Cloud, adjust the uri and token to Endpoint and Api key in Zilliz Cloud.
tokenPair with Milvus server on docker or k8s or zilliz cloud api key.
collection_nameThe name of the collection, set to default “milvus_db”.
overwriteOverwrite the data in collection if it existed, set to default as True.

Start PrivateGPT

Once all settings are done, you can run PrivateGPT with a gradio UI.

PGPT_PROFILES=ollama make run

The UI will be available athttp://0.0.0.0:8001.

You can play around with the UI and ask questions about your documents.

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started
Feedback

Was this page helpful?