使用 Milvus 建立 RAG
在本教程中,我們將教您如何使用 Milvus 建立 RAG(Retrieval-Augmented Generation) 管道。
RAG 系統結合了一個檢索系統和一個生成模型,以根據給定的提示生成新的文字。該系統首先使用 Milvus 從語料庫中檢索相關文件,然後根據檢索到的文件使用生成模型生成新文本。
準備工作
依賴與環境
$ pip install --upgrade pymilvus openai requests tqdm
如果您使用的是 Google Colab,為了啟用剛安裝的依賴項目,您可能需要重新啟動運行時間。(點選螢幕上方的「Runtime」功能表,並從下拉式功能表中選擇「Restart session」)。
在本範例中,我們將使用 OpenAI 作為 LLM。您應該準備api key OPENAI_API_KEY
作為環境變數。
import os
os.environ["OPENAI_API_KEY"] = "sk-***********"
準備資料
我們使用Milvus 文件 2.4.x中的常見問題頁面作為 RAG 中的私有知識,對於簡單的 RAG 管道而言,這是一個很好的資料來源。
下載 zip 檔案並解壓縮文件到資料夾milvus_docs
。
$ wget https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip
$ unzip -q milvus_docs_2.4.x_en.zip -d milvus_docs
我們從資料夾milvus_docs/en/faq
載入所有 markdown 檔案。對於每個文件,我們只需簡單地使用「#」來分隔文件中的內容,這樣就可以大致分隔出 markdown 檔案中每個主要部分的內容。
from glob import glob
text_lines = []
for file_path in glob("milvus_docs/en/faq/*.md", recursive=True):
with open(file_path, "r") as file:
file_text = file.read()
text_lines += file_text.split("# ")
準備嵌入模型
我們初始化 OpenAI 用戶端以準備嵌入模型。
from openai import OpenAI
openai_client = OpenAI()
定義一個使用 OpenAI client 產生文字嵌入的函式。我們使用text-embedding-3-small模型作為範例。
def emb_text(text):
return (
openai_client.embeddings.create(input=text, model="text-embedding-3-small")
.data[0]
.embedding
)
產生測試嵌入,並列印其尺寸和前幾個元素。
test_embedding = emb_text("This is a test")
embedding_dim = len(test_embedding)
print(embedding_dim)
print(test_embedding[:10])
1536
[0.00988506618887186, -0.005540902726352215, 0.0068014683201909065, -0.03810417652130127, -0.018254263326525688, -0.041231658309698105, -0.007651153020560741, 0.03220026567578316, 0.01892443746328354, 0.00010708322952268645]
將資料載入 Milvus
建立集合
from pymilvus import MilvusClient
milvus_client = MilvusClient(uri="./milvus_demo.db")
collection_name = "my_rag_collection"
至於MilvusClient
的參數 :
- 將
uri
設定為本機檔案,例如./milvus.db
,是最方便的方法,因為它會自動利用Milvus Lite將所有資料儲存在這個檔案中。 - 如果您有大規模的資料,您可以在docker 或 kubernetes 上架設效能更高的 Milvus 伺服器。在此設定中,請使用伺服器的 uri,例如
http://localhost:19530
,作為您的uri
。 - 如果您想使用Zilliz Cloud(Milvus 的完全管理雲端服務),請調整
uri
和token
,與 Zilliz Cloud 的Public Endpoint 和 Api key對應。
檢查集合是否已經存在,如果已經存在,請將其刪除。
if milvus_client.has_collection(collection_name):
milvus_client.drop_collection(collection_name)
使用指定的參數建立新的集合。
如果我們沒有指定任何欄位資訊,Milvus 會自動建立一個預設的id
欄位做為主索引鍵,以及一個vector
欄位來儲存向量資料。保留的 JSON 欄位用來儲存非結構描述定義的欄位及其值。
milvus_client.create_collection(
collection_name=collection_name,
dimension=embedding_dim,
metric_type="IP", # Inner product distance
consistency_level="Strong", # Strong consistency level
)
插入資料
遍歷文字行,建立嵌入,然後將資料插入 Milvus。
這裡有一個新欄位text
,它是集合模式中的非定義欄位。它會自動加入保留的 JSON 動態欄位,在高層次上可視為一般欄位。
from tqdm import tqdm
data = []
for i, line in enumerate(tqdm(text_lines, desc="Creating embeddings")):
data.append({"id": i, "vector": emb_text(line), "text": line})
milvus_client.insert(collection_name=collection_name, data=data)
Creating embeddings: 100%|██████████| 72/72 [00:27<00:00, 2.67it/s]
{'insert_count': 72,
'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71],
'cost': 0}
建立 RAG
為查詢擷取資料
讓我們指定一個關於 Milvus 的常見問題。
question = "How is data stored in milvus?"
在資料集中搜尋該問題,並擷取語義上前三名的符合資料。
search_res = milvus_client.search(
collection_name=collection_name,
data=[
emb_text(question)
], # Use the `emb_text` function to convert the question to an embedding vector
limit=3, # Return top 3 results
search_params={"metric_type": "IP", "params": {}}, # Inner product distance
output_fields=["text"], # Return the text field
)
讓我們來看看查詢的搜尋結果
import json
retrieved_lines_with_distances = [
(res["entity"]["text"], res["distance"]) for res in search_res[0]
]
print(json.dumps(retrieved_lines_with_distances, indent=4))
[
[
" Where does Milvus store data?\n\nMilvus deals with two types of data, inserted data and metadata. \n\nInserted data, including vector data, scalar data, and collection-specific schema, are stored in persistent storage as incremental log. Milvus supports multiple object storage backends, including [MinIO](https://min.io/), [AWS S3](https://aws.amazon.com/s3/?nc1=h_ls), [Google Cloud Storage](https://cloud.google.com/storage?hl=en#object-storage-for-companies-of-all-sizes) (GCS), [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs), [Alibaba Cloud OSS](https://www.alibabacloud.com/product/object-storage-service), and [Tencent Cloud Object Storage](https://www.tencentcloud.com/products/cos) (COS).\n\nMetadata are generated within Milvus. Each Milvus module has its own metadata that are stored in etcd.\n\n###",
0.7883545756340027
],
[
"How does Milvus handle vector data types and precision?\n\nMilvus supports Binary, Float32, Float16, and BFloat16 vector types.\n\n- Binary vectors: Store binary data as sequences of 0s and 1s, used in image processing and information retrieval.\n- Float32 vectors: Default storage with a precision of about 7 decimal digits. Even Float64 values are stored with Float32 precision, leading to potential precision loss upon retrieval.\n- Float16 and BFloat16 vectors: Offer reduced precision and memory usage. Float16 is suitable for applications with limited bandwidth and storage, while BFloat16 balances range and efficiency, commonly used in deep learning to reduce computational requirements without significantly impacting accuracy.\n\n###",
0.6757288575172424
],
[
"How much does Milvus cost?\n\nMilvus is a 100% free open-source project.\n\nPlease adhere to [Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0) when using Milvus for production or distribution purposes.\n\nZilliz, the company behind Milvus, also offers a fully managed cloud version of the platform for those that don't want to build and maintain their own distributed instance. [Zilliz Cloud](https://zilliz.com/cloud) automatically maintains data reliability and allows users to pay only for what they use.\n\n###",
0.6421123147010803
]
]
使用 LLM 獲得 RAG 回應
將擷取的文件轉換成字串格式。
context = "\n".join(
[line_with_distance[0] for line_with_distance in retrieved_lines_with_distances]
)
定義 Lanage Model 的系統和使用者提示。此提示與從 Milvus 擷取的文件組合。
SYSTEM_PROMPT = """
Human: You are an AI assistant. You are able to find answers to the questions from the contextual passage snippets provided.
"""
USER_PROMPT = f"""
Use the following pieces of information enclosed in <context> tags to provide an answer to the question enclosed in <question> tags.
<context>
{context}
</context>
<question>
{question}
</question>
"""
使用 OpenAI ChatGPT 根據提示產生回應。
response = openai_client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": USER_PROMPT},
],
)
print(response.choices[0].message.content)
Milvus stores data in persistent storage as incremental logs, including inserted data (vector data, scalar data, and collection-specific schema) and metadata. Inserted data is stored in various object storage backends like MinIO, AWS S3, Google Cloud Storage, Azure Blob Storage, Alibaba Cloud OSS, and Tencent Cloud Object Storage. Metadata generated within Milvus is stored in etcd.
快速部署
若要瞭解如何使用本教學開始線上示範,請參考範例應用程式。