Vertex AICompatible with Milvus 2.6.x
Google Cloud Vertex AI is a high-performance service specifically designed for text embedding models. This guide explains how to use Google Cloud Vertex AI with Milvus for efficient text embedding generation.
Vertex AI supports several embedding models for different use cases:
gemini-embedding-001 (State-of-the-art performance across English, multilingual and code tasks)
text-embedding-005 (Latest text embedding model)
text-multilingual-embedding-002 (Latest multilingual text embedding model)
For more information, refer to Vertex AI text embedding models.
Prerequisites
Ensure you meet these requirements before configuring Vertex AI:
Run Milvus version 2.6 or higher - Verify your deployment meets the minimum version requirement.
Create a Google Cloud service account - At a minimum, you’ll likely need roles like “Vertex AI User” or other more specific roles. For details, refer to Create service accounts.
Download the service account’s JSON key file - Securely store this credential file on your server or local machine. For details, refer to Create a service account key.
Configure credentials
Before Milvus can call Vertex AI, it needs access to your GCP service account JSON key. We support two methods—choose one based on your deployment and operational needs.
Option |
Priority |
Best For |
---|---|---|
Configuration file ( |
High |
Cluster-wide, persistent settings |
Environment variables ( |
Low |
Container workflows, quick tests |
Option 1: Configuration file (recommended & higher priority)
Milvus will always prefer credentials declared in milvus.yaml
over any environment variables for the same provider.
Base64-encode your JSON key
cat credentials.json | jq . | base64
Declare credentials in
milvus.yaml
# milvus.yaml credential: gcp_vertex: # arbitrary label credential_json: | <YOUR_BASE64_ENCODED_JSON>
Bind the credential to Vertex AI provider
# milvus.yaml function: textEmbedding: providers: vertexai: credential: gcp_vertex # must match the label above url: <optional: custom Vertex AI endpoint>
If you later need to rotate keys, just update the Base64 string under
credential_json
and restart Milvus—no changes to your environment or containers required.
Option 2: Environment variables
Use this method when you prefer injecting secrets at deploy time. Milvus falls back to env-vars only if no matching entry exists in milvus.yaml
.
The configuration steps depend on your Milvus deployment mode (standalone vs. distributed cluster) and orchestration platform (Docker Compose vs. Kubernetes).
To obtain your Milvus configuration file (docker-compose.yaml), refer to Download an installation file.
Mount your key into the container
Edit your
docker-compose.yaml
file to include the credential volume mapping:services: standalone: volumes: # Map host credential file to container path - /path/to/your/credentials.json:/milvus/configs/google_application_credentials.json:ro
In the preceding configuration:
Use absolute paths for reliable file access (
/home/user/credentials.json
not~/credentials.json
)The container path must end with
.json
extension:ro
flag ensures read-only access for security
Set environment variable
In the same
docker-compose.yaml
file, add the environment variable pointing to the credential path:services: standalone: environment: # Essential for Vertex AI authentication MILVUSAI_GOOGLE_APPLICATION_CREDENTIALS: /milvus/configs/google_application_credentials.json
Apply changes
Restart your Milvus container to activate the configuration:
docker-compose down && docker-compose up -d
To obtain your Milvus configuration file (values.yaml), refer to Configure Milvus via configuration file.
Create a Kubernetes Secret
Execute this on your control machine (where kubectl is configured):
kubectl create secret generic vertex-ai-secret \ --from-file=credentials.json=/path/to/your/credentials.json \ -n <your-milvus-namespace>
In the preceding command:
vertex-ai-secret
: Name for your secret (customizable)/path/to/your/credentials.json
: Local filename of your GCP credential file<your-milvus-namespace>
: Kubernetes namespace hosting Milvus
Configure Helm values
Update your
values.yaml
based on your deployment type:For standalone deployment
standalone: extraEnv: - name: MILVUSAI_GOOGLE_APPLICATION_CREDENTIALS value: /milvus/configs/credentials.json # Container path volumes: - name: vertex-ai-credentials-vol secret: secretName: vertex-ai-secret # Must match Step 1 volumeMounts: - name: vertex-ai-credentials-vol mountPath: /milvus/configs/credentials.json # Must match extraEnv value subPath: credentials.json # Must match secret key name readOnly: true
For distributed deployment (add to each component)
proxy: extraEnv: - name: MILVUSAI_GOOGLE_APPLICATION_CREDENTIALS value: /milvus/configs/credentials.json volumes: - name: vertex-ai-credentials-vol secret: secretName: vertex-ai-secret volumeMounts: - name: vertex-ai-credentials-vol mountPath: /milvus/configs/credentials.json subPath: credentials.json readOnly: true # Repeat same configuration for dataNode, etc.
Apply Helm configuration
Deploy the updated configuration to your cluster:
helm upgrade milvus milvus/milvus -f values.yaml -n <your-milvus-namespace>
Use embedding function
Once Vertex AI is configured, follow these steps to define and use embedding functions.
Step 1: Define schema fields
To use an embedding function, create a collection with a specific schema. This schema must include at least three necessary fields:
The primary field that uniquely identifies each entity in a collection.
A scalar field that stores raw data to be embedded.
A vector field reserved to store vector embeddings that the function will generate for the scalar field.
from pymilvus import MilvusClient, DataType, Function, FunctionType, CollectionSchema, FieldSchema
# Assume you have connected to Milvus
# client = MilvusClient(uri="http://localhost:19530")
# 1. Create Schema
schema = MilvusClient.create_schema()
# 2. Add fields
schema.add_field("id", DataType.INT64, is_primary=True, auto_id=False)
schema.add_field("document", DataType.VARCHAR, max_length=9000) # Store text data
# IMPORTANT: Set dim to match the output dimension of the model and parameters
schema.add_field("dense_vector", DataType.FLOAT_VECTOR, dim=768) # Store embedding vectors (example dimension)
Step 2: Add embedding function to schema
The Function module in Milvus automatically converts raw data stored in a scalar field into embeddings and stores them into the explicitly defined vector field.
# 3. Define Vertex AI embedding function
text_embedding_function = Function(
name="vert_func", # Unique identifier for this embedding function
function_type=FunctionType.TEXTEMBEDDING, # Indicates a text embedding function
input_field_names=["document"], # Scalar field(s) containing text data to embed
output_field_names=["dense_vector"], # Vector field(s) for storing embeddings
params={ # Vertex AI specific parameters (function-level)
"provider": "vertexai", # Must be set to "vertexai"
"model_name": "text-embedding-005", # Required: Specifies the Vertex AI model to use
"projectid": "your-gcp-project-id", # Required: Your Google Cloud project ID
# Optional parameters (include these only if necessary):
# "location": "us-central1", # Optional: Vertex AI service region (default us-central1)
# "task": "DOC_RETRIEVAL", # Optional: Embedding task type (default DOC_RETRIEVAL)
# "dim": 768 # Optional: Output vector dimension (1-768)
}
)
# Add the configured embedding function to your existing collection schema
schema.add_function(text_embedding_function)
Parameter |
Description |
Required? |
Example Value |
---|---|---|---|
|
The embedding model provider. Set to "vertexai". |
Yes |
|
|
Specifies which Vertex AI embedding model to use. |
Yes |
|
|
Your Google Cloud project ID. |
Yes |
|
|
The region for the Vertex AI service. Currently, Vertex AI embeddings primarily support us-central1. Defaults to us-central1. |
No |
|
|
Specifies the embedding task type, affecting embedding results. Accepted values: DOC_RETRIEVAL (default), CODE_RETRIEVAL (only 005 supported), STS (Semantic Textual Similarity). |
No |
|
|
The dimension of the output embedding vectors. Accepts integers between 1 and 768. Note: If specified, ensure the dim of the vector field in the Schema matches this value. |
No |
|
Next steps
After configuring the embedding function, refer to the Function Overview for additional guidance on index configuration, data insertion examples, and semantic search operations.