Hugging Face TEICompatible with Milvus 2.6.x

Hugging Face Text Embeddings Inference (TEI) is a high-performance inference server specifically designed for text embedding models. This guide explains how to use Hugging Face TEI with Milvus for efficient text embedding generation.

TEI works with many text embedding models from the Hugging Face Hub, including:

BAAI/bge-* series
sentence-transformers/* series
E5 models
GTE models
And many more

For the latest list of supported models, refer to the TEI GitHub repository and Hugging Face Hub.

TEI deployment

Before configuring Milvus with TEI function, you need to have a running TEI service. Milvus supports two approaches for TEI deployment:

Standard deployment (external)

You can deploy TEI as a standalone service using the official methods from Hugging Face. This approach gives you maximum flexibility and control over your TEI service.

For detailed instructions on deploying TEI using Docker or other methods, refer to the Hugging Face Text Embeddings Inference official documentation.

After deployment, make note of your TEI service endpoint (e.g., http://localhost:8080) as you’ll need it when using the TEI function in Milvus.

Milvus Helm Chart deployment (integrated)

For Kubernetes environments, Milvus offers an integrated deployment option through its Helm chart. This simplifies the process by deploying and configuring TEI alongside Milvus.

To enable TEI in your Milvus Helm deployment:

Configure values.yaml to enable TEI:

tei:
  enabled: true
  image:
    repository: ghcr.io/huggingface/text-embeddings-inference
    tag: "1.7" # Modify based on hardware
  model: "BAAI/bge-large-en-v1.5" # Modify based on requirements
  # revision: "main"
  # hfTokenSecretName: "my-huggingface-token-secret"
  # apiKey: "your_secure_api_key"
  # apiKeySecret:
  #   name: "my-tei-api-key-secret"
  #   key: "api-key"
  resources:
    requests:
      cpu: "1"
      memory: "4Gi"
      # nvidia.com/gpu: "1" # For GPU
    limits:
      cpu: "2"
      memory: "8Gi"
      # nvidia.com/gpu: "1" # For GPU
  extraArgs: []

Deploy or Upgrade Milvus:
```
helm install my-release milvus/milvus -f values.yaml -n <your-milvus-namespace>
# or
helm upgrade my-release milvus/milvus -f values.yaml --reset-then-reuse-values -n <your-milvus-namespace>
```
When using the Helm chart deployment, the TEI service will be accessible within your Kubernetes cluster at http://my-release-milvus-tei:80 (using your release name). Use this as your endpoint in the TEI function configuration.

Configuration in Milvus

After deploying your TEI service, you’ll need to provide its endpoint when defining a TEI embedding function. In most cases, no additional configuration is required as TEI is enabled by default in Milvus.

If your TEI service was deployed with API key authentication (--api-key flag), however, you’ll need to configure Milvus to use this key:

Define API keys in the credential section:

# milvus.yaml
credential:
  tei_key:  # You can use any label name
    apikey: <YOUR_TEI_API_KEY>

Reference the credential in milvus.yaml:

function:
  textEmbedding:
    providers:
      tei:
        credential: tei_key      # ← choose any label you defined above
        enable: true # enabled by default. no action required.

Use embedding function

Once the TEI service is configured, follow these steps to define and use embedding functions.

Step 1: Define schema fields

To use an embedding function, create a collection with a specific schema. This schema must include at least three necessary fields:

The primary field that uniquely identifies each entity in a collection.
A scalar field that stores raw data to be embedded.
A vector field reserved to store vector embeddings that the function will generate for the scalar field.

The following example defines a schema with one scalar field "document" for storing textual data and one vector field "dense_vector" for storing embeddings to be generated by the Function module. Remember to set the vector dimension (dim) to match the output of your chosen embedding model.

from pymilvus import MilvusClient, DataType, Function, FunctionType, CollectionSchema, FieldSchema

# Assume you have connected to Milvus
# client = MilvusClient(uri="http://localhost:19530")

# 1. Create Schema
schema = MilvusClient.create_schema()

# 2. Add fields
schema.add_field("id", DataType.INT64, is_primary=True, auto_id=False)
schema.add_field("document", DataType.VARCHAR, max_length=9000) # Store text data
# IMPORTANT: Set dim to exactly match the TEI model's output dimension
schema.add_field("dense_vector", DataType.FLOAT_VECTOR, dim=1024) # Store embedding vectors (example dimension)

Step 2: Add embedding function to schema

The Function module in Milvus automatically converts raw data stored in a scalar field into embeddings and stores them into the explicitly defined vector field.

The example below adds a Function module (tei_func) that converts the scalar field "document" into embeddings, storing the resulting vectors in the "dense_vector" vector field defined earlier.

Once you have defined your embedding function, add it to your collection schema. This instructs Milvus to use the specified embedding function to process and store embeddings from your text data.

# 3. Define TEI embedding function
text_embedding_function = Function(
    name="tei_func",                            # Unique identifier for this embedding function
    function_type=FunctionType.TEXTEMBEDDING,   # Indicates a text embedding function
    input_field_names=["document"],             # Scalar field(s) containing text data to embed
    output_field_names=["dense_vector"],        # Vector field(s) for storing embeddings
    params={                                    # TEI specific parameters (function-level)
        "provider": "TEI",                      # Must be set to "TEI"
        "endpoint": "http://your-tei-service-endpoint:80", # Required: Points to your TEI service address
        # Optional parameters:
        # "truncate": "true",                   # Optional: Whether to truncate long input (default false)
        # "truncation_direction": "right",      # Optional: Truncation direction (default right)
        # "max_client_batch_size": 64,          # Optional: Client max batch size (default 32)
        # "ingestion_prompt": "passage: ",      # Optional: (Advanced) Ingestion phase prompt
        # "search_prompt": "query: "            # Optional: (Advanced) Search phase prompt
    }
)

# Add the configured embedding function to your existing collection schema
schema.add_function(text_embedding_function)

Parameter	Required?	Description	Example Value
`provider`	Yes	The embedding model provider. Set to "TEI".	"TEI"
`endpoint`	Yes	The network address pointing to your deployed TEI service. If deployed via Milvus Helm Chart, this is usually the internal Service address.	"http://localhost:8080", "http://my-release-milvus-tei:80"
`truncate`	No	Whether to truncate input texts exceeding the model's maximum length. Defaults to false.	"true"
`truncation_direction`	No	Effective when truncate is true. Specifies whether to truncate from the left or right. Defaults to right.	"left"
`max_client_batch_size`	No	The maximum batch size the Milvus client sends to TEI. Defaults to 32.	64
`prompt_name`	No	(Advanced) Specifies a key in the sentence-transformers configuration prompts dictionary. Used for certain models requiring specific prompt formats. TEI support might be limited and depends on the model's configuration on the Hub.	"your_prompt_key"
`ingestion_prompt`	No	(Advanced) Specifies the prompt to use during the data insertion (ingestion) phase. Depends on the TEI model used; the model must support prompts.	"passage: "
`search_prompt`	No	(Advanced) Specifies the prompt to use during the search phase. Depends on the TEI model used; the model must support prompts.	"query: "

Next steps

After configuring the embedding function, refer to the Function Overview for additional guidance on index configuration, data insertion examples, and semantic search operations.

Hugging Face TEI
TEI deployment
Standard deployment (external)
Milvus Helm Chart deployment (integrated)
Configuration in Milvus
Use embedding function
Step 1: Define schema fields
Step 2: Add embedding function to schema
Next steps

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?