🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Try Managed Milvus

Home

About Milvus
Get Started
Concepts
User Guide
Data Import
Administration Guide
Tools
Integrations
Tutorials
FAQs
API Reference

Home
Docs
User Guide
Embeddings & Reranking
Reranking Function
Model Ranker
Model Ranker Overview

Model Ranker OverviewCompatible with Milvus 2.6.x

Traditional vector search ranks results purely by mathematical similarity—how closely vectors match in high-dimensional space. While efficient, this approach often misses true semantic relevance. Consider searching for “best practices for database optimization”: you might receive documents with high vector similarity that mention these terms frequently, but don’t actually provide actionable optimization strategies.

Model Ranker transforms Milvus search by integrating advanced language models that understand semantic relationships between queries and documents. Instead of relying solely on vector similarity, it evaluates content meaning and context to deliver more intelligent, relevant results.

Limits

Model rankers cannot be used with grouping searches.
Fields used for model reranking must be text type (VARCHAR).
Each model ranker can use only one VARCHAR field at a time for evaluation.

How it works

Model rankers integrate language model understanding capabilities into the Milvus search process through a well-defined workflow:

Model Ranker Overview

Initial query: Your application sends a query to Milvus
Vector search: Milvus performs standard vector search to identify candidate documents
Candidate retrieval: The system identifies the initial set of candidate documents based on vector similarity
Model evaluation: The Model Ranker Function processes query-document pairs:
- Sends the original query and candidate documents to an external model service
- The language model evaluates semantic relevance between query and each document
- Each document receives a relevance score based on semantic understanding
Intelligent reranking: Documents are reordered based on model-generated relevance scores
Enhanced results: Your application receives results ranked by semantic relevance rather than just vector similarity

Choose a model provider for your needs

Milvus supports the following model service providers for reranking, each with distinct characteristics:

Provider	Best For	Characteristics	Example Use Case
vLLM	Complex applications requiring deep semantic understanding and customization	Supports various large language models Flexible deployment options Higher computational requirements Greater customization potential	Legal research platform deploying domain-specific models that understand legal terminology and case law relationships
TEI	Quick implementation with efficient resource usage	Lightweight service optimized for text operations Easier deployment with lower resource requirements Pre-optimized reranking models Minimal infrastructure overhead	Content management system needing efficient reranking capabilities with standard requirements

For detailed information about implementation of each model service, refer to the dedicated documentation:

Implementation

Before implementing Model Ranker, ensure you have:

A Milvus collection with a VARCHAR field containing the text to be reranked
A running external model service (vLLM or TEI) accessible to your Milvus instance
Appropriate network connectivity between Milvus and your chosen model service

Model rankers integrate seamlessly with both standard vector search and hybrid search operations. The implementation involves creating a Function object that defines your reranking configuration and passing it to search operations.

Create a model ranker

To implement model reranking, first define a Function object with the appropriate configuration:

from pymilvus import MilvusClient, Function, FunctionType

# Connect to your Milvus server
client = MilvusClient(
    uri="http://localhost:19530"  # Replace with your Milvus server URI
)

# Create a model ranker function
model_ranker = Function(
    name="semantic_ranker",  # Function identifier
    input_field_names=["document"],  # VARCHAR field to use for reranking
    function_type=FunctionType.RERANK,  # Must be set to RERANK
    params={
        "reranker": "model",  # Specify model reranker. Must be "model"
        "provider": "tei",  # Choose provider: "tei" or "vllm"
        "queries": ["machine learning for time series"],  # Query text
        "endpoint": "http://model-service:8080",  # Model service endpoint
        # "maxBatch": 32  # Optional: batch size for processing
    }
)

Parameter	Required?	Description	Value / Example
`name`	Yes	Identifier for your function used when executing searches.	`"semantic_ranker"`
`input_field_names`	Yes	Name of the text field to use for reranking. Must be a `VARCHAR` type field.	`["document"]`
`function_type`	Yes	Specifies the type of function being created. Must be set to `RERANK` for all model rankers.	`FunctionType.RERANK`
`params`	Yes	A dictionary containing configuration for the model-based reranking function. The available parameters (keys) vary depending on the provider (`tei` or `vllm`). Refer to vLLM Ranker or TEI Ranker for more details.	{…}
`params.reranker`	Yes	Must be set to `"model"` to enable model reranking.	`"model"`
`params.provider`	Yes	The model service provider to use for reranking.	`"tei"` or `"vllm"`
`params.queries`	Yes	List of query strings used by the reranking model to calculate relevance scores. The number of query strings must match exactly the number of queries in your search operation (even when using query vectors instead of text), otherwise an error will be reported.	`["search query"]`
`params.endpoint`	Yes	URL of the model service.	`"http://localhost:8080"`
`maxBatch`	No	Maximum number of documents to process in a single batch. Larger values increase throughput but require more memory.	`32` (default)

Apply to standard vector search

After defining your model ranker, you can apply it during search operations by passing it to the ranker parameter:

# Use the model ranker in standard vector search
results = client.search(
    collection_name,
    data=["machine learning for time series"], # Number of queries must match that specified in model_ranker.params["queries"] 
    anns_field="vector_field",
    limit=10,
    output_fields=["document"],  # Include the text field in outputs
    ranker=model_ranker,  # Apply the model ranker here
    consistency_level="Strong"
)

Apply to hybrid search

Model rankers can also be applied to hybrid search operations that combine multiple vector fields:

from pymilvus import AnnSearchRequest

# Define search requests for different vector fields
dense_request = AnnSearchRequest(
    data=["machine learning for time series"],
    anns_field="dense_vector",
    param={},
    limit=20
)

sparse_request = AnnSearchRequest(
    data=["machine learning for time series"],
    anns_field="sparse_vector",
    param={},
    limit=20
)

# Apply model ranker to hybrid search
hybrid_results = client.hybrid_search(
    collection_name,
    [dense_request, sparse_request],
    ranker=model_ranker,  # Same model ranker works with hybrid search
    limit=10,
    output_fields=["document"]
)

Model Ranker Overview
Limits
How it works
Choose a model provider for your needs
Implementation

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?