🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Milvus
Zilliz
Home
  • User Guide
  • Home
  • Docs
  • User Guide

  • Embeddings & Reranking

  • Reranking Function

  • Model Ranker

  • Model Ranker Overview

Model Ranker OverviewCompatible with Milvus 2.6.x

Traditional vector search ranks results purely by mathematical similarity—how closely vectors match in high-dimensional space. While efficient, this approach often misses true semantic relevance. Consider searching for “best practices for database optimization”: you might receive documents with high vector similarity that mention these terms frequently, but don’t actually provide actionable optimization strategies.

Model Ranker transforms Milvus search by integrating advanced language models that understand semantic relationships between queries and documents. Instead of relying solely on vector similarity, it evaluates content meaning and context to deliver more intelligent, relevant results.

Limits

  • Model rankers cannot be used with grouping searches.

  • Fields used for model reranking must be text type (VARCHAR).

  • Each model ranker can use only one VARCHAR field at a time for evaluation.

How it works

Model rankers integrate language model understanding capabilities into the Milvus search process through a well-defined workflow:

Model Ranker Overview Model Ranker Overview

  1. Initial query: Your application sends a query to Milvus

  2. Vector search: Milvus performs standard vector search to identify candidate documents

  3. Candidate retrieval: The system identifies the initial set of candidate documents based on vector similarity

  4. Model evaluation: The Model Ranker Function processes query-document pairs:

    • Sends the original query and candidate documents to an external model service

    • The language model evaluates semantic relevance between query and each document

    • Each document receives a relevance score based on semantic understanding

  5. Intelligent reranking: Documents are reordered based on model-generated relevance scores

  6. Enhanced results: Your application receives results ranked by semantic relevance rather than just vector similarity

Choose a model provider for your needs

Milvus supports the following model service providers for reranking, each with distinct characteristics:

Provider

Best For

Characteristics

Example Use Case

vLLM

Complex applications requiring deep semantic understanding and customization

  • Supports various large language models

  • Flexible deployment options

  • Higher computational requirements

  • Greater customization potential

Legal research platform deploying domain-specific models that understand legal terminology and case law relationships

TEI

Quick implementation with efficient resource usage

  • Lightweight service optimized for text operations

  • Easier deployment with lower resource requirements

  • Pre-optimized reranking models

  • Minimal infrastructure overhead

Content management system needing efficient reranking capabilities with standard requirements

For detailed information about implementation of each model service, refer to the dedicated documentation:

Implementation

Before implementing Model Ranker, ensure you have:

  • A Milvus collection with a VARCHAR field containing the text to be reranked

  • A running external model service (vLLM or TEI) accessible to your Milvus instance

  • Appropriate network connectivity between Milvus and your chosen model service

Model rankers integrate seamlessly with both standard vector search and hybrid search operations. The implementation involves creating a Function object that defines your reranking configuration and passing it to search operations.

Create a model ranker

To implement model reranking, first define a Function object with the appropriate configuration:

from pymilvus import MilvusClient, Function, FunctionType

# Connect to your Milvus server
client = MilvusClient(
    uri="http://localhost:19530"  # Replace with your Milvus server URI
)

# Create a model ranker function
model_ranker = Function(
    name="semantic_ranker",  # Function identifier
    input_field_names=["document"],  # VARCHAR field to use for reranking
    function_type=FunctionType.RERANK,  # Must be set to RERANK
    params={
        "reranker": "model",  # Specify model reranker. Must be "model"
        "provider": "tei",  # Choose provider: "tei" or "vllm"
        "queries": ["machine learning for time series"],  # Query text
        "endpoint": "http://model-service:8080",  # Model service endpoint
        # "maxBatch": 32  # Optional: batch size for processing
    }
)

Parameter

Required?

Description

Value / Example

name

Yes

Identifier for your function used when executing searches.

"semantic_ranker"

input_field_names

Yes

Name of the text field to use for reranking. Must be a VARCHAR type field.

["document"]

function_type

Yes

Specifies the type of function being created. Must be set to RERANK for all model rankers.

FunctionType.RERANK

params

Yes

A dictionary containing configuration for the model-based reranking function. The available parameters (keys) vary depending on the provider (tei or vllm). Refer to vLLM Ranker or TEI Ranker for more details.

{…}

params.reranker

Yes

Must be set to "model" to enable model reranking.

"model"

params.provider

Yes

The model service provider to use for reranking.

"tei" or "vllm"

params.queries

Yes

List of query strings used by the reranking model to calculate relevance scores. The number of query strings must match exactly the number of queries in your search operation (even when using query vectors instead of text), otherwise an error will be reported.

["search query"]

params.endpoint

Yes

URL of the model service.

"http://localhost:8080"

maxBatch

No

Maximum number of documents to process in a single batch. Larger values increase throughput but require more memory.

32 (default)

After defining your model ranker, you can apply it during search operations by passing it to the ranker parameter:

# Use the model ranker in standard vector search
results = client.search(
    collection_name,
    data=["machine learning for time series"], # Number of queries must match that specified in model_ranker.params["queries"] 
    anns_field="vector_field",
    limit=10,
    output_fields=["document"],  # Include the text field in outputs
    ranker=model_ranker,  # Apply the model ranker here
    consistency_level="Strong"
)

Model rankers can also be applied to hybrid search operations that combine multiple vector fields:

from pymilvus import AnnSearchRequest

# Define search requests for different vector fields
dense_request = AnnSearchRequest(
    data=["machine learning for time series"],
    anns_field="dense_vector",
    param={},
    limit=20
)

sparse_request = AnnSearchRequest(
    data=["machine learning for time series"],
    anns_field="sparse_vector",
    param={},
    limit=20
)

# Apply model ranker to hybrid search
hybrid_results = client.hybrid_search(
    collection_name,
    [dense_request, sparse_request],
    ranker=model_ranker,  # Same model ranker works with hybrid search
    limit=10,
    output_fields=["document"]
)

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started
Feedback

Was this page helpful?