SiliconFlow RankerCompatible with Milvus 2.6.x

The SiliconFlow Ranker leverages SiliconFlow’s comprehensive reranking models to enhance search relevance through semantic reranking. It provides flexible document chunking capabilities and supports a wide range of specialized reranking models from various providers.

SiliconFlow Ranker is particularly valuable for applications requiring:

Advanced document chunking with configurable overlap for handling long documents
Access to diverse reranking models including BAAI/bge-reranker series and other specialized models
Flexible chunk-based scoring where the highest-scoring chunk represents the document score
Cost-effective reranking with support for both standard and pro model variants

Prerequisites

Before implementing SiliconFlow Ranker in Milvus, ensure you have:

A Milvus collection with a VARCHAR field containing the text to be reranked
A valid SiliconFlow API key with access to reranking models. Sign up at SiliconFlow’s platform to obtain your API credentials. You can either:
- Set the SILICONFLOW_API_KEY environment variable, or
- Specify the API key directly in the ranker configuration

Create a SiliconFlow ranker function

To use SiliconFlow Ranker in your Milvus application, create a Function object that specifies how the reranking should operate. This function will be passed to Milvus search operations to enhance result ranking.

Python Java NodeJS Go cURL

from pymilvus import MilvusClient, Function, FunctionType

# Connect to your Milvus server
client = MilvusClient(
    uri="http://localhost:19530"  # Replace with your Milvus server URI
)

# Configure SiliconFlow Ranker
siliconflow_ranker = Function(
    name="siliconflow_semantic_ranker",     # Unique identifier for your ranker
    input_field_names=["document"],         # VARCHAR field containing text to rerank
    function_type=FunctionType.RERANK,      # Must be RERANK for reranking functions
    params={
        "reranker": "model",                # Enables model-based reranking
        "provider": "siliconflow",          # Specifies SiliconFlow as the service provider
        "model_name": "BAAI/bge-reranker-v2-m3", # SiliconFlow reranking model to use
        "queries": ["renewable energy developments"], # Query text for relevance evaluation
        "max_client_batch_size": 128,       # Optional: batch size for model service requests (default: 128)
        "max_chunks_per_doc": 5,            # Optional: max chunks per document for supported models
        "overlap_tokens": 50,               # Optional: token overlap between chunks for supported models
        # "credential": "your-siliconflow-api-key" # Optional: if not set, uses SILICONFLOW_API_KEY env var
    }
)

import io.milvus.v2.client.ConnectConfig;
import io.milvus.v2.client.MilvusClientV2;
import io.milvus.common.clientenum.FunctionType;
import io.milvus.v2.service.collection.request.CreateCollectionReq;

MilvusClientV2 client = new MilvusClientV2(ConnectConfig.builder()
        .uri("http://localhost:19530")
        .build());

CreateCollectionReq.Function ranker = CreateCollectionReq.Function.builder()
                       .functionType(FunctionType.RERANK)
                       .name("siliconflow_semantic_ranker")
                       .inputFieldNames(Collections.singletonList("document"))
                       .param("reranker", "model")
                       .param("provider", "siliconflow")
                       .param("model_name", "BAAI/bge-reranker-v2-m3")
                       .param("queries", "[\"renewable energy developments\"]")
                       .param("endpoint", "http://localhost:8080")
                       .param("max_client_batch_size", "32")
                       .param("max_chunks_per_doc", "5")
                       .param("overlap_tokens", "50")
                       .build();

// nodejs

// go

# restful

SiliconFlow ranker-specific parameters

The following parameters are specific to the SiliconFlow ranker:

Parameter	Required?	Description	Value / Example
`reranker`	Yes	Must be set to `"model"` to enable model reranking.	`"model"`
`provider`	Yes	The model service provider to use for reranking.	`"siliconflow"`
`model_name`	Yes	The SiliconFlow reranking model to use from supported models on SiliconFlow platform. For a list of rerank models available, refer to SiliconFlow documentation.	`"BAAI/bge-reranker-v2-m3"`
`queries`	Yes	List of query strings used by the rerank model to calculate relevance scores. The number of query strings must match exactly the number of queries in your search operation (even when using query vectors instead of text), otherwise an error will be reported.	["search query"]
`max_client_batch_size`	No	Since model services may not process all data at once, this sets the batch size for accessing the model service in multiple requests.	`128` (default)
`max_chunks_per_doc`	No	Maximum number of chunks generated from within a document. Long documents are divided into multiple chunks for calculation, and the highest score among the chunks is taken as the document's score. Only supported by specific models: `BAAI/bge-reranker-v2-m3`, `Pro/BAAI/bge-reranker-v2-m3`, and `netease-youdao/bce-reranker-base_v1`.	`5`, `10`
`overlap_tokens`	No	Number of token overlaps between adjacent chunks when documents are chunked. This ensures continuity across chunk boundaries for better semantic understanding. Only supported by specific models: `BAAI/bge-reranker-v2-m3`, `Pro/BAAI/bge-reranker-v2-m3`, and `netease-youdao/bce-reranker-base_v1`.	`50`
`credential`	No	Authentication credential for accessing SiliconFlow API services. If not specified, the system will look for the `SILICONFLOW_API_KEY` environment variable.	"your-siliconflow-api-key"

Model-specific feature support: The max_chunks_per_doc and overlap_tokens parameters are only supported by specific models. When using other models, these parameters will be ignored.

For general parameters shared across all model rankers (e.g., provider, queries), refer to Create a model ranker.

Apply to standard vector search

To apply SiliconFlow Ranker to a standard vector search:

Python Java NodeJS Go cURL

# Execute search with SiliconFlow reranking
results = client.search(
    collection_name="your_collection",
    data=[your_query_vector],  # Replace with your query vector
    anns_field="dense_vector",                   # Vector field to search
    limit=5,                                     # Number of results to return
    output_fields=["document"],                  # Include text field for reranking
    ranker=siliconflow_ranker,                  # Apply SiliconFlow reranking
    consistency_level="Bounded"
)

import io.milvus.v2.common.ConsistencyLevel;
import io.milvus.v2.service.vector.request.SearchReq;
import io.milvus.v2.service.vector.response.SearchResp;
import io.milvus.v2.service.vector.request.data.EmbeddedText;

SearchReq searchReq = SearchReq.builder()
        .collectionName("your_collection")
        .data(Arrays.asList(new EmbeddedText("AI Research Progress"), new EmbeddedText("What is AI")))
        .annsField("vector_field")
        .limit(10)
        .outputFields(Collections.singletonList("document"))
        .functionScore(FunctionScore.builder()
                .addFunction(ranker)
                .build())
        .consistencyLevel(ConsistencyLevel.BOUNDED)
        .build();
SearchResp searchResp = client.search(searchReq);

// nodejs

// go

# restful

Apply to hybrid search

SiliconFlow Ranker can also be used with hybrid search to combine dense and sparse retrieval methods:

Python Java NodeJS Go cURL

from pymilvus import AnnSearchRequest

# Configure dense vector search
dense_search = AnnSearchRequest(
    data=[your_query_vector_1], # Replace with your query vector
    anns_field="dense_vector",
    param={},
    limit=5
)

# Configure sparse vector search  
sparse_search = AnnSearchRequest(
    data=[your_query_vector_2], # Replace with your query vector
    anns_field="sparse_vector", 
    param={},
    limit=5
)

# Execute hybrid search with SiliconFlow reranking
hybrid_results = client.hybrid_search(
    collection_name="your_collection",
    [dense_search, sparse_search],              # Multiple search requests
    ranker=siliconflow_ranker,                 # Apply SiliconFlow reranking to combined results
    limit=5,                                   # Final number of results
    output_fields=["document"]
)

import io.milvus.v2.service.vector.request.AnnSearchReq;
import io.milvus.v2.service.vector.request.HybridSearchReq;
import io.milvus.v2.service.vector.request.data.EmbeddedText;
import io.milvus.v2.service.vector.request.data.FloatVec;
        
List<AnnSearchReq> searchRequests = new ArrayList<>();
searchRequests.add(AnnSearchReq.builder()
        .vectorFieldName("dense_vector")
        .vectors(Arrays.asList(new FloatVec(embedding1), new FloatVec(embedding2)))
        .limit(5)
        .build());
searchRequests.add(AnnSearchReq.builder()
        .vectorFieldName("sparse_vector")
        .data(Arrays.asList(new EmbeddedText("AI Research Progress"), new EmbeddedText("What is AI")))
        .limit(5)
        .build());

HybridSearchReq hybridSearchReq = HybridSearchReq.builder()
                .collectionName("your_collection")
                .searchRequests(searchRequests)
                .ranker(ranker)
                .limit(5)
                .outputFields(Collections.singletonList("document"))
                .build();
SearchResp searchResp = client.hybridSearch(hybridSearchReq);

// nodejs

// go

# restful

SiliconFlow Ranker
Prerequisites
Create a SiliconFlow ranker function
SiliconFlow ranker-specific parameters
Apply to standard vector search
Apply to hybrid search

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?

SiliconFlow RankerCompatible with Milvus 2.6.x

Prerequisites

Create a SiliconFlow ranker function

SiliconFlow ranker-specific parameters

Apply to standard vector search

Apply to hybrid search

Table of contents

Try Managed Milvus for Free

Feedback