Cohere RankerCompatible with Milvus 2.6.x
The Cohere Ranker leverages Cohere’s powerful rerank models to enhance search relevance through semantic reranking. It provides enterprise-grade reranking capabilities with robust API infrastructure and optimized performance for production environments.
Cohere Ranker is particularly valuable for applications requiring:
High-quality semantic understanding with state-of-the-art rerank models
Enterprise-grade reliability and scalability for production workloads
Multilingual reranking capabilities across diverse content types
Consistent API performance with built-in rate limiting and error handling
Prerequisites
Before implementing Cohere Ranker in Milvus, ensure you have:
A Milvus collection with a
VARCHARfield containing the text to be rerankedA valid Cohere API key with access to reranking models. Sign up at Cohere’s platform to obtain your API credentials. You can either:
Set the
COHERE_API_KEYenvironment variable, orSpecify the API key directly in the
credentialof the ranker configuration
Create a Cohere ranker function
To use Cohere Ranker in your Milvus application, create a Function object that specifies how the reranking should operate. This function will be passed to Milvus search operations to enhance result ranking.
from pymilvus import MilvusClient, Function, FunctionType
# Connect to your Milvus server
client = MilvusClient(
uri="http://localhost:19530" # Replace with your Milvus server URI
)
# Configure Cohere Ranker
cohere_ranker = Function(
name="cohere_semantic_ranker", # Unique identifier for your ranker
input_field_names=["document"], # VARCHAR field containing text to rerank
function_type=FunctionType.RERANK, # Must be RERANK for reranking functions
params={
"reranker": "model", # Enables model-based reranking
"provider": "cohere", # Specifies Cohere as the service provider
"model_name": "rerank-english-v3.0", # Cohere rerank model to use
"queries": ["renewable energy developments"], # Query text for relevance evaluation
"max_client_batch_size": 128, # Optional: batch size for model service requests (default: 128)
"max_tokens_per_doc": 4096, # Optional: max tokens per document (default: 4096)
# "credential": "your-cohere-api-key" # Optional: authentication credential for Cohere API
}
)
import io.milvus.v2.client.ConnectConfig;
import io.milvus.v2.client.MilvusClientV2;
import io.milvus.common.clientenum.FunctionType;
import io.milvus.v2.service.collection.request.CreateCollectionReq;
MilvusClientV2 client = new MilvusClientV2(ConnectConfig.builder()
.uri("http://localhost:19530")
.build());
CreateCollectionReq.Function ranker = CreateCollectionReq.Function.builder()
.functionType(FunctionType.RERANK)
.name("cohere_semantic_ranker")
.inputFieldNames(Collections.singletonList("document"))
.param("reranker", "model")
.param("provider", "cohere")
.param("model_name", "rerank-english-v3.0")
.param("queries", "[\"renewable energy developments\"]")
.param("endpoint", "http://localhost:8080")
.param("max_client_batch_size", "128")
.param("max_tokens_per_doc", "4096")
.build();
// nodejs
// go
# restful
Cohere ranker-specific parameters
The following parameters are specific to the Cohere ranker:
Parameter |
Required? |
Description |
Value / Example |
|---|---|---|---|
|
Yes |
Must be set to |
|
|
Yes |
The model service provider to use for reranking. |
|
|
Yes |
The Cohere rerank model to use from supported models on Cohere platform. For a list of rerank models available, refer to Cohere documentation. |
|
|
Yes |
List of query strings used by the rerank model to calculate relevance scores. The number of query strings must match exactly the number of queries in your search operation (even when using query vectors instead of text), otherwise an error will be reported. |
["search query"] |
|
No |
Since model services may not process all data at once, this sets the batch size for accessing the model service in multiple requests. |
|
|
No |
Maximum number of tokens per document. Long documents will be automatically truncated to the specified number of tokens. |
|
|
No |
Authentication credential for accessing Cohere API services. If not specified, the system will look for the |
"your-cohere-api-key" |
For general parameters shared across all model rankers (e.g., provider, queries), refer to Create a model ranker.
Apply to standard vector search
To apply Cohere Ranker to a standard vector search:
# Execute search with Cohere reranking
results = client.search(
collection_name="your_collection",
data=[your_query_vector], # Replace with your query vector
anns_field="dense_vector", # Vector field to search
limit=5, # Number of results to return
output_fields=["document"], # Include text field for reranking
ranker=cohere_ranker, # Apply Cohere reranking
consistency_level="Bounded"
)
import io.milvus.v2.common.ConsistencyLevel;
import io.milvus.v2.service.vector.request.SearchReq;
import io.milvus.v2.service.vector.response.SearchResp;
import io.milvus.v2.service.vector.request.data.EmbeddedText;
SearchReq searchReq = SearchReq.builder()
.collectionName(COLLECTION_NAME)
.data(Arrays.asList(new EmbeddedText("AI Research Progress"), new EmbeddedText("What is AI")))
.annsField("vector_field")
.limit(10)
.outputFields(Collections.singletonList("document"))
.functionScore(FunctionScore.builder()
.addFunction(ranker)
.build())
.consistencyLevel(ConsistencyLevel.BOUNDED)
.build();
SearchResp searchResp = client.search(searchReq);
// nodejs
// go
# restful
Apply to hybrid search
Cohere Ranker can also be used with hybrid search to combine dense and sparse retrieval methods:
from pymilvus import AnnSearchRequest
# Configure dense vector search
dense_search = AnnSearchRequest(
data=[your_query_vector_1], # Replace with your query vector
anns_field="dense_vector",
param={},
limit=5
)
# Configure sparse vector search
sparse_search = AnnSearchRequest(
data=[your_query_vector_2], # Replace with your query vector
anns_field="sparse_vector",
param={},
limit=5
)
# Execute hybrid search with Cohere reranking
hybrid_results = client.hybrid_search(
collection_name="your_collection",
[dense_search, sparse_search], # Multiple search requests
ranker=cohere_ranker, # Apply Cohere reranking to combined results
limit=5, # Final number of results
output_fields=["document"]
)
import io.milvus.v2.service.vector.request.AnnSearchReq;
import io.milvus.v2.service.vector.request.HybridSearchReq;
import io.milvus.v2.service.vector.request.data.EmbeddedText;
import io.milvus.v2.service.vector.request.data.FloatVec;
List<AnnSearchReq> searchRequests = new ArrayList<>();
searchRequests.add(AnnSearchReq.builder()
.vectorFieldName("dense_vector")
.vectors(Arrays.asList(new FloatVec(embedding1), new FloatVec(embedding2)))
.limit(5)
.build());
searchRequests.add(AnnSearchReq.builder()
.vectorFieldName("sparse_vector")
.data(Arrays.asList(new EmbeddedText("AI Research Progress"), new EmbeddedText("What is AI")))
.limit(5)
.build());
HybridSearchReq hybridSearchReq = HybridSearchReq.builder()
.collectionName("your_collection")
.searchRequests(searchRequests)
.ranker(ranker)
.limit(5)
.outputFields(Collections.singletonList("document"))
.build();
SearchResp searchResp = client.hybridSearch(hybridSearchReq);
// nodejs
// go
# restful