Conduct a Hybrid Search
This topic describes how to conduct a hybrid search.
A hybrid search is essentially a vector search with attribute filtering. By specifying boolean expressions that filter the scalar fields or the primary key field, you can limit your search with certain conditions.
The following example shows how to perform a hybrid search on the basis of a regular vector search. Suppose you want to search for certain books based on their vectorized introductions, but you only want those within a specific range of word count. You can then specify the boolean expression to filter the word_count
field in the search parameters. Milvus will search for similar vectors only among those entities that match the expression.
Load collection
All search and query operations within Milvus are executed in memory. Load the collection to memory before conducting a vector search.
from pymilvus import Collection
collection = Collection("book") # Get an existing collection.
collection.load()
await milvusClient.collectionManager.loadCollection({
collection_name: "book",
});
err := milvusClient.LoadCollection(
context.Background(), // ctx
"book", // CollectionName
false // async
)
if err != nil {
log.Fatal("failed to load collection:", err.Error())
}
milvusClient.loadCollection(
LoadCollectionParam.newBuilder()
.withCollectionName("book")
.build()
);
load -c book
Conduct a hybrid vector search
By specifying the boolean expression, you can filter the scalar field of the entities during the vector search. The following example limits the scale of search to the vectors within a specified word_count
value range.
search_param = {
"data": [[0.1, 0.2]],
"anns_field": "book_intro",
"param": {"metric_type": "L2", "params": {"nprobe": 10}},
"limit": 2,
"expr": "word_count <= 11000",
}
res = collection.search(**search_param)
const results = await milvusClient.dataManager.search({
collection_name: "book",
expr: "word_count <= 11000",
vectors: [[0.1, 0.2]],
search_params: {
anns_field: "book_intro",
topk: "2",
metric_type: "L2",
params: JSON.stringify({ nprobe: 10 }),
},
vector_type: 101, // DataType.FloatVector,
});
sp, _ := entity.NewIndexFlatSearchParam( // NewIndex*SearchParam func
10, // searchParam
)
searchResult, err := milvusClient.Search(
context.Background(), // ctx
"book", // CollectionName
[]string{}, // partitionNames
"word_count <= 11000", // expr
[]string{"book_id"}, // outputFields
[]entity.Vector{entity.FloatVector([]float32{0.1, 0.2})}, // vectors
"book_intro", // vectorField
entity.L2, // metricType
2, // topK
sp, // sp
)
if err != nil {
log.Fatal("fail to search collection:", err.Error())
}
final Integer SEARCH_K = 2;
final String SEARCH_PARAM = "{\"nprobe\":10}";
List<String> search_output_fields = Arrays.asList("book_id");
List<List<Float>> search_vectors = Arrays.asList(Arrays.asList(0.1f, 0.2f));
SearchParam searchParam = SearchParam.newBuilder()
.withCollectionName("book")
.withMetricType(MetricType.L2)
.withOutFields(search_output_fields)
.withTopK(SEARCH_K)
.withVectors(search_vectors)
.withVectorFieldName("book_intro")
.withExpr("word_count <= 11000")
.withParams(SEARCH_PARAM)
.build();
R<SearchResults> respSearch = milvusClient.search(searchParam);
search
Collection name (book): book
The vectors of search data(the length of data is number of query (nq), the dim of every vector in data must be equal to vector field’s of collection. You can also import a csv file without headers): [[0.1, 0.2]]
The vector field used to search of collection (book_intro): book_intro
Metric type: L2
Search parameter nprobe's value: 10
The max number of returned record, also known as topk: 2
The boolean expression used to filter attribute []: word_count <= 11000
The names of partitions to search (split by "," if multiple) ['_default'] []:
timeout []:
Guarantee Timestamp(It instructs Milvus to see all operations performed before a provided timestamp. If no such timestamp is provided, then Milvus will search all operations performed to date) [0]:
Travel Timestamp(Specify a timestamp in a search to get results based on a data view) [0]:
Parameter | Description |
---|---|
data |
Vectors to search with. |
anns_field |
Name of the field to search on. |
params |
Search parameter(s) specific to the index. See Vector Index for more information. |
limit |
Number of the most similar results to return. |
expr |
Boolean expression used to filter attribute. See Boolean Expression Rules for more information. |
partition_names (optional) |
List of names of the partition to search in. |
output_fields (optional) |
Name of the field to return. Vector field is not supported in current release. |
timeout (optional) |
A duration of time in seconds to allow for RPC. Clients wait until server responds or error occurs when it is set to None. |
round_decimal (optional) |
Number of decimal places of returned distance. |
Parameter | Description |
---|---|
collection_name |
Name of the collection to search in. |
search_params |
Parameters (as an object) used for search. |
vectors |
Vectors to search with. |
vector_type |
Pre-check of binary or float vectors. 100 for binary vectors and 101 for float vectors. |
partition_names (optional) |
List of names of the partition to search in. |
expr (optional) |
Boolean expression used to filter attribute. See Boolean Expression Rules for more information. |
output_fields (optional) |
Name of the field to return. Vector field not support in current release. |
Parameter | Description | Options |
---|---|---|
NewIndex*SearchParam func |
Function to create entity.SearchParam according to different index types. | For floating point vectors:
|
searchParam |
Search parameter(s) specific to the index. | See Vector Index for more information. |
ctx |
Context to control API invocation process. | N/A |
CollectionName |
Name of the collection to load. | N/A |
partitionNames |
List of names of the partitions to load. All partitions will be searched if it is left empty. | N/A |
expr |
Boolean expression used to filter attribute. | See Boolean Expression Rules for more information. |
output_fields |
Name of the field to return. | Vector field is not supported in current release. |
vectors |
Vectors to search with. | N/A |
vectorField |
Name of the field to search on. | N/A |
metricType |
Metric type used for search. | This parameter must be set identical to the metric type used for index building. |
topK |
Number of the most similar results to return. | N/A |
sp |
entity.SearchParam specific to the index. | N/A |
Parameter | Description | Options |
---|---|---|
CollectionName |
Name of the collection to load. | N/A |
MetricType |
Metric type used for search. | This parameter must be set identical to the metric type used for index building. |
OutFields |
Name of the field to return. | Vector field is not supported in current release. |
TopK |
Number of the most similar results to return. | N/A |
Vectors |
Vectors to search with. | N/A |
VectorFieldName |
Name of the field to search on. | N/A |
Expr |
Boolean expression used to filter attribute. | See Boolean Expression Rules for more information. |
Params |
Search parameter(s) specific to the index. | See Vector Index for more information. |
Option | Full name | Description |
---|---|---|
--help | n/a | Displays help for using the command. |
Check the returned results.
assert len(res) == 1
hits = res[0]
assert len(hits) == 2
print(f"- Total hits: {len(hits)}, hits ids: {hits.ids} ")
print(f"- Top1 hit id: {hits[0].id}, distance: {hits[0].distance}, score: {hits[0].score} ")
console.log(results.results)
fmt.Printf("%#v\n", searchResult)
for _, sr := range searchResult {
fmt.Println(sr.IDs)
fmt.Println(sr.Scores)
}
SearchResultsWrapper wrapperSearch = new SearchResultsWrapper(respSearch.getData().getResults());
System.out.println(wrapperSearch.getIDScore(0));
System.out.println(wrapperSearch.getFieldData("book_id", 0));
# Milvus CLI automatically returns the primary key values of the most similar vectors and their distances.
What’s next
Explore API references for Milvus SDKs: