Conduct a Hybrid Search

In addition to vectors, Milvus supports data types such as boolean, integers, floating-point numbers, and more. A collection in Milvus can hold multiple fields for accommodating different data features or properties. Milvus is a flexible vector database that pairs scalar filtering with powerful vector similarity search.

Parameters marked with * are specific to Python SDK, and those marked with ** are specific to Node.js SDK.

A hybrid search is a vector similarity search, during which you can filter the scalar data by specifying a boolean expression.

  1. Connect to the Milvus server:
from pymilvus import connections
connections.connect("default", host='localhost', port='19530')
import { MilvusClient } from "@zilliz/milvus2-sdk-node";
const milvusClient = new MilvusClient("localhost:19530");
Detailed Description
Parameter Description Note
alias* Alias for the Milvus server Data type: String
Mandatory
host* IP address of the Milvus server Mandatory
port* Port of the Milvus server Mandatory
address** Address of the Milvus server. "server_IP:server_port"
Mandatory
  1. Prepare collection parameters and create a collection:
>>> from pymilvus import Collection, FieldSchema, CollectionSchema, DataType
>>> collection_name = "test_collection_search"
>>> schema = CollectionSchema([
...     FieldSchema("film_id", DataType.INT64, is_primary=True),
...     FieldSchema("films", dtype=DataType.FLOAT_VECTOR, dim=2)
... ])
>>> collection = Collection(collection_name, schema, using='default', shards_num=2)
const COLLECTION_NAME = "test_collection_search";
milvusClient.collectionManager.createCollection({
  collection_name: COLLECTION_NAME,
  fields: [
    {
      name: "films",
      description: "vector field",
      data_type: DataType.FloatVector,
      type_params: {
        dim: "2",
      },
    },
    {
      name: "film_id",
      data_type: DataType.Int64,
      autoID: false,
      is_primary_key: true,
      description: "",
    },
  ],
});
Detailed Description
Parameter Description Note
collection_name Name of the collection to create Data type: String
field_name Name of the field in the collection Data type: String
Schema Schema used to create a collection and the fields within. Refer to field schema and collection schema for detailed description.  
description Description of the collection Data type: String
using* By specifying the srever alias here, you can decide in which Milvus server you create a collection. Optional
shards_num* Number of the shards for the collection to create Optional
  1. Insert random vectors to the newly created collection:
>>> import random
>>> data = [
...     [i for i in range(10)],
...     [[random.random() for _ in range(2)] for _ in range(10)],
... ]
>>> collection.insert(data)
>>> collection.num_entities
10
let id = 1;
const entities = Array.from({ length: 10 }, () => ({
  films: Array.from({ length: 2 }, () => Math.random() * 10),
  film_id: id++,
}));

await milvusClient.collectionManager.insert({
  collection_name: COLLECTION_NAME,
  fields_data: entities,
});
Detailed Description
Parameter Description Note
data Data to insert into Milvus Mandatory
partition_name Name of the partition to insert data into Optional
timeout* Timeout (in seconds) to allow for RPC. Clients wait until server responds or error occurs when it is set to None. Optional
  1. Load the collection to memory and conduct a vector similarity search:
>>> collection.load()
>>> search_param = {
...     "data": [[1.0, 1.0]],
...     "anns_field": "films",
...     "param": {"metric_type": "L2"},
...     "limit": 2,
...     "expr": "film_id in [2,4,6,8]",
... }
>>> res = collection.search(**search_param)
await milvusClient.collectionManager.loadCollection({
  collection_name: COLLECTION_NAME,
});
await milvusClient.dataManager.search({
  collection_name: COLLECTION_NAME,
  // partition_names: [],
  expr: "film_id in [1,4,6,8]",
  vectors: [entities[0].films],
  search_params: {
    anns_field: "films",
    topk: "4",
    metric_type: "L2",
    params: JSON.stringify({ nprobe: 10 }),
  },
  vector_type: 100, // float vector -> 100
});
Detailed Description
Parameter Description Note
collection_name** Name of the collection to load and search Mandatory
vectors Vectors to search with. Lehgth of the data represents the number of query nq. Mandatory
anns_field Name of the field to search on Mandatory
params* Search parameter(s) specific to the index Find more parameter details of different indexes in Index Selection.
Mandatory
limit* Number of the most similar results to return Mandatory
expr Boolean expression used to filter attribute Find more expression details in Predicate Expressions.
Optional
partition_names Name of the partition to search on Optional
output_fields Name of the field to return (vector field not support in current release) Optional
timeout* Timeout (in seconds) to allow for RPC. Clients wait until server responds or error occurs when it is set to None. Optional
vector_type** Pre-check of binary/float vectors. 100 for binary vectors and 101 for float vectors. Mandatory
  1. Check the returned results:
>>> assert len(res) == 1>>> hits = res[0]>>> assert len(hits) == 2>>> print(f"- Total hits: {len(hits)}, hits ids: {hits.ids} ")- Total hits: 2, hits ids: [2, 4]>>> print(f"- Top1 hit id: {hits[0].id}, distance: {hits[0].distance}, score: {hits[0].score} ")- Top1 hit id: 2, distance: 0.10143111646175385, score: 0.101431116461
// search result will be like:{  status: { error_code: 'Success', reason: '' },  results: [    { score: 0, id: '1' },    { score: 9.266796112060547, id: '4' },    { score: 28.263811111450195, id: '8' },    { score: 41.055686950683594, id: '6' }  ]}
Is this page helpful?
Scored Successfully!