search()
This operation conducts a vector similarity search with an optional scalar filtering expression.
Request syntax
search(
collection_name: str,
data: Union[List[list], list],
filter: str = "",
limit: int = 10,
output_fields: Optional[List[str]] = None,
search_params: Optional[dict] = None,
timeout: Optional[float] = None,
partition_names: Optional[List[str]] = None,
**kwargs,
) -> List[dict]
PARAMETERS:
collection_name (str) -
[REQUIRED]
The name of an existing collection.
data (List[list], list]) -
[REQUIRED]
A list of vector embeddings.
Milvus searches for the most similar vector embeddings to the specified ones.
anns_field (str) -
The name of the target vector field of the current search.
filter (str) -
A scalar filtering condition to filter matching entities.
The value defaults to an empty string, indicating that no condition applies.
You can set this parameter to an empty string to skip scalar filtering. To build a scalar filtering condition, refer to Boolean Expression Rules.
limit (int) -
The total number of entities to return.
You can use this parameter in combination with offset in param to enable pagination.
The sum of this value and offset in param should be less than 16,384.
output_fields (l_ist[str]_) -
A list of field names to include in each entity in return.
The value defaults to None. If left unspecified, only the primary field is included.
search_params (dict) -
The parameter settings specific to this operation.
metric_type (str) -
The metric type applied to this operation. This should be the same as the one used when you index the vector field specified above.
Possible values are L2, IP, and COSINE.
params (dict) -
Additional parameters
radius (float) -
Determines the threshold of least similarity. When setting
metric_type
toL2
, ensure that this value is greater than that of range_filter. Otherwise, this value should be lower than that of range_filter.range_filter (float) -
Refines the search to vectors within a specific similarity range. When setting
metric_type
toIP
orCOSINE
, ensure that this value is greater than that of radius. Otherwise, this value should be lower than that of radius.
For details on other applicable search parameters, refer to In-memory Index and On-disk Index.
timeout (float | None) -
The timeout duration for this operation. Setting this to None indicates that this operation timeouts when any response arrives or any error occurs.
partition_names (list) -
A list of partition names.
The value defaults to None. If specified, only the specified partitions are involved in queries.
kwargs -
offset (int) -
The number of records to skip in the search result.
You can use this parameter in combination with
limit
to enable pagination.The sum of this value and
limit
should be less than 16,384.round_decimal (int) -
The number of decimal places that Milvus rounds the calculated distances to.
The value defaults to -1, indicating that Milvus skips rounding the calculated distances and returns the raw value.
RETURN TYPE:
list[dict]
RETURNS: A list of dictionaries that contains the searched entities with specified output fields.
EXCEPTIONS:
MilvusException
This exception will be raised when any error occurs during this operation.
Examples
from pymilvus import MilvusClient
# 1. Set up a milvus client
client = MilvusClient(
uri="http://localhost:19530",
token="root:Milvus"
)
# 2. Create a collection
client.create_collection(
collection_name="test_collection",
dimension=5
)
# 3. Insert data
client.insert(
collection_name="test_collection",
data=[
{"id": 0, "vector": [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592], "color": "pink_8682"},
{"id": 1, "vector": [0.19886812562848388, 0.06023560599112088, 0.6976963061752597, 0.2614474506242501, 0.838729485096104], "color": "red_7025"},
{"id": 2, "vector": [0.43742130801983836, -0.5597502546264526, 0.6457887650909682, 0.7894058910881185, 0.20785793220625592], "color": "orange_6781"},
{"id": 3, "vector": [0.3172005263489739, 0.9719044792798428, -0.36981146090600725, -0.4860894583077995, 0.95791889146345], "color": "pink_9298"},
{"id": 4, "vector": [0.4452349528804562, -0.8757026943054742, 0.8220779437047674, 0.46406290649483184, 0.30337481143159106], "color": "red_4794"},
{"id": 5, "vector": [0.985825131989184, -0.8144651566660419, 0.6299267002202009, 0.1206906911183383, -0.1446277761879955], "color": "yellow_4222"},
{"id": 6, "vector": [0.8371977790571115, -0.015764369584852833, -0.31062937026679327, -0.562666951622192, -0.8984947637863987], "color": "red_9392"},
{"id": 7, "vector": [-0.33445148015177995, -0.2567135004164067, 0.8987539745369246, 0.9402995886420709, 0.5378064918413052], "color": "grey_8510"},
{"id": 8, "vector": [0.39524717779832685, 0.4000257286739164, -0.5890507376891594, -0.8650502298996872, -0.6140360785406336], "color": "white_9381"},
{"id": 9, "vector": [0.5718280481994695, 0.24070317428066512, -0.3737913482606834, -0.06726932177492717, -0.6980531615588608], "color": "purple_4976"}
],
)
# {'insert_count': 10}
# 4. Conduct a search
search_params = {
"metric_type": "IP",
"params": {}
}
# Search with limit
res = client.search(
collection_name="test_collection",
data=[[0.05, 0.23, 0.07, 0.45, 0.13]],
limit=3,
search_params=search_params
)
# [[{'id': 7, 'distance': 0.4801957309246063, 'entity': {}},
# {'id': 2, 'distance': 0.3205878734588623, 'entity': {}},
# {'id': 1, 'distance': 0.2993225157260895, 'entity': {}}]]
# Search with filter
res = client.search(
collection_name="test_collection",
data=[[0.05, 0.23, 0.07, 0.45, 0.13]],
limit=3,
filter='color like "red%"',
search_params=search_params
)
# [[{'id': 1, 'distance': 0.2993225157260895, 'entity': {}},
# {'id': 4, 'distance': 0.12666261196136475, 'entity': {}},
# {'id': 6, 'distance': -0.3535143733024597, 'entity': {}}]]
# Search with an offset
res = client.search(
collection_name="test_collection",
data=[[0.05, 0.23, 0.07, 0.45, 0.13]],
limit=3,
offset=3,
search_params=search_params
)
# [[{'id': 4, 'distance': 0.12666261196136475, 'entity': {}},
# {'id': 3, 'distance': 0.11930042505264282, 'entity': {}},
# {'id': 5, 'distance': -0.05843167006969452, 'entity': {}}]]
# Search with output fields
res = client.search(
collection_name="test_collection",
data=[[0.05, 0.23, 0.07, 0.45, 0.13]],
limit=3,
output_fields=["vector", "color"],
search_params=search_params
)
# [[{'id': 7,
# 'distance': 0.4801957309246063,
# 'entity': {'color': 'grey_8510',
# 'vector': [-0.33445146679878235,
# -0.25671350955963135,
# 0.8987540006637573,
# 0.9402995705604553,
# 0.537806510925293]}},
# {'id': 2,
# 'distance': 0.3205878734588623,
# 'entity': {'color': 'orange_6781',
# 'vector': [0.4374213218688965,
# -0.5597502589225769,
# 0.6457887887954712,
# 0.789405882358551,
# 0.20785793662071228]}},
# {'id': 1,
# 'distance': 0.2993225157260895,
# 'entity': {'color': 'red_7025',
# 'vector': [0.19886812567710876,
# 0.060235604643821716,
# 0.697696328163147,
# 0.2614474594593048,
# 0.8387295007705688]}}]]
# Conduct a range search
search_params = {
"metric_type": "IP",
"params": {
"radius": 0.1,
"range_filter": 0.8
}
}
res = client.search(
collection_name="test_collection",
data=[[0.05, 0.23, 0.07, 0.45, 0.13]],
limit=3,
search_params=search_params
)
# [[{'id': 7, 'distance': 0.4801957309246063, 'entity': {}},
# {'id': 2, 'distance': 0.3205878734588623, 'entity': {}},
# {'id': 1, 'distance': 0.2993225157260895, 'entity': {}}]]