milvus-logo

hybrid_search()

This operation performs multi-vector search on a collection and returns search results after reranking.

Request Syntax

hybrid_search(
    reqs: List,
    rerank: BaseRanker,
    limit: int,
    partition_names: Optional[List[str]] = None,
    output_fields: Optional[List[str]] = None,
    timeout: Optional[float] = None,
    round_decimal: int = -1,
)

PARAMETERS:

  • reqs (List[AnnSearchRequest]) -

    A list of search requests, where each request is an ANNSearchRequest object. Each request corresponds to a different vector field and a different set of search parameters.

    • ANNSearchRequest: A class representing an ANN search request.

      ├── AnnSearchRequest
      │   └── data  
      │   └── anns_field
      │   └── param 
      │   └── limit 
      │   └── expr
      
      • data: The query vector to search in the request.

      • anns_field: The vector field to use in the request.

      • param: A dictionary of search parameters for the request. For details, refer to Search parameters.

      • limit: The maximum number of results to return in the request.

      • expr: (Optional) The expression to filter the results.

  • __rerank __(BaseRanker) -

    The reranking strategy to use for hybrid search. Valid values: WeightedRanker and RRFRanker.

    • WeightedRanker: The Average Weighted Scoring reranking strategy, which prioritizes vectors based on relevance, averaging their significance.

    • RRFRanker: The RRF reranking strategy, which merges results from multiple searches, favoring items that consistently appear.

  • limit (int) -

    The total number of entities to return.

    You can use this parameter in combination with offset in param to enable pagination.

    The sum of this value and offset in param should be less than 16,384.

  • partition_names (List[str]) -

    A list of partition names.

    The value defaults to None. If specified, only the specified partitions are involved in queries.

  • output_fields (List[str]) -

    A list of field names to include in each entity in return.

    The value defaults to None. If left unspecified, only the primary field is included.

  • timeout (float) -

    The timeout duration for this operation. Setting this to None indicates that this operation timeouts when any response arrives or any error occurs.

  • round_decimal (int) -

    The number of decimal places that Milvus rounds the calculated distances to.

    The value defaults to -1, indicating that Milvus skips rounding the calculated distances and returns the raw value.

RETURN TYPE:

SearchResult

RETURNS:

A SearchResult object that contains a list of Hits objects.

  • Response structure

    notes

    A SearchResult object contains a list of Hits objects, each corresponding to a query vector in the search request.

    A Hits object contains a list of Hit objects, each corresponding to an entity hit by the search.

    ├── SearchResult
    │   └── Hits  
    │       ├── ids
    │       ├── distances
    │       └── Hit
    │           ├── id
    │           ├── distance
    │           ├── score
    │           ├── vector
    │           └── get()
    
  • Properties and methods

    • A Hits object has the following fields:

      • ids (list[int] | list[str])

        A list containing the IDs of the hit entities.

      • distances (list[float])

        A list of distances from the hit entities' vector fields to the query vector.

    • A Hit object has the following fields:

      • id (int | str)

        The ID of a hit entity.

      • distance (float)

        The distance from a hit entity's vector field to the query vector.

      • score (float)

        An alias to distance.

      • vector (list[float])

        The vector field of a hit entity.

      • get(field_name: str)

        A function to get the value of the specified field in a hit entity.

EXCEPTIONS:

  • MilvusException

    This exception will be raised when any error occurs during this operation.

Examples

collection = Collection(name='{your_collection_name}') # Replace with the actual name of your collection

res = collection.hybrid_search(
    reqs=[
        AnnSearchRequest(
            data=[['{your_text_query_vector}']],  # Replace with your text vector data
            anns_field='{text_vector_field_name}',  # Textual data vector field
            param={"metric_type": "IP", "params": {"nprobe": 10}}, # Search parameters
            limit=2
        ),
        AnnSearchRequest(
            data=[['{your_image_query_vector}']],  # Replace with your image vector data
            anns_field='{image_vector_field_name}',  # Image data vector field
            param={"metric_type": "IP", "params": {"nprobe": 10}}, # Search parameters
            limit=2
        )
    ],
    # Use WeightedRanker to combine results with specified weights
    rerank=WeightedRanker(0.8, 0.2), # Assign weights of 0.8 to text search and 0.2 to image search
    # Alternatively, use RRFRanker for reciprocal rank fusion reranking
    # rerank=RRFRanker(),
    limit=2
)