🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Try Managed Milvus

Home

About Milvus
Get Started
Concepts
User Guide
Data Import
Administration Guide
Tools
Integrations
Tutorials
FAQs
API Reference

Home
Docs
User Guide
Indexes
GPU-enabled Indexes
GPU_CAGRA

GPU_CAGRA

The GPU_CAGRA index is a graph-based index optimized for GPUs. Using inference-grade GPUs to run the Milvus GPU version can be more cost-effective compared to using expensive training-grade GPUs.

Build index

To build a GPU_CAGRA index on a vector field in Milvus, use the add_index() method, specifying the index_type, metric_type, and additional parameters for the index.

from pymilvus import MilvusClient

# Prepare index building params
index_params = MilvusClient.prepare_index_params()

index_params.add_index(
    field_name="your_vector_field_name", # Name of the vector field to be indexed
    index_type="GPU_CAGRA", # Type of the index to create
    index_name="vector_index", # Name of the index to create
    metric_type="L2", # Metric type used to measure similarity
    params={
        "intermediate_graph_degree": 32, # Affects recall and build time by determining the graph’s degree before pruning
        "graph_degree": 64, # Affets search performance and recall by setting the graph’s degree after pruning
        "build_algo": "IVF_PQ", # Selects the graph generation algorithm before pruning
        "cache_dataset_on_device": "true", # Decides whether to cache the original dataset in GPU memory
        "adapt_for_cpu": "false", # Decides whether to use GPU for index-building and CPU for search
    } # Index building params
)

In this configuration:

index_type: The type of index to be built. In this example, set the value to GPU_CAGRA.
metric_type: The method used to calculate the distance between vectors. For details, refer to Metric Types.
params: Additional configuration options for building the index. To learn more building parameters available for the GPU_CAGRA index, refer to Index building params.

Once the index parameters are configured, you can create the index by using the create_index() method directly or passing the index params in the create_collection method. For details, refer to Create Collection.

Search on index

Once the index is built and entities are inserted, you can perform similarity searches on the index.

search_params = {
    "params": {
        "itopk_size": 16, # Determines the size of intermediate results kept during the search
        "search_width": 8, # Specifies the number of entry points into the CAGRA graph during the search
    }
}

res = MilvusClient.search(
    collection_name="your_collection_name", # Collection name
    anns_field="vector_field", # Vector field name
    data=[[0.1, 0.2, 0.3, 0.4, 0.5]],  # Query vector
    limit=3,  # TopK results to return
    search_params=search_params
)

In this configuration:

params: Additional configuration options for searching on the index. To learn more search parameters available for the GPU_CAGRA index, refer to Index-specific search params.

Index params

This section provides an overview of the parameters used for building an index and performing searches on the index.

Index building params

The following table lists the parameters that can be configured in params when building an index.

Parameter	Description	Default Value
`intermediate_graph_degree`	Affects recall and build time by determining the graph’s degree before pruning. Recommended values are `32` or `64`.	`128`
`graph_degree`	Affects search performance and recall by setting the graph’s degree after pruning. A larger difference between these two degrees results in a longer build time. Its value must be smaller than the value of `intermediate_graph_degree`.	`64`
`build_algo`	Selects the graph generation algorithm before pruning. Possible values: `IVF_PQ`: Offers higher quality but slower build time. `NN_DESCENT`: Provides a quicker build with potentially lower recall.	`IVF_PQ`
`cache_dataset_on_device`	Decides whether to cache the original dataset in GPU memory. Possible values: `"true"`: Caches the original dataset to enhance recall by refining search results. `"false"`: Does not cache the original dataset to save gpu memory.	`"false"`
`adapt_for_cpu`	Decides whether to use GPU for index-building and CPU for search. Setting this parameter to `"true"` requires the presence of the `ef` parameter in the search requests.	`"false"`

Index-specific search params

The following table lists the parameters that can be configured in search_params.params when searching on the index.

Parameter	Description	Default Value
`itopk_size`	Determines the size of intermediate results kept during the search. A larger value may improve recall at the expense of search performance. It should be at least equal to the final top-k (limit) value and is typically a power of 2 (e.g., 16, 32, 64, 128).	Empty
`search_width`	Specifies the number of entry points into the CAGRA graph during the search. Increasing this value can enhance recall but may impact search performance（e.g. 1, 2, 4, 8, 16, 32).	Empty
`min_iterations` / `max_iterations`	Controls the search iteration process. By default, they are set to `0`, and CAGRA automatically determines the number of iterations based on `itopk_size` and `search_width`. Adjusting these values manually can help balance performance and accuracy.	`0`
`team_size`	Specifies the number of CUDA threads used for calculating metric distance on the GPU. Common values are a power of 2 up to 32 (e.g. 2, 4, 8, 16, 32). It has a minor impact on search performance. The default value is `0`, where Milvus automatically selects the `team_size` based on the vector dimension.	`0`
`ef`	Specifies the query time/accuracy trade-off. A higher `ef` value leads to more accurate but slower search. This parameter is mandatory if you set `adapt_for_cpu` to `true` when you build the index.	`[top_k, int_max]`

GPU_CAGRA
Build index
Search on index
Index params

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?