Warm UpCompatible with Milvus 2.6.4+
In Milvus, Warm Up complements Tiered Storage by alleviating first-hit latency that occurs when cold data is accessed for the first time. Once configured, Warm Up preloads selected types of fields or indexes into the cache before a segment becomes queryable, ensuring that frequently accessed data is available immediately after loading.
Why warm up
Lazy Load in Tiered Storage improves efficiency by loading only metadata initially. However, this can cause latency on the first query to cold data, since required chunks or indexes must be fetched from object storage.
Warm Up solves this problem by proactively caching critical data during segment initialization.
It is especially beneficial when:
Certain scalar indexes are frequently used in filter conditions.
Vector indexes are essential for search performance and must be ready immediately.
Cold-start latency after QueryNode restart or new segment load is unacceptable.
In contrast, Warm Up is not recommended for fields or indexes queried infrequently. Disabling Warm Up shortens segment load time and conserves cache space—ideal for large vector fields or non-critical scalar fields.
Configuration
Warm Up is controlled under queryNode.segcore.tieredStorage.warmup in milvus.yaml. You can configure it separately for scalar fields, scalar indexes, vector fields, and vector indexes. Each target supports two modes:
Mode |
Description |
Typical scenario |
|---|---|---|
|
Preload before the segment becomes queryable. Load time increases slightly, but the first query incurs no latency. |
Use for performance-critical data that must be immediately available, such as high-frequency scalar indexes or key vector indexes used in search. |
|
Skip preloading. The segment becomes queryable faster, but the first query may trigger on-demand loading. |
Use for infrequently accessed or large data such as raw vector fields or non-critical scalar fields. |
Example YAML:
queryNode:
segcore:
tieredStorage:
warmup:
# options: sync, disable.
# Specifies the timing for warming up the Tiered Storage cache.
# - `sync`: data will be loaded into the cache before a segment is considered loaded.
# - `disable`: data will not be proactively loaded into the cache, and loaded only if needed by search/query tasks.
# Defaults to `sync`, except for vector field which defaults to `disable`.
scalarField: sync
scalarIndex: sync
vectorField: disable # cache warmup for vector field raw data is by default disabled.
vectorIndex: sync
Parameter |
Values |
Description |
Recommended use case |
|---|---|---|---|
|
|
Controls whether scalar field data is preloaded. |
Use |
|
|
Controls whether scalar indexes are preloaded. |
Use |
|
|
Controls whether vector field data is preloaded. |
Generally |
|
|
Controls whether vector indexes are preloaded. |
Use |
Best practices
Warm Up only affects the initial load. If cached data is later evicted, the next query will reload it on demand.
Avoid overusing
sync. Preloading too many fields increases load time and cache pressure.Start conservatively—enable Warm Up only for fields and indexes that are frequently accessed.
Monitor query latency and cache metrics, then expand preloading as needed.
For mixed workloads, apply
syncto performance-sensitive collections anddisableto capacity-oriented ones.