queryNode-related Configurations Milvus v2.4.x documentation

🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Star35.9K Contact Us

Try Managed Milvus

Home

About Milvus
Get Started
Concepts
User Guide
Models
Administration Guide
Tools
Integrations
Tutorials
FAQs
API Reference

Related configuration of queryNode, used to run hybrid search between vector and scalar data.

`queryNode.stats.publishInterval`

Description	Default Value
The interval that query node publishes the node statistics information, including segment status, cpu usage, memory usage, health status, etc. Unit: ms.	1000

`queryNode.segcore.knowhereThreadPoolNumRatio`

Description	Default Value
The number of threads in knowhere's thread pool. If disk is enabled, the pool size will multiply with knowhereThreadPoolNumRatio([1, 32]).	4

`queryNode.segcore.chunkRows`

Description	Default Value
Row count by which Segcore divides a segment into chunks.	128

`queryNode.segcore.interimIndex.enableIndex`

Description	Default Value
Whether to create a temporary index for growing segments and sealed segments not yet indexed, improving search performance. Milvus will eventually seals and indexes all segments, but enabling this optimizes search performance for immediate queries following data insertion. This defaults to true, indicating that Milvus creates temporary index for growing segments and the sealed segments that are not indexed upon searches.	true

Description

Default Value

Whether to create a temporary index for growing segments and sealed segments not yet indexed, improving search performance.

Milvus will eventually seals and indexes all segments, but enabling this optimizes search performance for immediate queries following data insertion.

This defaults to true, indicating that Milvus creates temporary index for growing segments and the sealed segments that are not indexed upon searches.

true

`queryNode.segcore.interimIndex.nlist`

Description	Default Value
temp index nlist, recommend to set sqrt(chunkRows), must smaller than chunkRows/8	128

`queryNode.segcore.interimIndex.nprobe`

Description	Default Value
nprobe to search small index, based on your accuracy requirement, must smaller than nlist	16

`queryNode.segcore.interimIndex.memExpansionRate`

Description	Default Value
extra memory needed by building interim index	1.15

`queryNode.segcore.interimIndex.buildParallelRate`

Description	Default Value
the ratio of building interim index parallel matched with cpu num	0.5

`queryNode.segcore.knowhereScoreConsistency`

Description	Default Value
Enable knowhere strong consistency score computation logic	false

`queryNode.loadMemoryUsageFactor`

Description	Default Value
The multiply factor of calculating the memory usage while loading segments	1

`queryNode.enableDisk`

Description	Default Value
enable querynode load disk index, and search on disk index	false

`queryNode.cache.memoryLimit`

Description	Default Value
2 GB, 2 * 1024 1024 1024	2147483648

`queryNode.cache.readAheadPolicy`

Description	Default Value
The read ahead policy of chunk cache, options: `normal, random, sequential, willneed, dontneed`	willneed

`queryNode.cache.warmup`

Description	Default Value
options: async, sync, disable. Specifies the necessity for warming up the chunk cache. 1. If set to "sync" or "async" the original vector data will be synchronously/asynchronously loaded into the chunk cache during the load process. This approach has the potential to substantially reduce query/search latency for a specific duration post-load, albeit accompanied by a concurrent increase in disk usage; 2. If set to "disable" original vector data will only be loaded into the chunk cache during search/query.	disable

Description

Default Value

options: async, sync, disable.

Specifies the necessity for warming up the chunk cache.

1. If set to "sync" or "async" the original vector data will be synchronously/asynchronously loaded into the

chunk cache during the load process. This approach has the potential to substantially reduce query/search latency

for a specific duration post-load, albeit accompanied by a concurrent increase in disk usage;

2. If set to "disable" original vector data will only be loaded into the chunk cache during search/query.

disable

`queryNode.mmap.mmapEnabled`

Description	Default Value
Enable mmap for loading data	false

`queryNode.mmap.growingMmapEnabled`

Description	Default Value
Enable mmap for using in growing raw data	false

`queryNode.mmap.fixedFileSizeForMmapAlloc`

Description	Default Value
tmp file size for mmap chunk manager	1

`queryNode.mmap.maxDiskUsagePercentageForMmapAlloc`

Description	Default Value
disk percentage used in mmap chunk manager	50

`queryNode.lazyload.enabled`

Description	Default Value
Enable lazyload for loading data	false

`queryNode.lazyload.waitTimeout`

Description	Default Value
max wait timeout duration in milliseconds before start to do lazyload search and retrieve	30000

`queryNode.lazyload.requestResourceTimeout`

Description	Default Value
max timeout in milliseconds for waiting request resource for lazy load, 5s by default	5000

`queryNode.lazyload.requestResourceRetryInterval`

Description	Default Value
retry interval in milliseconds for waiting request resource for lazy load, 2s by default	2000

`queryNode.lazyload.maxRetryTimes`

Description	Default Value
max retry times for lazy load, 1 by default	1

`queryNode.lazyload.maxEvictPerRetry`

Description	Default Value
max evict count for lazy load, 1 by default	1

`queryNode.scheduler.maxReadConcurrentRatio`

Description	Default Value
maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task). Max read concurrency would be the value of hardware.GetCPUNum * maxReadConcurrentRatio. It defaults to 2.0, which means max read concurrency would be the value of hardware.GetCPUNum * 2. Max read concurrency must greater than or equal to 1, and less than or equal to hardware.GetCPUNum * 100. (0, 100]	1

`queryNode.scheduler.cpuRatio`

Description	Default Value
ratio used to estimate read task cpu usage.	10

`queryNode.scheduler.scheduleReadPolicy.name`

Description	Default Value
fifo: A FIFO queue support the schedule. user-task-polling: The user's tasks will be polled one by one and scheduled. Scheduling is fair on task granularity. The policy is based on the username for authentication. And an empty username is considered the same user. When there are no multi-users, the policy decay into FIFO"	fifo

`queryNode.scheduler.scheduleReadPolicy.taskQueueExpire`

Description	Default Value
Control how long (many seconds) that queue retains since queue is empty	60

`queryNode.scheduler.scheduleReadPolicy.enableCrossUserGrouping`

Description	Default Value
Enable Cross user grouping when using user-task-polling policy. (Disable it if user's task can not merge each other)	false

`queryNode.scheduler.scheduleReadPolicy.maxPendingTaskPerUser`

Description	Default Value
Max pending task per user in scheduler	1024

`queryNode.dataSync.flowGraph.maxQueueLength`

Description	Default Value
The maximum size of task queue cache in flow graph in query node.	16

`queryNode.dataSync.flowGraph.maxParallelism`

Description	Default Value
Maximum number of tasks executed in parallel in the flowgraph	1024

`queryNode.enableSegmentPrune`

Description	Default Value
use partition stats to prune data in search/query on shard delegator	false

`queryNode.bloomFilterApplyParallelFactor`

Description	Default Value
parallel factor when to apply pk to bloom filter, default to 4*CPU_CORE_NUM	4

`queryNode.queryStreamBatchSize`

Description	Default Value
return batch size of stream query	4194304

`queryNode.workerPooling.size`

Description	Default Value
the size for worker querynode client pool	10

`queryNode.ip`

Description	Default Value
TCP/IP address of queryNode. If not specified, use the first unicastable address

`queryNode.port`

Description	Default Value
TCP port of queryNode	21123

`queryNode.grpc.serverMaxSendSize`

Description	Default Value
The maximum size of each RPC request that the queryNode can send, unit: byte	536870912

`queryNode.grpc.serverMaxRecvSize`

Description	Default Value
The maximum size of each RPC request that the queryNode can receive, unit: byte	268435456

`queryNode.grpc.clientMaxSendSize`

Description	Default Value
The maximum size of each RPC request that the clients on queryNode can send, unit: byte	268435456

`queryNode.grpc.clientMaxRecvSize`

Description	Default Value
The maximum size of each RPC request that the clients on queryNode can receive, unit: byte	536870912

queryNode-related Configurations
queryNode.stats.publishInterval
queryNode.segcore.knowhereThreadPoolNumRatio
queryNode.segcore.chunkRows
queryNode.segcore.interimIndex.enableIndex
queryNode.segcore.interimIndex.nlist
queryNode.segcore.interimIndex.nprobe
queryNode.segcore.interimIndex.memExpansionRate
queryNode.segcore.interimIndex.buildParallelRate
queryNode.segcore.knowhereScoreConsistency
queryNode.loadMemoryUsageFactor
queryNode.enableDisk
queryNode.cache.memoryLimit
queryNode.cache.readAheadPolicy
queryNode.cache.warmup
queryNode.mmap.mmapEnabled
queryNode.mmap.growingMmapEnabled
queryNode.mmap.fixedFileSizeForMmapAlloc
queryNode.mmap.maxDiskUsagePercentageForMmapAlloc
queryNode.lazyload.enabled
queryNode.lazyload.waitTimeout
queryNode.lazyload.requestResourceTimeout
queryNode.lazyload.requestResourceRetryInterval
queryNode.lazyload.maxRetryTimes
queryNode.lazyload.maxEvictPerRetry
queryNode.scheduler.maxReadConcurrentRatio
queryNode.scheduler.cpuRatio
queryNode.scheduler.scheduleReadPolicy.name
queryNode.scheduler.scheduleReadPolicy.taskQueueExpire
queryNode.scheduler.scheduleReadPolicy.enableCrossUserGrouping
queryNode.scheduler.scheduleReadPolicy.maxPendingTaskPerUser
queryNode.dataSync.flowGraph.maxQueueLength
queryNode.dataSync.flowGraph.maxParallelism
queryNode.enableSegmentPrune
queryNode.bloomFilterApplyParallelFactor
queryNode.queryStreamBatchSize
queryNode.workerPooling.size
queryNode.ip
queryNode.port
queryNode.grpc.serverMaxSendSize
queryNode.grpc.serverMaxRecvSize
queryNode.grpc.clientMaxSendSize
queryNode.grpc.clientMaxRecvSize

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?

queryNode-related Configurations

queryNode.stats.publishInterval

queryNode.segcore.knowhereThreadPoolNumRatio

queryNode.segcore.chunkRows

queryNode.segcore.interimIndex.enableIndex

queryNode.segcore.interimIndex.nlist

queryNode.segcore.interimIndex.nprobe

queryNode.segcore.interimIndex.memExpansionRate

queryNode.segcore.interimIndex.buildParallelRate

queryNode.segcore.knowhereScoreConsistency

queryNode.loadMemoryUsageFactor

queryNode.enableDisk

queryNode.cache.memoryLimit

queryNode.cache.readAheadPolicy

queryNode.cache.warmup

queryNode.mmap.mmapEnabled

queryNode.mmap.growingMmapEnabled

queryNode.mmap.fixedFileSizeForMmapAlloc

queryNode.mmap.maxDiskUsagePercentageForMmapAlloc

queryNode.lazyload.enabled

queryNode.lazyload.waitTimeout

queryNode.lazyload.requestResourceTimeout

queryNode.lazyload.requestResourceRetryInterval

queryNode.lazyload.maxRetryTimes

queryNode.lazyload.maxEvictPerRetry

queryNode.scheduler.maxReadConcurrentRatio

queryNode.scheduler.cpuRatio

queryNode.scheduler.scheduleReadPolicy.name

queryNode.scheduler.scheduleReadPolicy.taskQueueExpire

queryNode.scheduler.scheduleReadPolicy.enableCrossUserGrouping

queryNode.scheduler.scheduleReadPolicy.maxPendingTaskPerUser

queryNode.dataSync.flowGraph.maxQueueLength

queryNode.dataSync.flowGraph.maxParallelism

queryNode.enableSegmentPrune

queryNode.bloomFilterApplyParallelFactor

queryNode.queryStreamBatchSize

queryNode.workerPooling.size

queryNode.ip

queryNode.port

queryNode.grpc.serverMaxSendSize

queryNode.grpc.serverMaxRecvSize

queryNode.grpc.clientMaxSendSize

queryNode.grpc.clientMaxRecvSize

Table of contents