This topic introduces the configuration items related to quotas and limits in Milvus.
Some of these configuration items are used to set thresholds for Milvus to proactively throttle DDL/DML/DQL requests related to collections, partitions, indexes, etc.
Some of them are used to set backpressure signals that force Milvus to lower the rate of DDL/DML/DQL requests.
quotaAndLimits.limits.maxCollectionNumPerDB
Description
Default Value
Maximum number of collections per database.
64
quotaAndLimits.ddl.enabled
Description
Default Value
Whether DDL request throttling is enabled.
False
quotaAndLimits.ddl.collectionRate
Description
Default Value
Maximum number of collection-related DDL requests per second.
Setting this item to 10 indicates that Milvus processes no more than 10 collection-related DDL requests per second, including collection creation requests, collection drop requests, collection load requests, and collection release requests.
To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
∞
quotaAndLimits.ddl.partitionRate
Description
Default Value
Maximum number of partition-related DDL requests per second.
Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including partition creation requests, partition drop requests, partition load requests, and partition release requests.
To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
∞
quotaAndLimits.indexRate.enabled
Description
Default Value
Whether index-related request throttling is enabled.
False
quotaAndLimits.indexRate.max
Description
Default Value
Maximum number of index-related requests per second.
Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including index creation requests and index drop requests.
To use this setting, set quotaAndLimits.indexRate.enabled to true at the same time.
∞
quotaAndLimits.flushRate.enabled
Description
Default Value
Whether flush request throttling is enabled.
False
quotaAndLimits.flush.max
Description
Default Value
Maximum number of flush requests per second.
Setting this item to 10 indicates that Milvus processes no more than 10 flush requests per second.
To use this setting, set quotaAndLimits.flushRate.enabled to true at the same time.
∞
quotaAndLimits.compaction.enabled
Description
Default Value
Whether flush request throttling is enabled.
False
quotaAndLimits.compaction.max
Description
Default Value
Maximum number of manual-compaction requests per second.
Setting this item to 10 indicates that Milvus processes no more than 10 manual-compaction requests per second.
To use this setting, set quotaAndLimits.compaction.enabled to true at the same time.
∞
quotaAndLimits.dml.enabled
Description
Default Value
Whether DML request throttling is enabled.
False
quotaAndLimits.dml.insertRate.max
Description
Default Value
Highest data insertion rate per second.
Setting this item to 5 indicates that Milvus only allows data insertion at the rate of 5 MB/s.
To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
∞
quotaAndLimits.dml.insertRate.collection.max
Description
Default Value
Highest data insertion rate per collection per second.
Setting this item to 5 indicates that Milvus only allows data insertion to any collection at the rate of 5 MB/s.
To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
∞
quotaAndLimits.dml.deleteRate.max
Description
Default Value
Highest data deletion rate per second.
Setting this item to 0.1 indicates that Milvus only allows data deletion at the rate of 0.1 MB/s.
To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
∞
quotaAndLimits.dml.deleteRate.collection.max
Description
Default Value
Highest data deletion rate per second.
Setting this item to 0.1 indicates that Milvus only allows data deletion from any collection at the rate of 0.1 MB/s.
To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
∞
quotaAndLimits.dql.enabled
Description
Default Value
Whether DQL request throttling is enabled.
False
quotaAndLimits.dql.searchRate.max
Description
Default Value
Maximum number of vectors to search per second.
Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second no matter whether these 100 vectors are all in one search or scattered across multiple searches.
To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
∞
quotaAndLimits.dql.searchRate.collection.max
Description
Default Value
Maximum number of vectors to search per collection per second.
Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second per collection no matter whether these 100 vectors are all in one search or scattered across multiple searches.
To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
∞
quotaAndLimits.dql.queryRate.max
Description
Default Value
Maximum number of queries per second.
Setting this item to 100 indicates that Milvus only allows 100 queries per second.
To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
∞
quotaAndLimits.dql.queryRate.collection.max
Description
Default Value
Maximum number of queries per collection per second.
Setting this item to 100 indicates that Milvus only allows 100 queries per collection per second.
To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
∞
quotaAndLimits.limitWriting.ttProtection.enabled
Description
Default Value
Whether the backpressure based on time tick delay is enabled.
Maximum time tick delay. A time tick delay is the difference between RootCoord TSO and the minimum time tick of all flow graphs on DataNodes and QueryNodes.
Setting this item to 300 indicates that Milvus reduces the DML request rate as the delay increases and drops all DML requests once the delay reaches the set maximum in seconds.
To use this setting, set quotaAndLimits.limitWriting.ttProtection.enabled to true at the same time.
300
quotaAndLimits.limitWriting.memProtection.enabled
Description
Default Value
Whether the backpressure based on memory water level is enabled.
Maximum number of search vectors or queries. Note that a search request containing multiple search vectors are regarded as multiple seaches, while a query is the same as a search request containing only one search vector.
Setting this item to 10000 indicates that Milvus reduces the DQL request rate as the number of searches and queries reaches the set maximum in milliseconds, and the backpressure is resolved when the number decreases below the set value. The reduction rate id determined by quotaAndLimits.limitReading.coolOffSpeed.
To use this setting, set quotaAndLimits.limitReading.queueProtection.enabled to true at the same time.
Average latency of the queued searches and queries. Note that a search request containing multiple search vectors are regarded as multiple seaches, while a query is the same as a search request containing only one search vector.
Setting this item to 200 indicates that Milvus reduces the DQL request rate as the average latency reaches the set maximum in milliseconds, and the backpressure is resolved when the number decreases below the set value in milliseconds. The reduction rate id determined by quotaAndLimits.limitReading.coolOffSpeed.
To use this setting, set quotaAndLimits.limitReading.queueProtection.enabled to true at the same time.
Setting this item to 2 indicates that Milvus reduces the DQL request rate as the data rate reaches the set maximum in MB/s, and the backpressure is resolved when the number decreases below the set value in MB/s. The reduction rate id determined by quotaAndLimits.limitReading.coolOffSpeed.
To use this setting, set quotaAndLimits.limitReading.resultProtection.enabled to true at the same time.
∞
quotaAndLimits.limitWriting.forceDeny
Description
Default Value
Whether to manually configure Milvus to drop all DQL requests.