milvus-logo

Quota- and Limit-related configurations

This topic introduces the configuration items related to quotas and limits in Milvus.

Some of these configuration items are used to set thresholds for Milvus to proactively throttle DDL/DML/DQL requests related to collections, partitions, indexes, etc.

Some of them are used to set backpressure signals that force Milvus to lower the rate of DDL/DML/DQL requests.

quotaAndLimits.ddl.enabled

Description Default Value
Whether DDL request throttling is enabled. False

quotaAndLimits.ddl.collectionRate

Description Default Value
  • Maximum number of collection-related DDL requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 collection-related DDL requests per second, including collection creation requests, collection drop requests, collection load requests, and collection release requests.
  • To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
  • quotaAndLimits.ddl.partitionRate

    Description Default Value
  • Maximum number of partition-related DDL requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including partition creation requests, partition drop requests, partition load requests, and partition release requests.
  • To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
  • quotaAndLimits.indexRate.enabled

    Description Default Value
    Whether index-related request throttling is enabled. False

    quotaAndLimits.indexRate.max

    Description Default Value
  • Maximum number of index-related requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including index creation requests and index drop requests.
  • To use this setting, set quotaAndLimits.indexRate.enabled to true at the same time.
  • quotaAndLimits.flushRate.enabled

    Description Default Value
    Whether flush request throttling is enabled. False

    quotaAndLimits.flush.max

    Description Default Value
  • Maximum number of flush requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 flush requests per second.
  • To use this setting, set quotaAndLimits.flushRate.enabled to true at the same time.
  • quotaAndLimits.compaction.enabled

    Description Default Value
    Whether flush request throttling is enabled. False

    quotaAndLimits.compaction.max

    Description Default Value
  • Maximum number of manual-compaction requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 manual-compaction requests per second.
  • To use this setting, set quotaAndLimits.compaction.enabled to true at the same time.
  • quotaAndLimits.dml.enabled

    Description Default Value
    Whether DML request throttling is enabled. False

    quotaAndLimits.dml.insertRate.max

    Description Default Value
  • Highest data insertion rate per second.
  • Setting this item to 5 indicates that Milvus only allows data insertion at the rate of 5 MB/s.
  • To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
  • quotaAndLimits.dml.deleteRate.max

    Description Default Value
  • Highest data insertion rate per second.
  • Setting this item to 0.1 indicates that Milvus only allows data insertion at the rate of 0.1 MB/s.
  • To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
  • quotaAndLimits.dql.enabled

    Description Default Value
    Whether DQL request throttling is enabled. False

    quotaAndLimits.dql.searchRate.max

    Description Default Value
  • Maximum number of vectors to search per second.
  • Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second no matter whether these 100 vectors are all in one search or scattered across multiple searches.
  • To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
  • quotaAndLimits.dql.queryRate.max

    Description Default Value
  • Maximum number of queries per second.
  • Setting this item to 100 indicates that Milvus only allows 100 queries per second.
  • To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
  • quotaAndLimits.limitWriting.ttProtection.enabled

    Description Default Value
    Whether the backpressure based on time tick delay is enabled. False

    quotaAndLimits.limitWriting.ttProtection.maxTimeTickDelay

    Description Default Value
  • Maximum time tick delay. A time tick delay is the difference between RootCoord TSO and the minimum time tick of all flow graphs on DataNodes and QueryNodes.
  • Setting this item to 300 indicates that Milvus reduces the DML request rate as the delay increases and drops all DML requests once the delay reaches the set maximum in milliseconds.
  • To use this setting, set quotaAndLimits.limitWriting.ttProtection.enabled to true at the same time.
  • 300

    quotaAndLimits.limitWriting.memProtection.enabled

    Description Default Value
    Whether the backpressure based on memory water level is enabled. False

    quotaAndLimits.limitWriting.memProtection.dataNodeMemoryLowWaterLevel

    Description Default Value
  • Low memory water level on DataNodes. The memory water level is the ratio between the used memory and total memory on DataNodes.
  • Setting this item to 0.85 indicates that Milvus reduces the DML request rate as the memory water level on DataNodes reaches the set value.
  • To use this setting, set quotaAndLimits.limitWriting.memProtection.enabled to true at the same time.
  • 0.85

    quotaAndLimits.limitWriting.memProtection.queryNodeMemoryLowWaterLevel

    Description Default Value
  • Low memory water level on QueryNodes. The memory water level is the ratio between the used memory and total memory on QueryNodes.
  • Setting this item to 0.85 indicates that Milvus reduces the DML request rate as the memory water level on QueryNodes reaches the set value.
  • To use this setting, set quotaAndLimits.limitWriting.memProtection.enabled to true at the same time.
  • 0.85

    quotaAndLimits.limitWriting.memProtection.dataNodeMemoryHighWaterLevel

    Description Default Value
  • High memory water level on DataNodes. The memory water level is the ratio between the used memory and total memory on DataNodes.
  • Setting this item to 0.95 indicates that Milvus drops all DML requests as the memory water level on DataNodes reaches the set value.
  • To use this setting, set quotaAndLimits.limitWriting.memProtection.enabled to true at the same time.
  • 0.95

    quotaAndLimits.limitWriting.memProtection.queryNodeMemoryHighWaterLevel

    Description Default Value
  • High memory water level on QueryNodes. The memory water level is the ratio between the used memory and total memory on QueryNodes.
  • Setting this item to 0.95 indicates that Milvus drops all DML requests as the memory water level on QueryNodes reaches the set value.
  • To use this setting, set quotaAndLimits.limitWriting.memProtection.enabled to true at the same time.
  • 0.95

    quotaAndLimits.limitWriting.diskProtection.enabled

    Description Default Value
    Whether the backpressure based on disk quota is enabled. False

    quotaAndLimits.limitWriting.diskProtection.diskQuota

    Description Default Value
  • Disk quota allocated to binlog.
  • Setting this item to 8192 indicates that Milvus drops all DML requests as the size of binlog reaches the set value.
  • To use this setting, set quotaAndLimits.limitWriting.diskProtection.enabled to true at the same time.
  • quotaAndLimits.limitWriting.forceDeny

    Description Default Value
    Whether to manually configure Milvus to drop all DML requests. False

    quotaAndLimits.limitReading.queueProtection.enabled

    Description Default Value
    Whether the backpressure based on the lengths of the search and query queue is enabled. False

    quotaAndLimits.limitReading.queueProtection.nqInQueueThreshold

    Description Default Value
  • Maximum number of search vectors or queries. Note that a search request containing multiple search vectors are regarded as multiple seaches, while a query is the same as a search request containing only one search vector.
  • Setting this item to 10000 indicates that Milvus reduces the DQL request rate as the number of searches and queries reaches the set maximum in milliseconds, and the backpressure is resolved when the number decreases below the set value. The reduction rate id determined by quotaAndLimits.limitReading.coolOffSpeed.
  • To use this setting, set quotaAndLimits.limitReading.queueProtection.enabled to true at the same time.
  • quotaAndLimits.limitReading.queueProtection.queueLatencyThreshold

    Description Default Value
  • Average latency of the queued searches and queries. Note that a search request containing multiple search vectors are regarded as multiple seaches, while a query is the same as a search request containing only one search vector.
  • Setting this item to 200 indicates that Milvus reduces the DQL request rate as the average latency reaches the set maximum in milliseconds, and the backpressure is resolved when the number decreases below the set value in milliseconds. The reduction rate id determined by quotaAndLimits.limitReading.coolOffSpeed.
  • To use this setting, set quotaAndLimits.limitReading.queueProtection.enabled to true at the same time.
  • quotaAndLimits.limitReading.resultProtection.enabled

    Description Default Value
    Whether the backpressure based on the rate of the query results is enabled. False

    quotaAndLimits.limitReading.resultProtection.maxReadResultRate

    Description Default Value
  • Rate of the data returned to the client.
  • Setting this item to 2 indicates that Milvus reduces the DQL request rate as the data rate reaches the set maximum in MB/s, and the backpressure is resolved when the number decreases below the set value in MB/s. The reduction rate id determined by quotaAndLimits.limitReading.coolOffSpeed.
  • To use this setting, set quotaAndLimits.limitReading.resultProtection.enabled to true at the same time.
  • quotaAndLimits.limitWriting.forceDeny

    Description Default Value
    Whether to manually configure Milvus to drop all DQL requests. False
    Is this page helpful?
    On this page