milvus-logo
LFAI
Home
  • Administration Guide

quotaAndLimits-related Configurations

QuotaConfig, configurations of Milvus quota and limits.

By default, we enable:

  1. TT protection;

  2. Memory protection.

  3. Disk quota protection.

You can enable:

  1. DML throughput limitation;

  2. DDL, DQL qps/rps limitation;

  3. DQL Queue length/latency protection;

  4. DQL result rate protection;

If necessary, you can also manually force to deny RW requests.

quotaAndLimits.enabled

Description Default Value
`true` to enable quota and limits, `false` to disable. true

quotaAndLimits.quotaCenterCollectInterval

Description Default Value
  • quotaCenterCollectInterval is the time interval that quotaCenter
  • collects metrics from Proxies, Query cluster and Data cluster.
  • seconds, (0 ~ 65536)
  • 3

    quotaAndLimits.limits.allocRetryTimes

    Description Default Value
    retry times when delete alloc forward data from rate limit failed 15

    quotaAndLimits.limits.allocWaitInterval

    Description Default Value
    retry wait duration when delete alloc forward data rate failed, in millisecond 1000

    quotaAndLimits.limits.complexDeleteLimitEnable

    Description Default Value
    whether complex delete check forward data by limiter false

    quotaAndLimits.limits.maxCollectionNumPerDB

    Description Default Value
    Maximum number of collections per database. 65536

    quotaAndLimits.limits.maxInsertSize

    Description Default Value
    maximum size of a single insert request, in bytes, -1 means no limit -1

    quotaAndLimits.limits.maxResourceGroupNumOfQueryNode

    Description Default Value
    maximum number of resource groups of query nodes 1024

    quotaAndLimits.ddl.enabled

    Description Default Value
    Whether DDL request throttling is enabled. false

    quotaAndLimits.ddl.collectionRate

    Description Default Value
  • Maximum number of collection-related DDL requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 collection-related DDL requests per second, including collection creation requests, collection drop requests, collection load requests, and collection release requests.
  • To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
  • -1

    quotaAndLimits.ddl.partitionRate

    Description Default Value
  • Maximum number of partition-related DDL requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including partition creation requests, partition drop requests, partition load requests, and partition release requests.
  • To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
  • -1

    quotaAndLimits.ddl.db.collectionRate

    Description Default Value
    qps of db level , default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollection -1

    quotaAndLimits.ddl.db.partitionRate

    Description Default Value
    qps of db level, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartition -1

    quotaAndLimits.indexRate.enabled

    Description Default Value
    Whether index-related request throttling is enabled. false

    quotaAndLimits.indexRate.max

    Description Default Value
  • Maximum number of index-related requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including index creation requests and index drop requests.
  • To use this setting, set quotaAndLimits.indexRate.enabled to true at the same time.
  • -1

    quotaAndLimits.indexRate.db.max

    Description Default Value
    qps of db level, default no limit, rate for CreateIndex, DropIndex -1

    quotaAndLimits.flushRate.enabled

    Description Default Value
    Whether flush request throttling is enabled. true

    quotaAndLimits.flushRate.max

    Description Default Value
  • Maximum number of flush requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 flush requests per second.
  • To use this setting, set quotaAndLimits.flushRate.enabled to true at the same time.
  • -1

    quotaAndLimits.flushRate.collection.max

    Description Default Value
    qps, default no limit, rate for flush at collection level. 0.1

    quotaAndLimits.flushRate.db.max

    Description Default Value
    qps of db level, default no limit, rate for flush -1

    quotaAndLimits.compactionRate.enabled

    Description Default Value
    Whether manual compaction request throttling is enabled. false

    quotaAndLimits.compactionRate.max

    Description Default Value
  • Maximum number of manual-compaction requests per second.
  • Setting this item to 10 indicates that Milvus processes no more than 10 manual-compaction requests per second.
  • To use this setting, set quotaAndLimits.compaction.enabled to true at the same time.
  • -1

    quotaAndLimits.compactionRate.db.max

    Description Default Value
    qps of db level, default no limit, rate for manualCompaction -1

    quotaAndLimits.dml.enabled

    Description Default Value
    Whether DML request throttling is enabled. false

    quotaAndLimits.dml.insertRate.max

    Description Default Value
  • Highest data insertion rate per second.
  • Setting this item to 5 indicates that Milvus only allows data insertion at the rate of 5 MB/s.
  • To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
  • -1

    quotaAndLimits.dml.insertRate.db.max

    Description Default Value
    MB/s, default no limit -1

    quotaAndLimits.dml.insertRate.collection.max

    Description Default Value
  • Highest data insertion rate per collection per second.
  • Setting this item to 5 indicates that Milvus only allows data insertion to any collection at the rate of 5 MB/s.
  • To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
  • -1

    quotaAndLimits.dml.insertRate.partition.max

    Description Default Value
    MB/s, default no limit -1

    quotaAndLimits.dml.upsertRate.max

    Description Default Value
    MB/s, default no limit -1

    quotaAndLimits.dml.upsertRate.db.max

    Description Default Value
    MB/s, default no limit -1

    quotaAndLimits.dml.upsertRate.collection.max

    Description Default Value
    MB/s, default no limit -1

    quotaAndLimits.dml.upsertRate.partition.max

    Description Default Value
    MB/s, default no limit -1

    quotaAndLimits.dml.deleteRate.max

    Description Default Value
  • Highest data deletion rate per second.
  • Setting this item to 0.1 indicates that Milvus only allows data deletion at the rate of 0.1 MB/s.
  • To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
  • -1

    quotaAndLimits.dml.deleteRate.db.max

    Description Default Value
    MB/s, default no limit -1

    quotaAndLimits.dml.deleteRate.collection.max

    Description Default Value
  • Highest data deletion rate per second.
  • Setting this item to 0.1 indicates that Milvus only allows data deletion from any collection at the rate of 0.1 MB/s.
  • To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
  • -1

    quotaAndLimits.dml.deleteRate.partition.max

    Description Default Value
    MB/s, default no limit -1

    quotaAndLimits.dml.bulkLoadRate.max

    Description Default Value
    MB/s, default no limit, not support yet. TODO: limit bulkLoad rate -1

    quotaAndLimits.dml.bulkLoadRate.db.max

    Description Default Value
    MB/s, default no limit, not support yet. TODO: limit db bulkLoad rate -1

    quotaAndLimits.dml.bulkLoadRate.collection.max

    Description Default Value
    MB/s, default no limit, not support yet. TODO: limit collection bulkLoad rate -1

    quotaAndLimits.dml.bulkLoadRate.partition.max

    Description Default Value
    MB/s, default no limit, not support yet. TODO: limit partition bulkLoad rate -1

    quotaAndLimits.dql.enabled

    Description Default Value
    Whether DQL request throttling is enabled. false

    quotaAndLimits.dql.searchRate.max

    Description Default Value
  • Maximum number of vectors to search per second.
  • Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second no matter whether these 100 vectors are all in one search or scattered across multiple searches.
  • To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
  • -1

    quotaAndLimits.dql.searchRate.db.max

    Description Default Value
    vps (vectors per second), default no limit -1

    quotaAndLimits.dql.searchRate.collection.max

    Description Default Value
  • Maximum number of vectors to search per collection per second.
  • Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second per collection no matter whether these 100 vectors are all in one search or scattered across multiple searches.
  • To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
  • -1

    quotaAndLimits.dql.searchRate.partition.max

    Description Default Value
    vps (vectors per second), default no limit -1

    quotaAndLimits.dql.queryRate.max

    Description Default Value
  • Maximum number of queries per second.
  • Setting this item to 100 indicates that Milvus only allows 100 queries per second.
  • To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
  • -1

    quotaAndLimits.dql.queryRate.db.max

    Description Default Value
    qps, default no limit -1

    quotaAndLimits.dql.queryRate.collection.max

    Description Default Value
  • Maximum number of queries per collection per second.
  • Setting this item to 100 indicates that Milvus only allows 100 queries per collection per second.
  • To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
  • -1

    quotaAndLimits.dql.queryRate.partition.max

    Description Default Value
    qps, default no limit -1

    quotaAndLimits.limitWriting.forceDeny

    Description Default Value
  • forceDeny false means dml requests are allowed (except for some
  • specific conditions, such as memory of nodes to water marker), true means always reject all dml requests.
  • false

    quotaAndLimits.limitWriting.ttProtection.maxTimeTickDelay

    Description Default Value
  • maxTimeTickDelay indicates the backpressure for DML Operations.
  • DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,
  • if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.
  • seconds
  • 300

    quotaAndLimits.limitWriting.memProtection.enabled

    Description Default Value
  • When memory usage > memoryHighWaterLevel, all dml requests would be rejected;
  • When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;
  • When memory usage < memoryLowWaterLevel, no action.
  • true

    quotaAndLimits.limitWriting.memProtection.dataNodeMemoryLowWaterLevel

    Description Default Value
    (0, 1], memoryLowWaterLevel in DataNodes 0.85

    quotaAndLimits.limitWriting.memProtection.dataNodeMemoryHighWaterLevel

    Description Default Value
    (0, 1], memoryHighWaterLevel in DataNodes 0.95

    quotaAndLimits.limitWriting.memProtection.queryNodeMemoryLowWaterLevel

    Description Default Value
    (0, 1], memoryLowWaterLevel in QueryNodes 0.85

    quotaAndLimits.limitWriting.memProtection.queryNodeMemoryHighWaterLevel

    Description Default Value
    (0, 1], memoryHighWaterLevel in QueryNodes 0.95

    quotaAndLimits.limitWriting.growingSegmentsSizeProtection.enabled

    Description Default Value
  • No action will be taken if the growing segments size is less than the low watermark.
  • When the growing segments size exceeds the low watermark, the dml rate will be reduced,
  • but the rate will not be lower than minRateRatio * dmlRate.
  • false

    quotaAndLimits.limitWriting.diskProtection.enabled

    Description Default Value
    When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected; true

    quotaAndLimits.limitWriting.diskProtection.diskQuota

    Description Default Value
    MB, (0, +inf), default no limit -1

    quotaAndLimits.limitWriting.diskProtection.diskQuotaPerDB

    Description Default Value
    MB, (0, +inf), default no limit -1

    quotaAndLimits.limitWriting.diskProtection.diskQuotaPerCollection

    Description Default Value
    MB, (0, +inf), default no limit -1

    quotaAndLimits.limitWriting.diskProtection.diskQuotaPerPartition

    Description Default Value
    MB, (0, +inf), default no limit -1

    quotaAndLimits.limitWriting.l0SegmentsRowCountProtection.enabled

    Description Default Value
    switch to enable l0 segment row count quota false

    quotaAndLimits.limitWriting.l0SegmentsRowCountProtection.lowWaterLevel

    Description Default Value
    l0 segment row count quota, low water level 32768

    quotaAndLimits.limitWriting.l0SegmentsRowCountProtection.highWaterLevel

    Description Default Value
    l0 segment row count quota, low water level 65536

    quotaAndLimits.limitReading.forceDeny

    Description Default Value
  • forceDeny false means dql requests are allowed (except for some
  • specific conditions, such as collection has been dropped), true means always reject all dql requests.
  • false

    quotaAndLimits.limitReading.queueProtection.nqInQueueThreshold

    Description Default Value
  • nqInQueueThreshold indicated that the system was under backpressure for Search/Query path.
  • If NQ in any QueryNode's queue is greater than nqInQueueThreshold, search&query rates would gradually cool off
  • until the NQ in queue no longer exceeds nqInQueueThreshold. We think of the NQ of query request as 1.
  • int, default no limit
  • -1

    quotaAndLimits.limitReading.queueProtection.queueLatencyThreshold

    Description Default Value
  • queueLatencyThreshold indicated that the system was under backpressure for Search/Query path.
  • If dql latency of queuing is greater than queueLatencyThreshold, search&query rates would gradually cool off
  • until the latency of queuing no longer exceeds queueLatencyThreshold.
  • The latency here refers to the averaged latency over a period of time.
  • milliseconds, default no limit
  • -1

    quotaAndLimits.limitReading.resultProtection.maxReadResultRate

    Description Default Value
  • maxReadResultRate indicated that the system was under backpressure for Search/Query path.
  • If dql result rate is greater than maxReadResultRate, search&query rates would gradually cool off
  • until the read result rate no longer exceeds maxReadResultRate.
  • MB/s, default no limit
  • -1

    quotaAndLimits.limitReading.coolOffSpeed

    Description Default Value
  • colOffSpeed is the speed of search&query rates cool off.
  • (0, 1]
  • 0.9
    Table of contents

    Try Managed Milvus for Free

    Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

    Get Started
    Feedback

    Was this page helpful?