Release Notes

Find out what’s new in Milvus! This page summarizes information about new features, improvements, known issues, and bug fixes in each release. You can find the release notes for each released version after v2.0.0-RC1 in this section. We suggest that you regularly visit this page to learn about updates.

v2.0.0-RC7

Release date: 2021-10-11

Compatibility

Milvus version Python SDK version Java SDK version Go SDK version Node SDK version
2.0.0-RC7 2.0.0rc7 Coming soon Coming soon 1.0.18

Milvus 2.0.0-RC7 is a preview version of Milvus 2.0.0-GA. It supports collection alias, shares msgstream on physical channel, and changes the default MinIO and Pulsar dependencies to cluster version. Several resource leaks and deadlocks were fixed.

It should be noted that Milvus 2.0.0-RC7 is NOT compatible with previous versions of Milvus 2.0.0 because of some changes made to storage format.

Improvements

  • #8215 Adds max number of retries for interTask in query coord.

  • #9459 Applies collection start position.

  • #8721 Adds Node ID to Log Name.

  • #8940 Adds streaming segments memory to used memory in checkLoadMemory.

  • #8542 Replaces proto.MarshalTextString with proto.Marshal.

  • #8770 Refactors flowgraph and related invocation.

  • #8666 Changes CMake version.

  • #8653 Updates getCompareOpType.

  • #8697 #8682 #8657 Applies collection start position when opening segment.

  • #8608 Changes segment replica structure.

  • #8565 Refactors buffer size calculation.

  • #8262 Adds segcore logger.

  • #8138 Adds BufferData in insertBufferNode.

  • #7738 Implements allocating msgstream from pool when creating collections.

  • #8054 Improves codes in insertBufferNode.

  • #7909 Upgrades pulsar-client-go to 0.6.0.

  • #7913 Moves segcore rows_per_chunk configuration to query_node.yaml.

  • #7792 Removes ctx from LongTermChecker.

  • #9269 Changes == to is when comparing to None in expression.

  • #8159 Make FlushSegments async.

  • #8278 Refactor rocksmq close logic and improve codecov.

  • #7797 Uses definitional type instead of raw type.

Features

  • #9579 Uses replica memory size and cacheSize in getSystemInfoMetrics.

  • #9556 Adds ProduceMark interface to return message ID.

  • #9554 Supports LoadPartial interface for DataKV.

  • #9471 Supports DescribeCollection by collection ID.

  • #9451 Stores index parameters to descriptor event.

  • #8574 Adds a round_decimal parameter for precision control to search function.

  • #8947 Rocksmq supports SubscriptionPositionLatest.

  • #8919 Splits blob into several string rows when index file is large.

  • #8914 Binlog parser tool supports index files.

  • #8514 Refactors the index file format.

  • #8765 Adds cacheSize to prevent OOM in query node.

  • #8673 #8420 #8212 #8272 #8166 Supports multiple Milvus clusters sharing Pulsar and MinIO.

  • #8654 Adds BroadcastMark for Msgstream returning Message IDs.

  • #8586 Adds Message ID return value into producers.

  • #8408 #8363 #8454 #8064 #8480 Adds session liveness check.

  • #8264 Adds description event extras.

  • #8341 Replaces MarshalTextString with Marshal in root coord.

  • #8228 Supports healthz check API.

  • #8276 Initializes the SIMD type when initializing an index node.

  • #7967 Adds knowhere.yaml to support knowhere configuration.

  • #7974 Supports setting max task number of task queue.

  • #7948 #7975 Adds suffixSnapshot to implement SnapshotKV.

  • #7942 Supports configuring SIMD type.

  • #7814 Supports bool field filter in search and query expression.

  • #7635 Supports setting segcore rows_per_chunk via configuration file.

Bug Fixes

  • #9572 Rocksdb does not delete the end key after DeleteRange is called.

  • #8735 Acked infomation takes up memory resources.

  • #9454 Data race in query service.

  • #8850 SDK raises error with a message about index when dropping collection by alias.

  • #8930 Flush occasionally gets stuck when SaveBinlogPath fails due to instant buffer removal from insertBuf.

  • #8868 Trace log catches the wrong file name and line number.

  • #8844 SearchTask result is nil.

  • #8835 Root coord crashes because of bug in pulsar-client-go.

  • #8780 #8268 #7255 Collection alias-related issues.

  • #8744 Rocksdb_kv error process.

  • #8752 Data race in mqconsumer.

  • #8686 Flush after auto-flush will not finish.

  • #8564 #8405 #8743 #8798 #9509 #8884 rocksdb memory leak.

  • #8671 Objects are not removed in MinIO when dropped.

  • #8050 #8545 #8567 #8582 #8562 tsafe-related issues.

  • #8137 Time goes backward because TSO does not load last timestamp.

  • #8461 Potential data race in data coord.

  • #8386 Incomplete logic when allocating dm channel to data node.

  • #8206 Incorrect reduce algorithm in proxy search task.

  • #8120 Potential data race in root coord.

  • #8068 Query node crashes when query result is empty and optional retrieve_ret_ is not initialized.

  • #8060 Query task panicking.

  • #8091 Data race in proxy gRPC client.

  • #8078 Data race in root coord gRPC client.

  • #7730 Topic and ConsumerGroup remain after CloseRocksMQ.

  • #8188 Logic error in releasing collections.

v2.0.0-RC6

Release date: 2021-09-10

Compatibility

Milvus version Python SDK version Java SDK version Go SDK version Node SDK version
2.0.0-RC6 2.0.0rc7 Coming soon Coming soon 1.0.18

Milvus 2.0.0-RC6 is a preview version of Milvus 2.0.0. It supports specifying shard number when creating collections, and query by expression. It exposes more cluster metrics through API. In RC6 we increase the unit test coverage to 80%. We also fixed a series of issues involving resource leakage, system panic, etc.

Improvements

  • Increases unit test coverage to 80%.

Features

  • #7482 Supports specifying shard number when creating a collection.
  • #7386 Supports query by expression.
  • Exposes system metrics through API:
    • #7400 Proxy metrics integrate with other coordinators.
    • #7177 Exposes metrics of data node and data coord.
    • #7228 Exposes metrics of root coord.
    • #7472 Exposes more detailed metrics information.
    • #7436 Supports caching the system information metrics.

Bug Fixes

  • #7434 Query node OOM if loading a collection that beyond the memory limit.
  • #7678 Standalone OOM when recovering from existing storage.
  • #7636 Standalone panic when sending message to a closed channel.
  • #7631 Milvus panic when closing flowgraph.
  • #7605 Milvus crashed with panic when running nightly CI tests.
  • #7596 Nightly cases failed because rootcoord disconnected with etcd.
  • #7557 Wrong search result returned when the term content in expression is not in order.
  • #7536 Incorrect MqMsgStream Seek logic.
  • #7527 Dataset's memory leak in knowhere when searching.
  • #7444 Deadlock of channels time ticker.
  • #7428 Possible deadlock when MqMsgStream broadcast fails.
  • #7715 Query request overwritten by concurrent operations on the same slice.

v2.0.0-RC5

Release date: 2021-08-30

Compatibility

Milvus version Python SDK version Java SDK version Go SDK version Node SDK version
2.0.0-RC5 2.0.0rc7 Coming soon Coming soon 1.0.18

Milvus 2.0.0-RC5 is a preview version of Milvus 2.0.0. It supports message queue data retention mechanism and etcd data cleanup, exposes cluster metrics through API, and prepares for delete operation support. RC5 also made great progress on system stability. We fixed a series of resource leakage, operation hang and the misconfiguration of standalone Pulsar under Milvus cluster.

Improvements

  • #7226 Refactors data coord allocator.
  • #6867 Adds connection manager.
  • #7172 Adds a seal policy to restrict the lifetime of a segment.
  • #7163 Increases the timeout for gRPC connection when creating index.
  • #6996 Adds a minimum interval for segment flush.
  • #6590 Saves binlog path in SegmentInfo.
  • #6848 Removes RetrieveRequest and RetrieveTask.
  • #7102 Supports vector field as output.
  • #7075 Refactors NewEtcdKV API.
  • #6965 Adds channel for data node to watch etcd.
  • #7066 Optimizes search reduce logics.
  • #6993 Enhances the log when parsing gRPC recv/send parameters.
  • #7331 Changes context to correct package.
  • #7278 Enables etcd auto compaction for every 1000 revision.
  • #7355 Clean fmt.Println in util/flowgraph.

Features

  • #7112 #7174 Imports an embedded etcdKV (part 1).
  • #7231 Adds a segment filter interface.
  • #7157 Exposes metrics of index coord and index nodes.
  • #7137 #7157 Exposes system topology information by proxy.
  • #7113 #7157 Exposes metrics of query coord and query nodes.
  • #7134 Allows users to get vectors using memory instead of local storage.
  • #6617 Supports retention for rocksmq.
  • #7303 Adds query node segment filter.
  • #7304 Adds delete API into proto.
  • #7261 Adds delete node.
  • #7268 Constructs Bloom filter when inserting.

Bug Fixes

  • #7272 #7352 #7335 Failure to start new docker container with existing volumes if index was created: proxy is not healthy.
  • #7243 Failure to create index in a new version of Milvus for data that were inserted in an old version.
  • #7253 Search gets empty results after releasing a different partition.
  • #7244 #7227 Proxy crashes when receiving empty search results.
  • #7203 Connection gets stuck when gRPC server is down.
  • #7188 Incomplete unit test logics.
  • #7175 Unspecific error message returns when calculating distances using collection IDs without loading.
  • #7151 Data node flowgraph does not close caused by missing DropCollection.
  • #7167 Failure to load IVF_FLAT index.
  • #7123 Timestamp go back for timeticksync.
  • #7140 calc_distance returns wrong results for binary vectors when using TANIMOTO metrics.
  • #7143 The state of memory and etcd is inconsistent if KV operation fails.
  • #7141 #7136 Index building gets stuck when the index node pod is frequently killed and pulled up.
  • #7119 Pulsar msgStream may get stuck when subscribed with the same topic and sub name.
  • #6971 Exception occurs when searching with index (HNSW).
  • #7104 Search gets stuck if query nodes only load sealed segment without watching insert channels.
  • #7085 Segments do not auto flush.
  • #7074 Index nodes wait for index coord to start to complete.
  • #7061 Segment allocation does not expire if data coord does not receive timetick message from data node.
  • #7059 Query nodes get producer leakage.
  • #7005 Query nodes do not return error to query coord when loadSegmentInternal fails.
  • #7054 Query nodes return incorrect IDs when topk is larger than row_num.
  • #7053 Incomplete allocation logics.
  • #7044 Lack of check on unindexed vectors in memory before retriving vectors in local storage.
  • #6862 Memory leaks in flush cache of data node.
  • #7346 Query coord container exited in less than 1 minute when re-installing Milvus cluster.
  • #7339 Incorrect expression boundary.
  • #7311 Collection nil when adding query collection.
  • #7266 Flowgraph released incorrectly.
  • #7310 Excessive timeout when searching after releasing and loading a partition.
  • #7320 Port conflicts between embedded etcd and external etcd.
  • #7336 Data node corner cases.

v2.0.0-RC4

Release date: 2021-08-13

Compatibility

Milvus version Python SDK version Java SDK version Go SDK version
2.0.0-RC4 2.0.0rc7 Coming soon Coming soon

Milvus 2.0.0-RC4 is a preview version of Milvus 2.0.0. It mainly focuses on fixing stability issues, it also offers functionalities to retrieve vector data from object storage and specify output field by wildcard matching.

Improvements

  • #6984 #6772 #6704 #6652 #6536 #6522 Unit test improvements.

  • #6859 Increases the MaxCallRecvMsgSize and MaxCallSendMsgSize of gRPC client.

  • #6796 Fixes MsgStream exponential retry.

  • #6897 #6899 #6681 #6766 #6768 #6597 #6501 #6477 #6478 #6935 #6871 #6671 #6682 Log improvements.

  • #6440 Refactors segment manager.

  • #6421 Splits raw vectors to several smaller binlog files when creating index.

  • #6466 Separates the idea of query and search.

  • #6505 Changes output_fields to out_fields_id for RetrieveRequest.

  • #6427 Refactors the logic of assigning tasks in index coord.

  • #6529 #6599 Refactors the snapshot of timestamp statistics.

  • #6692 #6343 Shows/Describes collections/partitions with created timestamps.

  • #6629 Adds the WatchWithVersion interface for etcdKV.

  • #6666 Refactors expression executor to use single bitsets.

  • #6664 Auto creates new segments when allocating rows that exceeds the maximum number of rows per segment.

  • #6786 Refactors RangeExpr and CompareExpr.

  • #6497 Looses the lower limit of dimension when searching on a binary vector field.

Features

  • #6706 Supports reading vectors from disk.

  • #6299 #6598 Supports query vector field.

  • #5210 Extends the grammar of Boolean expressions.

  • #6411 #6650 Supports wildcards and wildcard matching on search/query output fields.

  • #6464 Adds a vector chunk manager to support vector file local storage.

  • #6701 Supports data persistence with docker compose deployments.

  • #6767 Adds a Grafana dashboard .json file for Milvus.

Bug fixes

  • #5443 CalcDistance returns wrong results when fetching vectors from collection.

  • #7004 Pulsar consumer causes goroutine leakage.

  • #6946 Data race occurs when a flow graph close() immediately after start().

  • #6903 Uses proto marshal instead of marshalTextString in querycoord to avoid crash triggered by unknown field name crash.

  • #6374 #6849 Load collection failure.

  • #6977 Search returns wrong limit after a partition or collection is dropped.

  • #6515 #6567 #6552 #6483 Data node BackGroundGC does not work and causes memory leak.

  • #6943 The MinIOKV GetObject method does not close client and causes goroutine leaking per call.

  • #6370 Search is stuck due to wrong semantics offered by load partition.

  • #6831 Data node crashes in meta service.

  • #6469 Search binary results are wrong with metrics of Hamming when limit (topK) is bigger than the quantity of inserted entities.

  • #6693 Timeout caused by segment race condition.

  • #6097 Load hangs after frequently restarting query node within a short period of time.

  • #6464 Data sorter edge cases.

  • #6419 Milvus crashes when inserting empty vectors.

  • #6477 Different components repeatedly create buckets in MinIO.

  • #6377 Query results get incorrect global sealed segments from etcd.

  • #6499 TSO allocates wrong timestamps.

  • #6501 Channels are lost after data node crashes.

  • #6527 Task info of watchQueryChannels can't be deleted from etcd.

  • #6576 #6526 Duplicate primary field IDs are added when retrieving entities.

  • #6627 #6569 std::sort does not work properly to filter search results when the distance of new record is NaN.

  • #6655 Proxy crashes when retrieve task is called.

  • #6762 Incorrect created timestamp of collections and partitions.

  • #6644 Data node failes to restart automatically.

  • #6641 Failure to stop data coord when disconnecting with etcd.

  • #6621 Milvus throws an exception when the inserted data size is larger than the segment.

  • #6436 #6573 #6507 Incorrect handling of time synchronization.

  • #6732 Failure to create IVF_PQ index.

v2.0.0-RC2

Release date: 2021-07-13

Compatibility

Milvus version Python SDK version Java SDK version Go SDK version
2.0.0-RC2 2.0.0rc7 Coming soon Coming soon

Milvus 2.0.0-RC2 is a preview version of Milvus 2.0.0. It fixes stability and performance issues and refactors code for node and storage management.

Improvements

  • #6356 Refactors code for cluster in data coordinator.
  • #6300 Refactors code for meta management in data coordinator. (#6300)
  • #6289 Adds collectionID and partitionID to SegmentIndexInfo.
  • #6258 Clears the corresponding searchMsgStream in proxy when calling releaseCollection().
  • #6227 Merges codes relating to retrieve and search in query node.
  • #6196 Adds candidate management for data coordinator to manage data node cluster.
  • #6188 Adds Building Milvus with Docker Docs. (#6188)

Features

  • #6386 Adds the fget_objects() method for loading files from MinIO to the local device.
  • #6253 Adds the GetFlushedSegments() method in data coordinator.
  • #6213 Adds the GetIndexStates() method.

Bug fixes

  • #6184 Search accuracy worsens when dataset gets larger.
  • #6308 The server crashes if the KNNG in NSG is not full.
  • #6212 Search hangs after restarting query nodes.
  • #6265 The server does not check node status when detecting nodes are online.
  • #6359 #6334 An error occurs when compiling Milvus on CentOS

v2.0.0-RC1

Release date: 2021-06-28

Compatibility

Milvus version Python SDK version Java SDK version Go SDK version
2.0.0-RC1 2.0.0rc7 Coming soon Coming soon

Milvus 2.0.0-RC1 is the preview version of 2.0.0. It introduces Golang as the distributed layer development language and a new cloud-native distributed design. The latter brings significant improvements to scalability, elasticity, and functionality.

Architecture

Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility.

The system breaks down into four levels:

  • Access layer
  • Coordinator service
  • Worker nodes
  • Storage

Access layer: The front layer of the system and endpoint to users. It comprises peer proxies for forwarding requests and gathering results.

Coordinator service: The coordinator service assigns tasks to the worker nodes and functions as the system's brain. It has four coordinator types: root coord, data coord, query coord, and index coord.

Worker nodes: Worker nodes are dumb executors that follow the instructions from the coordinator service. There are three types of worker nodes, each responsible for a different job: data nodes, query nodes, and index nodes.

Storage: The cornerstone of the system that all other functions depend on. It has three storage types: meta storage, log broker, and object storage. Kudos to the open-source communities of etcd, Pulsar, MinIO, and RocksDB for building this fast, reliable storage.

For more information about how the system works, see Milvus 2.0 Architecture.

New Features

SDK

  • Object-relational mapping (ORM) PyMilvus

    The PyMilvus APIs operate directly on collections, partitions, and indexes, helping users focus on the building of an effective data model rather than the detailed implementation.

Core Features

  • Hybrid Search between scalar and vector data

    Milvus 2.0 supports storing scalar data. Operators such as GREATER, LESS, EQUAL, NOT, IN, AND, and OR can be used to filter scalar data before a vector search is conducted. Currently supported data types include bool, int8, int16, int32, int64, float, and double. Support for string/VARBINARY data will be offered in a later version.

  • Match query

    Unlike the search operation, which returns similar results, the match query operation returns exact matches. Match query can be used to retrieve vectors by ID or by condition.

  • Tunable consistency

    Distributed databases make tradeoffs between consistency and availability/latency. Milvus offers four consistency levels (from strongest to weakest): strong, bounded staleness, session, and consistent prefix. You can define your own read consistency by specifying the read timestamp. As a rule of thumb, the weaker the consistency level, the higher the availability and the higher the performance.

  • Time travel

    Time travel allows you to access historical data at any point within a specified time period, making it possible to query data in the past, restore, and backup.

Miscellaneous

  • Supports installing Milvus 2.0 with Helm or Docker-compose.

  • Compatibility with Prometheus and Grafana for monitoring and alerts.

  • Milvus Insight

    Milvus Insight is a graphical management system for Milvus. It features visualization of cluster states, meta management, data queries and more. Milvus Insight will eventually be open sourced.

Breaking Changes

Milvus 2.0 uses an entirely different programming language, data format, and distributed architecture compared with previous versions. This means prior versions of Milvus cannot be upgraded to 2.x. However, Milvus 1.x is receiving long-term support and data migration tools will be made available as soon as possible.

Specific breaking changes include:

  • JAVA, Go, or C++ SDK is not yet supported.

  • Delete or update is not yet supported.

  • PyMilvus-ORM does not support force flush.

  • Data format is incompatible with all prior versions.

  • Mishards is deprecated because Milvus 2.0 is distributed and sharding middleware is no longer necessary.

  • Local file system and distributed system storage are not yet supported.

Is this page helpful?
Scored Successfully!