An Introduction to Milvus Python SDK and API
By Xuan Yang
The following illustration depicts the interaction between SDKs and Milvus through gRPC. Imagine that Milvus is a black box. Protocol Buffers are used to define the interfaces of the server, and the structure of the information they carry. Therefore, all operations in the black box Milvus is defined by Protocol API.
Milvus Protocol API consists of
schema.proto, which are Protocol Buffers files suffixed with
.proto. To ensure proper operation, SDKs must interact with Milvus with these Protocol Buffers files.
milvus.proto is the vital component of Milvus Protocol API because it defines the
MilvusService, which further defines all RPC interfaces of Milvus.
The following code sample shows the interface
CreatePartitionRequest. It has two major string-type parameters
partition_name, based on which you can start a partition creation request.
Check an example of Protocol in PyMilvus GitHub Repository on line 19.
You can find the definition of
Contributors who wish to develop a feature of Milvus or an SDK in a different programming language are welcome to find all interfaces Milvus offers via RPC.
common.proto defines the common types of information, including
schema.proto defines the schema in the parameters. The following code sample is an example of
schema.proto together constitutes the API of Milvus, representing all operations that can be called via RPC.
If you dig into the source code and observe carefully, you will find that when interfaces like
create_index are called, they actually call multiple RPC interfaces such as
describe_index. Many of the outward interface of Milvus is a combination of multiple RPC interfaces.
Having understood the behaviors of RPC, you can then develop new features for Milvus through combination. You are more than welcome to use your imagination and creativeness and contribute to Milvus community.
To put it in a nutshell, Object-relational mapping (ORM) refers to that when you operate on a local object, such operations will affect the corresponding object on server. PyMilvus ORM-style API features the following characteristics:
- It operates directly on objects.
- It isolates service logic and data access details.
- It hides the complexity of implementation, and you can run the same scripts across different Milvus instances regardless of their deployment approaches or implementation.
One of the essence of ORM-style API lies in the control of Milvus connection. For example, you can specify aliases for multiple Milvus servers, and connect to or disconnect from them merely with their aliases. You can even delete the local server address, and control certain objects via specific connection precisely.
Another feature of ORM-style API is that, after abstraction, all operations can be performed directly on objects, including collection, partition, and index.
You can abstract a collection object by getting an existing one or creating a new one. You can also assign a Milvus connection to specific objects using connection alias, so that you can operate on these objects locally.
To create a partition object, you can either create it with its parent collection object, or you can do it just like when you create a collection object. These methods can be employed on an index object as well.
In the case that these partition or index objects exist, you can get them through their parent collection object.
With the official announcement of general availability of Milvus 2.0, we orchestrated this Milvus Deep Dive blog series to provide an in-depth interpretation of the Milvus architecture and source code. Topics covered in this blog series include:
Like the article? Spread the word