This is a basic introduction to Milvus by PyMilvus.

For a runnable python script, checkout on PyMilvus Github, or hello milvus on Milvus official website. It's a good recommended start to get started with Milvus and PyMilvus as well.


This tutorial uses float vectors as example vector field data, if you want to learn example about binary vectors, see binary vector example.


Before you start, there are some prerequisites.

Make sure that:

  • You have a running Milvus instance.

  • PyMilvus is correctly installed, see Installation.

Connect to Milvus

First of all, you need to import pymilvus.

>>> from pymilvus import connections

Then, you can make connection with Milvus server. By default Milvus runs on localhost in port 19530, so you can use default value to connect to Milvus.

>>> connections.connect() # connect by default value

Or you can add other Milvus server address by:

>>> host = ''
>>> port = '19530'
>>> connections.add_connection(dev={"host": host, "port": port})

After connecting, you can communicate with Milvus in the following ways. If you are confused about the terminology, see Milvus Terminology for explanations.


Now it's time to create a new collection. You can list all the collections already exist. For a brand new Milvus running instance, the result should be empty.

>>> from pymilvus import utility
>>> utility.list_collections()

Create Collection

To create a collection, you need to provide schema for it.

In this tutorial, you will create a collection with three fields: id, year and embedding.

  • The type of id field is int64, and it is set as primary field.

  • The type of year field is int64, and the type of embedding is FLOAT_VECTOR whose dim is 128.

>>> from pymilvus import Collection, DataType, FieldSchema, CollectionSchema
>>> dim = 128
>>> fields = [
...     FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False),
...     FieldSchema(name="year", dtype=DataType.INT64, description="year"),
...     FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128),
... ]
>>> schema = CollectionSchema(fields, description='desc of collection')
>>> collection_name = "tutorial"
>>> tutorial = Collection(collection_name, schema, consistency_level="Strong")

Then you can list collections and 'tutorial' will be in the result.

>>> utility.list_collections()

This tutorial is a basic intro tutorial, building index won't be covered by this tutorial. If you want to go further into Milvus with indexes, it's recommended to check our index examples.

If you're already known about indexes from index examples, and you want a full lists of params supported by PyMilvus, you check out Index chapter of the PyMilvus documentation.

Further more, if you want to get a thorough view of indexes, check our official website for Vector Index.

Create Partition

If you don't create a partition, there will be a default one called "_default", all the entities will be inserted into the "_default" partition. You can check it by Collection.partitions()

>>> tutorial.partitions
[{"name": "_default", "description": "", "num_entities": 0}]

You can provide a partition name to create a new partition.

>>> tutorial.create_partition("comedy")
>>> tutorial.partitions
[{"name": "_default", "description": "", "num_entities": 0}, {"name": "comedy", "description": "", "num_entities": 0}]

Insert Entities

An entity is a group of fields that corresponds to real world objects. In this tutorial, collection has three fields. Here is an example of 30 entities structured in list of list.

>>> import random
>>> num_entities = 30
>>> entities = [
...     [i for i in range(num_entities)], # field id
...     [random.randrange(1949, 2021) for _ in range(num_entities)],  # field year
...     [[random.random() for _ in range(128)] for _ in range(num_entities)],  # field embedding
... ]
>>> insert_result = tutorial.insert(entities)
>>> insert_result
(insert count: 30, delete count: 0, upsert count: 0, timestamp: 430704946903515140)