Tutorial

This is a basic introduction to Milvus by pymilvus-orm.

For a runnable python script, checkout example.py on pymilvus-orm Github, or hello milvus on Milvus official website. It's a good recommended start to get started with Milvus and pymilvus-orm as well.

Note

Here we use float vectors as example vector field data, if you want to learn example about binary vectors, see binary vector example.

Prerequisites

Before we start, there are some prerequisites.

Make sure that:

  • You have a running Milvus instance.

  • pymilvus-orm is correctly Installation.

Connect to Milvus

First of all, we need to import pymilvus-orm.

>>> from pymilvus_orm import connections

Then, we can make connection with Milvus server. By default Milvus runs on localhost in port 19530, so you can use default value to connect to Milvus.

>>> host = '127.0.0.1'
>>> port = '19530'
>>> connections.create_connection("default", {"host": host, "port": port})

After connecting, we can communicate with Milvus in the following ways. If you are confused about the terminology, see Milvus Terminology for explanations.

Collection

Now let's create a new collection. Before we start, we can list all the collections already exist. For a brand new Milvus running instance, the result should be empty.

>>> from pymilvus_orm import list_collections
>>> list_collections()
[]

Create Collection

To create collection, we could provide the schema for it.

In this tutorial, we will create a collection with two field: year and embedding.

The type of year field is int64, and the type of embedding is FLOAT_VECTOR whose dim is 128.

Now we can create a collection:

>>> from pymilvus_orm import Collection, DataType, FieldSchema, CollectionSchema
>>> dim = 128
>>> year_field = FieldSchema(name="year", type="int64", is_primary=False, description="year")
>>> embedding_field = FieldSchema(name="embedding", type=DataType.FLOAT_VECTOR, dim=dim)
>>> schema = CollectionSchema(fields=[year_field, embedding_field], auto_id=True)
>>> collection_name = 'tutorial'
>>> collection = Collection(name=tutorial, schema=schema)

Then you can list collections and 'tutorial' will be in the result.

>>> list_collections()
['tutorial']

You can also get info of the collection.

>>> collection.description
"desc of collection"

This tutorial is a basic intro tutorial, building index won't be covered by this tutorial. If you want to go further into Milvus with indexes, it's recommended to check our index examples.

If you're already known about indexes from index examples, and you want a full lists of params supported by PyMilvus, you check out Index chapter of the PyMilvus documentation.

Further more, if you want to get a thorough view of indexes, check our official website for Vector Index.

Create Partition

If you don't create a partition, there will be a default one called "_default", all the entities will be inserted into the "_default" partition. You can check it by Collection.partitions()

>>> collection.partitions()

You can provide a partition name to create a new partition.

>>> collection.partition("new_partition")
>>> collection.partitions()

Insert Entities

An entity is a group of fields that corresponds to real world objects. In this tutorial, collection has two field. Here is an example of 3 entities structured in list of list.

>>> import random
>>> nb = 3
>>> years = [i for i in range(nb)]
>>> embeddings = [[random.random() for _ in range(dim)] for _ in range(nb)]
>>> entities = [years, embeddings]
>>> collection.insert(entities)