milvus-logo
LFAI
Home
  • User Guide

Create a Collection

This topic describes how to create a collection in Milvus.

A collection consists of one or more partitions. While creating a new collection, Milvus creates a default partition _default. See Glossary - Collection for more information.

The following example builds a two-shard collection named book, with a primary key field named book_id, an INT64 scalar field named word_count, and a two-dimensional, floating-point vector field named book_intro. Real applications will likely use much higher dimensional vectors than the example.

Milvus supports setting consistency level while creating a collection (only on PyMilvus currently). In this example, the consistency level of the collection is set as Strong, meaning Milvus will read the most updated data view at the exact time point when a search or query request comes. By default, a collection created without specifying the consistency level is set with Bounded consistency level, under which Milvus reads a less updated data view (usually several seconds earlier) when a search or query request comes. Besides collection creation, you can also set the consistency level specifically for search or query (only on PyMilvus currently). The consistency level set in search or query requests overwrites the one set while creating the collection. Therefore, we recommend setting the consistency level during a search or query. For other consistency level supported by Milvus, see Guarantee Timestamp in Search Requests.

Prepare Schema

  • Connecting to Milvus server before any operation.
  • The collection to create must contain a primary key field and a vector field. INT64 is the only supported data type for the primary key field in current release of Milvus.

First, prepare necessary parameters, including field schema, collection schema, and collection name.

from pymilvus import CollectionSchema, FieldSchema, DataType
book_id = FieldSchema(
  name="book_id", 
  dtype=DataType.INT64, 
  is_primary=True, 
)
word_count = FieldSchema(
  name="word_count", 
  dtype=DataType.INT64,  
)
book_intro = FieldSchema(
  name="book_intro", 
  dtype=DataType.FLOAT_VECTOR, 
  dim=2
)
schema = CollectionSchema(
  fields=[book_id, word_count, book_intro], 
  description="Test book search"
)
collection_name = "book"
const params = {
  collection_name: "book",
  description: "Test book search",
  fields: [
    {
      name: "book_intro",
      description: "",
      data_type: 101,  // DataType.FloatVector
      type_params: {
        dim: "2",
      },
    },
    {
      name: "book_id",
      data_type: 5,   //DataType.Int64
      is_primary_key: true,
      description: "",
    },
    {
      name: "word_count",
      data_type: 5,    //DataType.Int64
      description: "",
    },
  ],
};
var (
        collectionName = "book"
    )
schema := &entity.Schema{
  CollectionName: collectionName,
  Description:    "Test book search",
  Fields: []*entity.Field{
    {
      Name:       "book_id",
      DataType:   entity.FieldTypeInt64,
      PrimaryKey: true,
      AutoID:     false,
    },
    {
      Name:       "word_count",
      DataType:   entity.FieldTypeInt64,
      PrimaryKey: false,
      AutoID:     false,
    },
    {
      Name:     "book_intro",
      DataType: entity.FieldTypeFloatVector,
      TypeParams: map[string]string{
          "dim": "2",
      },
    },
  },
}
FieldType fieldType1 = FieldType.newBuilder()
        .withName("book_id")
        .withDataType(DataType.Int64)
        .withPrimaryKey(true)
        .withAutoID(false)
        .build();
FieldType fieldType2 = FieldType.newBuilder()
        .withName("word_count")
        .withDataType(DataType.Int64)
        .build();
FieldType fieldType3 = FieldType.newBuilder()
        .withName("book_intro")
        .withDataType(DataType.FloatVector)
        .withDimension(2)
        .build();
CreateCollectionParam createCollectionReq = CreateCollectionParam.newBuilder()
        .withCollectionName("book")
        .withDescription("Test book search")
        .withShardsNum(2)
        .addFieldType(fieldType1)
        .addFieldType(fieldType2)
        .addFieldType(fieldType3)
        .build();
create collection -c book -f book_id:INT64 -f word_count:INT64 -f book_intro:FLOAT_VECTOR:2 -p book_id
Parameter Description Option
FieldSchema Schema of the fields within the collection to create. Refer to Schema for more information. N/A
name Name of the field to create. N/A
dtype Data type of the field to create. For primary key field:
  • DataType.INT64 (numpy.int64)
For scalar field:
  • DataType.BOOL (Boolean)
  • DataType.INT64 (numpy.int64)
  • DataType.FLOAT (numpy.float32)
  • DataType.DOUBLE (numpy.double)
For vector field:
  • BINARY_VECTOR (Binary vector)
  • FLOAT_VECTOR (Float vector)
is_primary (Mandatory for primary key field) Switch to control if the field is primary key field. True or False
auto_id (Mandatory for primary key field) Switch to enable or disable automatic ID (primary key) allocation. True or False
dim (Mandatory for vector field) Dimension of the vector. [1, 32,768]
description (Optional) Description of the field. N/A
CollectionSchema Schema of the collection to create. Refer to Schema for more information. N/A
fields Fields of the collection to create. N/A
description (Optional) Description of the collection to create. N/A
collection_name Name of the collection to create. N/A
Parameter Description Option
collectionName Name of the collection to create. N/A
description Description of the collection to create. N/A
Fields Schema of the fields within the collection to create. Refer to Schema for more information. N/A
Name Name of the field to create. N/A
DataType Data type of the field to create. For primary key field:
  • entity.FieldTypeInt64 (numpy.int64)
For scalar field:
  • entity.FieldTypeBool (Boolean)
  • entity.FieldTypeInt64 (numpy.int64)
  • entity.FieldTypeFloat (numpy.float32)
  • entity.FieldTypeDouble (numpy.double)
For vector field:
  • entity.FieldTypeBinaryVector (Binary vector)
  • entity.FieldTypeFloatVector (Float vector)
PrimaryKey (Mandatory for primary key field) Switch to control if the field is primary key field. True or False
AutoID (Mandatory for primary key field) Switch to enable or disable Automatic ID (primary key) allocation. True or False
dim (Mandatory for vector field) Dimension of the vector. [1, 32768]
Parameter Description Option
collection_name Name of the collection to create. N/A
description Description of the collection to create. N/A
fields Schema of the filed and the collection to create. Refer to Schema for more information.
data_type Data type of the filed to create. Refer to data type reference number for more information.
is_primary (Mandatory for primary key field) Switch to control if the field is primary key field. True or False
auto_id Switch to enable or disable Automatic ID (primary key) allocation. True or False
dim (Mandatory for vector field) Dimension of the vector. [1, 32768]
description (Optional) Description of the field. N/A
Parameter Description Option
Name Name of the field to create. N/A
Description Description of the field to create. N/A
DataType Data type of the field to create. For primary key field:
  • entity.FieldTypeInt64 (numpy.int64)
For scalar field:
  • entity.FieldTypeBool (Boolean)
  • entity.FieldTypeInt64 (numpy.int64)
  • entity.FieldTypeFloat (numpy.float32)
  • entity.FieldTypeDouble (numpy.double)
For vector field:
  • entity.FieldTypeBinaryVector (Binary vector)
  • entity.FieldTypeFloatVector (Float vector)
PrimaryKey (Mandatory for primary key field) Switch to control if the field is primary key field. True or False
AutoID Switch to enable or disable Automatic ID (primary key) allocation. True or False
Dimension (Mandatory for vector field) Dimension of the vector. [1, 32768]
CollectionName Name of the collection to create. N/A
Description (Optional) Description of the collection to create. N/A
ShardsNum Number of the shards for the collection to create. [1,256]
Option Description
-c The name of the collection.
-f (Multiple) The field schema in the ```::``` format.
-p The name of the primary key field.
-a (Optional) Flag to generate IDs automatically.
-d (Optional) The description of the collection.

Create a collection with the schema

Then, create a collection with Strong consistency level and the schema you specified above.

from pymilvus import Collection
collection = Collection(
    name=collection_name, 
    schema=schema, 
    using='default', 
    shards_num=2,
    consistency_level="Strong"
    )
await milvusClient.collectionManager.createCollection(params);
err = milvusClient.CreateCollection(
    context.Background(), // ctx
    schema,
    2, // shardNum
)
if err != nil {
    log.Fatal("failed to create collection:", err.Error())
}
milvusClient.createCollection(createCollectionReq);
# Follow the previous step.
Parameter Description Option
using (optional) By specifying the server alias here, you can choose in which Milvus server you create a collection. N/A
shards_num (optional) Number of the shards for the collection to create. [1,256]
consistency_level (optional) Consistency level of the collection to create.
  • Strong
  • Bounded
  • Session
  • Eventually
  • Customized
Parameter Description Option
ctx Context to control API invocation process. N/A
shardNum Number of the shards for the collection to create. [1,256]

Limits

FeatureMaximum limit
Length of a collection name255 characters
Number of partitions in a collection4,096
Number of fields in a collection256
Number of shards in a collection256

What’s next

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started
Feedback

Was this page helpful?