Create a Collection
This topic describes how to create a collection in Milvus.
A collection consists of one or more partitions. While creating a new collection, Milvus creates a default partition _default
. See Glossary - Collection for more information.
The following example builds a two-shard collection named book
, with a primary key field named book_id
, an INT64
scalar field named word_count
, and a two-dimensional floating-point vector field named book_intro
. Real applications will likely use much higher dimensional vectors than the example.
Prepare Schema
First, prepare necessary parameters, including field schema, collection schema, and collection name.
from pymilvus import CollectionSchema, FieldSchema, DataType
book_id = FieldSchema(
name="book_id",
dtype=DataType.INT64,
is_primary=True,
)
book_name = FieldSchema(
name="book_name",
dtype=DataType.VARCHAR,
max_length=200,
)
word_count = FieldSchema(
name="word_count",
dtype=DataType.INT64,
)
book_intro = FieldSchema(
name="book_intro",
dtype=DataType.FLOAT_VECTOR,
dim=2
)
schema = CollectionSchema(
fields=[book_id, book_name, word_count, book_intro],
description="Test book search"
)
collection_name = "book"
const params = {
collection_name: "book",
description: "Test book search",
fields: [
{
name: "book_intro",
description: "",
data_type: 101, // DataType.FloatVector
type_params: {
dim: "2",
},
},
{
name: "book_id",
data_type: 5, //DataType.Int64
is_primary_key: true,
description: "",
},
{
name: "word_count",
data_type: 5, //DataType.Int64
description: "",
},
],
};
var (
collectionName = "book"
)
schema := &entity.Schema{
CollectionName: collectionName,
Description: "Test book search",
Fields: []*entity.Field{
{
Name: "book_id",
DataType: entity.FieldTypeInt64,
PrimaryKey: true,
AutoID: false,
},
{
Name: "word_count",
DataType: entity.FieldTypeInt64,
PrimaryKey: false,
AutoID: false,
},
{
Name: "book_intro",
DataType: entity.FieldTypeFloatVector,
TypeParams: map[string]string{
"dim": "2",
},
},
},
}
FieldType fieldType1 = FieldType.newBuilder()
.withName("book_id")
.withDataType(DataType.Int64)
.withPrimaryKey(true)
.withAutoID(false)
.build();
FieldType fieldType2 = FieldType.newBuilder()
.withName("word_count")
.withDataType(DataType.Int64)
.build();
FieldType fieldType3 = FieldType.newBuilder()
.withName("book_intro")
.withDataType(DataType.FloatVector)
.withDimension(2)
.build();
CreateCollectionParam createCollectionReq = CreateCollectionParam.newBuilder()
.withCollectionName("book")
.withDescription("Test book search")
.withShardsNum(2)
.addFieldType(fieldType1)
.addFieldType(fieldType2)
.addFieldType(fieldType3)
.build();
create collection -c book -f book_id:INT64:book_id -f word_count:INT64:word_count -f book_intro:FLOAT_VECTOR:2 -p book_id
curl -X 'POST' \
'http://localhost:9091/api/v1/collection' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"collection_name": "book",
"schema": {
"autoID": false,
"description": "Test book search",
"fields": [
{
"name": "book_id",
"description": "book id",
"is_primary_key": true,
"autoID": false,
"data_type": 5
},
{
"name": "word_count",
"description": "count of words",
"is_primary_key": false,
"data_type": 5
},
{
"name": "book_intro",
"description": "embedded vector of book introduction",
"data_type": 101,
"is_primary_key": false,
"type_params": [
{
"key": "dim",
"value": "2"
}
]
}
],
"name": "book"
}
}'
{}
Parameter | Description | Option |
---|---|---|
FieldSchema |
Schema of the fields within the collection to create. Refer to Schema for more information. | N/A |
name |
Name of the field to create. | N/A |
dtype |
Data type of the field to create. | For primary key field:
|
is_primary (Mandatory for primary key field) |
Switch to control if the field is primary key field. | True or False |
auto_id (Mandatory for primary key field) |
Switch to enable or disable automatic ID (primary key) allocation. | True or False |
max_length (Mandatory for VARCHAR field) |
Maximum length of strings allowed to be inserted. | [1, 65,535] |
dim (Mandatory for vector field) |
Dimension of the vector. | [1, 32,768] |
description (Optional) |
Description of the field. | N/A |
CollectionSchema |
Schema of the collection to create. Refer to Schema for more information. | N/A |
fields |
Fields of the collection to create. | N/A |
description (Optional) |
Description of the collection to create. | N/A |
collection_name |
Name of the collection to create. | N/A |
Parameter | Description | Option |
---|---|---|
collectionName |
Name of the collection to create. | N/A |
description |
Description of the collection to create. | N/A |
Fields |
Schema of the fields within the collection to create. Refer to Schema for more information. | N/A |
Name |
Name of the field to create. | N/A |
DataType |
Data type of the field to create. | For primary key field:
|
PrimaryKey (Mandatory for primary key field) |
Switch to control if the field is primary key field. | True or False |
AutoID (Mandatory for primary key field) |
Switch to enable or disable Automatic ID (primary key) allocation. | True or False |
dim (Mandatory for vector field) |
Dimension of the vector. | [1, 32768] |
Parameter | Description | Option |
---|---|---|
collection_name |
Name of the collection to create. | N/A |
description |
Description of the collection to create. | N/A |
fields |
Schema of the filed and the collection to create. | Refer to Schema for more information. |
data_type |
Data type of the filed to create. | Refer to data type reference number for more information. |
is_primary (Mandatory for primary key field) |
Switch to control if the field is primary key field. | True or False |
auto_id |
Switch to enable or disable Automatic ID (primary key) allocation. | True or False |
dim (Mandatory for vector field) |
Dimension of the vector. | [1, 32768] |
description (Optional) |
Description of the field. | N/A |
Parameter | Description | Option |
---|---|---|
Name |
Name of the field to create. | N/A |
Description |
Description of the field to create. | N/A |
DataType |
Data type of the field to create. | For primary key field:
|
PrimaryKey (Mandatory for primary key field) |
Switch to control if the field is primary key field. | True or False |
AutoID |
Switch to enable or disable Automatic ID (primary key) allocation. | True or False |
Dimension (Mandatory for vector field) |
Dimension of the vector. | [1, 32768] |
CollectionName |
Name of the collection to create. | N/A |
Description (Optional) |
Description of the collection to create. | N/A |
ShardsNum |
Number of the shards for the collection to create. | [1,256] |
Option | Description |
---|---|
-c | The name of the collection. |
-f (Multiple) | The field schema in the ` |
-p | The name of the primary key field. |
-a (Optional) | Flag to generate IDs automatically. |
-d (Optional) | The description of the collection. |
Parameter | Description | Option |
---|---|---|
collection_name |
Name of the collection to create. | N/A |
name (schema) |
Must be the same as collection_name , this duplicated field is kept for historical reasons. |
Same as collection_name |
autoID (schema) |
Switch to enable or disable Automatic ID (primary key) allocation. | True or False |
description (schema) |
Description of the collection to create. | N/A |
fields |
Schema of the fields within the collection to create. Refer to Schema for more information. | N/A |
name (field) |
Name of the field to create. | N/A |
description (field) |
Description of the collection to create. | N/A |
is_primary_key (Mandatory for primary key field) |
Switch to control if the field is primary key field. | True or False |
autoID (field)(Mandatory for primary key field) |
Switch to enable or disable Automatic ID (primary key) allocation. | True or False |
data_type |
Data type of the field to create. |
Enums:
1: "Bool", 2: "Int8", 3: "Int16", 4: "Int32", 5: "Int64", 10: "Float", 11: "Double", 20: "String", 21: "VarChar", 100: "BinaryVector", 101: "FloatVector", For primary key field:
|
dim (Mandatory for vector field) |
Dimension of the vector. | [1, 32,768] |
Create a collection with the schema
Then, create a collection with the schema you specified above.
from pymilvus import Collection
collection = Collection(
name=collection_name,
schema=schema,
using='default',
shards_num=2,
)
await milvusClient.collectionManager.createCollection(params);
err = milvusClient.CreateCollection(
context.Background(), // ctx
schema,
2, // shardNum
)
if err != nil {
log.Fatal("failed to create collection:", err.Error())
}
milvusClient.createCollection(createCollectionReq);
# Follow the previous step.
# Follow the previous step.
Parameter | Description | Option |
---|---|---|
using (optional) |
By specifying the server alias here, you can choose in which Milvus server you create a collection. | N/A |
shards_num (optional) |
Number of the shards for the collection to create. | [1,256] |
Parameter | Description | Option |
---|---|---|
ctx |
Context to control API invocation process. | N/A |
shardNum |
Number of the shards for the collection to create. | [1,256] |
Limits
Feature | Maximum limit |
---|---|
Length of a collection name | 255 characters |
Number of partitions in a collection | 4,096 |
Number of fields in a collection | 256 |
Number of shards in a collection | 256 |
What’s next
- Learn more basic operations of Milvus: