Home
Docs
Reference
Schema

Schema

This topic introduces schema in Milvus. Schema is used to define the properties of a collection and the fields within.

Field schema

A field schema is the logical definition of a field. It is the first thing you need to define before defining a collection schema and creating a collection.

Milvus supports only one primary key field in a collection.

Field schema properties

Properties	Description	Note
`name`	Name of the field in the collection to create	Data type: String. Mandatory
`dtype`	Data type of the field	Mandatory
`description`	Description of the field	Data type: String. Optional
`is_primary`	Whether to set the field as the primary key field or not	Data type: Boolean (`true` or `false`). Mandatory for the primary key field
`auto_id` (Mandatory for primary key field)	Switch to enable or disable automatic ID (primary key) allocation.	`True` or `False`
`max_length` (Mandatory for VARCHAR field)	Maximum length of strings allowed to be inserted.	[1, 65,535]
`dim`	Dimension of the vector	Data type: Integer ∈[1, 32768]. Mandatory for the vector field
`is_partition_key`	Whether this field is a partition-key field.	Data type: Boolean (`true` or `false`).

Create a field schema

To reduce the complexity in data inserts, Milvus allows you to specify a default value for each scalar field during field schema creation, excluding the primary key field. This indicates that if you leave a field empty when inserting data, the default value you specified for this field applies.

Create a regular field schema:

from pymilvus import FieldSchema
id_field = FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, description="primary id")
age_field = FieldSchema(name="age", dtype=DataType.INT64, description="age")
embedding_field = FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128, description="vector")

# The following creates a field and use it as the partition key
position_field = FieldSchema(name="position", dtype=DataType.VARCHAR, max_length=256, is_partition_key=True)

Create a field schema with default field values:

from pymilvus import FieldSchema

fields = [
  FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
  # configure default value `25` for field `age`
  FieldSchema(name="age", dtype=DataType.INT64, default_value=25, description="age"),
  embedding_field = FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128, description="vector")
]

Supported data types

DataType defines the kind of data a field contains. Different fields support different data types.

Primary key field supports:
- INT64: numpy.int64
- VARCHAR: VARCHAR
Scalar field supports:
- BOOL: Boolean (true or false)
- INT8: numpy.int8
- INT16: numpy.int16
- INT32: numpy.int32
- INT64: numpy.int64
- FLOAT: numpy.float32
- DOUBLE: numpy.double
- VARCHAR: VARCHAR
- JSON: JSON
- Array: Array
Vector field supports:
- BINARY_VECTOR: Binary vector
- FLOAT_VECTOR: Float vector

JSON as a composite data type is available. A JSON field comprises key-value pairs. Each key is a string, and a value can be a number, string, boolean value, array, or list. For details, refer to JSON: a new data type

Collection schema

A collection schema is the logical definition of a collection. Usually you need to define the field schema before defining a collection schema and creating a collection.

Collection schema properties

Properties	Description	Note
`field`	Fields in the collection to create	Mandatory
`description`	Description of the collection	Data type: String. Optional
`partition_key_field`	Name of a field that is designed to act as the partition key.	Data type: String. Optional
`enable_dynamic_field`	Whether to enable dynamic schema or not	Data type: Boolean (`true` or `false`). Optional, defaults to `False`. For details on dynamic schema, refer to Dynamic Schema and the user guides for managing collections.

Create a collection schema

Define the field schemas before defining a collection schema.

from pymilvus import FieldSchema, CollectionSchema
id_field = FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, description="primary id")
age_field = FieldSchema(name="age", dtype=DataType.INT64, description="age")
embedding_field = FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128, description="vector")

# Enable partition key on a field if you need to implement multi-tenancy based on the partition-key field
position_field = FieldSchema(name="position", dtype=DataType.VARCHAR, max_length=256, is_partition_key=True)

# Set enable_dynamic_field to True if you need to use dynamic fields. 
schema = CollectionSchema(fields=[id_field, age_field, embedding_field], auto_id=False, enable_dynamic_field=True, description="desc of a collection")

Create a collection with the schema specified:

from pymilvus import Collection
collection_name1 = "tutorial_1"
collection1 = Collection(name=collection_name1, schema=schema, using='default', shards_num=2)

You can define the shard number with shards_num.
You can define the Milvus server on which you wish to create a collection by specifying the alias in using.
You can enable the partition key feature on a field by setting is_partition_key to True on the field if you need to implement partition-key-based multi-tenancy.
You can enable dynamic schema by setting enable_dynamic_field to True in the collection schema if you need to use dynamic fields.

You can also create a collection with Collection.construct_from_dataframe, which automatically generates a collection schema from DataFrame and creates a collection.

import pandas as pd
df = pd.DataFrame({
    "id": [i for i in range(nb)],
    "age": [random.randint(20, 40) for i in range(nb)],
    "embedding": [[random.random() for _ in range(dim)] for _ in range(nb)],
    "position": "test_pos"
})

collection, ins_res = Collection.construct_from_dataframe(
    'my_collection',
    df,
    primary_field='id',
    auto_id=False
    )

What’s next

Learn how to prepare schema when creating a collection.
Read more about dynamic schema.
Read more about partition-key in Multi-tenancy.

Schema
Field schema
Collection schema
What's next

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?