milvus-logo

Collection

A Collection instance represents a Milvus collection.

class pymilvus.Collection

Constructor

Constructs a collection by name, schema, and other parameters.

Collection(
    name: str,
    schema: CollectionSchema,
    using: str
) 

PARAMETERS:

  • name (string) -

    [REQUIRED]

    The name of the collection to create.

  • schema (CollectionSchema) -

    The schema used to create the collection.

    The default value is None, indicating that a default schema is used.

    what is a schema?

    The schema is responsible for organizing data in the target collection. A valid schema should have multiple fields, which must include a primary key, a vector field, and several scalar fields.

  • using (string) -

    The alias of the employed connection.

    The default value is default, indicating that this operation employs the default connection.

  • num_shards (int) -

    The number of shards to create along with the creation of this collection.

    The value defaults to 2, indicating that two shards are to be created along with this collection.

    what is sharding?

    Sharding refers to distributing write operations to different nodes to make the most of the parallel computing potential of a Milvus cluster for writing data.

    By default, a collection contains two shards.

  • consistency_level (int | str)

    The consistency level of the target collection.

    The value defaults to __Bounded __(1) with options of __Strong __(0), __Bounded __(1), __Session __(2), and __Eventually __(3).

    what is the consistency level?

    Consistency in a distributed database specifically refers to the property that ensures every node or replica has the same view of data when writing or reading data at a given time.

    Milvus supports four consistency levels: Strong, Bounded Staleness, Session, and Eventually. The default consistency level in Milvus is bounded staleness.

    You can easily tune the consistency level when conducting a vector similarity search or query to make it best suit your application.

  • timeout (float | None)

    The timeout duration for this operation. Setting this to None indicates that this operation timeouts when any response arrives or any error occurs.

RETURN TYPE:

Collection

RETURNS:

A collection object.

EXCEPTIONS:

  • SchemaNotReadyException

    This exception will be raised when the provided schema is invalid.

Examples

from pymilvus import Collection, CollectionSchema, FieldSchema, DataType

# Create a collection using the user-defined schema
primary_key = FieldSchema(
    name="id",
    dtype=DataType.INT64,
    is_primary=True,
)

vector = FieldSchema(
    name="vector",
    dtype=DataType.FLOAT_VECTOR,
    dim=768,
)

schema = CollectionSchema(
    fields = [primary_key, vector]
)

collection = Collection(
    name="test_01",
    schema=schema,
    using="default"
)