upsert()
This operation inserts new records into the database or updates existing ones.
notes
An upsert is a data-level operation that will overwrite an existing entity if a specified field already exists in a collection, and insert a new entity if the specified value doesn’t already exist.
Request Syntax
upsert(
data: List | pandas.DataFrame | Dict,
partition_name: str | None,
timeout: float,
)
PARAMETERS:
data (list | dict | pandas.DataFrame) -
[REQUIRED]
The data to insert into the current collection.
The data to insert should match the schema of the current collection. You can organize your data into:
A list of columns
Each column is a list of values of all entities in that column.
data = [ [0,1,2,3,4], # id [ # vector [0.1,0.2,-0.3,-0.4,0.5], [0.3,-0.1,-0.2,-0.6,0.7], [-0.6,-0.3,0.2,0.8,0.7], [0.6,0.2,-0.3,-0.8,0.5], [0.3,0.1,-0.2,-0.6,-0.7], ], ]
A pandas.DataFrame
You can form a data frame in any way, as demonstrated in the Example section on this page.
data = pd.DataFrame({ "id": [5,6,7,8,9], "vector": [ [0.1,0.2,-0.3,-0.4,0.5], [0.3,-0.1,-0.2,-0.6,0.7], [-0.6,-0.3,0.2,0.8,0.7], [0.6,0.2,-0.3,-0.8,0.5], [0.3,0.1,-0.2,-0.6,-0.7], ] })
A list of rows or just a row
Each row is a dictionary that represents an entity.
data = [ {"id": 10, "vector": [0.1,0.2,-0.3,-0.4,0.5]}, {"id": 11, "vector": [0.3,-0.1,-0.2,-0.6,0.7]}, {"id": 12, "vector": [-0.6,-0.3,0.2,0.8,0.7]}, {"id": 13, "vector": [0.6,0.2,-0.3,-0.8,0.5]}, {"id": 14, "vector": [0.3,0.1,-0.2,-0.6,-0.7]}, ] # or data = {"id": 15, "vector": [0.3,0.1,-0.2,-0.6,-0.7]},
partition_name (string | None) -
The name of a partition in the current collection.
If specified, the data is to be inserted into the specified partition.
timeout (float | None)
The timeout duration for this operation. Setting this to None indicates that this operation timeouts when any response arrives or any error occurs.
RETURN TYPE:
MutationResult
RETURNS:
A MutationResult object that contains the following fields:
insert_count (int)
The count of inserted entities.
delete_count (int)
The count of deleted entities.
upsert_count (int)
The count of upserted entities.
succ_count (int)
The count of successful executions during this operation.
succ_index (list)
A list of index numbers starting from 0, each indicating a successful operation.
err_count (int)
The count of failed executions during this operation.
err_index (list)
A list of index numbers starting from 0, each indicating a failed operation.
primary_keys (list)
A list of primary keys for the inserted entities.
timestamp (int)
The timestamp at which this operation is completed.
EXCEPTIONS:
MilvusException
This exception will be raised when any error occurs during this operation.
Examples
from pymilvus import Collection, CollectionSchema, FieldSchema, DataType
schema = CollectionSchema([
FieldSchema("id", DataType.INT64, is_primary=True),
FieldSchema("vector", DataType.FLOAT_VECTOR, dim=5)
])
# Create a collection
collection = Collection(
name="test_collection",
schema=schema
)
# Upsert a list of columns
res = collection.upsert(
data=[
[0,1,2,3,4], # id
[ # vector
[0.1,0.2,-0.3,-0.4,0.5],
[0.3,-0.1,-0.2,-0.6,0.7],
[-0.6,-0.3,0.2,0.8,0.7],
[0.6,0.2,-0.3,-0.8,0.5],
[0.3,0.1,-0.2,-0.6,-0.7],
],
]
)
# Upsert a data frame
import pandas as pd
res = collection.upsert(
data=pd.DataFrame({
"id": [5,6,7,8,9],
"vector": [
[0.1,0.2,-0.3,-0.4,0.5],
[0.3,-0.1,-0.2,-0.6,0.7],
[-0.6,-0.3,0.2,0.8,0.7],
[0.6,0.2,-0.3,-0.8,0.5],
[0.3,0.1,-0.2,-0.6,-0.7],
]
})
)
# Upsert a list of dictionaries
res = collection.upsert(
data=[
{"id": 10, "vector": [0.1,0.2,-0.3,-0.4,0.5]},
{"id": 11, "vector": [0.3,-0.1,-0.2,-0.6,0.7]},
{"id": 12, "vector": [-0.6,-0.3,0.2,0.8,0.7]},
{"id": 13, "vector": [0.6,0.2,-0.3,-0.8,0.5]},
{"id": 14, "vector": [0.3,0.1,-0.2,-0.6,-0.7]},
]
)
# Upsert a dictionary
res = collection.upsert(
data={"id": 16, "vector": [0.3,0.1,-0.2,-0.6,-0.7]},
)