About Milvus
Get Started
User Guide
Administration Guide
Integrations
Benchmarks
Tools
Reference
Example Applications
FAQs
API Reference

Upsert Entities

This topic describes how to upsert entities in Milvus.

Upserting is a combination of insert and delete operations. In the context of a Milvus vector database, an upsert is a data-level operation that will overwrite an existing entity if a specified field already exists in a collection, and insert a new entity if the specified value doesn’t already exist.

The following example upserts 3,000 rows of randomly generated data as the example data. When performing upsert operations, it's important to note that the operation may compromise performance. This is because the operation involves deleting data during execution.

Prepare data

First, prepare the data to upsert. The type of data to upsert must match the schema of the collection, otherwise Milvus will raise an exception.

Milvus supports default values for scalar fields, excluding a primary key field. This indicates that some fields can be left empty during data inserts or upserts. For more information, refer to Create a Collection.

When interacting with Milvus using Python code, you have the flexibility to choose between PyMilvus and MilvusClient (new). For more information, refer to Python SDK.

Python GO

# Generate data to upsert
import random
nb = 3000
dim = 8
vectors = [[random.random() for _ in range(dim)] for _ in range(nb)]
data = [
    [i for i in range(nb)],
    [str(i) for i in range(nb)],
    [i for i in range(10000, 10000+nb)],
    vectors,
    [str("dy"*i) for i in range(nb)]
]

nEntities:= 3000
dim:= 8
idList:= make([]int64, 0, nEntities)
randomList:= make([]float64, 0, nEntities)
embeddingList := make([][]float32, 0, nEntities)

for i := 0; i < nEntities; i++ {
    idList = append(idList, int64(i))
}
    
for i := 0; i < nEntities; i++ {
    randomList = append(randomList, rand.Float64())
}
  
for i := 0; i < nEntities; i++ {
    vec := make([]float32, 0, dim)
for j := 0; j < dim; j++ {
        vec = append(vec, rand.Float32())
    }
    embeddingList = append(embeddingList, vec)
}
idColData := entity.NewColumnInt64("ID", idList)
randomColData := entity.NewColumnDouble("random", randomList)
embeddingColData := entity.NewColumnFloatVector("embeddings", dim, embeddingList)

Upsert data

Upsert the data to the collection.

Python GO

from pymilvus import Collection
collection = Collection("book") # Get an existing collection.
mr = collection.upsert(data)

if _, err := c.Upsert(ctx, collectionName, "", idColData, embeddingColData);
err != nil {
        log.Fatalf("failed to upsert data, err: %v", err)
}

Parameter	Description
`data`	Data to upsert into Milvus.
`partition_name` (optional)	Name of the partition to upsert data into.
`timeout` (optional)	An optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.

Parameter	Description
`ctx`	Context to control API invocation process.
`collectionName`	Name of the collection to upsert data into.
`partitionName`	Name of the partition to upsert data into. Data will be upserted in the default partition if left blank.
`idColData`	Data to upsert into each field.

After upserting entities into a collection that has previously been indexed, you do not need to re-index the collection, as Milvus will automatically create an index for the newly upserted data. For more information, refer to Can indexes be created after inserting vectors?

Flush data

When data is upserted into Milvus it is updated and inserted into segments. Segments have to reach a certain size to be sealed and indexed. Unsealed segments will be searched brute force. In order to avoid this with any remainder data, it is best to call flush(). The flush() call will seal any remaining segments and send them for indexing. It is important to only call this method at the end of an upsert session. Calling it too often will cause fragmented data that will need to be cleaned later on.

Limits

Updating primary key fields is not supported by upsert().
upsert() is not applicable and an error can occur if autoID is set to True for primary key fields.

What's next

Learn more basic operations of Milvus:

Upsert Entities
Prepare data
Upsert data
Flush data
Limits
What's next

Feedback

Was this page helpful?