Connect Apache Kafka® with Milvus/Zilliz Cloud for Real-Time Vector Data Ingestion

In this quick start guide we show how to setup open source kafka and Zilliz Cloud to ingest vector data.

This tutorial explains how to use Apache Kafka® to stream and ingest vector data into Milvus vector database and Zilliz Cloud (fully-managed Milvus), enabling advanced real-time applications such as semantic search, recommendation systems, and AI-powered analytics.

Apache Kafka is a distributed event streaming platform designed for high-throughput, low-latency pipelines. It is widely used to collect, store, and process real-time data streams from sources like databases, IoT devices, mobile apps, and cloud services. Kafka’s ability to handle large volumes of data makes it an important data source of vector databases like Milvus or Zilliz Cloud.

For example, Kafka can capture real-time data streams—such as user interactions, sensor readings, together with their embeddings from machine learning models—and publish these streams directly to Milvus or Zilliz Cloud. Once in the vector database, this data can be indexed, searched, and analyzed efficiently.

The Kafka integration with Milvus and Zilliz Cloud provides a seamless way to build powerful pipelines for unstructured data workflows. The connector works for both open-source Kafka deployment and hosted services such as Confluent and StreamNative.

In this tutorial we use Zilliz Cloud as a demostration:

Step 1: Download the kafka-connect-milvus plugin

Complete the following steps to download the kafka-connect-milvus plugin.

download the latest plugin zip file zilliz-kafka-connect-milvus-xxx.zip from here.

Step 2: Download Kafka

Download the latest kafka from here.
Unzip the downloaded file and go to the kafka directory.

$ tar -xzf kafka_2.13-3.6.1.tgz
$ cd kafka_2.13-3.6.1

STEP 3: Start the Kafka Environment

NOTE: Your local environment must have Java 8+ installed.

Run the following commands in order to start all services in the correct order:

Start the ZooKeeper service

$ bin/zookeeper-server-start.sh config/zookeeper.properties

Start the Kafka broker service

Open another terminal session and run:
```
$ bin/kafka-server-start.sh config/server.properties
```

Once all services have successfully launched, you will have a basic Kafka environment running and ready to use.

check the official quick start guide form kafka for details: https://kafka.apache.org/quickstart

Step 4: Configure Kafka and Zilliz Cloud

Ensure you have Kafka and Zilliz Cloud setup and properly configured.

If you don’t already have a topic in Kafka, create a topic (e.g. topic_0) in Kafka.
```
$ bin/kafka-topics.sh --create --topic topic_0 --bootstrap-server localhost:9092
```
If you don’t already have a collection in Zilliz Cloud, create a collection with a vector field (in this example the vector has dimension=8). You can use the following example schema on Zilliz Cloud:

Note: Make sure the schema on both sides match each other. In the schema, there is exactly one vector field. The names of each field on both sides are exactly the same.

Step 5: Load the kafka-connect-milvus plugin to Kafka Instance

unzip the zilliz-kafka-connect-milvus-xxx.zip file you downloaded in Step 1.
copy the zilliz-kafka-connect-milvus directories to the libs directory of your Kafka installation.

modify the connect-standalone.properties file in the config directory of your Kafka installation.

key.converter.schemas.enable=false
value.converter.schemas.enable=false
plugin.path=libs/zilliz-kafka-connect-milvus-xxx

create and configure a milvus-sink-connector.properties file in the config directory of your Kafka installation.

name=zilliz-kafka-connect-milvus
connector.class=com.milvus.io.kafka.MilvusSinkConnector
public.endpoint=https://<public.endpoint>:port
token=*****************************************
collection.name=topic_0
topics=topic_0

Step 6: Launch the connector

Start the connector with the previous configuration file

$ bin/connect-standalone.sh config/connect-standalone.properties config/milvus-sink-connector.properties

Try produce a message to the Kafka topic you just created in Kafka

bin/kafka-console-producer.sh --topic topic_0 --bootstrap-server localhost:9092                        
>{"id": 0, "title": "The Reported Mortality Rate of Coronavirus Is Not Important", "title_vector": [0.041732933, 0.013779674, -0.027564144, -0.013061441, 0.009748648, 0.00082446384, -0.00071647146, 0.048612226], "link": "https://medium.com/swlh/the-reported-mortality-rate-of-coronavirus-is-not-important-369989c8d912"}

Check if the entity has been inserted into the collection in Zilliz Cloud. Here is what it looks like on Zilliz Cloud if the insertion succeeds:

Support

If you require any assistance or have questions regarding the Kafka Connect Milvus Connector, please feel free to reach out to the maintainer of the connector: Email: support@zilliz.com

Connect Apache Kafka® with Milvus/Zilliz Cloud for Real-Time Vector Data Ingestion
Step 1: Download the kafka-connect-milvus plugin
Step 2: Download Kafka
STEP 3: Start the Kafka Environment
Step 4: Configure Kafka and Zilliz Cloud
Step 5: Load the kafka-connect-milvus plugin to Kafka Instance
Step 6: Launch the connector
Support

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?