🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Milvus
Zilliz

What is zero-shot learning with embeddings?

Zero-shot learning with embeddings is a machine learning approach where a model makes predictions for classes it hasn’t explicitly seen during training. This is achieved by leveraging semantic embeddings—vector representations that capture relationships between data points. For example, a model trained to recognize animals like “cat” and “dog” could infer a new class like “wolf” by understanding its similarity to known classes through embeddings. The embeddings act as a bridge between seen and unseen classes, enabling generalization without additional training data.

Embeddings are created by mapping data (like text or images) into a high-dimensional space where similar items are closer together. In zero-shot learning, both input data (e.g., images) and class labels (e.g., “wolf”) are converted into embeddings. For instance, a text encoder might map class names to vectors, while an image encoder maps photos to the same space. During inference, the model compares the input embedding (e.g., an image of a wolf) to all class embeddings and selects the closest match. This avoids retraining because the relationships between classes are encoded in the embedding space. Tools like CLIP (Contrastive Language-Image Pretraining) use this approach by aligning image and text embeddings, allowing image classification based on textual descriptions of unseen classes.

A practical example is a product categorization system. Suppose a model is trained to classify electronics like “laptop” and “smartphone” using embeddings. If a new product type, “e-reader,” is introduced, the model can infer its category by comparing its embedding to existing ones, even if “e-reader” wasn’t in the training data. Similarly, in NLP, a chatbot trained on common queries could handle a new request like “cancel subscription” by matching its embedding to related known intents like “unsubscribe” or “terminate service.” These embeddings are often precomputed using models like BERT or ResNet, making zero-shot learning efficient for developers who need adaptable systems without frequent retraining.

Like the article? Spread the word