voyage-2 takes raw text input and converts it into a numerical representation that preserves semantic meaning. Concretely, its job is to map text into a high-dimensional vector space where similar meanings are close together and unrelated meanings are far apart. This allows software systems to answer questions like “Which documents are most relevant to this query?” or “Which support ticket is most similar to this one?” without relying on exact word overlap. The model itself does not search or rank results; it only produces embeddings that make those operations possible.
In a typical workflow, voyage-2 is used in two phases: indexing and querying. During indexing, you process your dataset—documents, code snippets, chat logs, or product descriptions—through voyage-2 and store the resulting vectors. During querying, you embed the user’s input text with the same model and compare it against the stored vectors. The comparison step is usually handled by a vector database. For example, a developer building internal documentation search might embed every Markdown file once, store the embeddings, and then embed each user query on demand to retrieve the top-k most similar sections.
The usefulness of voyage-2 becomes clear when combined with a vector database such as Milvus or Zilliz Cloud. These databases provide indexing structures (like IVF or HNSW) and similarity search APIs that can handle large-scale workloads efficiently. voyage-2 does not replace a database; instead, it enables semantic comparison, while the database enables fast retrieval. Together, they form the backbone of many production systems for semantic search, recommendation, and RAG pipelines where relevance is based on meaning rather than keywords.
For more information, click here: https://zilliz.com/ai-models/voyage-2