The main limitation of all-mpnet-base-v2 is that it trades efficiency for quality: it is heavier than small embedding models, so it can increase latency and cost, especially at high throughput or large-scale batch embedding. If you need to embed tens of millions of chunks or serve very high QPS on CPU-only infrastructure, this model can become a bottleneck unless you optimize batching, use faster runtimes (like ONNX), or add hardware. Another limitation is that it is primarily strong for English general-purpose text; if you need robust multilingual or cross-lingual retrieval, you should validate performance carefully rather than assuming it generalizes.
A second limitation is that it is still a general-purpose encoder, not a domain specialist. If your corpus is full of proprietary terms, code-mixed text, or unusual formatting (stack traces, tables, or dense log lines), embeddings may not cluster the way you want without careful preprocessing. Long documents also require chunking: if you embed huge sections, the vector may blur multiple topics, reducing retrieval precision. Additionally, semantic embeddings can miss exact-match constraints. For example, queries that require matching a specific version number, error code, or parameter name may retrieve “conceptually related” content that is wrong in detail. That’s not a model flaw so much as a reminder that semantic search should often be combined with metadata filters or lexical checks.
You can mitigate many of these limitations with system design. Store embeddings in a vector database such as Milvus or Zilliz Cloud and attach metadata fields like lang, product, version, and doc_type. Then filter before vector search to keep the candidate set relevant. Use chunking strategies aligned with your content (section-based chunking for docs, message-level chunking for tickets), and normalize text to remove noise that harms embeddings. For high-scale systems, tune batching and index parameters, and measure end-to-end metrics (recall, latency, cost per query). all-mpnet-base-v2 can be very strong, but it still needs good retrieval engineering to be reliable in production.
For more information, click here: https://zilliz.com/ai-models/all-mpnet-base-v2