Milvus
Zilliz
  • Home
  • Blog
  • Introducing the Milvus Ngram Index: Faster Keyword Matching and LIKE Queries for Agent Workloads

Introducing the Milvus Ngram Index: Faster Keyword Matching and LIKE Queries for Agent Workloads

  • Engineering
December 16, 2025
Chenjie Tang

In agent systems, context retrieval is a foundational building block across the entire pipeline, providing the basis for downstream reasoning, planning, and action. Vector search helps agents retrieve semantically relevant context that captures intent and meaning across large and unstructured datasets. However, semantic relevance alone is often not enough. Agent pipelines also rely on full-text search to enforce exact keyword constraints—such as product names, function calls, error codes, or legally significant terms. This supporting layer ensures that retrieved context is not only relevant, but also explicitly satisfies hard textual requirements.

Real workloads consistently reflect this need:

  • Customer support assistants must find conversations mentioning a specific product or ingredient.

  • Coding copilots look for snippets containing an exact function name, API call, or error string.

  • Legal, medical, and academic agents filter documents for clauses or citations that must appear verbatim.

Traditionally, systems have handled this with the SQL LIKE operator. A query such as name LIKE '%rod%' is simple and widely supported, but under high concurrency and large data volumes, this simplicity carries major performance costs.

  • Without an index, a LIKE query scans the entire context store and applies pattern matching row by row. At millions of records, even a single query can take seconds—far too slow for real-time agent interactions.

  • Even with a conventional inverted index, wildcard patterns such as %rod% remain hard to optimize because the engine must still traverse the entire dictionary and run pattern matching on each entry. The operation avoids row scans but remains fundamentally linear, resulting in only marginal improvements.

This creates a clear gap in hybrid retrieval systems: vector search handles semantic relevance efficiently, but exact keyword filtering often becomes the slowest step in the pipeline.

Milvus natively supports hybrid vector and full-text search with metadata filtering. To address the limitations of keyword matching, Milvus introduces the Ngram Index, which improves LIKE performance by splitting text into small substrings and indexing them for efficient lookup. This dramatically reduces the amount of data examined during query execution, delivering tens to hundreds of times faster LIKE queries in real agentic workloads.

The rest of this post walks through how the Ngram Index works in Milvus and evaluates its performance in real-world scenarios.

What Is the Ngram Index?

In databases, text filtering is commonly expressed using SQL, the standard query language used to retrieve and manage data. One of its most widely used text operators is LIKE, which supports pattern-based string matching.

LIKE expressions can be broadly grouped into four common pattern types, depending on how wildcards are used:

  • Infix match (name LIKE '%rod%'): Matches records where the substring rod appears anywhere in the text.

  • Prefix match (name LIKE 'rod%'): Matches records whose text starts with rod.

  • Suffix match (name LIKE '%rod'): Matches records whose text ends with rod.

  • Wildcard match (name LIKE '%rod%aab%bc_de'): Combines multiple substring conditions (%) with single-character wildcards (_) in a single pattern.

While these patterns differ in appearance and expressiveness, the Ngram Index in Milvus accelerates all of them using the same underlying approach.

Before building the index, Milvus splits each text value into short, overlapping substrings of fixed lengths, known as n-grams. For example, when n = 3, the word “Milvus” is decomposed into the following 3-grams: “Mil”, “ilv”, “lvu”, and “vus”. Each n-gram is then stored in an inverted index that maps the substring to the set of document IDs in which it appears. At query time, LIKE conditions are translated into combinations of n-gram lookups, allowing Milvus to quickly filter out most non-matching records and evaluate the pattern against a much smaller candidate set. This is what turns expensive string scans into efficient index-based queries.

Two parameters control how the Ngram Index is constructed: min_gram and max_gram. Together, they define the range of substring lengths that Milvus generates and indexes.

  • min_gram: The shortest substring length to index. In practice, this also sets the minimum query substring length that can benefit from the Ngram Index

  • max_gram: The longest substring length to index. At query time, it additionally determines the maximum window size used when splitting longer query strings into n-grams.

By indexing all contiguous substrings whose lengths fall between min_gram and max_gram, Milvus establishes a consistent and efficient foundation for accelerating all supported LIKE pattern types.

How Does the Ngram Index Work?

Milvus implements the Ngram Index in a two-phase process:

  • Build the index: Generate n-grams for each document and build an inverted index during data ingestion.

  • Accelerate queries: Use the index to narrow the search to a small candidate set, then verify exact LIKE matches on those candidates.

A concrete example makes this process easier to understand.

Phase 1: Build the index

Decompose text into n-grams:

Assume we index the text “Apple” with the following settings:

  • min_gram = 2

  • max_gram = 3

Under this setting, Milvus generates all contiguous substrings of length 2 and 3:

  • 2-grams: Ap, pp, pl, le

  • 3-grams: App, ppl, ple

Build an inverted index:

Now consider a small dataset of five records:

  • Document 0: Apple

  • Document 1: Pineapple

  • Document 2: Maple

  • Document 3: Apply

  • Document 4: Snapple

During ingestion, Milvus generates n-grams for each record and inserts them into an inverted index. In this index:

  • Keys are n-grams (substrings)

  • Values are lists of document IDs where the n-gram appears

"Ap"  -> [0, 3]
"App" -> [0, 3]
"Ma"  -> [2]
"Map" -> [2]
"Pi"  -> [1]
"Pin" -> [1]
"Sn"  -> [4]
"Sna" -> [4]
"ap"  -> [1, 2, 4]
"apl" -> [2]
"app" -> [1, 4]
"ea"  -> [1]
"eap" -> [1]
"in"  -> [1]
"ine" -> [1]
"le"  -> [0, 1, 2, 4]
"ly"  -> [3]
"na"  -> [4]
"nap" -> [4]
"ne"  -> [1]
"nea" -> [1]
"pl"  -> [0, 1, 2, 3, 4]
"ple" -> [0, 1, 2, 4]
"ply" -> [3]
"pp"  -> [0, 1, 3, 4]
"ppl" -> [0, 1, 3, 4]

Now the index is fully built.

Phase 2: Accelerate queries

When a LIKE filter is executed, Milvus uses the Ngram Index to speed up query evaluation through the following steps:

1. Extract the query term: Contiguous substrings without wildcards are extracted from the LIKE expression (for example, '%apple%' becomes apple).

2. Decompose the query term: The query term is decomposed into n-grams based on its length (L) and the configured min_gram and max_gram.

3. Look for each gram & intersect: Milvus looks up query n-grams in the inverted index and intersects their document ID lists to produce a small candidate set.

4. Verify and return results: The original LIKE condition is applied only to this candidate set to determine the final result.

In practice, the way a query is split into n-grams depends on the shape of the pattern itself. To see how this works, we’ll focus on two common cases: infix matches and wildcard matches. Prefix and suffix matches behave the same as infix matches, so we won’t cover them separately.

Infix match

For an infix match, execution depends on the length of the literal substring (L) relative to min_gram and max_gram.

1. min_gram ≤ L ≤ max_gram (e.g., strField LIKE '%ppl%')

The literal substring ppl falls entirely within the configured n-gram range. Milvus directly looks up the n-gram "ppl" in the inverted index, producing the candidate document IDs [0, 1, 3, 4].

Because the literal itself is an indexed n-gram, all candidates already satisfy the infix condition. The final verification step does not eliminate any records, and the result remains [0, 1, 3, 4].

2. L > max_gram (e.g., strField LIKE '%pple%')

The literal substring pple is longer than max_gram, so it is decomposed into overlapping n-grams using a window size of max_gram. With max_gram = 3, this produces the n-grams "ppl" and "ple".

Milvus looks up each n-gram in the inverted index:

  • "ppl"[0, 1, 3, 4]

  • "ple"[0, 1, 2, 4]

Intersecting these lists yields the candidate set [0, 1, 4]. The original LIKE '%pple%' filter is then applied to these candidates. All three satisfy the condition, so the final result remains [0, 1, 4].

3. L < min_gram (e.g., strField LIKE '%pp%')

The literal substring is shorter than min_gram and therefore cannot be decomposed into indexed n-grams. In this case, the Ngram Index cannot be used, and Milvus falls back to the default execution path, evaluating the LIKE condition through a full scan with pattern matching.

Wildcard match (e.g., strField LIKE '%Ap%pple%')

This pattern contains multiple wildcards, so Milvus first splits it into contiguous literals: "Ap" and "pple".

Milvus then processes each literal independently:

  • "Ap" has length 2 and falls within the n-gram range.

  • "pple" is longer than max_gram and is decomposed into "ppl" and "ple".

This reduces the query to the following n-grams:

  • "Ap"[0, 3]

  • "ppl"[0, 1, 3, 4]

  • "ple"[0, 1, 2, 4]

Intersecting these lists produces a single candidate: [0].

Finally, the original LIKE '%Ap%pple%' filter is applied to document 0 ("Apple"). Since it does not satisfy the full pattern, the final result set is empty.

Limitations and Trade-offs of the Ngram Index

While the Ngram Index can significantly improve LIKE query performance, it introduces trade-offs that should be considered in real-world deployments.

  • Increased index size

The primary cost of the Ngram Index is higher storage overhead. Because the index stores all contiguous substrings whose lengths fall between min_gram and max_gram, the number of generated n-grams grows quickly as this range expands. Each additional n-gram length effectively adds another full set of overlapping substrings for every text value, increasing both the number of index keys and their posting lists. In practice, expanding the range by just one character can roughly double the index size compared to a standard inverted index.

  • Not effective for all workloads

The Ngram Index does not accelerate every workload. If query patterns are highly irregular, contain very short literals, or fail to reduce the dataset to a small candidate set in the filtering phase, the performance benefit may be limited. In such cases, query execution can still approach the cost of a full scan, even though the index is present.

Evaluating Ngram Index Performance on LIKE Queries

The goal of this benchmark is to evaluate how effectively the Ngram Index accelerates LIKE queries in practice.

Test Methodology

To put its performance in context, we compare it against two baseline execution modes:

  • Master: Brute-force execution without any index.

  • Master-inverted: Execution using a conventional inverted index.

We designed two test scenarios to cover different data characteristics:

  • Wiki text dataset: 100,000 rows, with each text field truncated to 1 KB.

  • Single-word dataset: 1,000,000 rows, where each row contains a single word.

Across both scenarios, the following settings are applied consistently:

  • Queries use the infix match pattern (%xxx%)

  • The Ngram Index is configured with min_gram = 2 and max_gram = 4

  • To isolate query execution cost and avoid result materialization overhead, all queries return count(*) instead of full result sets.

Results

Test for wiki, each line is a wiki text with content length truncated by 1000, 100K rows

LiteralTime(ms)SpeedupCount
Masterstadium207.8335
Master-inverted2095335
Ngram1.09190 / 1922335
Mastersecondary school204.8340
Master-inverted2000340
Ngram1.26162.5 / 1587340
Masteris a coeducational, secondary school sponsore223.91
Master-inverted21001
Ngram1.69132.5 / 1242.61

Test for single words, 1M rows

LiteralTime(ms)SpeedupCount
Masterna128.640430
Master-inverted66.540430
Ngram1.3893.2 / 48.240430
Masternat1225200
Master-inverted65.15200
Ngram1.2796 / 51.35200
Masternati118.81630
Master-inverted66.91630
Ngram1.2198.2 / 55.31630
Masternatio118.41100
Master-inverted65.11100
Ngram1.3389 / 48.91100
Masternation1181100
Master-inverted63.31100
Ngram1.484.3 / 45.21100

Note: These results are based on benchmarks conducted in May. Since then, the Master branch has undergone additional performance optimizations, so the performance gap observed here is expected to be smaller in current versions.

The benchmark results highlight a clear pattern: the Ngram Index significantly accelerates LIKE queries in all cases, and how much faster the queries run depends strongly on the structure and length of the underlying text data.

  • For long text fields, such as Wiki-style documents truncated to 1,000 bytes, the performance gains are especially pronounced. Compared to brute-force execution with no index, the Ngram Index achieves speedups of roughly 100–200×. When compared against a conventional inverted index, the improvement is even more dramatic, reaching 1,200–1,900×. This is because LIKE queries on long text are particularly expensive for traditional indexing approaches, while n-gram lookups can quickly narrow the search space to a very small set of candidates.

  • On datasets consisting of single-word entries, the gains are smaller but still substantial. In this scenario, the Ngram Index runs approximately 80–100× faster than brute-force execution and 45–55× faster than a conventional inverted index. Although shorter text is inherently cheaper to scan, the n-gram–based approach still avoids unnecessary comparisons and consistently reduces query cost.

Conclusion

The Ngram Index accelerates LIKE queries by breaking text into fixed-length n-grams and indexing them using an inverted structure. This design turns expensive substring matching into efficient n-gram lookups followed by minimal verification. As a result, full-text scans are avoided while the exact semantics of LIKE are preserved.

In practice, this approach is effective across a wide range of workloads, with especially strong results for fuzzy matching on long text fields. The Ngram Index is therefore well suited for real-time scenarios such as code search, customer support agents, legal and medical document retrieval, enterprise knowledge bases, and academic search, where precise keyword matching remains essential.

At the same time, the Ngram Index benefits from careful configuration. Choosing appropriate min_gram and max_gram values is critical to balancing index size and query performance. When tuned to reflect real query patterns, the Ngram Index provides a practical, scalable solution for high-performance LIKE queries in production systems.

For more information about the Ngram Index, check the documentation below:

Have questions or want a deep dive on any feature of the latest Milvus? Join our Discord channel or file issues on GitHub. You can also book a 20-minute one-on-one session to get insights, guidance, and answers to your questions through Milvus Office Hours.

Learn More about Milvus 2.6 Features

Like the article? Spread the word

Keep Reading