Home
Blog
Introducing the Milvus Ngram Index: Faster Keyword Matching and LIKE Queries for Agent Workloads

Introducing the Milvus Ngram Index: Faster Keyword Matching and LIKE Queries for Agent Workloads

Engineering

December 16, 2025

Chenjie Tang

In agent systems, context retrieval is a foundational building block across the entire pipeline, providing the basis for downstream reasoning, planning, and action. Vector search helps agents retrieve semantically relevant context that captures intent and meaning across large and unstructured datasets. However, semantic relevance alone is often not enough. Agent pipelines also rely on full-text search to enforce exact keyword constraints—such as product names, function calls, error codes, or legally significant terms. This supporting layer ensures that retrieved context is not only relevant, but also explicitly satisfies hard textual requirements.

Real workloads consistently reflect this need:

Customer support assistants must find conversations mentioning a specific product or ingredient.
Coding copilots look for snippets containing an exact function name, API call, or error string.
Legal, medical, and academic agents filter documents for clauses or citations that must appear verbatim.

Traditionally, systems have handled this with the SQL LIKE operator. A query such as name LIKE '%rod%' is simple and widely supported, but under high concurrency and large data volumes, this simplicity carries major performance costs.

Without an index, a LIKE query scans the entire context store and applies pattern matching row by row. At millions of records, even a single query can take seconds—far too slow for real-time agent interactions.
Even with a conventional inverted index, wildcard patterns such as %rod% remain hard to optimize because the engine must still traverse the entire dictionary and run pattern matching on each entry. The operation avoids row scans but remains fundamentally linear, resulting in only marginal improvements.

This creates a clear gap in hybrid retrieval systems: vector search handles semantic relevance efficiently, but exact keyword filtering often becomes the slowest step in the pipeline.

Milvus natively supports hybrid vector and full-text search with metadata filtering. To address the limitations of keyword matching, Milvus introduces the Ngram Index, which improves LIKE performance by splitting text into small substrings and indexing them for efficient lookup. This dramatically reduces the amount of data examined during query execution, delivering tens to hundreds of times faster LIKE queries in real agentic workloads.

The rest of this post walks through how the Ngram Index works in Milvus and evaluates its performance in real-world scenarios.

What Is the Ngram Index?

In databases, text filtering is commonly expressed using SQL, the standard query language used to retrieve and manage data. One of its most widely used text operators is LIKE, which supports pattern-based string matching.

LIKE expressions can be broadly grouped into four common pattern types, depending on how wildcards are used:

Infix match (name LIKE '%rod%'): Matches records where the substring rod appears anywhere in the text.
Prefix match (name LIKE 'rod%'): Matches records whose text starts with rod.
Suffix match (name LIKE '%rod'): Matches records whose text ends with rod.
Wildcard match (name LIKE '%rod%aab%bc_de'): Combines multiple substring conditions (%) with single-character wildcards (_) in a single pattern.

While these patterns differ in appearance and expressiveness, the Ngram Index in Milvus accelerates all of them using the same underlying approach.

Before building the index, Milvus splits each text value into short, overlapping substrings of fixed lengths, known as n-grams. For example, when n = 3, the word “Milvus” is decomposed into the following 3-grams: “Mil”, “ilv”, “lvu”, and “vus”. Each n-gram is then stored in an inverted index that maps the substring to the set of document IDs in which it appears. At query time, LIKE conditions are translated into combinations of n-gram lookups, allowing Milvus to quickly filter out most non-matching records and evaluate the pattern against a much smaller candidate set. This is what turns expensive string scans into efficient index-based queries.

Two parameters control how the Ngram Index is constructed: min_gram and max_gram. Together, they define the range of substring lengths that Milvus generates and indexes.

min_gram: The shortest substring length to index. In practice, this also sets the minimum query substring length that can benefit from the Ngram Index
max_gram: The longest substring length to index. At query time, it additionally determines the maximum window size used when splitting longer query strings into n-grams.

By indexing all contiguous substrings whose lengths fall between min_gram and max_gram, Milvus establishes a consistent and efficient foundation for accelerating all supported LIKE pattern types.

How Does the Ngram Index Work?

Milvus implements the Ngram Index in a two-phase process:

Build the index: Generate n-grams for each document and build an inverted index during data ingestion.
Accelerate queries: Use the index to narrow the search to a small candidate set, then verify exact LIKE matches on those candidates.

A concrete example makes this process easier to understand.

Phase 1: Build the index

Decompose text into n-grams:

Assume we index the text “Apple” with the following settings:

min_gram = 2
max_gram = 3

Under this setting, Milvus generates all contiguous substrings of length 2 and 3:

2-grams: Ap, pp, pl, le
3-grams: App, ppl, ple

Build an inverted index:

Now consider a small dataset of five records:

Document 0: Apple
Document 1: Pineapple
Document 2: Maple
Document 3: Apply
Document 4: Snapple

During ingestion, Milvus generates n-grams for each record and inserts them into an inverted index. In this index:

Keys are n-grams (substrings)
Values are lists of document IDs where the n-gram appears

"Ap"  -> [0, 3]
"App" -> [0, 3]
"Ma"  -> [2]
"Map" -> [2]
"Pi"  -> [1]
"Pin" -> [1]
"Sn"  -> [4]
"Sna" -> [4]
"ap"  -> [1, 2, 4]
"apl" -> [2]
"app" -> [1, 4]
"ea"  -> [1]
"eap" -> [1]
"in"  -> [1]
"ine" -> [1]
"le"  -> [0, 1, 2, 4]
"ly"  -> [3]
"na"  -> [4]
"nap" -> [4]
"ne"  -> [1]
"nea" -> [1]
"pl"  -> [0, 1, 2, 3, 4]
"ple" -> [0, 1, 2, 4]
"ply" -> [3]
"pp"  -> [0, 1, 3, 4]
"ppl" -> [0, 1, 3, 4]

Now the index is fully built.

Phase 2: Accelerate queries

When a LIKE filter is executed, Milvus uses the Ngram Index to speed up query evaluation through the following steps:

1. Extract the query term: Contiguous substrings without wildcards are extracted from the LIKE expression (for example, '%apple%' becomes apple).

2. Decompose the query term: The query term is decomposed into n-grams based on its length (L) and the configured min_gram and max_gram.

3. Look for each gram & intersect: Milvus looks up query n-grams in the inverted index and intersects their document ID lists to produce a small candidate set.

4. Verify and return results: The original LIKE condition is applied only to this candidate set to determine the final result.

In practice, the way a query is split into n-grams depends on the shape of the pattern itself. To see how this works, we’ll focus on two common cases: infix matches and wildcard matches. Prefix and suffix matches behave the same as infix matches, so we won’t cover them separately.

Infix match

For an infix match, execution depends on the length of the literal substring (L) relative to min_gram and max_gram.

1. min_gram ≤ L ≤ max_gram (e.g., strField LIKE '%ppl%')

The literal substring ppl falls entirely within the configured n-gram range. Milvus directly looks up the n-gram "ppl" in the inverted index, producing the candidate document IDs [0, 1, 3, 4].

Because the literal itself is an indexed n-gram, all candidates already satisfy the infix condition. The final verification step does not eliminate any records, and the result remains [0, 1, 3, 4].

2. L > max_gram (e.g., strField LIKE '%pple%')

The literal substring pple is longer than max_gram, so it is decomposed into overlapping n-grams using a window size of max_gram. With max_gram = 3, this produces the n-grams "ppl" and "ple".

Milvus looks up each n-gram in the inverted index:

"ppl" → [0, 1, 3, 4]
"ple" → [0, 1, 2, 4]

Intersecting these lists yields the candidate set [0, 1, 4]. The original LIKE '%pple%' filter is then applied to these candidates. All three satisfy the condition, so the final result remains [0, 1, 4].

3. L < min_gram (e.g., strField LIKE '%pp%')

The literal substring is shorter than min_gram and therefore cannot be decomposed into indexed n-grams. In this case, the Ngram Index cannot be used, and Milvus falls back to the default execution path, evaluating the LIKE condition through a full scan with pattern matching.

Wildcard match (e.g., strField LIKE '%Ap%pple%')

This pattern contains multiple wildcards, so Milvus first splits it into contiguous literals: "Ap" and "pple".

Milvus then processes each literal independently:

"Ap" has length 2 and falls within the n-gram range.
"pple" is longer than max_gram and is decomposed into "ppl" and "ple".

This reduces the query to the following n-grams:

"Ap" → [0, 3]
"ppl" → [0, 1, 3, 4]
"ple" → [0, 1, 2, 4]

Intersecting these lists produces a single candidate: [0].

Finally, the original LIKE '%Ap%pple%' filter is applied to document 0 ("Apple"). Since it does not satisfy the full pattern, the final result set is empty.

Limitations and Trade-offs of the Ngram Index

While the Ngram Index can significantly improve LIKE query performance, it introduces trade-offs that should be considered in real-world deployments.

Increased index size

The primary cost of the Ngram Index is higher storage overhead. Because the index stores all contiguous substrings whose lengths fall between min_gram and max_gram, the number of generated n-grams grows quickly as this range expands. Each additional n-gram length effectively adds another full set of overlapping substrings for every text value, increasing both the number of index keys and their posting lists. In practice, expanding the range by just one character can roughly double the index size compared to a standard inverted index.

Not effective for all workloads

The Ngram Index does not accelerate every workload. If query patterns are highly irregular, contain very short literals, or fail to reduce the dataset to a small candidate set in the filtering phase, the performance benefit may be limited. In such cases, query execution can still approach the cost of a full scan, even though the index is present.

Evaluating Ngram Index Performance on LIKE Queries

The goal of this benchmark is to evaluate how effectively the Ngram Index accelerates LIKE queries in practice.

Test Methodology

To put its performance in context, we compare it against two baseline execution modes:

Master: Brute-force execution without any index.
Master-inverted: Execution using a conventional inverted index.

We designed two test scenarios to cover different data characteristics:

Wiki text dataset: 100,000 rows, with each text field truncated to 1 KB.
Single-word dataset: 1,000,000 rows, where each row contains a single word.

Across both scenarios, the following settings are applied consistently:

Queries use the infix match pattern (%xxx%)
The Ngram Index is configured with min_gram = 2 and max_gram = 4
To isolate query execution cost and avoid result materialization overhead, all queries return count(*) instead of full result sets.

Results

Test for wiki, each line is a wiki text with content length truncated by 1000, 100K rows

	Literal	Time(ms)	Speedup	Count
Master	stadium	207.8		335
Master-inverted		2095		335
Ngram		1.09	190 / 1922	335

Master	secondary school	204.8		340
Master-inverted		2000		340
Ngram		1.26	162.5 / 1587	340

Master	is a coeducational, secondary school sponsore	223.9		1
Master-inverted		2100		1
Ngram		1.69	132.5 / 1242.6	1

Test for single words, 1M rows

	Literal	Time(ms)	Speedup	Count
Master	na	128.6		40430
Master-inverted		66.5		40430
Ngram		1.38	93.2 / 48.2	40430

Master	nat	122		5200
Master-inverted		65.1		5200
Ngram		1.27	96 / 51.3	5200

Master	nati	118.8		1630
Master-inverted		66.9		1630
Ngram		1.21	98.2 / 55.3	1630

Master	natio	118.4		1100
Master-inverted		65.1		1100
Ngram		1.33	89 / 48.9	1100

Master	nation	118		1100
Master-inverted		63.3		1100
Ngram		1.4	84.3 / 45.2	1100

Note: These results are based on benchmarks conducted in May. Since then, the Master branch has undergone additional performance optimizations, so the performance gap observed here is expected to be smaller in current versions.

The benchmark results highlight a clear pattern: the Ngram Index significantly accelerates LIKE queries in all cases, and how much faster the queries run depends strongly on the structure and length of the underlying text data.

For long text fields, such as Wiki-style documents truncated to 1,000 bytes, the performance gains are especially pronounced. Compared to brute-force execution with no index, the Ngram Index achieves speedups of roughly 100–200×. When compared against a conventional inverted index, the improvement is even more dramatic, reaching 1,200–1,900×. This is because LIKE queries on long text are particularly expensive for traditional indexing approaches, while n-gram lookups can quickly narrow the search space to a very small set of candidates.
On datasets consisting of single-word entries, the gains are smaller but still substantial. In this scenario, the Ngram Index runs approximately 80–100× faster than brute-force execution and 45–55× faster than a conventional inverted index. Although shorter text is inherently cheaper to scan, the n-gram–based approach still avoids unnecessary comparisons and consistently reduces query cost.

Conclusion

The Ngram Index accelerates LIKE queries by breaking text into fixed-length n-grams and indexing them using an inverted structure. This design turns expensive substring matching into efficient n-gram lookups followed by minimal verification. As a result, full-text scans are avoided while the exact semantics of LIKE are preserved.

In practice, this approach is effective across a wide range of workloads, with especially strong results for fuzzy matching on long text fields. The Ngram Index is therefore well suited for real-time scenarios such as code search, customer support agents, legal and medical document retrieval, enterprise knowledge bases, and academic search, where precise keyword matching remains essential.

At the same time, the Ngram Index benefits from careful configuration. Choosing appropriate min_gram and max_gram values is critical to balancing index size and query performance. When tuned to reflect real query patterns, the Ngram Index provides a practical, scalable solution for high-performance LIKE queries in production systems.

For more information about the Ngram Index, check the documentation below:

Ngram Index

Have questions or want a deep dive on any feature of the latest Milvus? Join our Discord channel or file issues on GitHub. You can also book a 20-minute one-on-one session to get insights, guidance, and answers to your questions through Milvus Office Hours.