milvus-logo
LFAI
Home
  • User Guide

Standard​

The standard analyzer is the default analyzer in Milvus, which is automatically applied to text fields if no analyzer is specified. It uses grammar-based tokenization, making it effective for most languages.​

Definition​

The standard analyzer consists of:​

  • Tokenizer: Uses the standard tokenizer to split text into discrete word units based on grammar rules. For more information, refer to ​Standard.​

  • Filter: Uses the lowercase filter to convert all tokens to lowercase, enabling case-insensitive searches. For more information, refer to ​lowercase filter.

The functionality of the standard analyzer is equivalent to the following custom analyzer configuration:​

analyzer_params = {​
    "tokenizer": "standard",​
    "filter": ["lowercase"]​
}​

Configuration​

To apply the standard analyzer to a field, simply set type to standard in analyzer_params, and include optional parameters as needed.​

analyzer_params = {​
    "type": "standard", # Specifies the standard analyzer type​
}​

The standard analyzer accepts the following optional parameters: ​

Parameter​

Description​

stop_words

An array containing a list of stop words, which will be removed from tokenization. Defaults to _english_, a built-in set of common English stop words. The details of _english_ can be found here.​

Example configuration of custom stop words:​

analyzer_params = {​
    "type": "standard", # Specifies the standard analyzer type​
    "stop_words", ["of"] # Optional: List of words to exclude from tokenization​
}​

After defining analyzer_params, you can apply them to a VARCHAR field when defining a collection schema. This allows Milvus to process the text in that field using the specified analyzer for efficient tokenization and filtering. For more information, refer to Example use.​

Example output​

Here’s how the standard analyzer processes text.​

Original text:​

"The Milvus vector database is built for scale!"

Expected output:​

["the", "milvus", "vector", "database", "is", "built", "for", "scale"]​

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started
Feedback

Was this page helpful?