Agentic RAG agents evaluate retrieved documents and iteratively re-query or rewrite prompts if results are irrelevant.
Agent strategies:
1. Relevance scoring: Agent uses an LLM to score whether retrieved documents answer the original query. If scores fall below a threshold, the agent re-queries with different search terms.
2. Query rewriting: If retrieval fails, the agent reframes the question. Example:
- Original: “What is our Q4 supply chain resilience?”
- Rewritten: “Which suppliers had >10 days delivery delays in October–December?”
3. Multi-step iteration: Agent chains retrievals. If the first retrieval finds related documents, the agent uses those results to inform a second, more targeted query.
4. Fallback strategies: If vector search returns sparse results, the agent switches to metadata filtering, keyword search, or broader semantic queries.
5. Metadata constraints: Agent applies dynamic metadata filters (date ranges, document types, source systems) to prune irrelevant results before generation.
This iterative loop is why low-latency retrieval matters—agents need multiple round-trips to Milvus without blocking user response times.
Related Resources: