🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

  • Home
  • AI Reference
  • How can you specify or adjust the amount of time DeepResearch spends on a query, if at all?

How can you specify or adjust the amount of time DeepResearch spends on a query, if at all?

To control how much time DeepResearch spends processing a query, you can adjust parameters that influence its workflow stages, such as data retrieval, analysis, and response generation. These parameters are typically exposed through configuration options in the API or SDK. For example, you might set a max_time limit to cap total processing duration or configure the number of data sources or iterations the system uses. These settings let you balance speed against depth of analysis, depending on your use case.

One practical approach is to limit the scope of data retrieval. DeepResearch often starts by gathering information from external databases, APIs, or internal datasets. By specifying a search_depth parameter (e.g., restricting the number of sources queried) or setting shorter timeouts for external requests, you can reduce the time spent in this phase. For instance, if your query requires real-time results, you might configure the system to check only two databases instead of five, sacrificing some comprehensiveness for faster output. Similarly, adjusting the max_iterations parameter in algorithms like optimization loops or recursive analysis steps can directly control computational time. A lower iteration count stops the process earlier, yielding quicker but potentially less refined results.

However, adjusting time constraints requires understanding trade-offs. For example, a shorter max_time might force DeepResearch to return incomplete answers or skip verification steps, increasing the risk of errors. Conversely, allowing more time enables thorough cross-referencing and validation, which is critical for research-heavy tasks. Developers can implement fallback mechanisms, such as returning partial results with a warning when time limits are hit. Monitoring tools like execution logs or performance metrics (e.g., time_per_stage) can help fine-tune these parameters. For instance, if logs show data retrieval consumes 80% of the processing time, you might optimize by caching frequently accessed data or prioritizing faster sources. Ultimately, the goal is to align time adjustments with the application’s accuracy and latency requirements.

Like the article? Spread the word

How we use cookies

This website stores cookies on your computer. By continuing to browse or by clicking ‘Accept’, you agree to the storing of cookies on your device to enhance your site experience and for analytical purposes.