You decide by matching the objective to the cost of being wrong and the nature of uncertainty in your environment. Pure Minimax optimizes worst-case outcomes: it’s conservative and robust, but it can be overly cautious. Average-case objectives (expected value) are better when uncertainty is genuinely stochastic and the downside of occasional mistakes is acceptable. A hybrid objective is often best in real systems: you want strong average performance, but you also want to cap catastrophic failures. In games, this might mean playing for a win but avoiding forced losses; in decision systems, it might mean being helpful but refusing risky actions when evidence is weak.
Implementation-wise, you can formalize this with a utility function that includes both expected value and risk. One simple hybrid is a “risk-adjusted” score: score = E[value] - λ * Risk, where Risk could be variance, worst-case tail loss, or a penalty for low-confidence states. Another hybrid is “maximize expected value subject to a worst-case constraint,” like “never take actions whose worst-case score is below -X.” This is essentially a constrained optimization: you allow risk, but only up to a boundary you’re comfortable with. In Minimax search, you can approximate this by combining expectiminimax for chance with a worst-case guardrail for uncertain evidence, or by using Minimax at high-risk branches and expectation elsewhere.
A concrete example: imagine a help system that retrieves policy documents and decides whether to give a definitive answer or escalate to a human review. If wrong answers are expensive, a worst-case objective makes sense when evidence is thin. If evidence is strong, optimizing average helpfulness may be better. With Milvus or Zilliz Cloud, you can implement this by using retrieval confidence signals to switch objectives: when topK candidates are consistent and high-confidence, use an expectation-like score that prioritizes helpfulness; when candidates conflict or confidence is low, use a Minimax-like robust score that prioritizes safety and asks for clarification. The “hybrid” is not hand-wavy—it’s a policy rule grounded in measurable signals that determine when worst-case robustness should dominate and when average-case performance should.