🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Milvus
Zilliz

What is mean reciprocal rank (MRR)?

What is Mean Reciprocal Rank (MRR)? Mean Reciprocal Rank (MRR) is a metric used to evaluate the performance of systems that return ranked results, such as search engines or recommendation models. It measures how well a system positions the first correct answer in a list of results. Specifically, MRR calculates the average of the reciprocal (i.e., inverse) of the rank at which the first relevant result appears across multiple queries. For example, if the correct answer for a query is ranked 1st, the reciprocal is 1/1 = 1. If it’s ranked 3rd, the reciprocal is 1/3 ≈ 0.33. MRR averages these values across all queries, giving higher weight to results where the correct answer appears earlier.

Example and Calculation Suppose you test a search engine with three queries. For Query 1, the correct result is ranked 1st (reciprocal = 1). For Query 2, the correct result is ranked 3rd (reciprocal = 1/3). For Query 3, the correct result is ranked 2nd (reciprocal = 1/2). The MRR is the average of these reciprocals: (1 + 0.33 + 0.5) / 3 ≈ 0.61. This means the system, on average, places the first correct answer near the top of the results but not perfectly. MRR is particularly useful when the presence of at least one correct result matters most, such as in question-answering systems where users expect the answer quickly.

Use Cases and Limitations MRR is widely used in information retrieval and recommendation tasks where the goal is to surface the most relevant item early. For instance, in a chatbot that retrieves answers from a knowledge base, MRR could measure how often the correct response appears first. However, MRR has limitations: it ignores the presence of multiple correct results beyond the first one and doesn’t penalize systems that return irrelevant results after the first correct one. Metrics like Mean Average Precision (MAP) or Normalized Discounted Cumulative Gain (NDCG) are better suited for evaluating multi-relevance scenarios, but MRR remains a simple, effective tool for single-relevance use cases.

Like the article? Spread the word