The effect of pooling and evaluation depth on IR metrics

X Lu; A Moffat; JS Culpepper

Journal article

The effect of pooling and evaluation depth on IR metrics

X Lu, A Moffat, JS Culpepper

Information Retrieval Journal | SPRINGER | Published : 2016

DOI: 10.1007/s10791-016-9282-6

Download PDF

Abstract

Batch IR evaluations are usually performed in a framework that consists of a document collection, a set of queries, a set of relevance judgments, and one or more effectiveness metrics. A large number of evaluation metrics have been proposed, with two primary families having emerged: recall-based metrics, and utility-based metrics. In both families, the pragmatics of forming judgments mean that it is usual to evaluate the metric to some chosen depth such as k= 20 or k= 100 , without necessarily fully considering the ramifications associated with that choice. Our aim is this paper is to explore the relative risks arising with fixed-depth evaluation in the two families, and document the complex..

View full abstract

University of Melbourne Researchers

Alistair Moffat Author

Grants

Awarded by Australian Research Council

Awarded by Australian Research Council DECRA Research Fellowship

Funding Acknowledgements

This work was supported by the Australian Research Council's Discovery Projects Scheme (DP140101587). Shane Culpepper is the recipient of an Australian Research Council DECRA Research Fellowship (DE140100275).

Citation metrics

56Scopus

37Web of Science

50Dimensions

Keywords

46 Information and Computing Sciences

Evaluation Metrics Comparison

Generic Health Relevance

Experimentation

Pooling and Evaluation Depth

Computer Science, Information Systems

Science & Technology

4605 Data Management and Data Science

Technology

Computer Science