Design-phase estimation of query-set size in Information Retrieval evaluation

William Webber

Department of Computer Science and Software Engineering, The University of Melbourne

Date and time: 11.30am - 12.30pm, Friday 6th March, 2009

Venue: 10.08.04 (Building 10, Level 8, Room 4)

Abstract:

Standard Information Retrieval (IR) test collections are too small to reliably detect incremental improvements, and too generalist for use in specialised sub-fields.  Researchers should create their own test collections.  Statistical power analysis addresses the question is how many queries to include, crucial because of the expense of assessing them.  In this paper, we examine power analysis in IR evaluation, particularly the estimation of the variability of score deltas.  We demonstrate that there is no single population of score deltas, and so variability must be separately predicted for each pair of systems to be compared.  Both past experience and trial experiments set very wide bounds on this prediction.  Alternatively, the query set can be iteratively increased until power is achieved.  We demonstrate empirically that the iterative method is biased towards exaggerating power and significance, but mildly so.  We therefore propose a hybrid methodology, of an initial best-estimate on query set size, supplemented by iterative expansion if power is found to be insufficient.  Finally, we ask whether IR evaluation has become so stereotyped as to become a hindrance to genuine improvement.

About the speaker:

William Webber is a research associate in the Department of Computer Science and Software Engineering at the University of Melbourne.  His research interests include information retrieval evaluation and distributed information retrieval. He his currently completing his PhD.


Seminar Organisation

Seminars are free and open to the general public. No booking is necessary. If you are interested in giving a presentation in this seminar series, or to make suggestions for speakers, please contact Xiaodong Li, the seminar co-ordinator.