Do Typical IR System Experiments Measure Anything Useful?

Dr Andrew Turpin

RMIT University

Date and time: 11.30am - 12.30pm, Friday 28 July, 2006

Venue: 10.08.03 (Building 10, Level 8, Room 3)

Abstract:

Nearly all experiments reported in international journals and conferences comparing information retrieval systems use the metric Mean Average Precision (MAP): a measure of how many relevant documents are ranked highly in a results list returned by a search engine. Systems with high MAP scores are promoted as being superior to systems with low MAP scores.

Several recent studies have demonstrated that a system with a high MAP score does not necessarily satisfy users more than a system with a low MAP score. That is, while the IR research community believes that advances are being made in IR system effectiveness, users are just as happy with the old technology. As is the case with most studies involving human subjects, these previous studies involve compromises that make the case against MAP a little grey.

In this study, we attempt to remove some of the confounds, and evaluate two different information retrieval tasks on TREC Web-track data: a precision-based user task, measured by the length of time that users need to find a single document that is relevant to a TREC topic; and, a simple recall-based task, represented by the total number of relevant documents that users can identify within five minutes. Users employ search engines with controlled mean average precision (MAP) of between 55\% and 95\%.

Our results show that there is no significant relationship between system effectiveness measured by MAP and user performance on the precision-based task. That is, users performed equally well using either a search engine with high MAP or low MAP, confirming earlier studies. A significant, but weak relationship is present for the precision at one document returned metric. A weak relationship is present between MAP and the simple recall-based task.

About the speaker:

Dr Andrew Turpin is an ARC Queen Elizabeth II Senior Research Fellow in the School of Computer Science and Information Technology at RMIT University, Melbourne. He completed his PhD at The University of Melbourne on data compression in 1999, subsequently spent several years at Devers Eye Institute and Oregan Health and Sciences University in Portland Oregon, then four more years after that teaching computer science at Curtin University of Technology in Perth. After a short stint as Senior Lecturer at The Unversity of Melbourne, he has spent the last 18 months at RMIT as part of The Search Engine Group. His research interests include computational problems in ophthalmology, information retrieval, and string algorithms.

This work is joint with Dr Falk Scholer of RMIT, and will be presented at the SIGIR conference in Seattle, August 2006.


Seminar Organisation

Seminars are free and open to the general public. No booking is necessary. If you are interested in giving a presentation in this seminar series, or to make suggestions for speakers, please contact Xiaodong Li, the seminar co-ordinator.