Text Queries on Document Time Series

Dr. Rosie Jones, Sr. Research Scientist

Information Retrieval Yahoo! Matching Sciences

Date and time: 11.30am-12.30pm, Friday 9th September, 2005

Venue: 10.08.04

Chair: Xiaodong Li

Abstract:

Many documents, such as emails and news stories, have timestamps, corresponding to the creation date, or date the information was sent. Using the timestamp, each document can be represented as a point on a timeline. We can then represent the entire document collection as a function of the number of documents at each point. A text query on a collection of documents on a timeline selects the subset of documents that are relevant to the query. The resulting set of documents can also be represented on a timeline. For example, a query for "earthquake" would lead to documents clustered in peaks on the time line, near each earthquake described in the news. We describe experiments on ways of using this timeline to identify temporal query ambiguity, as well as predict when queries are likely to lead to poor search results.

About the speaker:

Rosie Jones is a research scientist in information retrieval at Yahoo! Her research interests include information retrieval, time series modeling, machine learning, and text mining. She received her PhD in Language Technologies form the School of Computer Science at Carnegie Mellon University, and her BSc in Computer Science from the University of Sydney..


Seminar Organisation

Seminars are free and open to the general public. No booking is necessary. If you are interested in giving a presentation in this seminar series, or to make suggestions for speakers, please contact Xiaodong Li, the seminar co-ordinator.