Text Reuse

Professor Bruce Croft

Distinguished Professor and Chair, Department of Computer Science and Director, Center for Intelligent Information Retrieval Department of Computer Science, University of Massachusetts

Date and time: 11.30am - 12.30pm, Friday 4th April, 2008

Venue: 10.08.04 (Building 10, Level 8, Room 4)

Abstract:

Many people are familiar with the issue of plagiarism using information available on the Web. Detecting duplicate documents has also been studied some time because of its importance in Web search engines. More generally, text reuse occurs whenever somebody "borrows" and modifies facts or statements from a source and uses it in another document. Initial experiments done at RMIT and UMass indicated that there was little difference between similarity measures in their ability to detect reuse. In this talk, I will describe more recent experiments with similarity measures and a fingerprint technique to detect how much reuse there is in news and blog databases.

About the speaker:

Professor Bruce Croft is a Distinguished Professor and Chair, Department of Computer Science and Director, Center for Intelligent Information Retrieval Department of Computer Science, University of Massachusetts. His research interests include Information retrieval and digital libraries. For more information, please go to his homepage.


Seminar Organisation

Seminars are free and open to the general public. No booking is necessary. If you are interested in giving a presentation in this seminar series, or to make suggestions for speakers, please contact Xiaodong Li, the seminar co-ordinator.