Conference Proceedings

Learning Information Extraction Patterns Using WordNet

Mark Stevenson, Mark A Greenwood, P Sojka (ed.), KS Choi (ed.), C Fellbaum (ed.), P Vossen (ed.)

GWC 2006: THIRD INTERNATIONAL WORDNET CONFERENCE, PROCEEDINGS | MASARYKOVA UNIV | Published : 2005

Abstract

Information Extraction (IE) systems often use patterns to identify relevant information in text but these are difficult and time-consuming to generate manually. This paper presents a new approach to the automatic learning of IE patterns which uses WordNet to judge the similarity between patterns. The algorithm starts with a small set of sample extraction patterns and uses a similarity metric, based on a version of the vector space model augmented with information from WordNet, to learn similar patterns. This approach is found to perform better than a previously reported method which relied on information about the distribution of patterns in a corpus and did not make use of Word-Net. © Masar..

View full abstract

University of Melbourne Researchers