Conference Proceedings
Interpretation of Compound Nominalisations using Corpus and Web Statistics
J Nicholson, T Baldwin
Coling Acl 2006 Multiword Expressions Identifying and Exploiting Underlying Properties Proceedings of the Workshop | Association for Computational Linguistics | Published : 2006
Abstract
We present two novel paraphrase tests for automatically predicting the inherent semantic relation of a given compound nominalisation as one of subject, direct object, or prepositional object. We compare these to the usual verb-argument paraphrase test using corpus statistics, and frequencies obtained by scraping the Google search engine interface. We also implemented a more robust statistical measure than maximum likelihood estimation - the confidence interval. A significant reduction in data sparseness was achieved, but this alone is insufficient to provide a substantial performance improvement.