Conference Proceedings
An empirical model of multiword expression decomposability
Timothy Baldwin, Colin Bannard, Takaaki Tanaka, Dominic Widdows
Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment - | Association for Computational Linguistics | Published : 2003
Open access
Abstract
This paper presents a construction-inspecific model of multiword expression decomposability based on latent semantic analysis. We use latent semantic analysis to determine the similarity between a multiword expression and its constituent words, and claim that higher similarities indicate greater decomposability. We test the model over English noun-noun compounds and verb-particles, and evaluate its correlation with similarities and hyponymy values in WordNet. Based on mean hyponymy over partitions of data ranked on similarity, we furnish evidence for the calculated similarities being correlated with the semantic relational content of WordNet.