Conference Proceedings
A Plethora of Methods for Learning English Countability
T Baldwin, F Bond
Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing Emnlp 2003 | ASSOC COMPUTATIONAL LINGUISTICS | Published : 2003
Abstract
This paper compares a range of methods for classifying words based on linguistic diagnostics, focusing on the task of learning countabilities for English nouns. We propose two basic approaches to feature representation: distribution-based representation, which simply looks at the distribution of features in the corpus data, and agreement-based representation which analyses the level of token-wise agreement between multiple pre-processor systems. We additionally compare a single multiclass classifier architecture with a suite of binary classifiers, and combine analyses from multiple pre-processors. Finally, we present and evaluate a feature selection method.
Grants
Awarded by National Science Foundation