Conference Proceedings
Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality
JH Lau, D Newman, TJ Baldwin
ACL Anthology | Published : 2014
DOI: 10.3115/v1/e14-1056
Abstract
Topic models based on latent Dirichlet allocation and related methods are used in a range of user-focused tasks including document navigation and trend analysis, but evaluation of the intrinsic quality of the topic model and topics remains an open research area. In this work, we explore the two tasks of automatic evaluation of single topics and automatic evaluation of whole topic models, and provide recommendations on the best strategy for performing the two tasks, in addition to providing an open-source toolkit for topic and topic model evaluation. © 2014 Association for Computational Linguistics.