Conference Proceedings

Huntsville, hospitals, and hockey teams: Names can reveal your location

B Salehi, D Hovy, E Hovy, A Søgaard

Proceedings of the 3rd Workshop on Noisy User-generated Text | Association for Computational Linguistics | Published : 2017


Geolocation is the task of identifying a social media user’s primary location, and in natural language processing, there is a growing literature on to what extent automated analysis of social media posts can help. However, not all content features are equally revealing of a user’s location. In this paper, we evaluate nine name entity (NE) types. Using various metrics, we find that GEO-LOC, FACILITY and SPORT-TEAM are more informative for geolocation than other NE types. Using these types, we improve geolocation accuracy and reduce distance error over various famous text-based methods.