Conference Proceedings

Detecting misflagged duplicate questions in community question-answering archives

D Hoogeveen, A Bennett, Y Li, KM Verspoor, T Baldwin

12th International AAAI Conference on Web and Social Media, ICWSM 2018 | AAAI Press | Published : 2018


Copyright © 2018, Association for the Advancement of Artificial Intelligence ( All rights reserved. In this paper we introduce the task of misflagged duplicate question detection for question pairs in community question-answer (cQA) archives and compare it to the more standard task of detecting valid duplicate questions. A misflagged duplicate is a question that has been erroneously hand-flagged by the community as a duplicate of an archived one, where the two questions are not actually the same. We find that for misflagged duplicate detection, meta data features that capture user authority, question quality, and relational data between questions, outperform pure text-based met..

View full abstract

Citation metrics