Generation of Synthetic Query Auto Completion Logs
U Krishnan, A Moffat, J Zobel, B Billerbeck
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Springer | Published : 2020
Privacy concerns can prohibit research access to large-scale commercial query logs. Here we focus on generation of a synthetic log from a publicly available dataset, suitable for evaluation of query auto completion (QAC) systems. The synthetic log contains plausible string sequences reflecting how users enter their queries in a QAC interface. Properties that would influence experimental outcomes are compared between a synthetic log and a real QAC log through a set of side-by-side experiments, and confirm the applicability of the generated log for benchmarking the performance of QAC methods.