A Comparative Analysis of Linguistic and Retrieval Diversity in LLM-Generated Search Queries

O Zendel; SFD Al Lawati; L Rashidi; F Scholer; M Sanderson

Conference Proceedings

A Comparative Analysis of Linguistic and Retrieval Diversity in LLM-Generated Search Queries

O Zendel, SFD Al Lawati, L Rashidi, F Scholer, M Sanderson

Cikm 2025 Proceedings of the 34th ACM International Conference on Information and Knowledge Management | ACM | Published : 2025

DOI: 10.1145/3746252.3761382

Abstract

Large Language Models (LLMs) are increasingly used to generate search queries for various Information Retrieval (IR) tasks. However, it remains unclear how these machine-generated queries compare to human-written ones, particularly in terms of diversity and alignment with real user behavior. This paper presents an empirical comparison of LLM- and human-generated queries across multiple dimensions, including lexical diversity, linguistic variation, and retrieval effectiveness. We analyze queries produced by several LLMs and compare them with human queries from two datasets collected five years apart. Our findings show that while LLMs can generate diverse queries, their patterns differ from th..

View full abstract

University of Melbourne Researchers

Lida Rashidi Author

Related Projects (1)

ARC Centre of Excellence in Automated Decision Making and Society (CE200100005)

Grants

Awarded by Royal Melbourne Institute of Technology

Citation metrics

1Scopus

1Dimensions

Keywords

46 Information and Computing Sciences

4609 Information Systems

4605 Data Management and Data Science