Conference Proceedings
Gaze and Speech in Multimodal Human-Computer Interaction: A Scoping Review
Anam Ahmad Khan, Florian Weidner, Jungwoo Rhee, Yasmeen Abdrabou, Andrea Bianchi, Eduardo Velloso, Hans Gellersen, Joshua Newn
Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems | ACM | Published : 2026
Abstract
Multimodal interaction has long promised to make interfaces more intuitive and effective by combining complementary inputs. Among these, gaze and speech form a compelling pairing: gaze provides rapid spatial grounding, while speech conveys rich semantic information. Together, they offer rich cues for understanding user behaviour and intent. Yet despite decades of exploration, the research remains fragmented, making this synthesis timely as these inputs mature and are integrated into consumer-ready devices. This scoping review examined 103 studies published between 1991 and 2025, organised into explicit, where users intentionally provide gaze and speech, and implicit, where systems leverage u..
View full abstractGrants
Awarded by IITP(Institute of Information & Communications Technology Planning & Evaluation)-ITRC(Information Technology Research Center)