Integrating Gaze and Speech for Enabling Implicit Interactions

AA Khan; J Newn; J Bailey; E Velloso

Conference Proceedings

Integrating Gaze and Speech for Enabling Implicit Interactions

AA Khan, J Newn, J Bailey, E Velloso

Conference on Human Factors in Computing Systems Proceedings | Published : 2022

DOI: 10.1145/3491102.3502134

Abstract

Gaze and speech are rich contextual sources of information that, when combined, can result in effective and rich multimodal interactions. This paper proposes a machine learning-based pipeline that leverages and combines users' natural gaze activity, the semantic knowledge from their vocal utterances and the synchronicity between gaze and speech data to facilitate users' interaction. We evaluated our proposed approach on an existing dataset, which involved 32 participants recording voice notes while reading an academic paper. Using a Logistic Regression classifier, we demonstrate that our proposed multimodal approach maps voice notes with accurate text passages with an average F1-Score of 0.9..

View full abstract

University of Melbourne Researchers

Eduardo Velloso Author

Related Projects (1)

Social Attentive User Interfaces for the Age of Interruption

This proposal aims to enable the development of social attentive user interfaces—those that employ sensors such as eye trackers and thermal ..

Grants

Awarded by Australian Government

Funding Acknowledgements

We wish to thank Bayu Trisedya for the useful discussion on this work. Eduardo Velloso is the recipient of an Australian Research Council Discovery Early Career Researcher Award (Project Number: DE180100315) funded by the Australian Government. Anam Ahmad Khan is supported under the Melbourne Graduate Research Scholarship.