Conference Proceedings

Anlirika: An LSTM–CNN Flow Twister for Spoken Language Identification

Andreas Scherbakov, Liam Whittle, Ritesh Kumar, Siddharth Singh, Matthew Coleman, Ekaterina Vylomova

Proceedings of the Third Workshop on Computational Typology and Multilingual NLP | Association for Computational Linguistics | Published : 2021

Abstract

The paper presents Anlirika's submission to SIGTYP 2021 Shared Task on Robust Spoken Language Identification. The task aims at building a robust system that generalizes well across different domains and speakers. The training data is limited to a single domain only with predominantly single speaker per language while the validation and test data samples are derived from diverse dataset and multiple speakers. We experiment with a neural system comprising a combination of dense, convolutional, and recurrent layers that are designed to perform better generalization and obtain speaker-invariant representations. We demonstrate that the task in its constrained form (without making use of external ..

View full abstract

University of Melbourne Researchers

Citation metrics