Conference Proceedings
Anlirika: An LSTM–CNN Flow Twister for Spoken Language Identification
Andreas Scherbakov, Liam Whittle, Ritesh Kumar, Siddharth Singh, Matthew Coleman, Ekaterina Vylomova
Proceedings of the Third Workshop on Computational Typology and Multilingual NLP | Association for Computational Linguistics | Published : 2021
Abstract
The paper presents Anlirika's submission to SIGTYP 2021 Shared Task on Robust Spoken Language Identification. The task aims at building a robust system that generalizes well across different domains and speakers. The training data is limited to a single domain only with predominantly single speaker per language while the validation and test data samples are derived from diverse dataset and multiple speakers. We experiment with a neural system comprising a combination of dense, convolutional, and recurrent layers that are designed to perform better generalization and obtain speaker-invariant representations. We demonstrate that the task in its constrained form (without making use of external ..
View full abstract