Conference Proceedings
Learning a Lexicon and Translation Model from Phoneme Lattices
O Adams, G Neubig, T Cohn, S Bird, QT Do, S Nakamura
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) Short papers | Association for Computational Linguistics | Published : 2016
DOI: 10.18653/v1/d16-1263
Abstract
Language documentation begins by gathering speech. Manual or automatic transcription at the word level is typically not possible because of the absence of an orthography or prior lexicon, and though manual phonemic transcription is possible, it is prohibitively slow. On the other hand, translations of the minority language into a major language are more easily acquired. We propose a method to harness such translations to improve automatic phoneme recognition. The method assumes no prior lexicon or translation model, instead learning them from phoneme lattices and translations of the speech being transcribed. Experiments demonstrate phoneme error rate improvements against two baselines and th..
View full abstract