Anda belum login :: 23 Nov 2024 05:41 WIB
Detail
ArtikelMaximum Entropy Language Modeling for Russian ASR  
Oleh: Shin, Evgeniy ; Stüker, Sebastian ; Kilgour, Kevin ; Fügen, Christian ; Waibel, Alex
Jenis: Article from Proceeding
Dalam koleksi: Proceedings of the 10th International Workshop on Spoken Language Translation (IWSLT 2013), Heidelberg, Germany: Dec. 5-6, 2013
Fulltext: Maximum Entropy Language.pdf (11.79MB)
Isi artikelRussian is a challenging language for automatic speech recognition systems due to its rich morphology. This rich morphology stems from Russian’s highly inflectional nature and the frequent use of pre- and su'xes. Also, Russian has a very free word order, changes in which are used to reflect connotations of the sentences. Dealing with these phenomena is rather dicult for traditional n-gram models. We therefore investigate in this paper the use of a maximum entropy language model for Russian whose features are specifically designed to deal with the inflections in Russian, as well as the loose word order. We combine this with a subword based language model in order to alleviate the problem of large vocabulary sizes necessary for dealing with highly inflecting languages. Applying the maximum entropy language model during re-scoring improves the word error rate of our recognition system by 1.2% absolute, while the use of the sub-word based language model reduces the vocabulary size from 120k to 40k and the OOV rate from 4.8% to 2.1%.
Opini AndaKlik untuk menuliskan opini Anda tentang koleksi ini!

Kembali
design
 
Process time: 0.015625 second(s)