Abstract
In the paper, we describe a research of factored language model (FLM) for Russian speech recognition. We used FLM at N-best list rescoring stage. Optimization of the FLM parameters was carried out by means of Genetic Algorithm. The best models used four factors: lemma, morphological tag, stem, and word. Experiments on large vocabulary continuous Russian speech recognition showed a relative WER reduction of 8% when FLM was interpolated with the baseline 3-gram model.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Nouza, J., Zdansky, J., Cerva, P., Silovsky, J.: Challenges in speech processing of Slavic languages (Case studies in speech recognition of Czech and Slovak). In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) Second COST 2102. LNCS, vol. 5967, pp. 225–241. Springer, Heidelberg (2010)
Whittaker, E.W.D., Woodland, P.C.: Language modelling for Russian and English using words and classes. Computer Speech and Language 17, 87–104 (2000)
Bilmes, J.A., Kirchhoff, K.: Factored language models and generalized parallel backoff. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Stroudsburg, PA, USA, vol. 2, pp. 4–6 (2003)
Vergyri, D., Kirchhoff, K., Duh, K., Stolcke, A.: Morphology-Based Language Modeling for Arabic Speech Recognition. In: Proceedings of ICSLP 2004, pp. 2245–2248 (2004)
Tachbelie, M.Y., Teferra Abate, S., Menzel, W.: Morpheme-based language modeling for Amharic speech recognition. In: Proceedings of the 4th Language and Technology Conference, LTC 2009, Posnan, Poland, pp. 114–118 (2009)
Alumae, T.: Sentence-adapted factored language model for transcribing Estonian speech. In: Proceedings of ICASSP 2006, Toulouse, France, pp. 429–432 (2006)
Adel, H., Kirchhof, K., Telaar, D., Vu, N.T., Schlippe, T., Schultz, T.: Features for factores language models for code-switching speech. In: Proceedings of 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), St. Petersburg, Russia, pp. 32–38 (2014)
Adel, H., Vu, N.T., Schultz, T.: Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria (2013)
Vazhenina, D., Markov, K.: Evaluation of advanced language modelling techniques for Russian LVCSR. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 124–131. Springer, Heidelberg (2013)
Vazhenina, D., Markov, K.: Factored Language Modeling for Russian LVCSR. In: Proceedings of International Joint Conference on Awareness Science and Technology & Ubi-Media Computing, Aizu-Wakamatsu city, Japan, pp. 205–210 (2013)
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods of Language Processing, Manchester, UK, pp. 44–49 (1994)
Kipyatkova, I., Verkhodanova, V., Karpov, A.: Rescoring N-best lists for Russian speech recognition using factored language models. In: Proceedings of 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU-2014), St. Petersburg, Russia, pp. 81–86 (2014)
Zulkarneev, M., Satunovsky, P., Shamraev, N.: The use of d-gram language model for speech recognition in Russian. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 362–366. Springer, Heidelberg (2013)
Kirchhoff, K., Bilmes, J., Duh, K.: Factored Language Models Tutorial. Tech. Report UWEETR-2007-0003, Dept. of EE, U. Washington (2007)
Karpov, A., Markov, K., Kipyatkova, I., Vazhenina, D., Ronzhin, A.: Large vocabulary Russian speech recognition using syntactico-statistical language modeling. Speech Communication 56, 213–228 (2014)
Kipyatkova, I.S., Karpov, A.A.: Development and Research of a Statistical Russian Language Model. SPIIRAS Proceedings 12, 35–49 (2010) (in Rus.)
Stolcke, A., Zheng, J., Wang, W., Abrash, V.: SRILM at Sixteen: Update and Outlook. In: Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop ASRU 2011, Waikoloa, Hawaii, USA (2011)
Kipyatkova, I., Karpov, A.: Lexicon Size and Language Model Order Optimization for Russian LVCSR. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 219–226. Springer, Heidelberg (2013)
Sokirko, A.: Morphological modules on the website www.aot.ru. In: Proceedings of “Dialogue-2004”, Protvino, Russia, pp. 559–564 (2004) (in Rus.)
Jokisch, O., Wagner, A., Sabo, R., Jaeckel, R., Cylwik, N., Rusko, M., Ronzhin, A., Hoffmann, R.: Multilingual speech data collection for the assessment of pronunciation and prosody in a language learning system. In: Proceedings of SPECOM 2009, St. Petersburg, Russia, pp. 515–520 (2009)
Karpov, A., Kipyatkova, I., Ronzhin, A.: Very Large Vocabulary ASR for Spoken Russian with Syntactic and Morphemic Analysis. In: Proceedings of Interspeech 2011, Florence, Italy, pp. 3161–3164 (2011)
Lee, A., Kawahara, T.: Recent Development of Open-Source Speech Recognition Engine Julius. In: Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2009), Sapporo, Japan, pp.131–137 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kipyatkova, I., Karpov, A. (2014). Study of Morphological Factors of Factored Language Models for Russian ASR. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_56
Download citation
DOI: https://doi.org/10.1007/978-3-319-11581-8_56
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11580-1
Online ISBN: 978-3-319-11581-8
eBook Packages: Computer ScienceComputer Science (R0)