Connectionist Language Model for Polish

Brocki, Łukasz; Marasek, Krzysztof; Koržinek, Danijel

doi:10.1007/978-3-642-24809-2_15

Łukasz Brocki⁵,
Krzysztof Marasek⁵ &
Danijel Koržinek⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 390))

712 Accesses

Abstract

This article describes a connectionist language model, which may be used as an alternative to the well known n-gram models. A comparison experiment between n-gram and connectionist language models is performed on a Polish text corpus. Statistical language modeling is based on estimating a joint probability function of a sequence of words in a given language. This task is made problematic due to a phenomenon known commonly as the “curse of dimensionality”. This occurs because the sequence of words used to test the model is most likely going to be different from anything present in the training data. Classic solutions to this problem are successfully achieved by using n-grams which generalize the data by concatenating short overlapping word sequences gathered from the training data. Connections models, however, can accomplish this by learning a distributed representation for words. They can simultaneously learn both the distributed representation for each word in the dictionary as well as the synaptic weights used for modeling the joint probability of word sequences. Generalization can be obtained thanks to the fact that if a sequence is made up of words that were already seen, it will receive a higher probability than an unseen sequence of words. In the experiments, perplexity is used as measure of language model quality.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Connectionist Models of Bilingual Word Reading

Continuous Distributed Representations of Words as Input of LSTM Network Language Model

The Custom Decay Language Model for Long Range Dependencies

Keywords

References

Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A Neural Probabilistic Language Model. In: NIPS 2000, vol. 13, pp. 933–938; revised in J. Machine Learning Research 3, 1137–1155 (2003)
Google Scholar
Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)
Google Scholar
Katz, S.M.: Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Transactions on Acoustics, Speech and Signal Processing 3, 400–401 (1987)
Article Google Scholar
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech and Signal Processing, pp. 181–184 (1995)
Google Scholar
Stolcke, A.: SRILM - An Extensible Language Modeling Toolkit. In: Proc. Intl. Conf. Spoken Language Processing, Denver, Colorado (2002)
Google Scholar
Duch, W., Korbowicz, J., Rutkowski, L., Tadeusiewicz, R.: Biocybernetyka i inżynieria biomedyczna, tom 6, Sieci neuronowe. Akademicka Oficyna Wydawnicza EXIT, Warszawa (2000)
Google Scholar
Bishop, C.M.: Neural networks for pattern recognition. Oxford University Press (1995)
Google Scholar
Ney, H., Kneser, R.: Improved clustering techniques for class-based statistical language modeling. In: European Conference on Speech Communication and Technology (Eurospeech), Berlin, pp. 973–976 (1993)
Google Scholar
Chen, S.F., Goodman, J.T.: An empirical study of smoothing techniques for language modeling. Computer, Speech and Language 13(4), 359–393 (1999)
Article Google Scholar
Tadeusiewicz, R.: Sieci neuronowe. Akademicka Oficyna Wydawnicza RM, Warszawa (1993)
Google Scholar
Przepiórkowski, A.: Korpus IPI PAN. Wersja wstępna / The IPI PAN Corpus: Preliminary version. IPI PAN, Warszawa (2004)
Google Scholar
Brown, P.F., DeSouza, P.V., Mercer, R.L., Della Pietra, V.J., Lai, J.C.: Class-based n-gram Models of Natural Language. Computational Linguistics 18, 467–479 (1992)
Google Scholar
Jelinek, E., Mercer, R.L.: Interpolated estimation of Markov source parameters from sparse data. In: Proceedings, Workshop on Pattern Recognition in Practice, Amsterdam, The Netherlands, pp. 381–397 (1980)
Google Scholar
Xu, P., Emami, A., Jelinek, F.: Training connectionist models for the structured language model. In: Proceedings of The 2003 Conference On Empirical Methods In Natural Language Processing (EMNLP 2003), pp. 160–167. Association for Computational Linguistics, Stroudsburg (2003)
Chapter Google Scholar
Tan, C.N.W., Wittig, G.E.: A study of the parameters of a backpropagation stock price prediction model. In: Proceedings, First New Zealand International Artificial Neural Networks and Expert Systems (1993)
Google Scholar
Niesler, T.R., Whittaker, E.W.D., Woodland, P.C.: Comparison of part-of-speech and automatically derived category-based language models for speech recognition. In: International Conference on Acoustics, Speech and Signal Processing, pp. 177–180 (1998)
Google Scholar
Hinton, G.E.: Learning Distributed Representations of Concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pp. 1–12 (1986)
Google Scholar
Hinton, G.E.: Connectionist Learning Procedures. Artificial Intelligence Journal 40, 185–234 (1989)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Polish-Japanese Institute of Information Technology, Koszykowa 86, 02-008, Warszawa, Poland
Łukasz Brocki, Krzysztof Marasek & Danijel Koržinek

Authors

Łukasz Brocki
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Marasek
View author publications
You can also search for this author in PubMed Google Scholar
Danijel Koržinek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Faculty of Electronics and Information, Warsaw University of Technology, Ul. Nowowiejska 15/19, Warsaw, 00-665, Poland
Robert Bembenik
Institute of Computer Science, Faculty of Electronics and Information, Warsaw University of Technology, Ul. Nowowiejska 15/19, Warsaw, 00-665, Poland
Lukasz Skonieczny
Institute of Computer Science, Faculty of Computer Science and, Warsaw University of Technology, ul. Zolnierska 49, Warsaw, 00-665, Poland
Henryk Rybiński
, Interdisciplinary Centre for Mathematica, University of Warsaw, Ul. Pawińskiego 5a, Warsaw, 02-106, Poland
Marek Niezgodka

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Brocki, Ł., Marasek, K., Koržinek, D. (2012). Connectionist Language Model for Polish. In: Bembenik, R., Skonieczny, L., Rybiński, H., Niezgodka, M. (eds) Intelligent Tools for Building a Scientific Information Platform. Studies in Computational Intelligence, vol 390. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24809-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-24809-2_15
Published: 24 January 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24808-5
Online ISBN: 978-3-642-24809-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Connectionist Language Model for Polish

Abstract

Chapter PDF

Similar content being viewed by others

Connectionist Models of Bilingual Word Reading

Continuous Distributed Representations of Words as Input of LSTM Network Language Model

The Custom Decay Language Model for Long Range Dependencies

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Connectionist Language Model for Polish

Abstract

Chapter PDF

Similar content being viewed by others

Connectionist Models of Bilingual Word Reading

Continuous Distributed Representations of Words as Input of LSTM Network Language Model

The Custom Decay Language Model for Long Range Dependencies

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation