Abstract
This article describes a connectionist language model, which may be used as an alternative to the well known n-gram models. A comparison experiment between n-gram and connectionist language models is performed on a Polish text corpus. Statistical language modeling is based on estimating a joint probability function of a sequence of words in a given language. This task is made problematic due to a phenomenon known commonly as the “curse of dimensionality”. This occurs because the sequence of words used to test the model is most likely going to be different from anything present in the training data. Classic solutions to this problem are successfully achieved by using n-grams which generalize the data by concatenating short overlapping word sequences gathered from the training data. Connections models, however, can accomplish this by learning a distributed representation for words. They can simultaneously learn both the distributed representation for each word in the dictionary as well as the synaptic weights used for modeling the joint probability of word sequences. Generalization can be obtained thanks to the fact that if a sequence is made up of words that were already seen, it will receive a higher probability than an unseen sequence of words. In the experiments, perplexity is used as measure of language model quality.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A Neural Probabilistic Language Model. In: NIPS 2000, vol. 13, pp. 933–938; revised in J. Machine Learning Research 3, 1137–1155 (2003)
Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)
Katz, S.M.: Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Transactions on Acoustics, Speech and Signal Processing 3, 400–401 (1987)
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech and Signal Processing, pp. 181–184 (1995)
Stolcke, A.: SRILM - An Extensible Language Modeling Toolkit. In: Proc. Intl. Conf. Spoken Language Processing, Denver, Colorado (2002)
Duch, W., Korbowicz, J., Rutkowski, L., Tadeusiewicz, R.: Biocybernetyka i inżynieria biomedyczna, tom 6, Sieci neuronowe. Akademicka Oficyna Wydawnicza EXIT, Warszawa (2000)
Bishop, C.M.: Neural networks for pattern recognition. Oxford University Press (1995)
Ney, H., Kneser, R.: Improved clustering techniques for class-based statistical language modeling. In: European Conference on Speech Communication and Technology (Eurospeech), Berlin, pp. 973–976 (1993)
Chen, S.F., Goodman, J.T.: An empirical study of smoothing techniques for language modeling. Computer, Speech and Language 13(4), 359–393 (1999)
Tadeusiewicz, R.: Sieci neuronowe. Akademicka Oficyna Wydawnicza RM, Warszawa (1993)
Przepiórkowski, A.: Korpus IPI PAN. Wersja wstępna / The IPI PAN Corpus: Preliminary version. IPI PAN, Warszawa (2004)
Brown, P.F., DeSouza, P.V., Mercer, R.L., Della Pietra, V.J., Lai, J.C.: Class-based n-gram Models of Natural Language. Computational Linguistics 18, 467–479 (1992)
Jelinek, E., Mercer, R.L.: Interpolated estimation of Markov source parameters from sparse data. In: Proceedings, Workshop on Pattern Recognition in Practice, Amsterdam, The Netherlands, pp. 381–397 (1980)
Xu, P., Emami, A., Jelinek, F.: Training connectionist models for the structured language model. In: Proceedings of The 2003 Conference On Empirical Methods In Natural Language Processing (EMNLP 2003), pp. 160–167. Association for Computational Linguistics, Stroudsburg (2003)
Tan, C.N.W., Wittig, G.E.: A study of the parameters of a backpropagation stock price prediction model. In: Proceedings, First New Zealand International Artificial Neural Networks and Expert Systems (1993)
Niesler, T.R., Whittaker, E.W.D., Woodland, P.C.: Comparison of part-of-speech and automatically derived category-based language models for speech recognition. In: International Conference on Acoustics, Speech and Signal Processing, pp. 177–180 (1998)
Hinton, G.E.: Learning Distributed Representations of Concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pp. 1–12 (1986)
Hinton, G.E.: Connectionist Learning Procedures. Artificial Intelligence Journal 40, 185–234 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
Brocki, Ł., Marasek, K., Koržinek, D. (2012). Connectionist Language Model for Polish. In: Bembenik, R., Skonieczny, L., Rybiński, H., Niezgodka, M. (eds) Intelligent Tools for Building a Scientific Information Platform. Studies in Computational Intelligence, vol 390. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24809-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-24809-2_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24808-5
Online ISBN: 978-3-642-24809-2
eBook Packages: EngineeringEngineering (R0)