Abstract
This paper studies the use of discrete-time recurrent neural networks for predicting the next symbol in a sequence. The focus is on online prediction, a task much harder than the classical o.ine grammatical inference with neural networks. The results obtained show that the performance of recurrent networks working online is acceptable when sequences come from finite-state machines or even from some chaotic sources. When predicting texts in human language, however, dynamics seem to be too complex to be correctly learned in real-time by the net. Two algorithms are considered for network training: real-time recurrent learning and the decoupled extended Kalman filter.
Work supported by the Generalitat Valenciana through grant FPI-99-14-268 and the Spanish Comisión Interministerial de Ciencia y Tecnología through grant TIC97-0941. This paper completes the results in [8].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bengio, Y., P. Simard and P. Frasconi (1994), “Learning long-term dependencies with gradient descent is difficult”, IEEE Transactions on Neural Networks, 5(2).
Carrasco, R. C. et al. (2000), “Stable-encoding of finite-state machines in discrete-time recurrent neural nets with sigmoid units”, Neural Computation, 12(9).
Cleeremans, A., D. Servan-Schreiber and J. L. McClelland (1989), “Finite state automata and simple recurrent networks”, Neural Computation, 1(3), 372–381.
Elman, J. L. (1990), “Finding structure in time”, Cognitive Science, 14, 179–211.
Haykin, S. (1999), Neural networks: a comprehensive foundation, Chapter 15, Prentice-Hall, New Jersey, 2nd edition.
Hochreiter, J. (1991), Untersuchungen zu dynamischen neuronalen Netzen, Diploma thesis, Institut für Informatik, Technische Universität München.
Nelson, M. (1991), “Arithmetic coding + statistical modeling = data compression”, Dr. Dobb’s Journal, Feb. 1991.
Pérez-Ortiz, J. A., J. Calera-Rubio, M. L. Forcada (2001), “Online text prediction with recurrent neural networks”, Neural Processing Letters, in press.
Puskorius, G. V. and L. A. Feldkamp (1991), ”Decoupled extended Kalman filter training of feedforward layered networks”, in International Joint Conference on Neural Networks, volume 1, pp. 771–777.
Robinson, A. J. and F. Fallside (1991), “A recurrent error propagation speech recognition system”, Computer Speech and Language, 5, 259–274.
Schmidhuber, J. and S. Heil (1996), “Sequential neural text compression”, IEEE Transactions on Neural Networks, 7(1), pp. 142–146.
Smith, A. W. and D. Zipser (1989), “Learning sequential structures with the realtime recurrent learning algorithm”, International Journal of Neural Systems, 1(2).
Tiňo, P., M. Köteles (1999), “Extracting finite-state representations from recurrent neural networks trained on chaotic symbolic sequences”, IEEE Transactions on Neural Networks, 10(2), pp. 284–302.
Williams, R. J. and R. A. Zipser (1989), “A learning algorithm for continually training recurrent neural networks”, Neural Computation, 1, 270–280.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pérez-Ortiz, J.A., Calera-Rubio, J., Forcada, M.L. (2001). Online Symbolic-Sequence Prediction with Discrete-Time Recurrent Neural Networks. In: Dorffner, G., Bischof, H., Hornik, K. (eds) Artificial Neural Networks — ICANN 2001. ICANN 2001. Lecture Notes in Computer Science, vol 2130. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44668-0_100
Download citation
DOI: https://doi.org/10.1007/3-540-44668-0_100
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42486-4
Online ISBN: 978-3-540-44668-2
eBook Packages: Springer Book Archive