Summary
Artificial Intelligence (AI) has recently become a real formal science: the new millennium brought the first mathematically sound, asymptotically optimal, universal problem solvers, providing a new, rigorous foundation for the previously largely heuristic field of General AI and embedded agents. At the same time there has been rapid progress in practical methods for learning true sequence-processing programs, as opposed to traditional methods limited to stationary pattern association. Here we will briefly review some of the new results, and speculate about future developments, pointing out that the time intervals between the most notable events in over 40,000 years or 29 lifetimes of human history have sped up exponentially, apparently converging to zero within the next few decades. Or is this impression just a by-product of the way humans allocate memory space to past events?
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Neural Network
- Recurrent Neural Network
- Neural Information Processing System
- Input Stream
- Neural Computation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
L. B. Almeida. A learning rule for asynchronous perceptrons with feedback in a combinatorial environment. In IEEE 1st International Conference on Neural Networks, San Diego, volume 2, pages 609-618, 1987.
S. Baluja and R. Caruana. Removing the genetics from the standard genetic algorithm. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 38-46. Morgan Kaufmann Publishers, San Francisco, CA, 1995.
N. Beringer, A. Graves, F. Schiel, and J. Schmidhuber. Classifying unprompted speech by retraining LSTM nets. In W. Duch, J. Kacprzyk, E. Oja, and S. Zadrozny, editors, Artificial Neural Networks: Biological Inspirations - ICANN 2005, LNCS 3696, pages 575-581. SpringerVerlag Berlin Heidelberg, 2005.
C. M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995.
Alan D. Blair and Jordan B. Pollack. Analysis of dynamical recognizers. Neural Computation, 9(5):1127-1142, 1997.
M. Boden and J. Wiles. Context-free and context-sensitive dynamics in recurrent neural networks. Connection Science, 2000.
H.A. Bourlard and N. Morgan. Connnectionist Speech Recognition: A Hybrid Approach. Kluwer Academic Publishers, 1994.
M. P. Casey. The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction. Neural Computation, 8(6):1135-1178, 1996.
N. L. Cramer. A representation for the adaptive generation of simple sequential programs. In J.J. Grefenstette, editor, Proceedings of an International Conference on Genetic Algorithms and Their Applications, Carnegie-Mellon University, July 24-26, 1985, Hillsdale NJ, 1985. Lawrence Erlbaum Associates.
B. de Vries and J. C. Principe. A theory for neural networks with time delays. In R. P. Lippmann, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 162-168. Morgan Kaufmann, 1991.
D. Dickmanns, J. Schmidhuber, and A. Winklhofer. Der genetische Algorithmus: Eine Implementierung in Prolog. Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof. Radig, Technische Universität München, 1987.
J. L. Elman. Finding structure in time. Technical Report CRL Technical Report 8801, Center for Research in Language, University of California, San Diego, 1988.
S. E. Fahlman. The recurrent cascade-correlation learning algorithm. In R. P. Lippmann, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 190-196. Morgan Kaufmann, 1991.
F. A. Gers and J. Schmidhuber. Neural processing of complex continual input streams. In Proc. IJCNN'2000, Int. Joint Conf. on Neural Networks, Como, Italy, 2000.
F. A. Gers and J. Schmidhuber. Recurrent nets that time and count. In Proc. IJCNN'2000, Int. Joint Conf. on Neural Networks, Como, Italy, 2000.
F. A. Gers and J. Schmidhuber. LSTM recurrent networks learn simple context free and context sensitive Languages. IEEE Transactions on Neural Networks, 12(6):1333-1340, 2001.
F. A. Gers, J. Schmidhuber, and F. Cummins. Learning to forget: Continual prediction with LSTM. Neural Computation, 12(10):2451-2471, 2000.
F. A. Gers, N. Schraudolph, and J. Schmidhuber. Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research, 3:115-143, 2002.
F. Gomez and J. Schmidhuber. Evolving modular fast-weight networks for control. In W. Duch, J. Kacprzyk, E. Oja, and S. Zadrozny, editors, Artificial Neural Networks: Biological Inspirations - ICANN 2005, LNCS 3697, pages 383-389. Springer-Verlag Berlin Heidelberg, 2005.
F. J. Gomez. Robust Nonlinear Control through Neuroevolution. PhD thesis, Department of Computer Sciences, University of Texas at Austin, 2003.
F. J. Gomez and R. Miikkulainen. Incremental evolution of complex general behavior. Adaptive Behavior, 5:317-342, 1997.
F. J. Gomez and R. Miikkulainen. Solving non-Markovian control tasks with neuroevolution. In Proc. IJCAI 99, Denver, CO, 1999. Morgan Kaufman.
F. J. Gomez and R. Miikkulainen. Active guidance for a finless rocket using neuroevolution. In Proc. GECCO 2003, Chicago, 2003. Winner of Best Paper Award in Real World Applications. Gomez is working at IDSIA on a CSEM grant to J. Schmidhuber.
F. J. Gomez and J. Schmidhuber. Coevolving recurrent neurons learn deep memory POMDPs. In Proc. of the 2005 conference on genetic and evolutionary computation (GECCO), Washington, D. C. ACM Press, New York, NY, USA, 2005. Nominated for a best paper award.
A. Graves, N. Beringer, and J. Schmidhuber. Rapid retraining on speech data with LSTM recurrent networks. Technical Report IDSIA-09-05, IDSIA, www.idsia.ch/techrep.html, 2005.
A. Graves, S. Fernandez, F. Gomez, and J. Schmidhuber. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural nets. In ICML'06: Proceedings of the International Conference on Machine Learning, 2006.
A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18:602-610, 2005.
A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional LSTM networks. In Proc. Int. Joint Conf. on Neural Networks IJCNN 2005, 2005.
S. Hochreiter. Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Universität München, 1991. Seewww.informatik.tumuenchen.de/~hochreit advisor: J. Schmidhuber.
S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In S. C. Kremer and J. F. Kolen, editors, A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, 2001.
S. Hochreiter and K. Obermayer. Sequence classification for protein analysis. In Snowbird Workshop, Snowbird, Utah, April 5-8 2005. Computational and Biological Learning Society.
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735-1780, 1997.
J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, 1975.
J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proc. of the National Academy of Sciences, 79:2554-2558, 1982.
M. Hutter. The fastest and shortest algorithm for all well-defined problems. International Journal of Foundations of Computer Science, 13 (3):431-443, 2002. (On J. Schmidhuber's SNF grant 20-61847).
M. Hutter. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. springer, Berlin, 2004. (On J. Schmidhuber's SNF grant 20-61847).
H. Jaeger. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science, 304:78-80, 2004.
L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: a survey. Journal of AI research, 4:237-285, 1996.
Y. Kalinke and H. Lehmann. Computation in recurrent neural networks: From counters to iterated function systems. In G. Antoniou and J. Slaney, editors, Advanced Topics in Artificial Intelligence, Proceedings of the 11th Australian Joint Conference on Artificial Intelligence, volume 1502 of LNAI, Berlin, Heidelberg, 1998. Springer.
R. Kurzweil. The Singularity is near. Wiley Interscience, 2005.
L. A. Levin. Universal sequential search problems. Problems of Information Transmission, 9(3):265-266, 1973.
M. Li and P. M. B. Vitányi. An Introduction to Kolmogorov Complexity and its Applications (2nd edition). Springer, 1997.
T. Lin, B.G. Horne, P. Tino, and C.L. Giles. Learning long-term dependencies in NARX recurrent neural networks. IEEE Transactions on Neural Networks, 7(6):1329-1338, 1996.
O. Miglino, H. Lund, and S. Nolfi. Evolving mobile robots in simulated and real environments. Artificial Life, 2(4):417-434, 1995.
C. B. Miller and C. L. Giles. Experimental comparison of the effect of order in recurrent neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 7(4):849-872, 1993.
G. Miller, P. Todd, and S. Hedge. Designing neural networks using genetic algorithms. In Proceedings of the 3rd International Conference on Genetic Algorithms, pages 379-384. Morgan Kauffman, 1989.
T. Mitchell. Machine Learning. McGraw Hill, 1997.
H. Moravec. Robot. Wiley Interscience, 1999.
D. E. Moriarty and R. Miikkulainen. Efficient reinforcement learning through symbiotic evolution. Machine Learning, 22:11-32, 1996.
M. C. Mozer. A focused back-propagation algorithm for temporal sequence recognition. Complex Systems, 3:349-381, 1989.
M. C. Mozer. Induction of multiscale temporal structure. In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 4, pages 275-282. Morgan Kaufmann, 1992.
A. Newell and H. Simon. GPS, a program that simulates human thought. In E. Feigenbaum and J. Feldman, editors, Computers and Thought, pages 279-293. McGraw-Hill, New York, 1963.
S. Nolfi, D. Floreano, O. Miglino, and F. Mondada. How to evolve autonomous robots: Different approaches in evolutionary robotics. In R. A. Brooks and P. Maes, editors, Fourth International Workshop on the Synthesis and Simulation of Living Systems (Artificial Life IV), pages 190-197. MIT, 1994.
J. R. Olsson. Inductive functional programming using incremental program transformation. Artificial Intelligence, 74(1):55-83, 1995.
B. A. Pearlmutter. Learning state space trajectories in recurrent neural networks. Neural Computation, 1(2):263-269, 1989.
B. A. Pearlmutter. Gradient calculations for dynamic recurrent neural networks: A survey. IEEE Transactions on Neural Networks, 6(5):1212-1228,1995.
R. Penrose. A generalized inverse for matrices. In Proceedings of the Cambridge Philosophy Society, volume 51, pages 406-413, 1955.
J. A. Pérez-Ortiz, F. A. Gers, D. Eck, and J. Schmidhuber. Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets. Neural Networks, (16):241-250, 2003.
J. A. Pérez-Ortiz, F. A. Gers, D. Eck, and J. Schmidhuber. Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets. Neural Networks, 16(2):241-250, 2003.
F. J. Pineda. Recurrent backpropagation and the dynamical approach to adaptive neural computation. Neural Computation, 1(2):161-172, 1989.
T. A. Plate. Holographic recurrent networks. In J. D. Cowan S. J. Hanson and C. L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 34-41. Morgan Kaufmann, 1993.
G. V. Puskorius and L. A. Feldkamp. Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks. IEEE Transactions on Neural Networks, 5(2):279-297, 1994.
I. Rechenberg. Evolutionsstrategie - Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Dissertation, 1971. Published 1973 by Fromman-Holzboog.
M. B. Ring. Learning sequential tasks by incrementally adding higher orders. In J. D. Cowan S. J. Hanson and C. L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 115-122. Morgan Kaufmann, 1993.
A. J. Robinson and F. Fallside. The utility driven dynamic error propagation network. Technical Report CUED/FINFENG/TR.1, Cambridge University Engineering Department, 1987.
Anthony J. Robinson. An application of recurrent nets to phone probability estimation. IEEE Transactions on Neural Networks, 5(2):298-305, March 1994.
P. Rodriguez, J. Wiles, and J Elman. A recurrent neural network that learns to count. Connection Science, 11(1):5-40, 1999.
Paul Rodriguez and Janet Wiles. Recurrent neural networks can learn to implement symbol-sensitive counting. In Advances in Neural Information Processing Systems, volume 10, pages 87-93. The MIT Press, 1998.
P. S. Rosenbloom, J. E. Laird, and A. Newell. The SOAR Papers. MIT Press, 1993.
D. E. Rumelhart and J. L. McClelland, editors. Parallel Distributed Processing, volume 1. MIT Press, 1986.
R. P. Salustowicz and J. Schmidhuber. Probabilistic incremental program evolution. Evolutionary Computation, 5(2):123-141, 1997.
R. P. Salustowicz, M. A. Wiering, and J. Schmidhuber. Learning team strategies: Soccer case studies. Machine Learning,33(2/3):263-282, 1998.
J. Schmidhuber. Evolutionary principles in self-referential learning. Diploma thesis, Institut für Informatik, Technische Universität München, 1987.
J. Schmidhuber. Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem. Dissertation, Institut für Informatik, Technische Universität München, 1990.
J. Schmidhuber. An online algorithm for dynamic reinforcement learning and planning in reactive environments. In Proc. IEEE/INNS International Joint Conference on Neural Networks, San Diego, volume 2, pages 253-258, 1990.
J. Schmidhuber. Reinforcement learning in Markovian and non Markovian environments. In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3 (NIPS 3), pages 500-506. Morgan Kaufmann, 1991.
J. Schmidhuber. A fixed size storage O(n3 ) time complexity learning algorithm for fully recurrent continually running networks. Neural Computation, 4(2):243-248, 1992.
J. Schmidhuber. Learning to control fast-weight memories: An alternative to recurrent nets. Neural Computation, 4(1):131-139, 1992.
J. Schmidhuber. Netzwerkarchitekturen, Zielfunktionen und Kettenregel. Habilitationsschrift, Institut für Informatik, Technische Universität München, 1993.
J. Schmidhuber. Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limit. International Journal of Foundations of Computer Science, 13(4):587-612, 2002.
J. Schmidhuber. The Speed Prior: a new simplicity measure yielding near-optimal computable predictions. In J. Kivinen and R. H. Sloan, editors, Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002), Lecture Notes in Artificial Intelligence, pages 216-228. Springer, Sydney, Australia, 2002.
J. Schmidhuber. Exponential speed-up of computer history's defining moments, 2003. http://www.idsia.ch/~juergen/computerhistory.html
J. Schmidhuber. Gödel machines: self-referential universal problem solvers making provably optimal self-improvements. Technical Report IDSIA-19-03, arXiv:cs.LO/0309048, IDSIA, Manno-Lugano, Switzerland, 2003.
J. Schmidhuber. The new AI: General & sound & relevant for physics. Technical Report TR IDSIA-04-03, Version1.0, cs.AI/0302012v1, February 2003.
J. Schmidhuber. Overview of work on robot learning, with publications, 2004. http://www.idsia.ch/~juergen/learningrobots.html
J. Schmidhuber. Completely self-referential optimal reinforcement learners. In W. Duch, J. Kacprzyk, E. Oja, and S. Zadrozny, editors, Artificial Neural Networks: Biological Inspirations - ICANN 2005, LNCS 3697, pages 223-233. Springer-Verlag Berlin Heidelberg, 2005. Plenary talk.
J. Schmidhuber. Gödel machines: Towards a technical justification of consciousness. In D. Kudenko, D. Kazakov, and E. Alonso, editors, Adaptive Agents and Multi-Agent Systems III (LNCS 3394), pages 1-23. Springer Verlag, 2005.
J. Schmidhuber. Artificial Intelligence- history highlights and outlook: AI maturing and becoming a real formal science,2006. http://www.idsia.ch/~juergen/ai.html
J. Schmidhuber. Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Science, 18(2):173-187, 2006.
J. Schmidhuber. Gödel machines: fully self-referential optimal universal problem solvers. In B. Goertzel and C. Pennachin, editors, Artificial General Intelligence. Springer Verlag, in press, 2006.
J. Schmidhuber. Is history converging? Again?,2006. http://www.idsia.ch/~juergen/history.html
J. Schmidhuber and B. Bakker. NIPS 2003 RNNaissance workshop on recurrent neural networks, Whistler, CA, 2003. http://www.idsia.ch/~juergen/rnnaissance.html .
J. Schmidhuber, M. Gagliolo, D. Wierstra, and F. Gomez. Evolino for recurrent support vector machines. In ESANN'06, 2006.
J. Schmidhuber, F. Gers, and D. Eck. Learning nonregular Languages: A comparison of simple recurrent networks and LSTM. Neural Computation, 14(9):2039-2041, 2002.
J. Schmidhuber, D. Wierstra, M. Gagliolo, and F. Gomez. Training recurrent networks by EVOLINO. Neural Computation, 2006, in press.
J. Schmidhuber, D. Wierstra, and F. J. Gomez. Evolino: Hybrid neuroevolution / optimal linear search for sequence prediction. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), pages 853-858, 2005.
J. Schmidhuber, J. Zhao, and N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, editors, Learning to learn, pages 293-309. Kluwer, 1997.
H. P. Schwefel. Numerische Optimierung von Computer-Modellen. Dissertation, 1974. Published 1977 by Birkhäuser, Basel.
C. E. Shannon. A mathematical theory of communication (parts I and II). Bell System Technical Journal, XXVII:379-423, 1948.
H. T. Siegelmann and E. D. Sontag. Turing computability with neural nets. Applied Mathematics Letters, 4(6):77-80, 1991.
H.T. Siegelmann. Theoretical Foundations of Recurrent Neural Networks. PhD thesis, Rutgers, New Brunswick Rutgers, The State of New Jersey, 1992.
K. Sims. Evolving virtual creatures. In Andrew Glassner, editor, Proceedings of SIGGRAPH'94 (Orlando, Florida, July 1994), Computer Graphics Proceedings, Annual Conference, pages 15-22. ACM SIGGRAPH, ACM Press, jul 1994. ISBN 0-89791-667-0.
V. Smil. Detonator of the population explosion. Nature, 400:415, 1999.
R. J. Solomonoff. A formal theory of inductive inference. Part I. Information and Control, 7:1-22, 1964.
R. J. Solomonoff. Complexity-based induction systems. IEEE Transactions on Information Theory, IT-24(5):422-432, 1978.
M. Steijvers and P.D.G. Grunwald. A recurrent network that performs a contextsensitive prediction task. In Proceedings of the 18th Annual Conference of the Cognitive Science Society. Erlbaum, 1996.
G. Sun, H. Chen, and Y. Lee. Time warping invariant neural networks. In J. D. Cowan S. J. Hanson and C. L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 180-187. Morgan Kaufmann, 1993.
G. Z. Sun, C. Lee Giles, H. H. Chen, and Y. C. Lee. The neural network pushdown automaton: Model, stack and learning simulations. Technical Report CS-TR-3118, University of Maryland, College Park, August 1993.
R. Sutton and A. Barto. Reinforcement learning: An introduction. Cambridge, MA, MIT Press, 1998.
B. Tonkes and J. Wiles. Learning a context-free task with a recurrent neural network: An analysis of stability. In Proceedings of the Fourth Biennial Conference of the Australasian Cognitive Science Society, 1997.
P. Utgoff. Shift of bias for inductive concept learning. In R. Michalski, J. Carbonell, and T. Mitchell, editors, Machine Learning, volume 2, pages 163-190. Morgan Kaufmann, Los Altos, CA, 1986.
V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.
V. Vinge. The coming technological singularity, 1993. VISION-21 Symposium sponsored by NASA Lewis Research Center, and Whole Earth Review, Winter issue.
R. L. Watrous and G. M. Kuhn. Induction of finite-state Languages using second-order recurrent networks. Neural Computation, 4:406-414, 1992.
P. J. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974.
P. J. Werbos. Generalization of backpropagation with application to a recurrent gas market model. Neural Networks, 1, 1988.
P. J. Werbos. Neural networks, system identification, and control in the chemical industries. In D. A. Sofge D. A. White, editor, Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, pages 283-356. Thomson Learning, 1992.
M. A. Wiering, R. P. Salustowicz, and J. Schmidhuber. Reinforcement learning soccer teams with incomplete world models. Autonomous Robots, 7(1):77-88, 1999.
D. Wierstra, F. J. Gomez, and J. Schmidhuber. Modeling systems with internal state using Evolino. In Proc. of the 2005 conference on genetic and evolutionary computation (GECCO), Washington, D. C., pages 1795-1802. ACM Press, New York, NY, USA, 2005. Got a GECCO best paper award.
J. Wiles and J. Elman. Learning to count without a counter: A case study of dynamics and activation landscapes in recurrent networks. In In Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society, pages pages 482 - 487, Cambridge, MA, 1995. MIT Press.
R. J. Williams. Complexity of exact gradient computation algorithms for recurrent neural networks. Technical Report Technical Report NU-CCS-89-27, Boston: Northeastern University, College of Computer Science, 1989.
R. J. Williams and J. Peng. An efficient gradient-based algorithm for online training of recurrent network trajectories. Neural Computation, 4:491-501, 1990.
B. M. Yamauchi and R. D. Beer. Sequential behavior and learning in evolved dynamical neural networks. Adaptive Behavior, 2(3):219-246, 1994.
Xin Yao. A review of evolutionary artificial neural networks. International Journal of Intelligent Systems, 4:203-222, 1993.
S.J. Young and P.C Woodland. HTK Version 3.2: User, Reference and Programmer Manual, 2002.
Z. Zeng, R. Goodman, and P. Smyth. Discrete recurrent neural networks for grammatical inference. IEEE Transactions on Neural Networks, 5(2), 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schmidhuber, J. (2007). New Millennium AI and the Convergence of History. In: Duch, W., Mańdziuk, J. (eds) Challenges for Computational Intelligence. Studies in Computational Intelligence, vol 63. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71984-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-71984-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71983-0
Online ISBN: 978-3-540-71984-7
eBook Packages: EngineeringEngineering (R0)