Skip to main content

Maximizing Learning Progress: An Internal Reward System for Development

  • Chapter
Embodied Artificial Intelligence

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3139))

Abstract

This chapter presents a generic internal reward system that drives an agent to increase the complexity of its behavior. This reward system does not reinforce a predefined task. Its purpose is to drive the agent to progress in learning given its embodiment and the environment in which it is placed. The dynamics created by such a system are studied first in a simple environment and then in the context of active vision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Skinner, B.: The Behavior of Organisms. Appleton Century Crofs, New York (1938)

    Google Scholar 

  2. Sutton, R., Barto, A.: Reinforcement learning: an introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  3. Montague, P., Dayan, P., Sejnowski, T.: A framework for mesencephalic dopamine systems based on predictive hebbian learning. Journal of Neuroscience 16, 1936–1947 (1996)

    Google Scholar 

  4. Schultz, W., Dayan, P., Montague, P.: A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)

    Article  Google Scholar 

  5. Doya, K.: Metalearning and neuromodulation. Neural Networks 15 (2002)

    Google Scholar 

  6. Lorenz, K.: Vom Weltbild des Verhaltensforschers. dtv, Munchen (1968)

    Google Scholar 

  7. Csikszenthmihalyi, M.: Flow-the psychology of optimal experience. Harper Perennial (1991)

    Google Scholar 

  8. Kaplan, F., Oudeyer, P.Y.: Motivational principles for visual know-how development. In: Prince, C., Berthouze, L., Kozima, H., Bullock, D., Stojanov, G., Balkenius, C. (eds.) Proceedings of the 3rd international workshop on Epigenetic Robotics: Modeling cognitive development in robotic systems, vol. 101, pp. 73–80. Lund University Cognitive Studies (2003)

    Google Scholar 

  9. Varela, F., Thompson, E., Rosch, E.: The embodied mind: Cognitive science and human experience. MIT Press, Cambridge (1991)

    Google Scholar 

  10. Andry, P., Gaussier, P., Moga, S., Banquet, J., Nadel, J.: Learning and communication in imitation: an autonomous robot perspective. IEEE Transaction on Systems, Man and Cybernetics, Part A: Systems and Humans 31, 431–444 (2001)

    Article  Google Scholar 

  11. Huang, X., Weng, J.: Novelty and reinforcement learning in the value system of developmental robots. In: Proceedings of the 2nd international workshop on Epigenetic Robotics - Lund University Cognitive Studies, vol. 94, pp. 47–55 (2002)

    Google Scholar 

  12. Thrun, S.: Exploration in active learning. In: Arbib, M. (ed.) Handbook of Brain Science and Neural Networks, MIT Press, Cambridge (1995)

    Google Scholar 

  13. Schmidhuber, J.: Curious model-building control systems. In: Proceeding International Joint Conference on Neural Networks, vol. 2, pp. 1458–1463. IEEE, Singapore (1991)

    Chapter  Google Scholar 

  14. Elman, J.: Finding structure in time. Cognitive Science 14, 179–211 (1990)

    Article  Google Scholar 

  15. Rabiner, L., Juang, B.: An introduction to hidden markov models. IEEE Acoutics, Speech and Signal Processing Magazine 3, 4–16 (1986)

    Google Scholar 

  16. Tani, J., Nolfi, S.: Learning to perceive the world as articulated: An approach for hiearchical learning in sensory-motor systems. Neural Network 12, 1131–1141 (1999)

    Article  Google Scholar 

  17. Jordan, M., Jacobs, R.: Hierarchical mixtures of experts and the em algorithm. Neural Computation 6, 181–214 (1994)

    Article  Google Scholar 

  18. Kato, T., Floreano, D.: An evolutionary active-vision system. In: Proceedings of the congress on evolutionary computation (CEC 2001), IEEE Press, Los Alamitos (2001)

    Google Scholar 

  19. Marocco, D., Floreano, D.: Active vision and feature selection in evolutionary behavioral systems. In: Hallam, B., Floreano, D., Hallam, J., Hayes, G., Meyer, J.A. (eds.) From Animals to Animats, vol. 7, MIT Press, Cambridge (2002)

    Google Scholar 

  20. Lungarella, M., Berthouze, L.: Embodied Artificial Intelligence. LNCS (LNAI), vol. 3139. Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kaplan, F., Oudeyer, PY. (2004). Maximizing Learning Progress: An Internal Reward System for Development. In: Iida, F., Pfeifer, R., Steels, L., Kuniyoshi, Y. (eds) Embodied Artificial Intelligence. Lecture Notes in Computer Science(), vol 3139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27833-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27833-7_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22484-6

  • Online ISBN: 978-3-540-27833-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics