Abstract
The reward signal is responsible for determining the agent’s behavior, and therefore is a crucial element within the reinforcement learning paradigm. Nevertheless, the mainstream of RL research in recent years has been preoccupied with the development and analysis of learning algorithms, treating the reward signal as given and not subject to change. As the learning algorithms have matured, it is now time to revisit the questions of reward function design. Therefore, this chapter reviews the history of reward function design, highlighting the links to behavioral sciences and evolution, and surveys the most recent developments in RL. Reward shaping, sparse and dense rewards, intrinsic motivation, curiosity, and a number of other approaches are analyzed and compared in this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aubret, A., Matignon, L., Hassas, S.: A survey on intrinsic motivation in reinforcement learning. arXiv preprint arXiv:1908.06976 (2019)
Bertoluzzo, F., Corazza, M.: Testing different reinforcement learning configurations for financial trading: introduction and applications. Proc. Econ. Financ. 3, 68–77 (2012)
Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., Efros, A.A.: Large-scale study of curiosity-driven learning. In: International Conference on Learning Representations (ICLR) (2019)
Frederick, S., Loewenstein, G., O’Donoghue, T.: Time discounting and time preference: a critical review. J. Econ. Lit. 40(2), 351–401 (2002)
Grafen, A.: Formalizing darwinism and inclusive fitness theory. Philos. Trans. R. Soc. B Biol. Sci. 364(1533), 3135–3141 (2009)
Harlow, H.F.: Learning and satiation of response in intrinsically motivated complex puzzle performance by monkeys. J. Comp. Physiol. Psychol. 43(4), 289 (1950)
Hughes, N.: Applying reinforcement learning to economic problems. In: ANU Crawford Phd Conference (2014)
Ng, A.Y., Harada, D., Russell, S.: Policy invariance under reward transformations: theory and application to reward shaping. In: International Conference on Machine Learning (ICML), vol. 99, pp. 278–287 (1999)
Randløv, J., Alstrøm, P.: Learning to drive a bicycle using reinforcement learning and shaping. In: International Conference on Machine Learning (ICML), vol. 98, pp. 463–471 (1998)
Reiss, M.J.: Optimization theory in behavioural ecology. J. Biol. Educ. 21(4), 241–247 (1987)
Schmidhuber, J.: Adaptive confidence and adaptive curiosity. Technical report, Institut fur Informatik, Technische Universitat Munchen, Arcisstr. 21, 800 Munchen 2 (1991)
Singh, S., Lewis, R.L., Barto, A.G., Sorg, J.: Intrinsically motivated reinforcement learning: an evolutionary perspective. IEEE Trans. Auton. Mental Dev. 2(2), 70–82 (2010)
Skinner, B.F.: Science and Human Behavior. Free Press (1965)
Smith, J.M.: Optimization theory in evolution. Ann. Rev. Ecol. System. 9, 31–56 (1978)
Stanton, C., Clune, J.: Deep curiosity search: intra-life exploration improves performance on challenging deep reinforcement learning problems. arXiv preprint arXiv:1806.00553 (2018)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (2018)
Watkins, C.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge (1989)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Eschmann, J. (2021). Reward Function Design in Reinforcement Learning. In: Belousov, B., Abdulsamad, H., Klink, P., Parisi, S., Peters, J. (eds) Reinforcement Learning Algorithms: Analysis and Applications. Studies in Computational Intelligence, vol 883. Springer, Cham. https://doi.org/10.1007/978-3-030-41188-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-41188-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41187-9
Online ISBN: 978-3-030-41188-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)