A Comparative Study of Model-Free Reinforcement Learning Approaches

Moudgalya, Anant; Shafi, Ayman; Arun, B. Amulya

doi:10.1007/978-981-15-3383-9_50

Anant Moudgalya¹⁷,
Ayman Shafi¹⁷ &
B. Amulya Arun¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1141))

Included in the following conference series:

International Conference on Advanced Machine Learning Technologies and Applications

2017 Accesses
1 Citations

Abstract

This study explores and compares three model-free learning methods, namely, deep Q-networks (DQN), dueling deep Q-networks (DDQN) and state-action-reward-state-action (SARSA), while detailing the mathematical principles behind each method. These methods were chosen as to bring out the contrast between off-policy (DQN) and on-policy (SARSA) learners. The DDQN method was included as it is a modification of DQN. The results of these methods and their performance on the classic problem, CartPole were compared. Post-training, testing results for each of the models were as follows: DQN obtained an average per episode reward of 496.36; its variant and improvement, DDQN obtained a perfect score of 500 and SARSA obtained a score of 438.28. To conclude, the theoretical inferences were decisively reaffirmed with observations based on descriptive plots of training and testing results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Survey of Deep Q-Networks used for Reinforcement Learning: State of the Art

Bring Color to Deep Q-Networks: Limitations and Improvements of DQN Leading to Rainbow DQN

Deep Q-Networks

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2014, 2015)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg1, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518 (7540), 529–533 (2015)
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with deep reinforcement learning (2013)
Google Scholar
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning (2015)
Google Scholar
Stadie, B.C., Levine, S., Abbeel, P.: Incentivizing exploration in reinforcement learning with deep predictive models (2015)
Google Scholar
Guo, X., Singh, S., Lee, H., Lewis, R.L., Wang, X.: Deep learning for real-time Atari game play using offline monte-carlo tree search planning, pp. 3338–3346. NIPS (2014)
Google Scholar
Bellemare, M.G., Ostrovski, G., Guez, A., Thomas, P.S., Munos, R.: Increasing the action gap: new operators for reinforcement learning. AAAI (2016)
Google Scholar
Nair, A., Srinivasan, P., Blackwell, S., Alcicek, C., Fearon, R., Maria, A.D., Panneershelvam, V., Suleyman, M., Beattie, C., Petersen, S., Legg, S., Mnih, V., Kavukcuoglu, K., Silver, D.: Massively parallel methods for deep reinforcement learning (2015)
Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. ICLR (2016)
Google Scholar
Watkins, C.: Learning from delayed rewards (1989)
Google Scholar
Melo, F.S.: Convergence of q-learning: a simple proof (2001)
Google Scholar
Nair, A., McGrew, B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Overcoming exploration in reinforcement learning with demonstrations. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6292–6299. IEEE (2018)
Google Scholar
Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot, M., de Freitas, N.: Dueling network architectures for deep reinforcement learning (2016)
Google Scholar
Rummery, G., Niranjan, M.: Online q-learning using connectionist systems (1994)
Google Scholar
Marco, W., Jurgen, S.: Fast online q(λ). Mach. Learn. 33(1), 105–115 (1998)
Article Google Scholar
Perez, J., Germain-Renaud, C., K´egl, B., Loomis, C.: Grid differentiated services: a reinforcement learning approach. In: 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID), pp. 287–294. IEEE, (2008)
Google Scholar
Plappert, M.: keras-rl (2016). https://github.com/keras-rl/keras-rl
Zhou, Z., Li, X., Zare, R.N.: Optimizing chemical reactions with deep reinforcement learning. ACS Central Sci 3(12), 1337–1344 (2017)
Article Google Scholar
PIVA, A.: Dynamic trading strategy for a risk-averse investor via the q-learning algorithm (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

PESIT South Campus, Bengaluru, Karnataka, India
Anant Moudgalya, Ayman Shafi & B. Amulya Arun

Authors

Anant Moudgalya
View author publications
You can also search for this author in PubMed Google Scholar
Ayman Shafi
View author publications
You can also search for this author in PubMed Google Scholar
B. Amulya Arun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anant Moudgalya .

Editor information

Editors and Affiliations

Faculty of Computer and Information, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Department of Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India
Roheet Bhatnagar
Faculty of Science, Helwan University, Helwan, Egypt
Ashraf Darwish

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moudgalya, A., Shafi, A., Arun, B.A. (2021). A Comparative Study of Model-Free Reinforcement Learning Approaches. In: Hassanien, A., Bhatnagar, R., Darwish, A. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2020. Advances in Intelligent Systems and Computing, vol 1141. Springer, Singapore. https://doi.org/10.1007/978-981-15-3383-9_50

Download citation

DOI: https://doi.org/10.1007/978-981-15-3383-9_50
Published: 26 May 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3382-2
Online ISBN: 978-981-15-3383-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Comparative Study of Model-Free Reinforcement Learning Approaches

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Survey of Deep Q-Networks used for Reinforcement Learning: State of the Art

Bring Color to Deep Q-Networks: Limitations and Improvements of DQN Leading to Rainbow DQN

Deep Q-Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Comparative Study of Model-Free Reinforcement Learning Approaches

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Survey of Deep Q-Networks used for Reinforcement Learning: State of the Art

Bring Color to Deep Q-Networks: Limitations and Improvements of DQN Leading to Rainbow DQN

Deep Q-Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation