Heuristic Search Based Exploration in Reinforcement Learning

Anh Vien, Ngo; Hoang Viet, Nguyen; Lee, SeungGwan; Chung, TaeChoong

doi:10.1007/978-3-540-73007-1_14

Ngo Anh Vien¹,
Nguyen Hoang Viet¹,
SeungGwan Lee¹ &
…
TaeChoong Chung¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4507))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

2451 Accesses
9 Citations

Abstract

In this paper, we consider reinforcement learning in systems with unknown environment where the agent must trade off efficiently between: exploration(long-term optimization) and exploitation (short-term optimization). ε− greedy algorithm is a method using near-greedy action selection rule. It behaves greedily (exploitation) most of the time, but every once in a while, say with small probability ε (exploration), instead select an action at random. Many works already proved that random exploration drives the agent towards poorly modeled states. Therefore, this study evaluates the role of heuristic based exploration in reinforcement learning. We proposed three methods: neighborhood search based exploration, simulated annealing based exploration, and tabu search based exploration. All techniques follow the same rule ”Explore the most unvisited state”. In the simulation, these techniques are evaluated and compared on a discrete reinforcement learning task (robot navigation).

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Learning to explore by reinforcement over high-level options

Article 30 November 2023

CostNet: An End-to-End Framework for Goal-Directed Reinforcement Learning

Teaching Robot Navigation Behaviors to Optimal RRT Planners

Article 23 November 2017

References

Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Wiering, M.A.: Explorations in efficient reinforcement learning. Ph.D. dissertation, University of Amsterdam IDSIA (February 1999)
Google Scholar
Thrun, S., Moller, K.: Active exploration in dynamic environments. In: Moody, J.E., Hanson, S.J., Lippmann, R. (eds.) Advances in Neural Information Processing Systems 4, pp. 531–538. Morgan Kaufmann, Washington (1992)
Google Scholar
Nguyen, D., Widrow, B.: The truck backer upper: An example of self-learning in neural networks. In: Proceedings of the First International Joint Conference on Neural Networks Washington DC San Diego, Washington, DC, IEEE TAB Neural Network Committee (1989)
Google Scholar
Thrun, S.B., Moller, K., Linden, A.: Planning with an adaptive world model. In: Advances in Neural Information Processing Systems, Morgan Kaufmann, San Mateo (1991)
Google Scholar
Holland, J.H.: Adaptation in Natural and Artificial System, 2nd edn. MIT Press, Cambridge (1992)
Google Scholar
Macready, W., Wolpert, D.H.: Bandit problems and the Exploration/Exploitation Tradeoff. IEEE Transactions on Evolutionary Computation 2(1), 2–22 (1998)
Article Google Scholar
Reeves, C.R.: Modern Heuristic Techniques for Combinatorial Problems. Blackwell Scientific Publication, Oxford (1993)
MATH Google Scholar
Downsland, K.: Simulated annealing. In: Downsland, K. (ed.) Modern Heuristic Techniques for Combinatorial Problems, Blackwell Scientific Publication, Oxford (1993)
Google Scholar
Davies, S., Ng, A., Moore, A.: Applying Online Search Techniques to Continuous-State Reinforcement Learning. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence, AAAI (1998)
Google Scholar
Atiya, A.F., Parlos, A.G., Ingber, L.: A reinforcement learning method based on adaptive simulated annealing. In: Proceedings of the 46th IEEE International Midwest Symposium on Circuits and Systems, MWSCAS ’03, vol.1, December 2003, pp. 121–124 (2003)
Google Scholar
Abramsan, M., Wechsler, H.: Competitive reinforcement learning for combinatorial problems. In: International Joint Conference on Neural Network (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence Lab, Department of Computer Engineering, School of Electronics and Information, Kyunghee University, 1-Seocheon, Giheung, Yongin, Gyeonggi, 446-701, South Korea
Ngo Anh Vien, Nguyen Hoang Viet, SeungGwan Lee & TaeChoong Chung

Authors

Ngo Anh Vien
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Hoang Viet
View author publications
You can also search for this author in PubMed Google Scholar
SeungGwan Lee
View author publications
You can also search for this author in PubMed Google Scholar
TaeChoong Chung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Francisco Sandoval Alberto Prieto Joan Cabestany Manuel Graña

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Anh Vien, N., Hoang Viet, N., Lee, S., Chung, T. (2007). Heuristic Search Based Exploration in Reinforcement Learning. In: Sandoval, F., Prieto, A., Cabestany, J., Graña, M. (eds) Computational and Ambient Intelligence. IWANN 2007. Lecture Notes in Computer Science, vol 4507. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73007-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-540-73007-1_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73006-4
Online ISBN: 978-3-540-73007-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics