Learning a Move-Generator for Upper Confidence Trees

Couetoux, Adrien; Teytaud, Olivier; Doghmen, Hassen

doi:10.1007/978-3-642-35452-6_23

Adrien Couetoux^4,5,
Olivier Teytaud^4,5,6 &
Hassen Doghmen⁴

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 20))

1974 Accesses
1 Citations

Abstract

We experiment the introduction of machine learning tools to improve Monte-Carlo Tree Search. More precisely, we propose the use of Direct Policy Search, a classical reinforcement learning paradigm, to learn the Monte-Carlo Move Generator. We experiment our algorithm on different forms of unit commitment problems, including experiments on a problem with both macrolevel and microlevel decisions.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Continuous Upper Confidence Trees with Polynomial Exploration – Consistency

A Novel Implementation of Q-Learning for the Whittle Index

Monte Carlo tree search algorithms for risk-aware and multi-objective reinforcement learning

Article Open access 28 April 2023

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bengio, Y.: Using a financial training criterion rather than a prediction criterion. CIRANO Working Papers 98s-21, CIRANO (1998)
Google Scholar
Chaslot, G., Winands, M., Uiterwijk, J., van den Herik, H., Bouzy, B.: Progressive Strategies for Monte-Carlo Tree Search. In: Wang, P., et al. (eds.) Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007), pp. 655–661. World Scientific Publishing Co. Pte. Ltd. (2007)
Google Scholar
Couëtoux, A., Hoock, J.-B., Sokolovska, N., Teytaud, O., Bonnard, N.: Continuous Upper Confidence Trees. In: Coello Coello, C.A. (ed.) LION 2011. LNCS, vol. 6683, pp. 433–445. Springer, Heidelberg (2011)
Google Scholar
Couëtoux, A., Doghmen, H., Teytaud, O.: Improving the Exploration in Upper Confidence Trees. In: Hamadi, Y., Schoenauer, M. (eds.) LION 2012. LNCS, vol. 7219, pp. 366–371. Springer, Heidelberg (2012)
Google Scholar
Coulom, R.: Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Google Scholar
Coulom, R.: Computing elo ratings of move patterns in the game of go. In: Computer Games Workshop, Amsterdam, The Netherlands (2007)
Google Scholar
Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: ICML 2007: Proceedings of the 24th International Conference on Machine Learning, pp. 273–280. ACM Press, New York (2007)
Google Scholar
Huang, S.-C., Coulom, R., Lin, S.-S.: Monte-Carlo Simulation Balancing in Practice. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 81–92. Springer, Heidelberg (2011)
Google Scholar
Kocsis, L., Szepesvari, C.: Bandit based Monte-Carlo planning. In: 15th European Conference on Machine Learning (ECML), pp. 282–293 (2006)
Google Scholar
Lee, C.-S., Wang, M.-H., Chaslot, G., Hoock, J.-B., Rimmel, A., Teytaud, O., Tsai, S.-R., Hsu, S.-C., Hong, T.-P.: The Computational Intelligence of MoGo Revealed in Taiwan’s Computer Go Tournaments. IEEE Transactions on Computational Intelligence and AI in Games (2009)
Google Scholar
Meyer-Nieberg, S., Beyer, H.-G.: Self-adaptation in evolutionary algorithms. In: Lobo, F.G., Lima, C.F., Michalewicz, Z. (eds.) Parameter Setting in Evolutionary Algorithms. Springer, Berlin (2007)
Google Scholar
Rimmel, A., Teytaud, F.: Multiple Overlapping Tiles for Contextual Monte Carlo Tree Search. In: Evostar, Istanbul, Turquie
Google Scholar
Rimmel, A., Teytaud, F., Teytaud, O.: Biasing Monte-Carlo Simulations through RAVE Values. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 59–68. Springer, Heidelberg (2011)
Google Scholar
Sharma, S., Kobti, Z., Goodwin, S.: Knowledge Generation for Improving Simulations in UCT for General Game Playing. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 49–55. Springer, Heidelberg (2008)
Google Scholar
Silver, D., Tesauro, G.: Monte-carlo simulation balancing. In: Danyluk, A.P., Bottou, L., Littman, M.L. (eds.) ICML. ACM International Conference Proceeding Series, vol. 382, p. 119. ACM (2009)
Google Scholar
Teytaud, O.: Including Ontologies in Monte-Carlo Tree Search and Applications - an Open Source Platform (2008)
Google Scholar
Wang, Y., Gelly, S.: Modifications of UCT and sequence-like simulations for Monte-Carlo Go. In: IEEE Symposium on Computational Intelligence and Games, Honolulu, Hawaii, pp. 175–182 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

TAO-INRIA, LRI, CNRS UMR 8623, Université Paris-Sud, Orsay, France
Adrien Couetoux, Olivier Teytaud & Hassen Doghmen
OASE Lab, National University of Tainan, Tainan, Taiwan
Adrien Couetoux & Olivier Teytaud
Montefiore Institute, Université de Liège, Liège, Belgium
Olivier Teytaud

Authors

Adrien Couetoux
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Teytaud
View author publications
You can also search for this author in PubMed Google Scholar
Hassen Doghmen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, Department of Computer Science and, National Dong Hwa University, #1, Sec. 2, Da Hsueh Rd., Hualien, Taiwan R.O.C.
Ruay-Shiung Chang
, Adjunct Professor, University of Canberra, Kirinari St. Mail Room, ACT, 2601, Australia
Lakhmi C. Jain
, Department of Computer Science and, National Dong Hwa University, #1, Sec. 2, Da Hsueh Rd., Hualien, Taiwan R.O.C.
Sheng-Lung Peng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Couetoux, A., Teytaud, O., Doghmen, H. (2013). Learning a Move-Generator for Upper Confidence Trees. In: Chang, RS., Jain, L., Peng, SL. (eds) Advances in Intelligent Systems and Applications - Volume 1. Smart Innovation, Systems and Technologies, vol 20. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35452-6_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-35452-6_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35451-9
Online ISBN: 978-3-642-35452-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Learning a Move-Generator for Upper Confidence Trees

Abstract

Chapter PDF

Similar content being viewed by others

Continuous Upper Confidence Trees with Polynomial Exploration – Consistency

A Novel Implementation of Q-Learning for the Whittle Index

Monte Carlo tree search algorithms for risk-aware and multi-objective reinforcement learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning a Move-Generator for Upper Confidence Trees

Abstract

Chapter PDF

Similar content being viewed by others

Continuous Upper Confidence Trees with Polynomial Exploration – Consistency

A Novel Implementation of Q-Learning for the Whittle Index

Monte Carlo tree search algorithms for risk-aware and multi-objective reinforcement learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation