Abstract
AlphaZero and its extension MuZero are computer programs that use machine-learning techniques to play at a superhuman level in chess, go, and a few other games. They achieved this level of play solely with reinforcement learning from self-play, without any domain knowledge except the game rules. It is a natural idea to adapt the methods and techniques used in AlphaZero for solving search problems such as the Boolean satisfiability problem (in its search version). Given a search problem, how to represent it for an AlphaZero-inspired solver? What are the “rules of solving” for this search problem? We describe possible representations in terms of easy-instance solvers and self-reductions, and we give examples of such representations for the satisfiability problem. We also describe a version of Monte Carlo tree search adapted for search problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Biere, A., Heule, M., van Maaren, H., Walsh, T. (Eds.): Handbook of Satisfiability, 2nd edn. IOS Press (2021)
Buss, S., Nordström, J.: Proof complexity and SAT solving. In: Handbook of Satisfiability, 2nd edn., vol. 336, pp. 233–350. IOS Press (2021)
Chang, H.S., Fu, M.C., Hu, J., Marcus, S.I.: An adaptive sampling algorithm for solving Markov decision processes. Oper. Res. 53(1), 126–139 (2005)
Hoos, H.H., Hutter, F., Leyton-Brown, K.: Automated configuration and selection of SAT solvers. In: Handbook of Satisfiability. Volume 336 of Frontiers in Artificial Intelligence and Applications, 2nd edn., pp. 481–507. IOS Press (2021)
Kerschke, P., Hoos, H.H., Neumann, F., Trautmann, H.: Automated algorithm selection: survey and perspectives. Evol. Comput. 27(1), 3–45 (2019)
Mazyavkina, N., Sviridov, S., Ivanov, S., Burnaev, E.: Reinforcement learning for combinatorial optimization: a survey. Comput. Oper. Res. 134, 105400 (2021)
Narvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor, M.E., Stone, P.: Curriculum learning for reinforcement learning domains: a framework and survey. J. Mach. Learn. Res. 21, 181:1–181:50 (2020)
Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T.P., Silver, D.: Mastering atari, go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020)
Selsam, D., Lamm, M., Bünz, B., Liang, P., de Moura, L., Dill, D.L.: Learning a SAT solver from single-bit supervision. In: Proceedings of the 7th International Conference on Learning Representations, ICLR 2019 (2019)
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., Hassabis, D.: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362(6419), 1140–1144 (2018)
Tseitin, G.S.: On the complexity of derivation in propositional calculus. Zapiski Nauchnykh Seminarov LOMI 8, 234–259 (1968). In Russian. Reprinted in: Siekmann, J., Wrightson, G. (eds.): Automation of Reasoning 2: Classical Papers on Computational Logic 1967–1970, pp. 466–483. Springer (1983)
Vesselinova, N., Steinert, R., Perez-Ramirez, D.F., Boman, M.: Learning combinatorial optimization on graphs: a survey with applications to networking. IEEE Access 8, 120388–120416 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dantsin, E., Kreinovich, V., Wolpert, A. (2023). An AlphaZero-Inspired Approach to Solving Search Problems. In: Ceberio, M., Kreinovich, V. (eds) Decision Making Under Uncertainty and Constraints. Studies in Systems, Decision and Control, vol 217. Springer, Cham. https://doi.org/10.1007/978-3-031-16415-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-16415-6_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16414-9
Online ISBN: 978-3-031-16415-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)