Abstract
Most game programs have a large number of parameters that are crucial for their performance. Tuning these parameters by hand is rather difficult. Therefore automatic optimization algorithms in game programs are interesting research domains. However, successful applications are only known for parameters that belong to certain components (e.g., evaluation-function parameters). The SPSA (Simultaneous Perturbation Stochastic Approximation) algorithm is an attractive choice for optimizing any kind of parameters of a game program, both for its generality and its simplicity. Its disadvantage is that it can be very slow.
In this article we propose several methods to speed up SPSA, in particular, the combination with RPROP, using common random numbers, antithetic variables, and averaging. We test the resulting algorithm for tuning various types of parameters in two domains, Poker and LOA. From the experimental study, we may conclude that using SPSA is a viable approach for optimization in game programs, in particular if no good alternative exists for the types of parameters considered.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Stochastic Approximation
- Perturbation Vector
- Opponent Model
- Realization Probability
- Simultaneous Perturbation Stochastic Approximation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Billings, D., Davidson, A., Shauenberg, T., Burch, N., Bowling, M., Holte, R., Schaeffer, J., Szafron, D.: Game Tree Search with Adaptation in Stochastic Imperfect Information Games. In: van den Herik, H.J., Björnsson, Y., Netanyahu, N.S. (eds.) CG 2004. LNCS, vol. 3846, pp. 21–34. Springer, Heidelberg (2006)
Björnsson, Y., Marsland, T.A.: Learning Extension Parameters in Game-Tree Search. Journal of Information Sciences 154, 95–118 (2003)
Chellapilla, K., Fogel, D.B.: Evolving Neural Networks to Play Checkers Without Expert Knowledge. IEEE Transactions on Neural Networks 10(6), 1382–1391 (1999)
Dippon, J.: Accelerated Randomized Stochastic Optimization. Annals of Statistics 31(4), 1260–1281 (2003)
Igel, C., Hüsken, M.: Empirical Evaluation of the Improved Rprop Learning Algorithm. Neurocomputing 50(C), 105–123 (2003)
Kleinman, N.L., Spall, J.C., Neiman, D.Q.: Simulation-based Optimization with Stochastic Approximation using Common Random Numbers. Management Science 45(11), 1570–1578 (1999)
Kocsis, L.: Learning Search Decisions. PhD thesis, Universiteit Maastricht, Maastricht, The Netherlands (2003)
Kocsis, L., van den Herik, H.J., Uiterwijk, J.W.H.M.: Two Learning Algorithms for Forward Pruning. ICGA Journal 26(3), 165–181 (2003)
L’Ecuyer, P., Yin, G.: Budget-dependent Convergence Rate of Stochastic Approximation. SIAM J. on Optimization 8(1), 217–247 (1998)
Levy, D.: Some Comments on Realization Probabilities and the Sex Algorithm. ICGA Journal 25(3), 167 (2002)
Riedmiller, M., Braun, H.: A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm. In: Ruspini, E.H. (ed.) Proceedings of the IEEE International Conference on Neural Networks, pp. 586–591 (1993)
Spall, J.C.: Multivariate Stochastic Approximation Using a Simultaneous Perturbation Gradient Approximation. IEEE Transactions on Automatic Control 37, 332–341 (1992)
Spall, J.C.: Adaptive Stochastic Approximation by the Simultaneous Perturbation Method. IEEE Transactions on Automatic Control 45, 1839–1853 (2000)
Tesauro, G.: Practical Issues in Temporal Difference Learning. Machine Learning 8, 257–277 (1992)
Theiler, J., Alper, J.: On the Choice of Random Directions for Stochastic Approximation Algorithms. IEEE Transactions on Automatic Control 51, 476–481 (2006)
Tsuruoka, Y., Yokoyama, D., Chikayama, T.: Game-tree Search Algorithm based on Realization Probability. ICGA Journal 25(3), 132–144 (2002)
Winands, M.H.M.: Informed Search in Complex Games. PhD thesis, Universiteit Maastricht, Maastricht, The Netherlands (2004)
Winands, M.H.M., Kocsis, L., Uiterwijk, J.W.H.M., van den Herik, H.J.: Learning in Lines of Action. In: Proceedings of BNAIC 2002, pp. 99–103 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kocsis, L., Szepesvári, C., Winands, M.H.M. (2006). RSPSA: Enhanced Parameter Optimization in Games. In: van den Herik, H.J., Hsu, SC., Hsu, Ts., Donkers, H.H.L.M.(. (eds) Advances in Computer Games. ACG 2005. Lecture Notes in Computer Science, vol 4250. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11922155_4
Download citation
DOI: https://doi.org/10.1007/11922155_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48887-3
Online ISBN: 978-3-540-48889-7
eBook Packages: Computer ScienceComputer Science (R0)