Abstract
Learning classifier systems (LCSs) are rule-based machine learning technologies designed to learn optimal decision-making policies in the form of a compact set of maximally general and accurate rules. A study of the literature reveals that most of the existing LCSs focused primarily on learning deterministic policies. However a desirable policy may often be stochastic, in particular when the environment is partially observable. To fill this gap, based on XCS, which is one of the most successful accuracy-based LCSs, a new Michigan-style LCS called Natural XCS (i.e. NXCS) is proposed in this paper. NXCS enables direct learning of stochastic policies by utilizing a natural gradient learning technology under a policy gradient framework. Its effectiveness is experimentally compared with XCS and one of its variation known as XCS μ in this paper. Our results show that NXCS can achieve competitive performance in both deterministic and stochastic multi-step problems.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Natural Gradient
- Learn Classifier System
- Deterministic Policy
- Policy Gradient
- Reinforcement Learning Problem
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Amari, S.: Natural gradient works efficiently in learning. Neural Computation 10(2), 251–276 (1998)
Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Natural actor-critic algorithms. Journal Automatica 45(11), 2471–2482 (2009)
Butz, M.V., Goldberg, D.E., Lanzi, P.L.: Gradient descent methods in learning classifier systems: improving xcs performance in multistep problems. IEEE Transactions on Evolutionary Computation (2005)
Butz, M.V., Wilson, S.W.: An Algorithmic Description of XCS. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 253–272. Springer, Heidelberg (2002)
Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)
Holland, J.H.: Adaptation. In: Progress in Theoretical Biology, vol. 4, pp. 263–293. Academic Press (1976)
Lanzi, P.L.: An analysis of the memory mechanism of xcsm. In: Proceedings of the Third Genetic Programming Conference, pp. 643–651 (1998)
Lanzi, P.L.: Learning classifier systems: then and now. Evolutionary Intelligence (2008)
Lanzi, P.L., Colombetti, M.: An extension to the xcs classifier system for stochastic environments. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 353–360 (2000)
Peters, J., Schaal, S.: Natural actor-critic. Neurocomputing, 1180–1190 (2008)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems 12 (NIPS 1999), vol. 12, pp. 1057–1063. MIT Press (2000)
Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3(2), 149–175 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, G., Zhang, M., Pang, S., Douch, C. (2014). Stochastic Decision Making in Learning Classifier Systems through a Natural Policy Gradient Method. In: Loo, C.K., Yap, K.S., Wong, K.W., Beng Jin, A.T., Huang, K. (eds) Neural Information Processing. ICONIP 2014. Lecture Notes in Computer Science, vol 8836. Springer, Cham. https://doi.org/10.1007/978-3-319-12643-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-12643-2_37
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12642-5
Online ISBN: 978-3-319-12643-2
eBook Packages: Computer ScienceComputer Science (R0)