Cross-entropic learning of a machine for the decision in a partially observable universe

Dambreville, Frédéric

doi:10.1007/s10898-006-9061-9

Cross-entropic learning of a machine for the decision in a partially observable universe

Published: 09 August 2006

Volume 37, pages 541–555, (2007)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Global Optimization Aims and scope Submit manuscript

Cross-entropic learning of a machine for the decision in a partially observable universe

Download PDF

Frédéric Dambreville¹

77 Accesses
12 Citations
Explore all metrics

Abstract

In this paper, we are interested in optimal decisions in a partially observable universe. Our approach is to directly approximate an optimal strategic tree depending on the observation. This approximation is made by means of a parameterized probabilistic law. A particular family of Hidden Markov Models (HMM), with input and output, is considered as a model of policy. A method for optimizing the parameters of these HMMs is proposed and applied. This optimization is based on the cross-entropic (CE) principle for rare events simulation developed by Rubinstein.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Bakker, B., Schmidhuber J.: Hierarchical reinforcement learning based on subgoal discovery and subpolicy specialization. In Proceedings of the 8th Conference on Intelligent Autonomous Systems, pp. 438–445. Amsterdam, The Netherlands (2004)
Bellman R. (1957): Dynamic Programming. Princeton University Press, Princeton, New Jersey
Google Scholar
de Boer, P.-T., Kroesse, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method, http://www.cs.utwente.nl/~ptdeboer/ce/
Cassandra A.R. (1998): Exact and approximate algorithms for partially observable Markov decision processes. PhD thesis, Brown University, Rhode Island, Providence
Google Scholar
Fine S., Singer Y., Tishby N. (1998): The hierarchical hidden markov model: Analysis and Application. Machine Learning 32(1): 41–62
Article Google Scholar
Homem-de-Mello, T., Rubinstein, R.Y.: Rare event estimation for static models via cross-entropy and importance sampling. http://users.iems.nwu.edu/~tito/list.htm
Meuleau, N., Peshkin, L., Kim, K-E., Kaelbling, L.P.: Learning finite-state controllers for partially observable environments. In Proc. of UAI-99, pp. 427–436. Stockholm (1999)
Murphy, K., Paskin, M.: Linear time inference in hierarchical HMMs. In: Proceedings of Neural Information Processing Systems, Vancouver, Canada (2001)
Rubinstein R., Kroese D.P. (2004): The Cross-Entropy method. An unified approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning. Information Science & Statistics, Springer, Berlin
Google Scholar
Sondik E.J. (1971): The optimal control of partially observable markov processes. PhD thesis, Stanford University, Stanford, California
Google Scholar
Sutton R.J., Barto A.G. (2000): Reinforcement Learning. MIT Press, Cambridge, MA
Google Scholar
Theocharous, G: Hierarchical learning and planning in partially observable markov decision processes. PhD thesis, Michigan State University (2002)

Download references

Author information

Authors and Affiliations

Délégation Générale pour l’Armement, DGA/DET/CEP/ASC/GIP, 16 Bis avenue Prieur de la Côte d’Or, Arcveil, F 94114, France
Frédéric Dambreville

Authors

Frédéric Dambreville
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frédéric Dambreville.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dambreville, F. Cross-entropic learning of a machine for the decision in a partially observable universe. J Glob Optim 37, 541–555 (2007). https://doi.org/10.1007/s10898-006-9061-9

Download citation

Received: 19 June 2006
Accepted: 22 June 2006
Published: 09 August 2006
Issue Date: April 2007
DOI: https://doi.org/10.1007/s10898-006-9061-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Cross-entropic learning of a machine for the decision in a partially observable universe

Abstract

Article PDF

Similar content being viewed by others

A Bayesian reinforcement learning approach in markov games for computing near-optimal policies

Variance minimization of parameterized Markov decision processes

Solving Markov Decision Processes via Simulation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cross-entropic learning of a machine for the decision in a partially observable universe

Abstract

Article PDF

Similar content being viewed by others

A Bayesian reinforcement learning approach in markov games for computing near-optimal policies

Variance minimization of parameterized Markov decision processes

Solving Markov Decision Processes via Simulation

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation