Exponential Lower Bounds for Policy Iteration

Fearnley, John

doi:10.1007/978-3-642-14162-1_46

John Fearnley²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6199))

Included in the following conference series:

International Colloquium on Automata, Languages, and Programming

1243 Accesses
28 Citations

Abstract

We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

On the Expected Total Reward with Unbounded Returns for Markov Decision Processes

Article 23 October 2018

One-Step Improvement Ideas and Computational Aspects

Strong n-discount and finite-horizon optimality for continuous-time Markov decision processes

Article 01 October 2014

References

Andersson, D.: Extending Friedmann’s lower bound to the Hoffman-Karp algorithm. Preprint (June 2009)
Google Scholar
Andersson, D., Hansen, T.D., Miltersen, P.B.: Toward better bounds on policy iteration. Preprint (June 2009)
Google Scholar
Ash, R.B., Doléans-Dade, C.A.: Probability and Measure Theory. Academic Press, London (2000)
MATH Google Scholar
Friedmann, O.: A super-polynomial lower bound for the parity game strategy improvement algorithm as we know it. In: Logic in Computer Science (LICS). IEEE, Los Alamitos (2009)
Google Scholar
Howard, R.: Dynamic Programming and Markov Processes. Technology Press and Wiley (1960)
Google Scholar
Mansour, Y., Singh, S.P.: On the complexity of policy iteration. In: Laskey, K.B., Prade, H. (eds.) UAI 1999: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 401–408. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Melekopoglou, M., Condon, A.: On the complexity of the policy improvement algorithm for Markov decision processes. ORSA Journal on Computing 6, 188–192 (1994)
MATH Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)
MATH Google Scholar
Vöge, J., Jurdziński, M.: A discrete strategy improvement algorithm for solving parity games (Extended abstract). In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 202–215. Springer, Heidelberg (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Warwick, UK
John Fearnley

Authors

John Fearnley
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computing Laboratory, Oxford University, Wolfson Building, Parks Road, OX1 3QD, Oxford, UK
Samson Abramsky
Université de Bordeaux (LaBRI) & INRIA, 351, cours de la Libération, 33405, Talence Cedex, France,
Cyril Gavoille
INRIA, Centre de Recherche Bordeaux – Sud-Ouest, 351 cours de la Libération, 33405, Talence Cedex, France
Claude Kirchner
Heinz Nixdorf Institute, University of Paderborn, Fürstenallee 11, 33102, Paderborn, Germany
Friedhelm Meyer auf der Heide
University of Patras and RACTI, 26500, Patras, Greece
Paul G. Spirakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fearnley, J. (2010). Exponential Lower Bounds for Policy Iteration. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds) Automata, Languages and Programming. ICALP 2010. Lecture Notes in Computer Science, vol 6199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14162-1_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-14162-1_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14161-4
Online ISBN: 978-3-642-14162-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exponential Lower Bounds for Policy Iteration

Abstract

Chapter PDF

Similar content being viewed by others

On the Expected Total Reward with Unbounded Returns for Markov Decision Processes

One-Step Improvement Ideas and Computational Aspects

Strong n-discount and finite-horizon optimality for continuous-time Markov decision processes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Exponential Lower Bounds for Policy Iteration

Abstract

Chapter PDF

Similar content being viewed by others

On the Expected Total Reward with Unbounded Returns for Markov Decision Processes

One-Step Improvement Ideas and Computational Aspects

Strong n-discount and finite-horizon optimality for continuous-time Markov decision processes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation