Abstract
We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Andersson, D.: Extending Friedmann’s lower bound to the Hoffman-Karp algorithm. Preprint (June 2009)
Andersson, D., Hansen, T.D., Miltersen, P.B.: Toward better bounds on policy iteration. Preprint (June 2009)
Ash, R.B., Doléans-Dade, C.A.: Probability and Measure Theory. Academic Press, London (2000)
Friedmann, O.: A super-polynomial lower bound for the parity game strategy improvement algorithm as we know it. In: Logic in Computer Science (LICS). IEEE, Los Alamitos (2009)
Howard, R.: Dynamic Programming and Markov Processes. Technology Press and Wiley (1960)
Mansour, Y., Singh, S.P.: On the complexity of policy iteration. In: Laskey, K.B., Prade, H. (eds.) UAI 1999: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 401–408. Morgan Kaufmann, San Francisco (1999)
Melekopoglou, M., Condon, A.: On the complexity of the policy improvement algorithm for Markov decision processes. ORSA Journal on Computing 6, 188–192 (1994)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)
Vöge, J., Jurdziński, M.: A discrete strategy improvement algorithm for solving parity games (Extended abstract). In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 202–215. Springer, Heidelberg (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fearnley, J. (2010). Exponential Lower Bounds for Policy Iteration. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds) Automata, Languages and Programming. ICALP 2010. Lecture Notes in Computer Science, vol 6199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14162-1_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-14162-1_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14161-4
Online ISBN: 978-3-642-14162-1
eBook Packages: Computer ScienceComputer Science (R0)