Abstract
This paper is concerned with the study of the asymptotic behavior of dynamic programming recursions of the form
where ℜ denotes a set of matrices, generated by all possible interchanges of corresponding rows, taken from a fixed finite set of nonnegative square matrices. These recursions arise in a number of well-known and frequently studied problems, e.g. in the theory of controlled Markov chains, Leontief substitution systems, controlled branching processes, etc. Results concerning the asymptotic behavior ofx(n), forn→∞, are established in terms of the maximal spectral radius, the maximal index, and a set of generalized eigenvectors. A key role in the analysis is played by a geometric convergence result for value iteration in undiscounted multichain Markov decision processes. A new proof of this result is also presented.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Bellman, R.,Dynamic Programming, Princeton University Press, Princeton, New Jersey, 1957.
Howard, R. A.,Dynamic Programming and Markov Processes, MIT Press, Cambridge, Massachusetts, 1960.
Howard, R. A., andMatheson, J. E.,Risk-Sensitive Markov Decision Processes, Management Sciences, Vol. 18, pp. 356–369, 1972.
Pliska, S. R.,Optimization of Multitype Branching Processes, Management Sciences, Vol. 23, pp. 117–124, 1976.
Burmeister, E., andDobell, R.,Mathematical Theories of Economic Growth, Macmillan, New York, New York, 1970.
Bellman, R.,On a Class of Quasi-Linear Equations, Canadian Journal of Mathematics, Vol. 8, pp. 198–202, 1956.
Zijm, W. H. M.,Nonnegative Matrices in Dynamic Programming, PhD Thesis, Eindhoven University of Technology, Eindhoven, Holland, 1982.
Pease, M. C.,Methods of Matrix Algebra, Academic Press, New York, New York, 1965.
Zijm, W. H. M.,Generalized Eigenvectors and Sets of Nonnegative Matrices, Linear Algebra and Applications, Vol. 59, pp. 91–113, 1984.
Schweitzer, P., andFedergruen, A.,Geometric Convergence of Value-Iteration in Multichain Markov Decision Problems, Advances in Applied Probability, Vol. 11, pp. 188–217, 1979.
Sladky, K.,On Dynamic Programming Recursions for Multiplicative Markov Decision Chains, Mathematical Programming Study, Vol. 6, pp. 216–226, 1976.
Sladky, K.,Successive Approximation Methods for Dynamic Programming Models, Proceedings of the 3rd Formator Symposium on Mathematical Methods for the Analysis of Large-Scale Systems, Prague, Czechoslovakia, pp. 171–189, 1979.
Sladky, K.,Bounds on Discrete Dynamic Programming Recursions, I: Models with Nonnegative Matrices, Kybernetica, Vol. 16, pp. 526–547, 1980.
Rothblum, U. G.,Sensitive Growth Analysis of Multiplicative Systems, I: The Dynamic Approach, SIAM Journal of Algebraic Discrete Mathematics, Vol. 2, pp. 25–34, 1981.
Rothblum, U. G.,Expansions of Sums of Matrix Powers, SIAM Review, Vol. 23, pp. 143–164, 1981.
Gantmacher, F. R.,The Theory of Matrices, Vol. 2 (Translated by K. A. Hirsch), Chelsea, New York, New York, 1959.
Seneta, E.,Nonnegative Matrices, Allen and Unwin, London, England, 1973.
Karlin, S.,A First Course in Stochastic Processes, Academic Press, New York, New York, 1966.
Rothblum, U. G.,Algebraic Eigenspaces of Nonnegative Matrices, Linear Algebra and Applications, Vol. 12, pp. 281–292, 1975.
Mandl, P., andSeneta, E.,The Theory of Nonnegative Matrices in a Dynamic Programming Problem, Australian Journal of Statistics, Vol. 11, pp. 85–96, 1969.
Rothblum, U. G., andWhittle, P.,Growth Optimality for Branching Markov Decision Chains, Mathematics of Operations Research, Vol. 7, pp. 582–601, 1982.
Denardo, E. V.,Contraction Mappings in the Theory Underlying Dynamic Programming, SIAM Review, Vol. 9, pp. 165–177, 1967.
Lanery, E., Etude Asymptotique des Systèmes Markoviens à Commande, Revue d'Informatique et des Recherches Operationnelles, Vol. 1, pp. 3–56, 1967.
Federgruen, A., andSchweitzer, P. J.,Discounted and Undiscounted Value Iteration in Markov Decision Processes: A Survey, Dynamic Programming and Its Applications, Edited by M. L. Puterman, Academic Press, New York, New York, pp. 23–52, 1978.
Miller, B. L., andVeinott, A. F.,Discrete Dynamic Programming with Small Interest Rate, Annals of Mathematical Statistics, Vol. 40, pp. 366–370, 1966.
Sladky, K.,On the Set of Optimal Controls for Markov Chains with Rewards, Kybernetica, Vol. 10, pp. 350–367, 1974.
Van Der Wal, J.,Stochastic Dynamic Programming, Mathematical Center, Amsterdam, Holland, 1981.
Zijm, W. H. M.,Continuous-Time Dynamic Programming Models, Proceedings of the 4th Formator Symposium on Mathematical Methods for the Analysis of Large-Scale Systems, Prague, Czechoslovakia, pp. 589–601, 1982.
Zijm, W. H. M.,Exponential Convergence in Undiscounted Continuous-Time Markov Decision Chains, CQM-Note No. 023, Philips, Eindhoven, Holland, 1984.
Zijm, W. H. M.,R-Theory for Countable Reducible Nonnegative Matrices, Stochastics, Vol. 10, pp. 243–271, 1983.
Derman, C.,Finite State Markovian Decision Processes, Academic Press, New York, New York, 1970.
Rothblum, U. G.,Multiplicative Markov Decision Chains, Mathematics of Operations Research, Vol. 9, pp. 6–24, 1984.
Denardo, E. V.,A Markov Decision Problem, Mathematical Programming, Edited by T. C. Hu and S. M. Robinson, Academic Press, New York, New York, pp. 33–68, 1973.
Author information
Authors and Affiliations
Additional information
Communicated by R. A. Howard
The author wishes to express his gratitude to Professor J. Wessels, his dissertation advisor, for his encouragement and advice during the research which led to this paper. Furthermore, he is indebted to an unknown referee who suggested many substantial improvements with respect to the organization of the paper.
Rights and permissions
About this article
Cite this article
Zijm, W.H.M. Asymptotic expansions for dynamic programming recursions with general nonnegative matrices. J Optim Theory Appl 54, 157–191 (1987). https://doi.org/10.1007/BF00940410
Issue Date:
DOI: https://doi.org/10.1007/BF00940410