Abstract
This work is concerned with controlled Markov chains with finite state and action spaces. It is assumed that the decision maker has an arbitrary but constant risk sensitivity coefficient, and that the performance of a control policy is measured by the long-run average cost criterion. Within this framework, the existence of solutions of the corresponding risk-sensitive optimality equation for arbitrary cost function is characterized in terms of communication properties of the transition law.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Arapstathis A, Borkar VK, Fernández-Gaucherand E, Gosh MK, Marcus SI (1993) Discrete-time controlled Markov processes with average cost criteria: a survey. SIAM J Control Optim 31: 282–334
Cavazos-Cadena R (2003) Solution to the risk-sesnitive average cost optimality equation in a class of markov decision processes with finite state space. Math Methods Oper Res 57: 263–285
Cavazos-Cadena R, Fernández-Gaucherand E (1999) Controlled Markov chains with risk-sensitive criteria: average cost, optimality equations and optimal solutions. Math Methods Oper Res 43: 121–139
Cavazos-Cadena R, Fernández-Gaucherand E (2002) Risk-sensitive control in communicating average Markov decision chains. In: Dror M, L’Ecuyer P, Szidarovsky F (eds) Modelling uncertainty: an examination of stochastic theory, methods and applications. Kluwer, Boston, pp 525–544
Cavazos-Cadena R, Hernández-Hernández D (2003) Solution to the risk-sensitive average cost optimality equation in communicating Markov decision chains with finite state space: An alternative approach. Math Methods Oper Res 56: 473–479
Cavazos-Cadena R, Hernández-Hernández D (2008) Necessary and sufficient conditions for a solution to the risk-sensitive Poisson equation on a finite state space. Syst Control Lett (to appear)
Di Masi GB, Stettner L (2000) Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Syst Control Lett 40: 305–321
Di Masi GB, Stettner L (2007) Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J Control Optim 46: 231–252
Fleming WH, McEneany WM (1995) Risk-sensitive control on an infinite horizon. SIAM J Control Optim 33: 1881–1915
Hernández-Hernández D, Marcus SI (1996) Risk-sensitive control of Markov processes in countable state space. Syst Control Lett 29: 147–155
Hernández-Lerma O (1988) Adaptive Markov control processes. Springer, New York
Howard AR, Matheson JED (1972) Risk-sensitive Markov decision processes. Manage Sci 18: 356–369
Jacobson DH (1973) Optimal stochastic linear systems with exponential performance criteria and their relation to stochastic differential games. IEEE Trans Automat Control 18: 124–131
Jaquette SC (1973) Markov decison processes with a new optimality criterion: discrete time. Ann Stat 1: 496–505
Jaquette SC (1976) A utility criterion for Markov decision processes. Manage Sci 23: 43–49
Jaśkiewicz A (2007) Average optimality for risk sensitive control with general state space. Ann Appl Probab 17: 654–675
Puterman ML (1994) Markov decision processes. Wiley, New York
Seneta E (1980) Nonnegative matrices. Springer, New York
Thomas LC (1980) Conectedness conditions for denumerable state Markov decision processes. In: Hartley R, Thomas LC, White DJ (eds) Recent advances in Markov decision processes. Academic Press, New York
Author information
Authors and Affiliations
Corresponding author
Additional information
Dedicated to Professor Onésimo Hernández-Lerma, on the occasion of his sixtieth birthday.
This work was supported by the PSF Organization under Grant No. 08-04, and in part by CONACYT under Grant 25357.
Rights and permissions
About this article
Cite this article
Cavazos-Cadena, R. Solutions of the average cost optimality equation for finite Markov decision chains: risk-sensitive and risk-neutral criteria. Math Meth Oper Res 70, 541–566 (2009). https://doi.org/10.1007/s00186-008-0277-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-008-0277-y
Keywords
- Closed set
- Arrival time
- Constant average cost
- Strong simultaneous Doeblin condition
- Multiplicative optimality equation