Abstract
This note concerns controlled Markov chains on a denumerable sate space. The performance of a control policy is measured by the risk-sensitive average criterion, and it is assumed that (a) the simultaneous Doeblin condition holds, and (b) the system is communicating under the action of each stationary policy. If the cost function is bounded below, it is established that the optimal average cost is characterized by an optimality inequality, and it is to shown that, even for bounded costs, such an inequality may be strict at every state. Also, for a nonnegative cost function with compact support, the existence an uniqueness of bounded solutions of the optimality equation is proved, and an example is provided to show that such a conclusion generally fails when the cost is negative at some state.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Arapostathis A, Borkar VK, Fernández-Gaucherand E, Gosh MK, Marcus SI (1993) Discrete-time controlled Markov processes with average cost criteria: a survey. SIAM J Control Optim 31: 282–334
Borkar VS, Meyn SP (2002) Risk-sensitive optimal control for Markov decison process with monotone cost. Math Oper Res 27: 192–209
Cavazos-Cadena R (1988) Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward markov decision chains. Syst Control Lett 10: 71–78
Cavazos-Cadena R, Fernández-Gaucherand E (1999) Controlled Markov chains with risk-sensitive criteria: average cost, optimality equations and optimal solutions. Math Meth Oper Res 43: 121–139
Cavazos-Cadena R, Fernández-Gaucherand E (2002) Risk-sensitive control in communicating average Markov decision chains. In: Dror M, L’Ecuyer P, Szidarovsky F (eds) Modelling uncertainty: an examination of stochastic theory, methods and applications. Kluwer, Boston, pp 525–544
Cavazos-Cadena R, Hernández-Hernández D (2004) A characterization of exponential functionals in finite Markov chains. Math Methods Oper Res 60: 399–414
Di Masi GB, Stettner L (2000) Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Syst Control Lett 40: 305–321
Di Masi GB, Stettner L (2007) Infinite horizon risk sensitive control of discrete time Markov processes under minorization properrty. SIAM J Control Optim 46: 231–252
Fleming WH, McEneany WM (1995) Risk-sensitive control on an infinite horizon. SIAM J Control Optim 33: 1881–1915
Hernández-Hernández D, Marcus SI (1996) Risk-sensitive control of Markov processes in countable state space. Syst Control Lett 29: 147–155
Hernández-Hernández D, Marcus SI (1999) Existence of risk-sensitive optimal stationary policies for controlled Markov processes. Appl Math Optim 40: 273–285
Hernández-Lerma O (1988) Adaptive Markov control processes. Springer, New York
Howard AR, Matheson JE (1972) Risk-sensitive Markov decision processes. Manage Sci 18: 356–369
Jacobson DH (1973) Optimal stochastic linear systems with exponential performance criteria and their relation to stochastic differential games. IEEE Trans Automat Contr 18: 124–131
Jaquette SC (1973) Markov decison processes with a new optimality criterion: discrete time. Ann Stat 1: 496–505
Jaquette SC (1976) A utility criterion for Markov decision processes. Manage Sci 23: 43–49
Jaśkiewicz A (2007) Average optimality for risk sensitive control with general state space. Ann Appl Probab 17: 654–675
Loève M (1980) Probability theory I. Springer, New York
Puterman ML (1994) Markov decision processes. Wiley, New York
Seneta E (1980) Nonnegative matrices. Springer, New York
Sennot L (1986) A new condition for the existence of optimum stationary policies in average cost Maarkov decision processes. Oper Res Lett 5: 17–23
Sennot L (1995) Another set of conditions for average optimality in Maarkov control processes. Syst Control Lett 24: 147–151
Thomas LC (1980) Conectedness conditions for denumerable state Markov decision processes. In: Hartley R, Thomas LC, White DJ (eds) Recent advances in Markov decision processes. Academic Press, New York
Author information
Authors and Affiliations
Corresponding author
Additional information
Dedicated to Professor Onésimo Hernández-Lerma, on the occasion of his 60th birthday.
This work was supported by the PSF Organisation under Grant No. 08-05(450), and in part by CONACYT under Grant 25357.
Rights and permissions
About this article
Cite this article
Cavazos-Cadena, R. Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains. Math Meth Oper Res 71, 47–84 (2010). https://doi.org/10.1007/s00186-009-0285-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-009-0285-6
Keywords
- First arrival time
- Stopping problem with total cost index
- Relative value function
- Constant average cost
- Stochastic matrix associated with a multiplicative Poisson equation