Mean, variance, and probabilistic criteria in finite Markov decision processes: A review

White, D. J.

doi:10.1007/BF00938524

Mean, variance, and probabilistic criteria in finite Markov decision processes: A review

Survey Paper
Published: January 1988

Volume 56, pages 1–29, (1988)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Mean, variance, and probabilistic criteria in finite Markov decision processes: A review

Download PDF

D. J. White¹

726 Accesses
76 Citations
3 Altmetric
Explore all metrics

Abstract

This paper is a survey of papers which make use of nonstandard Markov decision process criteria (i.e., those which do not seek simply to optimize expected returns per unit time or expected discounted return). It covers infinite-horizon nondiscounted formulations, infinite-horizon discounted formulations, and finite-horizon formulations. For problem formulations in terms solely of the probabilities of being in each state and taking each action, policy equivalence results are given which allow policies to be restricted to the class of Markov policies or to the randomizations of deterministic Markov policies. For problems which cannot be stated in such terms, in terms of the primitive state setI, formulations involving a redefinition of the states are examined.

Article PDF

Finite Markov Chains and Markov Decision Processes

Dynamic Programming and Markov Decision Processes

Stochastic Process Models

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Markowitz, H.,Portfolio Selection, Wiley, New York, New York, 1959.
Google Scholar
Charnes, A., andCooper, W. W.,Chance Constrained Programming, Management Science, Vol. 6, pp. 73–79, 1959.
Google Scholar
Hogan, A. J., Morris, J. G., andThompson, H. E.,Decision Problems under Risk and Chance Constrained Programming: Dilemmas in the Transition, Management Science, Vol. 27, pp. 698–716, 1981.
Google Scholar
Jacquette, S. C.,A Utility Criterion for Markov Decision Processes, Management Science, Vol. 23, pp. 43–49, 1979.
Google Scholar
Jacquette, S. C.,Markov Decision Processes with a New Optimality Criterion, Small Interest Rates, Annals of Mathematical Statistics, Vol. 1, pp. 1894–1901, 1973.
Google Scholar
Porteus, E. L.,On the Optimality of Structure Policies in Countable Stage Decision Processes, Management Science, Vol. 22, pp. 148–157, 1975.
Google Scholar
White, C. C.,The Optimality of Isotone Strategies for Markov Decision Problems with Utility Criterion, Recent Developments in Markov Decision Processes, Edited by R. Hartley, L. C. Thomas, and D. J. White, Academic Press, New York, New York, 1980.
Google Scholar
Howard, R. A., andMatheson, J. E.,Risk-Sensitive Markov Decision Processes, Management Science, Vol. 8, pp. 356–369, 1972.
Google Scholar
Kreps, D. M.,Decision Problems with Expected Utility Criteria, I: Upper and Lower Convergent Utility, Mathematics of Operations Research, Vol. 2, pp. 45–53, 1977.
Google Scholar
Kreps, D. M.,Decision Problems with Expected Utility Criteria, II: Stationarity, Mathematics of Operations Research, Vol. 2, pp. 266–274, 1977.
Google Scholar
Rothblum, U. G.,Multiplicative Markov Decision Chains, Mathematics of Operations Research, Vol. 9, pp. 6–24, 1984.
Google Scholar
Sobel, M. J.,Ordinal Dynamic Programming, Management Science, Vol. 21, pp. 967–975, 1975.
Google Scholar
Kallenberg, L. C. M.,Linear Programming and Finite Markovian Control Problems, Mathematisch Centrum, Amsterdam, Holland, 1983.
Google Scholar
Sobel, M. J.,The Variance of Discounted Markov Decision Processes, Journal of Applied Probability, Vol. 19, pp. 774–802, 1982.
Google Scholar
Miller, B.,On Dynamic Programming for a Stochastic Markovian Process with an Application to the Mean Variance Models, Management Science, Vol. 24, p. 1779, 1978.
Google Scholar
White, D. J.,Probabilistic Constraints and Variance in Markov Decision Processes, University of Manchester, Department of Decision Theory, Notes in Decision Theory, No. 149, 1984.
Derman, C.,Finite State Markovian Decision Processes, Academic Press, New York, New York, 1970.
Google Scholar
Van Der Wal, J.,Stochastic Dynamic Programming, Mathematisch Centrum, Amsterdam, Holland, 1981.
Google Scholar
Derman, C.,On Sequential Control Procedures, Annals of Mathematical Statistics, Vol. 35, pp. 341–349, 1964.
Google Scholar
Derman, C., andStrauch, R.,A Note on Memoryless Rules for Controlling Sequential Control Processes, Annals of Mathematical Statistics, Vol. 37, pp. 276–278, 1966.
Google Scholar
Hartley, R.,Finite, Discounted, Vector Markov Decision Processes, University of Manchester, Department of Decision Theory, Notes in Decision Theory, No. 85, 1979.
Derman, C.,Stable Sequential Control Rules and Markov Chains, Journal of Mathematical Analysis and Applications, Vol. 6, pp. 257–265, 1963.
Google Scholar
Hordjik, A., andKallenberg, L. C. M.,Constrained Stochastic Dynamic Programming, Mathematics of Operations Research, Vol. 9, pp. 276–289, 1984.
Google Scholar
Derman, C., andVeinott, A. F.,Constrained Markov Decision Chains, Management Science, Vol. 19, pp. 389–390, 1972.
Google Scholar
Strauch, R., andVeinott, A.,A Property of Sequential Control Processes, The Rand Corporation, Santa Monica, California, Research Memorandum No. RM 14772, 1966.
White, D. J.,Utility, Probabilistic Constraints, Mean, and Variance in Markov Decision Processes, University of Manchester, Notes in Decision Theory, No. 163, 1985.
Derman, C., andKlein, M.,Some Remarks on Finite-Horizon Markovian Decision Models, Operations Research, Vol. 13, pp. 272–278, 1965.
Google Scholar
White, D. J.,Dynamic Programming with Probabilistic Constraints, Operations Research, Vol. 22, pp. 654–664, 1972.
Google Scholar
Derman, C.,Optimal Replacement under Markovian Deterioration with Probability Bounds on Failure, Management Science, Vol. 9, pp. 478–481, 1963.
Google Scholar
Dantzig, G. B., andWolfe, P.,The Decomposition Algorithm for Linear Programming, Econometrica, Vol. 29, pp. 767–778, 1961.
Google Scholar
Howard, R. A.,Dynamic Programming and Markov Processes, Massachusetts Institute of Technology, PhD Thesis, 1960.
Filar, J. A., andLee, H. M.,Gain Variability Tradeoffs in Undiscounted Markov Decision Processes, Proceedings of the 24th IEEE Conference on Decision and Control, pp. 1106–1112, 1985.
White, D. J.,Optimality and Efficiency, Wiley, Now York, New York, 1982.
Google Scholar
Mendelssohn, R.,A Systematic Approach to Determining Mean Variance Tradeoffs when Managing Randomly Varying Populations, Mathematical Biosciences, Vol. 50, pp. 75–84, 1980.
Google Scholar
Filar, J. A.,Percentiles and Markovian Decision Proceesses, Operations Research Letters, Vol. 2, pp. 13–15, 1980.
Google Scholar
White, D. J.,Fundamentals of Decision Theory, North-Holland, New York, New York, 1976.
Google Scholar
White, D. J.,Minimizing Threshold Probabilities in Infinite-Horizon Discounted Markov Decision Processes, University of Manchester, Department of Decision Theory, Notes in Decision Theory, No. 165, 1985.
Henig, M.,Optimality in Dynamic Programming with Deterministic Transitions and Stochastic Rewards, Tel Aviv University, Faculty of Management, Working Paper No. 721/82, 1982.
Henig, M.,Target and Percentile Criteria in Dynamic Programming with Deterministic Transitions and Stochastic Rewards, University of Illinois at Urbana-Champaign, Department of Business Administration, 1984.
Charnes, A. andCooper, W. W.,Chance Constraints and Normal Deviates, Journal of the American Statistical Association, Vol. 57, pp. 134–148, 1962.
Google Scholar
Goldwerger, J.,Dynamic Programming of a Stochastic Markovian Process with an Application to the Mean Variance Models, Management Science, Vol. 23, pp. 612–620, 1977.
Google Scholar
Parks, M. S., andSteinberg, E.,A Preference Order Dynamic Program for a Knapsack Problem with Stochastic Rewards, Journal of the Operational Research Society, Vol. 30, pp. 141–147, 1979.
Google Scholar
Sneidovitch, M.,Preference Order Stochastic Knapsack Problems: Methodological Issues, Journal of the Operation Research Society, Vol. 31, pp. 1025–1032, 1980.
Google Scholar
Sneidovitch, M.,A Class of Variance Constrained Problems, Operations Research, Vol. 31, pp. 338–353, 1983.
Google Scholar
Greenberg, H.,Dynamic Programming with Linear Uncertainty, Operations Research, Vol. 16, pp. 675–678, 1968.
Google Scholar
Beja, A.,Probability Bounds in Replacement Policies for Markov Systems, Management Science, Vol. 16, pp. 253–264, 1969.
Google Scholar
Bouakiz, M.,Risk Sensitivity in Stochastic Optimization with Applications, Georgia Institute of Technology, PhD Thesis, 1985.
Chung, K. J.,Some Topics in Risk-Sensitive Stochastic Dynamic Models, Georgia Institute of Technology, PhD Thesis, 1985.
Filar, J. A., andLee, H. M.,Gain Variability Tradeoffs in Discounted Markov Decision Processes, Johns Hopkins University, Department of Mathematical Sciences, Technical Report No. 408, 1985.
Lee, H. M.,Gain Variability Tradeoffs in Markovian Decision Processes and Related Problems, Johns Hopkins University, Department of Mathematical Sciences, PhD Thesis, 1985.
Sobel, M. J.,Mean-Variance Tradeoffs in an Undiscounted MDP, Georgia Institute of Technology, Research Memorandum, 1984.
Sobel, M. J.,Maximal Mean/Variance Ratio in an Undiscounted MDP, Georgia Institute of Technology, Research Memorandum, 1985.

Download references

Author information

Authors and Affiliations

University of Manchester, Manchester, England
D. J. White (Professor of Decision Theory)

Authors

D. J. White
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Communicated by P. L. Yu

The author would like to thank two referees for a very thorough and helpful referceing of the original article and for the extra references (Refs. 47–52) now added to the original reference list.

Rights and permissions

Reprints and permissions

About this article

Cite this article

White, D.J. Mean, variance, and probabilistic criteria in finite Markov decision processes: A review. J Optim Theory Appl 56, 1–29 (1988). https://doi.org/10.1007/BF00938524

Download citation

Issue Date: January 1988
DOI: https://doi.org/10.1007/BF00938524

Key Words

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Mean, variance, and probabilistic criteria in finite Markov decision processes: A review

Abstract

Article PDF

Similar content being viewed by others

Finite Markov Chains and Markov Decision Processes

Dynamic Programming and Markov Decision Processes

Stochastic Process Models

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key Words

Navigation

Mean, variance, and probabilistic criteria in finite Markov decision processes: A review

Abstract

Article PDF

Similar content being viewed by others

Finite Markov Chains and Markov Decision Processes

Dynamic Programming and Markov Decision Processes

Stochastic Process Models

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Search

Navigation