Abstract
Given a non-zero sum discounted stochastic game with finitely many states and actions one can form a bimatrix game whose pure strategies are the pure stationary strategies of the players and whose penalty payoffs consist of the total discounted costs over all states at any pure stationary pair. It is shown that any Nash equilibrium point of this bimatrix game can be used to find a Nash equilibrium point of the stochastic game whenever the law of motion is controlled by one player. The theorem is extended to undiscounted stochastic games with irreducible transitions when the law of motion is controlled by one player. Examples are worked out to illustrate the algorithm proposed.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
D. Blackwell, “Discrete dynamic programming,”Annals of Mathematical Statistics 33 (1962) 719–726.
J.A. Filar, “Ordered field property for stochastic games when the player who controls transition changes from state to state,”Journal of Optimization Theory and Applications 34 (1981) 503–513.
J.A. Filar, “On stationary equilibria of a single-controller stochastic game,”Mathematical Programming 30 (1984) 313–325.
J.A. Filar and T.E.S. Raghavan, “A matrix game solution of a single controller stochastic game,”Mathematics of Operations Research 9 (1984) 356–362.
A.M. Fink, “Equilibrium in a stochasticn-person game,”Journal of Science of Hiroshima University, Series, A-I 28 (1964) 89–93.
D. Gillette, “Stochastic games with zero stop probabilities,” in:Contributions to the Theory of Games III. Annals of Mathematical Studies No. 39 (Princeton University Press, Princeton, NJ, 1957) pp. 179–187.
A. Hordijk and L.C.M. Kallenberg, “Linear programming and Markov games I, II,” in: O. Moeschlin and D. Pallaschke, eds.,Game Theory and Mathematical Economics (North-Holland, Amsterdam, 1981).
T. Parthasarathy and T.E.S. Raghavan,Some Topics in Two-Person Games (American Elsevier Publishing Corporation, New York, 1971).
T. Parthasarathy and T.E.S. Raghavan, “An orderfield property for stochastic games when one player controls transition probabilities,”Journal of Optimization Theory and Applications 33 (1981) 375–392.
T.E.S. Raghavan and J.A. Filar, “Algorithms for stochastic games—a survey,” to appear in:Zeitschrift fur Operations Research.
S.M. Ross, “Non-discounted denumerable Markovian decision models,”Annals of Mathematical Statistics 39 (1968) 412–423.
L.S. Shapley, “Stochastic games,”Proceedings of the National Academy of Sciences of the U.S.A. 39 (1953) 1095–1100.
M.J. Sobel, “Noncooperative stochastic games,”Annals of Mathematical Statistics 42 (1971) 1930–1935.
M. Takahashi, “Equilibrium points of stochastic noncooperativen-person games,”Journal of Science of Hiroshima University, Series A-I 28 (1964) 95–99.
O.J. Vrieze, “Linear programming and undiscounted stochastic games in which one player controls transitions,”OR Spektrum 3 (1981) 29–35.
Author information
Authors and Affiliations
Additional information
The work of this author was supported in part by the NSF grants DMS-9024408 and DMS 8802260.
Rights and permissions
About this article
Cite this article
Nowak, A.S., Raghavan, T.E.S. A finite step algorithm via a bimatrix game to a single controller non-zero sum stochastic game. Mathematical Programming 59, 249–259 (1993). https://doi.org/10.1007/BF01581246
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01581246