Abstract
We consider a two-person stochastic game of resource extraction. It is assumed that players have identical preferences. A novelty relies on the fact that each player is equipped with the same risk coefficient and calculates his discounted utility in the infinite time horizon in a recursive way by applying the entropic risk measure parametrized by this risk coefficient. Under two alternative sets of assumptions, we prove the existence of a symmetric stationary Markov perfect equilibrium.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A common assumption used in Markov decision processes as well as stochastic games is that the decision makers have the preferences represented by an overall utility parametrized by an expectation operator with respect to the current information. More precisely, if u is an instantaneous utility of the agent and \(c_t\) is the consumption level in period t, then the discounted lifetime utility \(V_t\) from period t onwards is defined in a recursive way as
where \(\beta \in (0,1)\) is a discount factor and \(E_t\) is the expectation operator with respect to the information in period t. However, taking only an expectation of \(V_{t+1}\) means that the agent is risk neutral to future discounted utility. In real life, this assumption is very often violated. For example, in an optimal growth model, the agent may have a higher risk aversion, which generates precautionary saving. Therefore, we propose to equip the agent with a constant absolute risk aversion coefficient, say \(\gamma >0,\) and assume that he/she uses the entropic risk measure, known also as certainty equivalent for the exponential function. In other words, the lifetime utility of the agent is defined now as
Within this framework, the agent is risk averse in future utility \(\tilde{V}_{t+1},\) in addition to being risk averse in future consumption levels \(c_{t+1}, c_{t+2},\ldots .\) The latter risk attitude is reflected in the concave function u of the agent. According to the properties of the entropic risk measure listed in Sect. 3, the agent takes into account not only the expected value of the future lifetime utility, but also all further moments with appropriate weights (see also Section 3.4 in Bäuerle and Rieder (2011)). These preferences have drawn attention of many authors. For instance, Hansen and Sargent (1995) applied them to a linear quadratic Gaussian control model, and Weil (1993) used them to examine precautionary savings and permanent income hypothesis. Moreover, these preferences found applications in the problems of Pareto optimal allocations (Anderson 2005) as well as in the study of Markov decision processes (Asienkiewicz and Jaśkiewicz 2017) or in one-sector optimal growth model with an unbounded felicity function (Bäuerle and Jaśkiewicz (2018)) . As argued by Hansen and Sargent (1995) the preferences in (1) are also attractive, because they can viewed as the robustness preferences. In this context, \(\gamma \) denotes the degree of robustness of the agent. This fact is a consequence of the robust representation of the entropic risk measure via the relative entropy as a penalty function, see Chapter 4 in Föllmer and Schied (2001).
In this paper, we study a strategic version of the discrete-time one-sector optimal growth model. Specifically, we deal with two players who own natural resource and they consume certain amount of the available stock in each time period. We assume that each player possesses the same risk coefficient \(\gamma \) and the same felicity function. Moreover, each player defines, using the aforementioned risk measure, his non-expected discounted utility. Our objective is to prove the existence of a symmetric Nash equilibrium in non-randomized strategies. Levhari and Mirman in their seminal paper Levhari and Mirman (1980) studied such a strategic optimal growth model with the same logarithmic felicity functions for each agent and the deterministic Cobb–Douglas production function. Their model has been extended in Sundaram (1989) for arbitrary production and felicity functions. Further generalizations to stochastic production functions were reported in Majumdar and Sundaram (1991), Dutta and Sundaram (1992) and Jaśkiewicz and Nowak (2018a). Other models of capital accumulation or resource extraction games with risk-neutral agents can be found in Jaśkiewicz and Nowak (2018b) and Balbus et al. (2016). Moreover, it is worth mentioning that there exist some iterative procedures (under special conditions) for finding Nash equilibria in such games, which were developed in Balbus and Nowak (2004) and Szajowski (2006). Finally, we wish to stress out that the games with risk-sensitive players have already been examined in the literature, but not with the non-expected discounted payoffs, see for instance Bäuerle and Rieder (2017), Jaśkiewicz and Nowak (2014), Klompstra (2000) and the references cited therein. Namely Bäuerle and Rieder (2017) dealt with zero-sum stochastic games, where the players take the expectation of the exponential function of accumulated discounted payoffs. Such an approach leads to a non-stationary model. Klompstra (2000), on the other hand, studied Nash equilibria for a two-person non-zero-sum game with a quadratic-exponential cost criterion, whilst in Jaśkiewicz and Nowak (2014) the authors treated intergenerational models with risk-sensitive generations. Finally, Başar (1999) dealt with risk-sensitive players playing a differential game. Therefore, to the best of our knowledge, this work is the first which studies recursive utilities in dynamic games.
To show the existence of an equilibrium, we need to accept some conditions on the felicity function and the transition probabilities. Our assumptions are borrowed from Balbus et al. (2015a) and Jaśkiewicz and Nowak (2018a). Namely we present two alternative sets of conditions. We assume either non-atomic transition probabilities or transition probabilities that allow atoms and embrace purely deterministic case. These assumptions allow us to prove the existence of an equilibrium in the class of stationary Markov strategies.
The paper is organized as follows. Section 2 is devoted to a model description. In Sect. 3, we carefully define a non-expected discounted utility in the infinite time horizon. The assumptions and the main result are formulated in Sect. 4, whereas Sect. 5 contains the proof. Examples are placed in Sect. 6.
2 The model
Put \(\mathbb {R}_+=[0,+\infty ).\) Consider a two-person stochastic game with the following objects:
-
(i)
\(S= \mathbb {R}_+\) is the state space, i.e., the space of available resource stocks;
-
(ii)
\(A_i(s)=[0,s]\) is the space of actions available for player \(i \in \{1,2\}\), when the current resource stock is \(s\in S\);
-
(iii)
\(u_i:S\times S\times S\rightarrow \mathbb {R}_+\) is a felicity function for player \(i \in \{1,2\}\); we assume that for every \(s\in S\), \(a\in A_1(s)\) and \(b\in A_2(s),\)\(u_1(s,a,b)=u(a)\) and \(u_2(s,a,b)=u(b)\), where \(u:S\mapsto \mathbb {R}_+\) is a temporal utility for both agents; note that the utility for player 1 depends only on his/her consumption; the same remark applies to agent 2;
-
(iv)
\(q(\cdot | s-a-b)\) is a Borel measurable transition probability on S for the given feasible pair of actions \((a,b) \in A_1(s)\times A_2(s),\)\(a+b\le s\) and the current resource stock \(s\in S;\)
-
(v)
we define
$$\begin{aligned} D:=\left\{ (s,a,b)\in S\times S\times S: a+b \le s \right\} \end{aligned}$$and
$$\begin{aligned} D(s):=\left\{ (a,b)\in S\times S:(s,a,b)\in D\right\} ; \end{aligned}$$ -
(vi)
\(\gamma > 0\) is a risk coefficient;
-
(vii)
\(\beta \in (0,1)\) is a discount factor.
We assume that \(u(s) \le d\) for every \(s\in S\) and some constant \(d >0\). In each period, the both agents observe the state \(s \in S\) and simultaneously choose their actions \((a,b)\in A_1(s)\times A_2(s)\) provided that the actions are feasible, i.e., \((a,b) \in D(s)\). Immediately, player 1 enjoys the utility u(a), whereas player 2 enjoys u(b). The next state of the game \(s'\) has a distribution \(q(\cdot |s-a-b)\). If the pair of actions (a, b) is infeasible in state s, then the players choose again their actions. Therefore, we restrict our attention only to strategies generating feasible action pairs during the play. Next, we define a history of the game as follows:
where \(s_k \in S\), \(a_k+b_k \le s_k \) for all \(k=1,...,t\). Let \(H_t\) be a set of all histories up to tth step. We endow \(H_t\) with a natural product topology. We shall consider only pure strategies.
Definition 1
A strategy\(\pi \) for player 1 is a sequence \((\pi _{t})_{t=1}^{\infty }\) such that each \(\pi _{t}\) is a Borel measurable mapping from the history space to the space of actions available to player 1. The set of all strategies for player 1 is denoted by \(\varPi \). Similarly, we define a strategy \(\sigma \) for player 2 and denote the set of all his/her strategies by \(\varSigma \).
Furthermore, we introduce the following set of functions
Definition 2
A stationary Markov strategy for player 1 is a sequence \((\pi _{t})_{t=1}^{\infty }\) such that \(\pi _{t}=\phi \) for all \(t\in \mathbb {N}\) and some \(\phi \in F_1\). Analogously, we define a stationary strategy for player 2 as a sequence of \((\sigma _{t})_{t=1}^{\infty }\) such that \(\sigma _{t}=\hat{\phi }\) for all \(t\in \mathbb {N}\) and some \(\hat{\phi }\in F_2\). Further, we shall identify a stationary Markov strategy with the element of the sequence.
3 Non-expected \(\beta \)-discounted utility function
In this section, we define the non-expected utilities for the players. We assume that each player is equipped with the risk coefficient \(\gamma >0.\) Before giving a formal definition of the discounted utility in the infinite time horizon for each player, we introduce the notion of the entropic risk measure. Let \((\varOmega ,\mathcal {F},P)\) be a probability space and let \(X \in L^{\infty }(\varOmega ,\mathcal {F},P)\) be a random payoff. Then the entropic risk measure is defined as follows:
Let X and Y be random variables from \(L^{\infty }(\varOmega ,\mathcal {F},P)\). Then \(\rho (\cdot )\) satisfies following properties:
-
(P1)
monotonicity: if \(X\le Y,\) then \(\rho (X)\le \rho (Y)\);
-
(P2)
translation invariance: if \(k\in \mathbb {R},\) then \(\rho (X+k)=\rho (X)+k\);
-
(P3)
\(\rho (X)\le E(X)\), the consequence of Jensen’s inequality.
Using the Taylor expansion for the exponential and logarithmic functions, for \(\gamma \) sufficiently close to 0, we obtain the following approximation:
It means that the risk-sensitive player, when calculating his random payoff, takes into account not only the expected value of this random payoff but also its variance. Formula (2) is also known in the literature as a certainty equivalent of the exponential function Weil (1993). For further properties of \(\rho \), the reader is referred to Föllmer and Schied (2001).
Let \((\pi ,\sigma )\in \varPi \times \varSigma \). By \(\mathcal {B}(H_t)\) we denote the set of all Borel measurable bounded non-negative real-valued functions defined on \(H_t\) equipped with the supremum norm \(||\cdot ||\). For \(v_{t+1}\in \mathcal {B}(H_{t+1})\) and \(h_t \in H_t\), we set
By properties (P1) and (P3), we have that
Next, for any \(v_{t+1}\in \mathcal {B}(H_{t+1})\), we define the operator \(L_{\pi _t,\sigma _t}^i\) for player i as follows:
Note that \(L_{\pi _t,\sigma _t}^i: \mathcal {B}(H_{t+1})\rightarrow \mathcal {B}(H_{t+1}).\) Indeed, observe that for every player i
Further, we define an N-stage total discounted utility for player i by
where \(\mathbf {0}\) is a function that assigns 0 for any argument. For instance, for player 1 and stage 2 we have
Similarly, we can define \(U_2^2(s,\pi ,\sigma )\) for player 2.
From the monotonicity of \(\rho \), the sequence \(\big ( U_N^i(s,\pi ,\sigma )\big )_{N\in \mathbb {N}}\) is non-decreasing and bounded from below by 0 for every \(s\in S\) and \((\pi ,\sigma ) \in \varPi \times \varSigma \). Moreover, by properties (P1)–(P3) it follows that
for all \(s\in S\) and \((\pi ,\sigma )\in \varPi \times \varSigma \). The reader is referred to Asienkiewicz and Jaśkiewicz (2017), where (5) and further details are proved. Hence, \(\lim \nolimits _{N\rightarrow \infty } U_N^i(s,\pi ,\sigma )\) exists and let us denote this limit by \(U^i(s,\pi ,\sigma )\). By the aforementioned discussion, it follows that each player is careful of his future unknown continuation function. Therefore, at every stage he uses the entropic risk measure, parametrized by his risk-averse coefficient \(\gamma ,\) to calculate the discounted utility in the infinite time horizon.
4 Existence of symmetric stationary Nash equilibria
Definition 3
A feasible profile \((\pi ^*,\sigma ^*)\in \varPi \times \varSigma \) is called a Nash equilibrium, if
for each \(s\in S\) and any \(\pi \in \varPi \) such that \((\pi ,\sigma ^*)\) is feasible and
for each \(s\in S\) and any \(\sigma \in \varSigma \) such that \((\pi ^*,\sigma )\) is feasible.
Definition 4
A Stationary Markov Perfect Equilibrium (SMPE) is a Nash equilibrium \((\phi ^*_1,\phi ^*_2)\) that belongs to the class of strategy pairs \(F_1\times F_2\). An SMPE \((\phi _1^*,\phi ^*_2)\) is symmetric if \(\phi _1^*=\phi _2^*.\)
The purpose of this section is to find a symmetric stationary pure Nash equilibrium in an appropriate class of strategies. Therefore, we define the subset of \(F_i\) as follows:
The definition of the sets \(F_i^0\) (\(i=1,2\)) given in Jaśkiewicz and Nowak (2018a) on p. 243 should be same as above. More precisely, the function \(\varphi (s):=s-\phi (s)\) in Jaśkiewicz and Nowak (2018a) must be replaced by \(\varphi (s):=s/2-\phi (s).\)
We shall need the following assumptions imposed on the felicity function.
Assumption 1
(Felicity function) Function u is increasing, bounded, strictly concave and continuous at \(s=0\).
We also propose two alternative sets of assumptions for the transition probability.
Assumption 2
(Transition probability)
-
(i)
q is stochastically increasing, i.e., the function
$$\begin{aligned} y\rightarrow \int _S f(z)q(\mathrm{d}z|y) \end{aligned}$$is increasing, whenever \(f: S\rightarrow \mathbb {R}\) is increasing;
-
(ii)
q is weakly continuous, i.e., if \(y_n \rightarrow y\) in S, then \(q(\cdot |y_n) \Rightarrow q(\cdot |y)\) as \(n \rightarrow \infty \);
-
(iii)
For each \(s\in S\) the set \(Z_s:=\{y\in S:q(\{s\}|y)>0\}\) is countable and \(q(\{0\}|0)=1.\)
Assumption 3
(Transition probability)
-
(i)
For each \(y \in S_+:=(0,+\infty )\) the probability measure \(q(\cdot |y)\) is non-atomic and \(q(\cdot |0)\) has no atoms in \(S_+\);
-
(ii)
q is weakly continuous.
Theorem 1
Let either Assumptions 1 and 2 or Assumptions 1 and 3 be satisfied. Then there exists a symmetric SMPE \((\phi ^*,\phi ^*) \in F_1^0 \times F_2^0\).
Remark 1
The predecessors of our work on symmetric dynamic games of resource extraction are Dutta and Sundaram (1992) and Jaśkiewicz and Nowak (2018a). The common feature of these works is that the authors deal with standard discounted expected payoffs or utilities for the players. This in turn implies that the players care only about the expected value of the future random payoffs. In other words, when calculating the discounted expected utility in the infinite time horizon, the players take into account only the expectation of the continuation function. In our approach, we allow the agents to be risk averse towards future random payoffs in the sense that according to (3) the players care not only about the expectation but also about the variance of the continuation function. Therefore, they evaluate the discounted utility in a recursive way by using the entropic risk measure (or the exponential certainty equivalent) parametrized by the risk coefficient. As in Dutta and Sundaram (1992), a felicity function is bounded (in contrast to Jaśkiewicz and Nowak (2018a)) and as in Jaśkiewicz and Nowak (2018a) the resource stock takes values in \([0,+\infty )\) [in contrast to Dutta and Sundaram (1992)].
Our assumptions imposed on the model are borrowed from Balbus et al. (2015a) and Jaśkiewicz and Nowak (2018a). More precisely, Assumption 3 coincides with Assumption (A) in Jaśkiewicz and Nowak (2018a). However, Assumption 2, analogous to the one in Balbus et al. (2015a), is slightly stronger than Assumptions (B1)–(B3) in Jaśkiewicz and Nowak (2018a). This is because the risk measure \(\rho \) used in evaluating the discounted utility is not additive in the sense that, in general, \(\rho (X+Y)\not =\rho (X)+\rho (Y)\) for any random payoffs X and Y. Therefore, the transition probability q cannot be the convex combinations of stochastic kernels with coefficients depending on the investment amount as in Jaśkiewicz and Nowak (2018a).
On the other hand, our result can also be viewed as an extension of the optimization problem (one player case), studied in Asienkiewicz and Jaśkiewicz (2017) and Bäuerle and Jaśkiewicz (2018), to a strategic version of a one-sector optimal growth model. In contrast to Bäuerle and Jaśkiewicz (2018), we examine, as mentioned above, a model with bounded felicity functions. The crucial role played in a study of the unbounded case is the fact that both investment and consumption functions are non-decreasing. Here, this property does not hold, since the unique solution to the Bellman equation \(V_\phi \) in Lemma 5 depends on the consumption strategy \(\phi \) of the other player.
5 Proof of Theorem 1
The methods of proving Theorem 1 resemble the ones used in Jaśkiewicz and Nowak (2018a). However, most of the preceding results must be formulated in terms of the entropic risk measure. Moreover, for the sake of completeness and clarity, we decided to provide all lemmas with their proofs.
Let X be the vector space of all continuous from the right functions with bounded variation on every [0, n], \(n\in \mathbb {N}\). We endow X with the topology of weak convergence. Recall that a sequence \((\eta _t)_{t=1}^\infty \) converges weakly to \(\eta \in X\) if and only if \(\eta _t(s)\rightarrow \eta (s)\) as \(t\rightarrow \infty \) at any continuity point \(s\in S\) of \(\eta \). The weak convergence of \((\eta _t)_{t=1}^\infty \) to \(\eta \) is denoted by \(\eta _t \xrightarrow {w} \eta .\)
Let \(X^*\) be the set of all non-decreasing functions \(\eta \in X\) such that \(0 \le \eta (s) \le \frac{d}{1-\beta }\) for all \(s\in S.\) Note that each \(\eta \in X^*\) is upper semicontinuous. Furthermore, we notice that 0 is a continuity point of every function \(\eta \in X^*.\) By Proposition 1 in Jaśkiewicz and Nowak (2018a), we have that \(X^*\) is sequentially compact in X. Moreover, Proposition 2 in Jaśkiewicz and Nowak (2018a) yields that \(F_i^0\) is also a convex and sequentially compact subset of X when endowed with the topology of weak convergence.
Now we start with a sequence of preliminary lemmas.
Lemma 1
Assume that \(f_n \xrightarrow {w} f\) in \(X^*\) and \(y_n \rightarrow y\) in S as \(n \rightarrow \infty \). Then \(f(y) \ge \limsup \nolimits _{n\rightarrow \infty } f_n(y_n)\).
Proof
Let \(y_0 > y\) be a continuity point of f. Then there exists \(N \in \mathbb {N}\) such that \(y_n < y_0\)\( \text{ for } \text{ all } n>N\). Therefore, \(f_n(y_n) \le f_n(y_0)\) for \(n > N\) and finally \(\limsup \nolimits _{n \rightarrow \infty } f_n(y_n) \le \limsup \nolimits _{n \rightarrow \infty } f_n(y_0)=f(y_0)\). Since \(y_0\) can be chosen arbitrarily close to y and f is continuous from the right, we deduce that
Lemma 2
Let Assumptions 2 or 3 hold. Assume that \(f_n \xrightarrow {w} f\) in \(X^*\) and \(y_n \rightarrow y\) in S, \(n \rightarrow \infty \). Then we have
Proof
Define
We have that
The first inequality follows from property (P1) and Lemma 3.2 in Serfozo (1966), whereas the second one is a consequence of Lemma 1 and (P1). Thus, the result follows.
Lemma 3
Let Assumption 3 hold. Assume that \(f\in X^*\) and \(y_n \rightarrow y\) in S as \(n \rightarrow \infty \). Then we obtain
Proof
For any \(z \in S\) define
The function \(f_*\) is lower semicontinuous. Furthermore, \(f_*(z)=f(z)\) for any continuity point \(z \in S\) of f. Recall that 0 is a continuity point of f. Hence, \(f_*(0)=f(0)\). By Assumption 3(i), we have that
By Lemma 3.2 in Serfozo (1966), we obtain
Combining (6) and (7) with Lemma 2, we infer that
Thus, the result follows.
Lemma 4
Let Assumption 2 hold. Assume that \(y_n \searrow y\) in S as \(n \rightarrow \infty \) and \(f \in X^*\). Then it follows
Proof
By Assumption 2(i), we infer that
Hence, the above inequality and Lemma 2 yield
These inequalities finish the proof.
Let \(\phi \in F_2^0\) and \(\varPi (\phi )\) be the set of all strategies \(\pi \) for player 1 for which the pair \((\pi ,\phi )\) is feasible. We are now ready to formulate our next lemma.
Lemma 5
Put \(\varPhi (s)=[0,s-\phi (s)]\) for each \(s\in S\). Let either Assumptions 1 and 2 or Assumptions 1 and 3 be satisfied. Then there exists a unique function \(V_\phi \in X^*\) such that
for all \(s \in S\). Moreover,
Proof
For any \(V\in X^*\), define the operator T as follows:
Observe that since \(s\rightarrow s-\phi (s)\) is upper semicontinuous and u is increasing and continuous, it follows that the function \((s,y)\rightarrow u(s-\phi (s)-y)\) is upper semicontinuous. Moreover, by Lemma 2\(y\rightarrow -\frac{\beta }{\gamma }\ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|y)\) is also upper semicontinuous. Hence, by Proposition D.5 in Hernández-Lerma and Lasserre (1996), the function TV is upper semicontinuous. This fact and (4) yield that \(T:X^*\rightarrow X^*.\) We have to prove that T is contractive. Assume that \(V_1,V_2 \in X^*\). By properties (P1) and (P2) for each \(s\in S\), we have
Changing the roles of \(V_1\) and \(V_2\) we get
By the Banach fixed point theorem, there exists a unique function \(V_\phi \in X^*\) such that \(TV_\phi =V_\phi \).
Now we prove that \(V_\phi (s)=\sup \nolimits _{\pi \in \varPi (\phi )} U^1(s,\pi ,\phi )\). We have
for every feasible consumption a for agent 1 (that means \(a+\phi (s) \le s\) for every \(s \in S\)). Proceeding along similar lines as in Asienkiewicz and Jaśkiewicz (2017) (see formula (3.6)) we obtain by iteration that for every \(N \in \mathbb {N}\) and \(\pi \in \varPi (\phi )\)
Letting N tend to infinity, we have that
Hence,
From Proposition D.5 in Hernández-Lerma and Lasserre (1996), there exists \(\psi \in F_1\) such that
Put \(\phi ^*(s)=s-\phi (s)-\psi (s)\). Hence, for every \(s \in S\) we get
Again, by iteration of this equation and making use of properties (P1)–(P3), we obtain that for every \(s \in S\)
Letting N go to infinity, we have
for all \(s \in S\) and, consequently,
Inequalities (8) and (9) imply that
Define
For any \(s \in S\) we set \(g(\phi )(s):= \max A_\phi (s)\).
Lemma 6
The correspondence \(s \rightarrow A_\phi (s)\) is ascending, i.e., if \(s_1 < s_2\) and \(y_1 \in A_\phi (s_1)\), \(y_2 \in A_\phi (s_2)\), then \(y_1 \le y_2\).
Proof
Suppose that \(s \rightarrow A_\phi (s)\) is not ascending. This means that there exist \(s_1 <s_2\) and \(y_1 \in A_\phi (s_1)\), \(y_2 \in A_\phi (s_2)\) such that \(y_1 >y_2\). Observe that the set \({{\mathcal {L}}}:=\{(s,y):\ s\in S, \ y\in \varPhi (s)\}\) is a lattice with the usual component-wise order on \(\mathbb {R}^2.\) Consequently, the points \((s_1,y_2)\) and \((s_2,y_1)\) belong to \({{\mathcal {L}}}.\) From Assumption 1, u is strictly concave. From the proof of Lemma 2 in Nowak (2006) and the fact that \(s_2-\phi (s_2) > s_1-\phi (s_1)\), we infer
Adding \(- \frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|y_1)- \big (- \frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|y_2)\big )\) to both sides it follows that
Thus, we have a contradiction.
Lemma 7
Let \(\psi \) be any selector of the correspondence \(s \rightarrow A_\phi (s)\), i.e., \(\psi (s) \in A_\phi (s)\) for all \(s \in S\). If \(\psi \) is continuous at \(s_0\), then \(A_\phi (s_0)\) is a singleton.
Proof
Clearly, \(A_\phi (0)\) is a singleton. Assume that \(s_0>0\) and \(y_1,\)\(y_2\) are elements of \(A_\phi (s_0)\) such that \(y_1<y_2\). Since \(s \rightarrow A_\phi (s)\) is ascending, we conclude that
But \(\psi \) is continuous at \(s_0 \in S\). Thus, we have a contradiction.
Lemma 8
The function \(g(\phi )\) is a unique non-decreasing and continuous from the right selector of the correspondence \(s \rightarrow A_\phi (s)\).
Proof
From Lemma 6 the function \(g(\phi )\) is non-decreasing. Moreover, we observe that the graph of the correspondence \(s \rightarrow A_{\phi }(s)\) is closed from the right. Indeed, take \(s_n\searrow s\) and \(y_n\in A_\phi (s_n).\) From Lemma 6 it follows that \(y_n\) is non-increasing and let \(y_n\) converge to some y. Lemma 3 (under Assumption 3) or Lemma 4 (under Assumption 2) and Assumption 1 imply that \(y\in A_\phi (s).\) Therefore, \(g(\phi )\) is continuous from the right. Hence, \(g(\phi )\) is an upper semicontinuous selector of the correspondence \(s \rightarrow A_\phi (s)\). The uniqueness follows from Lemma 7.
Proof of Theorem 1
Define the operator L as follows \(L\phi (s):=\frac{s-g(\phi )(s)}{2}\) for \(s\in S\) and \(\phi \in F_2^0\). Lemma 8 implies that \(L\phi \in F_1^0\). Hence, \(L: F_2^0\rightarrow F_1^0.\) We have to show that the operator L is continuous when \(F_1^0\) and \(F_2^0\) are equipped with the topology of weak convergence. Suppose that \(\phi _n \xrightarrow {w} \phi \) as \(n \rightarrow \infty \). From fact that the set \(X^*\) is sequentially compact in X, we infer that there exists a subsequence of \((V_{\phi _n})_{n=1}^\infty \) converging to some V in \(X^*\). Without loss of generality we may accept that \(V_n:= V_{\phi _n} \xrightarrow {w} V\) in \(X^*\) as \(n\rightarrow \infty .\) Analogously, we may assume that \(\psi _n:=g(\phi _n) \xrightarrow {w} \psi \) in \(F^0_1\). Thus, for each \(n \in \mathbb {N}\), we obtain from Lemma 5 that
Let \(S_1 \subset S\) be the set of all continuity points of the functions V, \(\phi \) and \(\psi \). For any \(s\in S_1\) we get \(V_n(s)\rightarrow V(s)\), \(\phi _n(s) \rightarrow \phi (s)\) and \(\psi _n(s) \rightarrow \psi (s)\). Using Assumption 1, Lemma 2 and the last display, we obtain that
Let \(s \notin S_1\). Since \(S_1\) is dense in S and the functions V, \(\psi \) and \(\phi \) are continuous from the right, we may choose a sequence \((s_m)_{m=1}^\infty \) in S such that \(s_m \searrow s\) as \(m \rightarrow \infty \). Therefore, we get
From Lemma 2 and letting \(m \rightarrow \infty \) we conclude that (10) holds for all \(s \in S\). On the other hand, for any \(n\in \mathbb {N},\)\(y\in [0,s-\phi _n(s)]\) and \(s\in S\), by Lemma 5 we have
Now we define the following sets:
-
\(S_d\) is a countable set of discontinuity points of the function V;
-
\(S_2\) is the set of all continuity points of the functions V and \(\phi \);
-
\(S_3\) is the set of all \(y \in S\) such that \(q(S_d |y)=0\).
Recall that \(0 \notin S_d\). Clearly, the set \(S_2\) is dense in S. The set \(S_3\) is also dense in S and contains the state 0. These two facts follow from either Assumption 2(iii) or Assumption 3(i). Choose any \(s \in S_2 \cap S_+\) and \(y \in S_3 \cap [0, s-\phi (s))\). Then there exists some \(N \in \mathbb {N}\) such that \(y \in [0,s- \phi _n(s)]\) for all \(n >N\). Hence, we have the following inequality:
By the dominated convergence theorem and the fact that \(y \in S_3\), we obtain
Thus, we can conclude that
for \(y \in [0,s-\phi (s))\cap S_3\) and \(s\in S_2 \cap S_+\). Let us consider \(s_0 \in S\) and \(y_0 \in [0,s_0-\phi (s_0)]\). Now we choose two sequences \((s_m)_{m=1}^\infty \) and \((y_m)_{m=1}^\infty \) such that \(s_m \searrow s_0,\)\(y_m \searrow y_0\) as \(m \rightarrow \infty \) and \(s_m \in S_2 \cap S_+,\)\(y_m \in S_3 \cap [0, s_m-\phi (s_m))\) for all \(m \in \mathbb {N}\). Obviously, \(s_m-\phi (s_m) \ge s_0-\phi (s_0)\). Therefore, by (11), we obtain
Letting \(m\rightarrow \infty \) and making use of Lemma 3 in case of Assumption 2 or Lemma 4 in case of Assumption 3, the continuity of u, the continuity from the right of functions V, and \(s\rightarrow s-\phi (s)\) we infer that inequality (11) holds for \(s_0\in S\) and \(y_0 \in [0,s_0-\phi (s_0)]\). Finally, inequalities (10) and (11) yield that
Since \(\psi \) is non-decreasing and upper semicontinuous, it follows by Lemma 8 that \(g(\phi )=\psi \). Thus, the operator L is continuous. By the Schauder–Tychonoff fixed point theorem (see Corollary 17.56 in Aliprantis and Border (2006)), there exists \(\phi ^*\in F_2^0\) such that \(L{\phi ^*}=\phi ^*\). This implies that \((\phi ^*,\phi ^*) \in F_1^0 \times F_2^0\) is a symmetric SMPE.
6 Examples
In this section, we provide two examples satisfying our assumptions. Further examples can be found in Balbus et al. (2015a, b), Brock and Mirman (1972) and Jaśkiewicz and Nowak (2018b).
Example 1
Let \(\varOmega :=[0,1]\) and let \(\lambda \) be the standard Lebesgue measure. Let \(F:S\times \varOmega \mapsto S\) be Borel measurable and non-decreasing and continuous in the first argument such that \(F(0,\omega )=0\) for each \(\omega \in \varOmega \). Let \(F_y(\omega ):=F(y,\omega )\) for each \((y,\omega )\in \varOmega \). Let q has the form \(q(\cdot |y):=\lambda F_y^{-1}(\cdot ).\) Obviously q is weakly continuous, hence Assumption 3 (ii) is satisfied. Clearly if \(F(y,\cdot )\) is \(1-1\) for each \(y\in S\setminus \{0\}\), then \(q(\cdot |y)\) is non-atomic. Hence, Assumption 3 (i) is also satisfied. For example, we can consider a multiplicative shock \(F(y,\omega )=y^{\alpha }\omega \) with \(\alpha \in (0,1)\). The utility function for the agent can be, for instance, of the form \(u(c)=1-e^{-c}\) for \(c\in S\). Clearly u is increasing, strictly concave and continuous at 0. Hence, Assumption 1 is satisfied.
Example 2
Let \(\mu (\cdot |y)\) be a non-atomic measure for each \(y\in S\setminus \{0\}\) and \(\mu (\{0\}|0)=1\). Furthermore, assume that \(\mu \) is stochastically increasing and weakly continuous. Suppose that \(f_j\) is increasing, continuous and \(f_j(0)=0\) for each \(j=1,\ldots ,m.\) Assume that \(\sum _{j=1}^m\alpha _j+\alpha _0=1,\) where \(\alpha _0,\alpha _j\in [0,1]\) for \(j=1,\ldots ,m.\) Let
Observe that q satisfies Assumption 2 (i) and (ii). For proving that Assumption 2 (iii) is satisfied, observe that
for each \(s\in S\). Hence, the cardinality of \(Z_s\) is at most m. As a result, q obeys Assumption 2. Here, we may assume that the utility function for both players has the following form:
Clearly u is increasing, strictly concave and continuous at 0. As a result, Assumption 1 is satisfied.
References
Aliprantis CD, Border KC (2006) Infinite dimensional analysis. Springer, Berlin
Anderson EW (2005) The dynamics of risk-sensitive allocations. J Econ Theory 125:93–150
Asienkiewicz H, Jaśkiewicz A (2017) A note on a new class of recursive utilities in Markov decision processes. Appl Math 44:149–161. https://doi.org/10.4064/am2317-1-2017
Balbus Ł, Jaśkiewicz A, Nowak AS (2015) Stochastic bequest games. Games Econ Behav 90:247–256
Balbus Ł, Jaśkiewicz A, Nowak AS (2015) Existence of stationary Markov perfect equilibria in stochastic altruistic growth economies. J Optim Theory Appl 165:295–315
Balbus Ł, Nowak AS (2004) Construction of Nash equilibria in symmetric stochastic games of capital accumulation. Math Methods Oper Res 60:267–277
Balbus Ł, Reffett K, Woźny Ł (2016) Dynamic games in macroeconomics. In: Basar T, Zaccour G (eds) Handbook of dynamic game theory. Springer, Cham, pp 729–778
Başar T (1999) Nash equilibria of risk-sensitive nonlinear stochastic differential games. J Optim Theory Appl 100:478–498
Bäuerle N, Jaśkiewicz A (2018) Stochastic optimal growth model with risk sensitive preferences. J Econ Theory 173:181–200
Bäuerle N, Rieder U (2011) Markov decision processes with applications to finance. Springer, Berlin
Bäuerle N, Rieder U (2017) Zero-sum risk-sensitive stochastic games. Stoch Process Appl 127:622–642
Brock WA, Mirman LJ (1972) Optimal economic growth and uncertainty: the discounted case. J Econ Theory 4:479–533
Dutta P, Sundaram R (1992) Markovian equilibrium in a class of stochastic games: existence theorems for discounted and undiscounted models. Econ Theory 2:197–214
Föllmer H, Schied A (2001) Stochastic finance. Walter de Gruyter, Berlin
Hansen LP, Sargent TJ (1995) Discounted linear exponential quadratic Gaussian control. IEEE Trans Autom Control 40:968–971
Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes: basic optimality criteria. Springer, New York
Jaśkiewicz A, Nowak AS (2014) Stationary Markov perfect equilibria in risk sensitive stochastic overlapping generations models. J Econ Theory 151:411–447
Jaśkiewicz A, Nowak AS (2018) On symmetric stochastic games of resource extraction with weakly continuous transitions. TOP 26:239–256
Jaśkiewicz A, Nowak AS (2018) Non-zero-sum stochastic games. In: Basar T, Zaccour G (eds) Handbook of dynamic game theory. Springer, Cham, pp 281–344
Klompstra MB (2000) Nash equilibria in risk-sensitive dynamic games. IEEE Trans Autom Control 45:1397–1401
Levhari D, Mirman LJ (1980) The great fish war: an example using a dynamic Cournot–Nash solution. Bell J Econ 11:322–334
Majumdar M, Sundaram R (1991) Symmetric stochastic games of resource extraction: the existence of non-randomized stationary equilibrium. In: Raghavan TES, Ferguson TS, Parthasarathy T, Vrieze OJ (eds) Stoch Games Relat Top. Kluwer, Dordrecht, pp 175–190
Nowak AS (2006) On perfect equilibria in stochastic models of growth with intergenerational altruism. Econ Theory 28:73–83
Serfozo R (1966) Convergence of Lebesgue integrals with varying measures. Sankhya A Indian J Stat (Ser A) 44:871–890
Sundaram R (1989) Perfect equilibrium in non-randomized strategies in a class of symmetric dynamic games. J Econ Theory 47:380–402. Corrigendum: vol 49, pp 385–387
Szajowski P (2006) Constructions of Nash equilibria in stochastic games of resource extraction with additive transition structure. Math Methods Oper Res 63:239–260
Weil P (1993) Precautionary savings and the permanent income hypothesis. Rev Econ Stud 60:367–383
Acknowledgements
We thank Anna Jaśkiewicz and Andrzej S. Nowak for helpful discussions. We are also grateful two anonymous referees for their constructive comments. Hubert Asienkiewicz acknowledges the financial support from the National Science Centre in Poland under Grant 2016/23/B/ST1/00425.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Asienkiewicz, H., Balbus, Ł. Existence of Nash equilibria in stochastic games of resource extraction with risk-sensitive players. TOP 27, 502–518 (2019). https://doi.org/10.1007/s11750-019-00516-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11750-019-00516-2