Time consistent Markov policies in dynamic economies with quasi-hyperbolic consumers

Balbus, Łukasz; Reffett, Kevin; Woźny, Łukasz

doi:10.1007/s00182-014-0420-3

Time consistent Markov policies in dynamic economies with quasi-hyperbolic consumers

Published: 04 April 2014

Volume 44, pages 83–112, (2015)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Game Theory Aims and scope Submit manuscript

Time consistent Markov policies in dynamic economies with quasi-hyperbolic consumers

Download PDF

Łukasz Balbus¹,
Kevin Reffett² &
Łukasz Woźny³

501 Accesses
24 Citations
Explore all metrics

Abstract

We study the question of existence and computation of time-consistent Markov policies of quasi-hyperbolic consumers under a stochastic transition technology in a general class of economies with multidimensional action spaces and uncountable state spaces. Under standard complementarity assumptions on preferences, as well as a mild geometric condition on transition probabilities, we prove existence of time-consistent solutions in Markovian policies, and provide conditions for the existence of continuous and monotone equilibria. We present applications of our methods to habit formation models, environmental policies, and models of consumption under borrowing constraints, and hence show how our methods extend the results obtained by Harris and Laibson (Econometrica 69:935–957, 2001) to a broad class of dynamic economies. We also present a simple successive approximation scheme for computing extremal equilibrium, and provide some results on the existence of monotone equilibrium comparative statics in the model’s deep parameters.

Dynamic preferences and the behavioral case against sin taxes

Article 12 March 2021

Existence of Walrasian equilibria with discontinuous, non-ordered, interdependent and price-dependent preferences, without free disposal, and without compact consumption sets

Article Open access 24 November 2021

Closing the invisible hand: a rehabilitation of tâtonnement dynamics

Article 16 December 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Since it was introduced in Strotz (1956), and further developed in Phelps and Pollak (1968) and Peleg and Yaari (1973), the problem of dynamic consistency in economic models has played an important role in work in many fields in economics. In particular, this problem has appeared in recent papers in such diverse topics as the theory of optimal consumption/savings, the role of liquidity constraints in dynamic asset markets, the behavioral foundations of economic choice, the role of commitment devices in dynamic models of self-control, the design of dynamic time-consistent environmental policies, models of social discounting and cost-benefit analysis, and various other papers studying the welfare implications of public policy in dynamic models.^{Footnote 1} The classical toolkit for studying these problems has emphasized the language of recursive decision theory^{Footnote 2}, which was first introduced in Strotz (1956). As observed by many researchers in subsequent discussions [e.g., Peleg and Yaari (1973) and Bernheim and Ray (1986)], a key problem with this recursive decision theory is that optimal dynamically consistent (Markov) plans need not exist, let alone be simple to characterize or compute. One key reason for the failure of existence lies in the seemingly inherent presence of discontinuities in intertemporal preferences that arises naturally in the dynamic structure of these problems when recursive decision theory approaches are attempted. The source of this lack of continuity is the lack of commitment between the current ”versions” of the decision maker and all her continuation “selves”.^{Footnote 3} Due to this discontinuity, the optimal level of “commitment” may be nonexistent, and the dynamic maximization problem can turn out to be poorly defined [see, for example, Caplin and Leahy (2006) for an excellent discussion of this fact].^{Footnote 4}

As a way of circumventing these problems, Peleg and Yaari (1973) proposed a dynamic game interpretation of the time-consistency problem.^{Footnote 5} In this view, one envisions the decision maker playing a dynamic game between one’s current self and each of her future ”selves”, with the appropriate solution concept in the game being a subgame-perfect Nash equilibrium (SPNE, henceforth). A SPNE of the appropriate game need not be an optimal time-consistent policy, however. This fact is due, in part, to the dynamic decision theoretic approach being proposed itself, where future ties are broken in favor of a current self, and that observation is not necessarily true for a SPNE of a dynamic game. Additionally, the set of SPNE may be very large, and most importantly, not necessarily possess the element with the greatest value. Hence, an optimal SPNE (i.e., a SPNE that corresponds to some optimal time-consistent policy) may simply not exist. Moreover, even if the question of existence of SPNE is resolved, the existence of Stationary Markov Nash Equilibria^{Footnote 6} (henceforth, SMNE) is still not guaranteed [see Bernheim and Ray (1986) and Leininger (1986)].

In this paper, we develop a new approach in the tradition of the classical Strotz (1956) recursive approach to studying equilibrium in the Phelps and Pollak (1968) game theoretic representation of the problem, but with an emphasis on developing constructive methods for characterizing SMNE, as well as methods for computing them. The underlying game theoretic structure is that of a stochastic game. In our setting, we seek conditions under which a simple class of stable iterative algorithms exists that can both (i) characterize the existence of SMNE from a theoretical perspective; (ii) provide explicit and accurate algorithms for computing particular elements of this set; (iii) characterize the optimal time consistent policy among the set of SMNE solutions, and (iv) provide methods that are stable (in some well-defined sense) under perturbation of deep parameters. We provide conditions under which affirmative answers to all of these questions can be given.

More specifically, under standard assumptions on preferences and certain geometric condition on a transition probability, we are able to show existence of the greatest and the least SMNE, as well as provide conditions where they are (Lipschitz) continuous or monotone. Further, and equally as important, we characterize the set of all values corresponding to time-consistent policies, showing that the set of SMNE is a countably chain complete poset^{Footnote 7} containing the least and the greatest elements. This characterization of the set of SMNE allows us to verify the existence, and compute, the greatest value function associated with SMNE, and hence compute the optimal time-consistent policy. This fact, along with our constructive methods, allows us to link directly the game-theoretic analysis of our problem with that predicted by recursive decision theory.

We next turn to the question of computation of SMNE, as well as addressing the question of equilibrium comparative statics relative to ordered perturbations of the deep parameters of the game. This latter set of questions is also critical, as it allows us to develop a theory of computable equilibrium comparative statics. That is, we are able to construct a simple approximation scheme that is able to compute monotone comparative statics relative to extremal time-consistent SMNE policies with respect to the model parameters. These comparative statics and computation/approximation results are important for applied research in the field.^{Footnote 8}

From a technical perspective, our methods complement the ideas found in the important papers of Bernheim and Ray (1986) and Harris and Laibson (2001), where the authors add noise with invariant support to the problem, which in turn allows them to develop conditions that guarantee the existence of a time-consistent policy in spaces of functions with locally bounded variation or that are Lipschitzian for sufficiently small hyperbolic discount factor.^{Footnote 9} What is critical in understanding the differences between the approach in Harris and Laibson (2001) as opposed to those developed in the present paper is that our methods do not rely on so-called “generalized Euler equation” (GEE) methods.^{Footnote 10} Rather, our methods emphasize value iteration type methods, and hence are more in the spirit of “promised utility methods” but defined in spaces of functions [as opposed to spaces of correspondences in the promised utility literature as in the APS type approaches found in Bernheim et al. (1999) and Chade et al. (2008)]. What is equally as important is that our methods are therefore able to link the underlying stochastic game studied in Harris and Laibson (2001) with a recursive or value function methods suggested by Strotz (1956) [and further developed by Caplin and Leahy (2006)], and therefore it provides in the context of our stochastic framework a unification of both of decision-theoretic and game-theoretic approaches that have been taken in the existing literature.

The rest of the paper is organized as follows. We start in Sect. 2 presenting a general fixed point result that extends the theorems of Tarski (1955) per existence and Veinott (1992) per fixed point comparative statics to countably chain complete posets. We need these results as neither Tarski (1955) nor Veinott (1992)’s results can be applied in our problem at hand. In Sect. 3, we then specify our general model, state our assumptions, while in 4 discuss our main theorems on existence and computation of SMNE in such models. In Sect. 5, we present three examples showing how our general model and tools can be used in applications, as well how they can be extended to more general problems. Finally, Sect. 6 concludes by discussing in details the related results.

2 Preliminary result

We begin by stating a new fixed point result that is essential in all the subsequent analysis in this paper. The theorem is related to a well-known result characterizing the set of fixed points for monotone transformations of complete lattices^{Footnote 11} due to Tarski (1955).^{Footnote 12} Recall, Tarski’s theorem says that an isotone transformation of a nonempty complete lattice has a nonempty complete lattice of fixed points. Tarski’s theorem was later generalized by Markowsky (1976) to the case of isotone transformations of chain complete partially ordered sets (i.e., an isotone transformation of a nonempty chain complete partially order sets has a nonempty chain complete set of fixed points). Unfortunately, we cannot work with either of these theorems in this paper as our isotone maps transform domains that are neither complete lattices nor chain complete partially ordered sets. Rather, our mappings transform countable chain complete partially ordered sets.

Therefore, we need to begin by proving a new result that is an analog to the Tarski/Markowsky theorems for countably chain complete partially ordered sets.^{Footnote 13} We also need to extend well-known fixed point comparative statics result due to Veinott (1992) to this new context.^{Footnote 14} We start with an important definition^{Footnote 15}.

Definition 1

A function $F:X\rightarrow X$, where $X$ is a poset, is monotonically-sup-preserving, if for any monotone sequence $\{x_{n}\}_{n=0}^{\infty }$ we have: $F\left( \bigvee x_{n}\right) =\bigvee F(x_{n})$. We define monotonically inf-preserving functions analogously. $F$ is said to be monotonically-sup-inf-preserving if and only if, it is both monotonically-sup and monotonically-inf-preserving.

The property of monotonically sup (resp., inf) preserving mapping is a type of sequential “order continuity” property of a mapping in the Scott topology. For example, a mapping that is monotonically sup-inf preserving is also referred to in the literature as a sigma-order continuous mapping [e.g., Dugundji and Granas (1982, p.15)]. It bears mentioning that such order continuity properties for operators play an essential role in the computation of fixed point for isotone maps in countably chain complete partially ordered sets (i.e., obtaining convergence of successive approximation schemes where iterations are indexed on the natural numbers).^{Footnote 16}

We now state our new theorem which characterizes the order structure of the set of fixed points of a parameterized monotone increasing self map defined on a countably chain complete partially ordered set. We begin with some useful definitions. Let ($X,\ge )$ be a partially ordered set (i.e., $X$ is equipped with an order relation $\ge :X\times X\rightarrow X $ that is reflexive, antisymmetric and transitive). If every element of a poset $X$ is comparable in order, then $X$ is chain. If $X$ is a chain and countable, $X$ is a countable chain. In a poset $X$, if every chain $C\subset $ $X$ is complete, then $X\ $is referred to as a chain complete partially ordered set. If every countable chain $C\subset X$ is complete, then $X$ is referred to as a countably chain complete poset.

Our result has three parts: (a) characterization of the set of fixed points, (b) fixed point comparative statics, and (c) result on the computation of fixed points via successive approximations. Our contribution is part (a) and (b) of the theorem (not part (c), which is the Tarski–Kantorovich theorem). The proof is technical, and is found in the appendix.

Theorem 1

Let $F:X\times T\rightarrow X$ be a parameterized monotone increasing operator with $T$ a poset, $X$ a countably chain complete poset with the greatest and least element, $X\times T$ given the product order. Let the fixed point set of $F(\cdot ,t)$ be denoted by $ \Phi (t)$. If for every $t\in T,$ the function $F(\cdot ,t)$ is monotonically sup-inf preserving, then

(a)
$\Phi (t)$ is a non-empty countably chain complete poset in induced order.
(b)
Moreover, the least and greatest fixed point selections $t\rightarrow {\underline{\Phi }}(t):=\wedge $ $\Phi (t)$ and $t\rightarrow {\overline{\Phi }}(t):=\vee $ $\Phi (t)$ are isotone.
(c)
Finally, for the greatest ${\overline{\theta }}$ (resp, least ${\underline{\theta }}$) elements of $X$, we have:
$$\begin{aligned} \inf _{n}F^{n}(\overline{\theta },t)=\overline{\Phi } (t) \quad \left( \mathrm{{resp.}} \sup _{n}F^{n}(\underline{\theta },t)=\underline{\Phi }(t)\right) . \end{aligned}$$

A few remarks on this result. As previously mentioned, part (a) generalizes Tarski (1955) and Markowsky (1976) to the context of countable chain complete poset. The key additional fact to notice is that this result requires a stronger property for the mapping $F$ in $x$, for every $t\in T$ (namely, $F$ needs to be sigma-order continuous in $x$ to even obtain existence). Part (b) of the theorem is essentially Veinott’s fixed point comparative statics result adapted to the context of a countably chain complete partially ordered set. Finally, part (c) is related to the computational results for $\sigma -$complete lattices (resp, countably chain complete partially ordered sets) found in Vulikh (1967), Lemma XII.2.1 [resp., Tarski-Kantorovich, see Theorem 4.2 in Dugundji and Granas (1982)].

Now, obviously, weakening of conditions in our results relative to previous work does come at a cost. For example, per part (a), the additional assumption of order continuity implies the converse (necessity) results that are also proven relative to isotone maps for Tarski’s theorem for complete lattices [see Davis (1955), Theorem 2] and Markowsky’s theorem for chain complete partially ordered sets [see Markowsky (1976), Theorem 11] will not hold in our case. That is, we cannot characterize a countably chain complete partially ordered set using the fixed point property of the space relative to particular classes of monotone increasing mappings. So our new answers relate to sufficiency, not necessity.

Finally, it is also worth mentioning that in part (b) of the Theorem 1, we do generalize the fixed point comparative statics result of Veinott (1992) in some important directions. First, as mentioned before, $\Phi $ is now only countably chain complete valued, so we require less structure on the underlying domain of our operators. Further, in conjunction with part (c) of the theorem, we are able to compute these fixed point comparative statics (with convergence structures studied relative to the Scott topology). Of course, this also comes at the expense of added order continuity conditions. Also, it bears mentioning that as both the top and bottom elements of $\Phi $ are increasing selections, the correspondence $\Phi $ is actually directed upward and directed downward (hence, ascending in the “weak induced set order”). This is also true, for example in Veinott’s theorem for the case of complete lattices. So we obtain his fixed point comparatives statics in the weaker setting of countably chain complete partially ordered sets.

3 Benchmark model

With these results in mind, we can now describe the model we study in the paper. Our environment is a multidimensional version of $\beta -\delta $ quasi-hyperbolic discounting model that has been studied extensively in the literature. We envision an agent to be a sequence of “selves” indexed in discrete time $t\in T=\{0,1,\ldots \}$. A “current self” or “self $t$” enters the period in given state $x_{t}\in S$ , where^{Footnote 17} $S=[0,\overline{S }]\subset \mathbb {R}^{n}$ or $S=[0,\infty )\subset \mathbb {R}^{n}$, and chooses a vector of actions denoted by $a_{t}\in A\subset \mathbb {R}^{m}$. These choices, together with current state $x_{t}$, determine a stochastic transition probability on the next period state $x_{t+1}$ given by $ Q(dx_{t+1}|x_{t},a_{t})$.

The self $t$ preferences are represented by a utility function given by:

$$\begin{aligned} u(a_{t})+\beta E_{t}\sum _{i=t+1}^{\infty }\delta ^{i-t}u(a_{i}), \end{aligned}$$

(1)

where $1\ge \beta >0$ and $1>\delta \ge 0$, $u$ is an instantaneous payoff function, and expectations $E_{t}$ are taken with respect to a realization of a random variable $x_{i}$ drawn each period from a transition distribution $Q$, where this expectation is well-defined by the Ionescu–Tulcea theorem.

Under some continuity assumptions on $u$ and $Q$ (to be specified later), we can define a SMNE for the quasi-hyperbolic consumer to be an $h\in \mathcal {H }$, where $\mathcal {H}=\{h:S\rightarrow A|h\, \mathrm{{ is\,bounded\,and}}$ $\mathrm{{Borel\, measurable\, with }}\, h(x)\in A(x)\}$, that satisfies the following functional equation:

$$\begin{aligned} h(x)\in \arg \max _{a\in A(x)}u(a)+\beta \delta \int \limits _{S}V_{h}(y)Q(dy|x,a), \end{aligned}$$

(2)

where $V_{h}:S\rightarrow \mathbb {R}$ is a continuation value function for the household of “future” selves that are successors to the self $t$, and the future selves follow a stationary policy $h$ from tomorrow onward.

This statement implies that the value function in (2) that is defined for the future selves in a Markovian equilibrium must also solve the following functional equation in the continuation that is given recursively as follows:

$$\begin{aligned} V_{h}(x)=u(h(x))+\delta \int \limits _{S}V_{h}(y)Q(dy|x,h(x)). \end{aligned}$$

(3)

Therefore, if we define the value function for the self $t$ to be:

$$\begin{aligned} W_{h}(x):=u(h(x))+\beta \delta \int \limits _{S}V_{h}(y)Q(dy|x,h(x)), \end{aligned}$$

for the time consistent policy $h$ one obtains the relation

$$\begin{aligned} V_{h}(x)=\frac{1}{\beta }W_{h}(x)-\frac{1-\beta }{\beta }u(h(x)). \end{aligned}$$

(4)

Based on equation (4), we can define an operator whose fixed point, say $V^{*}$, corresponds to a value for some time-consistent Markov policy.

We need to make some assumptions on the primitive data of the game to use our parameterized fixed point results in Sect. 2. Along these lines, we make the following assumptions:

Assumption 1

Let us assume:

$A(x)\subset A\subset \mathbb {R}^m$ is compact and complete lattice valued with $A(0)=\{0\}$,
$u:A\rightarrow \mathbb {R}_+$ is continuous, increasing and supermodular with $u(0)=0$ and $u(\cdot )\le \overline{u}$,
for any $x,a\in S,$ let $Q(\cdot |x,a)=g_{0}(x,a)\delta _{0}(\cdot )+\sum _{j=1}^{J}g_{j}(x,a)\lambda _{j}(\cdot |x)$,
$(\forall j=1,\ldots ,J)\,g_{j}:S\times A\rightarrow [0,1]$ is continuous, with $g_{j}(0,a)=0$ and $\sum _{j=0}^{J}g_{j}(x,a)=1$ for all $a$ ; for all $x,$ with the function $a\rightarrow g_{j}(x,a)$ supermodular and decreasing,
$\delta _0$ is a delta Dirac measure concentrated at point $0$, while $ (\forall j=1,\ldots ,J)\,\lambda _j(\cdot |x)$ is a Borel transition distribution on $S$ for any $x\in S$.

Our assumptions on preferences are fairly standard, but require a few remarks relative to the work of Harris and Laibson (2001). Before doing that let us stress that our aim is not to weaken some conditions of their model but rather obtain new results [e.g. on computation and comparative statics]. Still, as they work is the most closely related to our paper, however, we owe the reader some specific discussion of our and their assumptions.

First, we assume bounded returns, which is not required in Harris and Laibson’s work, but we also allow for unbounded risk aversion (that is actually needed in their approach). The reason we make the assumption of bounded returns is quite natural, as we are studying a stochastic game with potentially an unbounded state space and many sources of shocks. Although, in principle, this assumption might be relaxed, it would require potentially very strong joint restrictions on payoffs and noise (especially in the case of returns unbounded below).^{Footnote 18}

Second, we allow for multidimensional choice spaces as well as state spaces. To do this, we impose supermodularity structure on the payoffs, which we need to obtain monotone operators in the quasi-hyperbolic decision-makers optimization problems (more on this in a moment). If we are solving a (single-dimensional) consumption-investment version of the model as in Harris and Laibson (2001), we obviously do not need this supermodularity condition. We also do not impose twice continuous differentiability nor strict monotonicity of a utility function.

Also, our assumptions on a transition probability require few remarks. First, we impose that the stochastic transition $Q$ is defined as a convex combination of $J$ measures $\lambda _{j}$ and one measure $\delta _{0}$ Dirac concentrated at point zero. Hence, with probability $1-\sum _{j=1}^{J}g_{j}(x,a),$ the next period state is zero, and with probability $g_{j}(x,a)$ it is drawn from $\lambda _{j}$. Also, we separate action variables $a$ and state variables $x$ in $Q$, i.e. $\lambda _{j}$ are not dependent on a decision $a$. Our mixing condition on the stochastic transitions in the game is quite common in the literature, and was first introduced in Amir (1996), and later developed extensively by Nowak (2003), Balbus and Nowak (2008) or Balbus et al. (2013). The condition has also been used to studying Markovian equilibrium in a very general class of stochastic supermodular games in Balbus et al. (2014). We should mention, even this assumption can be weakened a great deal per questions of existence (e.g., per the application of the celebrated APS procedure), but this weakening of sufficient conditions comes at the cost of not being able to computing both equilibrium values and pure strategy Markovian equilibrium.

Finally, as far as a direct comparison with Harris and Laibson (2001) per stochastic transition probabilities, our model generates more sources of noise than theirs, as in our case, not only is labor income random, but wealth (or capital) is also draw from $Q$. Also, we do not require that $Q$ has a density, let alone impose conditions on its degree of smoothness.

4 Main results

4.1 Existence

We first consider the question of existence of Markovian equilibrium. Let $\mathcal {V}$ be a space of bounded (by $0$ and $\frac{\overline{u}}{1-\delta }$), Borel measurable, real valued functions on $S$, with $V(0)=0$ equipped with a pointwise partial order. For a given value $V\in \mathcal {V}$, we construct a correspondence $T$ by:

$$\begin{aligned} TV(x)=\frac{1}{\beta }{\textit{CV}}(x)-\frac{1-\beta }{\beta }u({\textit{BV}}(x)), \end{aligned}$$

(5)

where the pair of operators $C$ and $B$ defined on space $\mathcal {V}$ are given by:

$$\begin{aligned} {\textit{CV}}(x)&= \max _{a\in A(x)}\left\{ u(a)+\beta \delta \int \limits _{S}V(y)Q(dy|x,a)\right\} , \end{aligned}$$

(6)

$$\begin{aligned} {\textit{BV}}(x)&= \arg \max _{a\in A(x)}\left\{ u(a)+\beta \delta \int \limits _{S}V(y)Q(dy|x,a)\right\} . \end{aligned}$$

(7)

Notice, in the above, we have defined the operator $B$ to map between candidates for equilibrium values $\mathcal {V}$ to spaces of pure strategy best replies $\mathcal {H}$. So in effect, we have a pair of operator equation we need to solve to construct equilibrium values $V^{*}\in \mathcal {V}$.

Surely $T$ maps $\mathcal {V}$ into $2^{\mathcal {V}}$. Further, for any fixed point $V^{*}$ of an operator $T$, this value function corresponds to a stationary, time-consistent Markov policy $h^{*}\in BV^{*}\in \mathcal {H}$. By $\overline{T}$ denote the greatest and by $\underline{T}$ the least selection from operator $T$. Equip the space of pure strategies $ \mathcal {H}$ with a pointwise partial order. In this case, we obtain:

Lemma 1

Let assumption 1 hold then $C:\mathcal {V }\rightarrow \mathcal {V}$ is increasing and $\overline{B},\underline{B}:\mathcal {V}\rightarrow \mathcal {H}$ are decreasing. Moreover, both $\overline{T}$ (resp. $\underline{T}$) are increasing and monotonically-inf (resp. sup) preserving.

Proof

$C$ is increasing by definition. To see monotonicity of $B,$ consider a function

$$\begin{aligned} G(a,x,V)=u(a)+\beta \delta \sum _{j=1}^{J}g_{j}(x,a)\int \limits _{S}V(y)\lambda _{j}(dy|x). \end{aligned}$$

Then for any $V\in \mathcal {V}$ and $x\in S,$ the function $G(\cdot ,x,V)$ is supermodular.

Moreover, $(a,V)\rightarrow g_{j}(x,a)\int _{S}V(y)\lambda _{j}(dy|x)$ has decreasing differences. To see this fact, observe we have the following inequalities:

$$\begin{aligned}&[g_{j}(x,a_{2})-g_{j}(x,a_{1})]\int \limits _{S}V_{2}(y)\lambda _{j}(dy|x), \\&\quad \le [g_{j}(x,a_{2})-g_{j}(x,a_{1})]\int \limits _{S}V_{1}(y)\lambda _{j}(dy|x), \end{aligned}$$

where $V_{2}\ge V_{1}$ and $a_{2}\ge a_{1}$. Therefore, for any $x\in S,$ the function $(a,V)\rightarrow G(a,x,V)$ has decreasing differences on $A(x)\times \mathcal {V}$. Since $A(x)$ is a lattice and $\mathcal {V}$ a poset, we obtain by Topkis (1978) theorem that the extremal selections ($\overline{B}$ and $\underline{B}$) of the best reply $BR(V)(x)=\arg \max _{a\in A(x)}G(a,x,V)$ are decreasing on $\mathcal {V}$. Since $C$ is increasing and $\overline{B},\underline{B}$ are decreasing, by definition of $\underline{T}$ and $\overline{T}$, we conclude that both extremal selections of $T$ are increasing.

We now show that $\underline{T}$ is monotonically sup preserving. Let $ \{V_{n}\}_{n=1}^{\infty }\subset \mathcal {V}$ be increasing sequence in the natural product order. Let $V_{n}\rightarrow V$ pointwise. Clearly, $V(x)=\sup \limits _{n\in \mathbf {N}}V_{n}(x)$. We need to show $\lim \limits _{n\rightarrow \infty }\underline{T}(V_{n})=\underline{T}(V)$. By Lebesgue Dominating Theorem, we immediately obtain:

$$\begin{aligned} \int \limits _{S}V_{n}(y)\lambda _{j}(dy|x)\rightarrow \int \limits _{S}V(y)\lambda _{j}(dy|x)\quad \mathrm{{as }}\,n\rightarrow \infty , \end{aligned}$$

for all $j$ and $x$. For fixed $x,$ let $a_{n}:=\overline{B}(V_{n})(x)$. Since $a_{n}$ belongs to compact set $A(x),$ without loss of generality, let us assume $a_{n}\rightarrow a_{0}$. Then by definition of $G$, we have:

$$\begin{aligned} G(a_{n},x,V_{n})\ge G(a,x,V), \end{aligned}$$

for all $a\in A(x)$. Taking limits, we obtain:

$$\begin{aligned} G(a_{0},x,V)\ge G(a,x,V), \end{aligned}$$

for all $a\in A(x)$. Hence, $a_0=\lim \limits _{n\rightarrow \infty }\overline{B}(V_{n})(x)\in B(V)(x)$. Further,

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }C(V_{n})(x)&= \lim \limits _{n\rightarrow \infty }G(\overline{B}(V_{n})(x),x,V_{n}) \\&= G(a_0,x,V)=C(V)(x). \end{aligned}$$

Therefore, $\lim \limits _{n\rightarrow \infty }\underline{T}(V_{n})(x)\ge \underline{T}(V)(x)$. Moreover, $\underline{T}(V)(x)\ge \underline{T}(V_n)(x)$ since $\underline{T}$ is isotone. As a result $\lim \limits _{n\rightarrow \infty }\underline{T}(V_{n})(x)= \underline{T}(V)(x)$. By isotonicity of $\underline{T}$, the iterations $\underline{T}(V_{n})(x)$ form an increasing sequence. Therefore, we have:

$$\begin{aligned} \sup \limits _{n\in \mathbf {N}}\underline{T}(V_{n})(x)=\lim \limits _{n\rightarrow \infty }\underline{T}(V_{n})(x)=\underline{T}(V)=\underline{T}\left( \sup \limits _{n\in \mathbf {N}}V_n(x)\right) . \end{aligned}$$

i.e., $\underline{T}$ is monotonically-sup-preserving. Analogously, we show that $\overline{T}$ is monotonically-inf-preserving. $\square $

Having lemma 1 in hand, we are now in a position to analyze fixed points of a monotone operator $T$.

Theorem 2

(Existence of extremal SMNE) Let assumption 1 hold. Then, the set of equilibrium values corresponding to stationary, time-consistent Markov policies is nonempty and possesses the greatest $\overline{h}^{*}$ and the least $\underline{h}^{*}$ elements, and they correspond to the least $v^{*}=\underline{T}v^{*}$ and the greatest $ w^{*}=\overline{T}w^{*}$ values.

Proof

By Lemma 1, the operator $\underline{T}:\mathcal {V}\rightarrow \mathcal {V} $ is increasing. Moreover, by Lemma 1, we have $\underline{T}$ is monotonically-sup preserving. As $\mathcal {V}$ is a countably chain complete poset by Theorem 1, $\underline{T}$ has a nonempty set of fixed points, with the greatest and the least elements. We conclude similar results for $\overline{T}$. $\square $

Theorem 2 is our central result on existence, and it requires a few remarks. First, aside from asserting the existence of time-consistent equilibrium Markov policy (pure strategy), it also asserts that the set of equilibrium values has a particular poset structure; namely, the set of equilibrium values has the greatest and the least elements. This result, in turn, implies that the set of time-consistent value functions is bounded. Second, for any initial state $x\in S$, the theorem indicates there exists the greatest time-consistent value (and the least equilibrium policy) that are optimal among all the time-consistent values. So, in general, some equilibrium values are ranked. Moreover, if $\underline{T}=\overline{T}$, the set of SMNE values is a countable chain complete poset.

We can relate the nature of our existence result to those that can be obtained using other approaches found in the existing literature that differ from our approach [as well as the GEE approach in Harris and Laibson (2001)]. First, notice our Theorem 2 is based in a type of value-policy iteration procedure, and resembles in an abstract sense APS type procedures as suggested by the work of Bernheim et al. (1999) and Chade et al. (2008) for sequential equilibrium strategies. Further, to deal with the complications associated with measurability, we only work in function spaces (as opposed to spaces of correspondences). In an APS type method for our problem, one would construct a different operator that map between spaces of value correspondences ordered under set inclusion, where the relevant topology for convergence issues would be the weak star topology. A critical problem for such a method for our class of games concerns handling multidimensional state spaces. In particular, as the set of measurable selections from the Nash equilibrium value set need not be weak-star closed, it is very difficult to get sufficient conditions for even existence using APS type methods unless the state space is either countable or the real line. So it is not clear how to use these methods for multistate models. Now, in the case of a single dimensional state space (or countable state space), it is very easy to check self-generation property of the APS value operator. Then noting the natural monotone structure of the operator under set inclusion order, one can show convergence to the greatest fixed point that contains all the sequential equilibrium values. Unfortunately, as this is not a repeated game, it is difficult to say anything substantial about the set of sequential equilibrium strategies (mixed or pure) that generate this set of values.^{Footnote 19} See also Chade et al. (2008) for the extensions of these methods to the multi-player case for a repeated game.

4.2 Computation

We next turn to the question of the computation of equilibrium. This question is particularly important in applied work as often researchers want to simulate/calibrate/estimate SMNE. We first use our main existence result to prove our central theorem on the computation of extremal equilibrium values (and their supporting pure strategy SMNE). We then provide additional characterizations of equilibrium strategies that achieve these values.

Theorem 3

(Pointwise approximation of extremal values) Assume 1 and consider two sequences $\{v_t\}^\infty _{t=0}$ and $\{w_t\}^\infty _{t=0}$ where^{Footnote 20} $v_0()=0$, $w_0(x)=\frac{\bar{u}}{1-\delta } \mathbf {1}_{(0,\infty ]}(x)$ and $v_t=\underline{T}v_{t-1}$ and $w_t= \overline{T}w_{t-1}$. Then $(\forall x\in S)\,\lim _{t\rightarrow \infty } v_t(x)=v^*(x)$ and $\lim _{t\rightarrow \infty }w_t(x)=w^*(x).$

Proof

Clearly $v_{1}\ge v_{0}$. Since $\underline{T}$ is monotone, we can conclude $v_{t}\ge v_{t-1}$. As a result, sequence $\{v_{t}\}$ is increasing. As it is also bounded above, it is convergent, say to $\bar{v}$. Further, it is straightforward to show by Lebesgue Dominance Convergence theorem, Lemma 1 and Kall (1986), that $v^{*}=\bar{v}$. Similarly, we show that $ \{w_{t}\}_{t=0}^{\infty }$ is decreasing and convergent to $w^{*}$. $\square $

A couple of remarks on Theorem 3. First, it provides a very simple constructive method for calculating (pointwise) two of time-consistent values, as well as their supporting policies (including those that are optimal). The theorem, though, gives us much more. In particular, it allows us to calculate the pointwise bounds for any time-consistent equilibrium strategy as well. Finally, of course, if the limits of two sequences analyzed in theorem coincide for any initial state $ x\in S$, then the uniqueness of time-consistent policy is guaranteed. ^{Footnote 21}

Second, the theorem (in conjunction with Theorem 2) also provides computable bounds on equilibrium behavior. In particular, from least (resp, greatest) values with corresponding greatest (resp., least) actions, iterations on our monotone operators can compute SMNE least (resp, greatest) values with corresponding greatest (resp., least) actions. This is particularly important in applied work when numerical implementations of our methods are constructed, if models have approximately the same extremal SMNE for the sets of parameters that are “close” (say extremal SMNE are “close” in a sup-norm topology when parameters of a given model are “close” in some metric), this gives one a chance of studying the “robustness” or “stability” of the predictions of the model at hand. Obtaining such bounds can be formalized using order. Further, in an ordered metric space (as we have in the present situation), pointwise order bounds translate into metric bounds (in our case, uniform metric bounds). But alternative $L_{p}$ metrics could also be developed. It is not clear how one could obtain a similar result using GEE methods ala Harris and Laibson (2001) (for, in the approach of Harris and Laibson (2001), the methods are inherently local, and obtain a global sensitivity analysis for SMNE would require a globalization of their analysis). Perhaps most importantly, our methods also allow one to construct bounds per the optimal time consistent policy versus other SMNE that are time consistent. Using Harris and Laibson (2001), it is not clear how to establish whether any particular equilibrium constructed using GEE is optimal (hence, no such comparison is possible)^{Footnote 22}.

The next two results identify sufficient conditions that allow us to further characterize the smoothness and monotonicity of any time consistent equilibrium policy function $h^{*}$.

Theorem 4

(Monotonicity of policies) Assume 1, and that each $g_j$ has increasing differences in $(x,a)$. Consider any time-consistent policy $ h^{*}$. If each $\lambda _{j}(\cdot |x)$ is constant with $x,$ and $x\rightarrow A(x)$ is strong set order increasing, then each time-consistent equilibrium policy $h^{*}$ is increasing.

Proof

Let $h^{*}={\textit{BV}}^{*}$ for some $V^{*}\in {\textit{TV}}^{*}$. Consider the function

$$\begin{aligned} G(a,x,V^{*})=u(a)+\beta \delta \sum _{j=1}^{J}g_{j}(x,a)\int \limits _{S}V^{*}(y)\lambda _{j}(dy). \end{aligned}$$

Observe $G$ is supermodular in $a$ on a lattice $A(x)$, and the feasible action set $A(x)$ is increasing in the Veinott’s strong set order. Moreover, by assumption on $g_{j}$, we conclude $G$ has increasing differences with $(a,x)$. By Topkis (1978) theorem argument maximizing $ h^{*}$ is increasing with $x$ on $S$. $\square $

To obtain such strong characterization of equilibrium time-consistent policies, we require that $\lambda _{j}$ are independent of state $x$. Although such assumption has been imposed in many related papers [see Nowak (2006) or Amir (2002)], a natural question to ask is whether one can obtain similar monotonicity results while still allowing the measures $ \lambda _{j}$ to be dependent on $x$. So it bears mentioning why it may be difficult to obtain such characterization, when $\lambda _{j}(\cdot |x)$ is e.g. stochastically ordered with $x$. Notice, if $V^{*}$ is increasing, and all $\lambda _{j}(\cdot |x)$ are stochastically decreasing, that is sufficient to obtain an increasing differences property between the control $ a$ and state $x$. But to assure for the Bellman operator $C$ that $V^{*}$ is increasing, one would like to assume that each of the $\lambda _{j}(\cdot |x)$ is stochastically increasing with $x$. Hence, to get monotonicity in this very general setting, we need each $\lambda _{j}$ independent on $x$ for all $j$. We also remark that the assumption that $\lambda _{j}(\cdot |x)$ is independent of $x$ means our noise is similar to that in Harris and Laibson (2001), but only for a multidimensional choice/state case.

We next turn to the question of continuous time consistent policies. For this, we impose the following Feller type property on the noise.

Assumption 2

$(\forall j=1,\ldots ,J)\,\lambda _{j}(\cdot |x)$ is strongly stochastically continuous (i.e. the function $x\rightarrow \eta _{f}^{j}(x):=\int \limits _{S}f(y)\lambda _{j}(dy|x)$ is continuous for any $ f\in \mathcal {V)}$.

With this assumption in place, we now prove a theorem that studies the continuity structure of equilibrium time consistent policies.

Theorem 5

(Continuity of policies) Let 1 and 2 hold with $u$ strictly concave and $g$ concave in $a$. Then, each time-consistent equilibrium policy $h^{*}$ is continuous.

Proof

Let $V_{h^{*}}\in \mathcal {V}$ be equilibrium payoff under time-consistent policy $h^{*}$. Then, by Assumption 1, the mapping

$$\begin{aligned} x\rightarrow \zeta _{h^{*}}^{j}(x):=\int \limits _{S}V_{h^{*}}(y)\lambda _{j}(dy|x) \end{aligned}$$

is continuous. Notice, the function

$$\begin{aligned} F_{h^{*}}(a,x):=u(a)+\beta \delta \sum _{j=1}^{J}\zeta _{h^{*}}^{j}(x)g_{j}(x,a) \end{aligned}$$

is also continuous and strictly concave with respect to $a$ for fixed $x>0$. Let $x_{n}\rightarrow x_{0}$. Since $h^{*}(x)=\mathrm{{arg}}\, \max \limits _{a\in A(x)}F_{h^{*}}(a,x)$, we have

$$\begin{aligned} F_{h^{*}}(h^{*}(x_{n}),x_{n})\ge F_{h^{*}}(a,x_{n}). \end{aligned}$$

Without loss of generality, suppose $h^{*}(x_{n})\rightarrow a_{0}$. By the continuity of $F_{h^{*}}$, we have

$$\begin{aligned} F_{h^{*}}(a_{0},x_{0})\ge F_{h^{*}}(a,x_{0}). \end{aligned}$$

By the strict concavity of $F_{h^{*}}(\cdot ,x)$ and definition of $ h^{*},$ we obtain $a_{0}=h^{*}(x_{0})=\lim \limits _{n\rightarrow \infty }h^{*}(x_{n})$. $\square $

4.3 Monotone comparative statics

Finally, motivated by the indeterminacy result in Gong and Smith (2007), as well as concerns about the possible econometric estimation of our stochastic game, we now consider the nature of monotone comparative statics in a parameterized version of our optimization problem. For a partially ordered set $\Theta $ , with $\theta \in \Theta $ a typical element$,$ define the greatest and least time-consistent policies as $\overline{h} _{\theta }^{*}$ and $\underline{h}_{\theta }^{*},$ respectively.

We make the following assumption.

Assumption 3

Let us assume:

$u:A\times \Theta \rightarrow \mathbb {R}$, $a\rightarrow u(a,\theta )$ is continuous, increasing and supermodular on $A$ with $(\forall \theta \in \Theta )\,u(0,\theta )=0$. Also $u$ has increasing differences with $ (a,\theta )$ and $\theta \rightarrow u(a,\theta )$ is decreasing.
For any $(x,a)\in (S\times A)$ and $\theta \in \Theta $ let $Q(\cdot |x,a,\theta )=(1-\sum _{j=1}^{J}g_{j}(x,a,\theta ))\delta _{0}(\cdot )+\sum _{j=1}^{J}g_{j}(x,a,\theta )\lambda _{j}(\cdot |\theta )$.
$(\forall j=1,\ldots ,j)\,g_{j}:S\times \Theta \rightarrow [0,1] $ and $a\rightarrow g_{j}(x,a,\theta )$ is continuous, decreasing and supermodular with $(\forall \theta \in \Theta )\,g_{j}(0,a,\theta )=0$. Also $g_{j}$ has increasing differences with $(a,\theta )$ and $(a,x)$. Moreover $ (x,\theta ) \rightarrow g_{j}(x,a,\theta )$ is decreasing on $S\times \Theta $.
$\delta _{0}$ is a delta Dirac measure concentrated at point $0$, while $(\forall j=1,\ldots ,j)\,\lambda _{j}(\cdot |\theta )$ is a Borel transition distribution on $S$ for any $\theta \in \Theta $, where $\lambda _{j}(\cdot |\theta )$ is stochastically increasing with $\theta $.

With Assumption 3 in place, we can now prove our main result on monotone comparative statics for extremal time consistent equilibrium policies.

Theorem 6

(Monotone comparative statics) Let Assumption 3 be satisfied. Then, the mappings $\theta \rightarrow \overline{h}_{\theta }^{*}$ and $ \theta \rightarrow \underline{h}_{\theta }^{*}$ are both increasing on $ \Theta $.

Proof

By Theorem 2, for any $\theta \in \Theta ,$ there exist top and bottom time consistent policies $\overline{h}_{\theta }^{*} $ and $\underline{h}_{\theta }^{*}$. By Theorem 4, $ x\rightarrow \overline{h}_{\theta }^{*}(x)$ and $x\rightarrow \underline{h}_{\theta }^{*}(x)$ are increasing functions of $x\in S$. As a result, for each $\theta $ both operators $\overline{T}_{\theta },\underline{T}_{\theta }$ maps $\mathcal {V}$ into decreasing functions, hence, its fixed points are decreasing functions of $x\in $ $S$.

Now, for decreasing $V\in \mathcal {V}$, consider a function

$$\begin{aligned} G(a,x,\theta ,V)=u(a,\theta )+\beta \delta \sum _{j=1}^{J}g_{j}(x,a,\theta )\int \limits _{S}V(y)\lambda _{j}(dy|\theta ), \end{aligned}$$

and observe that $G$ is decreasing with $\theta $, and has increasing differences with $(a,\theta )$. Clearly, $C_{\theta }V(x)=\max _{a\in A(x)}G(x,a,$ $\theta ,V)$ is decreasing with $\theta $. Similarly, by Topkis (1978) theorem, $\underline{B}_{\theta }V(x)$ is increasing with $\theta $ (where $ B_{\theta }V(x)=\arg \max _{a\in A(x)}G(a,x,\theta ,V))$. Consequently, we have

$$\begin{aligned} \theta \rightarrow \overline{T}_{\theta }V(x)=\frac{1}{\beta }C_{\theta }V(x)-\frac{ 1-\beta }{\beta }u(\underline{B}_{\theta }V(x)), \end{aligned}$$

is decreasing on $\Theta $. From Theorem 1, we therefore conclude the greatest $w_{\theta }^{*}$ and the least fixed point $v_{\theta }^{*}$ are decreasing with $ \theta $. Consequently, $\theta \rightarrow G(x,a,\theta ,w_{\theta }^{*})$ is decreasing and $(a,\theta )\rightarrow G(x,a,\theta , w_{\theta }^{*})$ has increasing differences with $(a,\theta ) $. Then, by Topkis (1978) theorem, $\underline{h}_{\theta }^{*}$ is increasing with $\theta $. The reasoning is similar for $v_{\theta }^{*}$.

$\square $

Finally, our strong results on comparative statics cannot be obtained using Chade et al. (2008) APS type approaches (adapted to a stochastic game). That is, Chade et al. (2008) show conditions in a repeated game under which the whole equilibrium value set is monotone (under set inclusion) in parameter. They also show example where it is not the case. In our case, we are able to provide conditions under which the greatest SMNE value increase with the change of the parameter. That is, we can characterize the comparative statics of the optimal among the set of equilibrium time consistent policies (but we cannot characterize the comparative statics of the whole equilibrium set).

5 Applications and extensions

In this section we discuss two applications and one extension that show how our results can be used in the study of optimal (among the set of time consistent) consumption policies under credit constraints, habit formation and environmental protection. In Sect. 5.4 we also present two specific examples of transition probabilities that generate non-trivial invariant distributions for any SMNE policy. We begin with the standard consumption-savings problem.

5.1 Consumption-savings with $\beta -\delta $ preferences

We first apply our results to a version of the problem studied in Harris and Laibson (2001). Here, each self $t$ has $\beta -\delta $ preferences given as in the general model [e.g., Eq. (1)]. A typical self enters the period endowed with output $x \in S=[0,\bar{S}]$ where $\bar{S}$ is finite or $S=[0,\infty )$, and she decides on the current consumption $a\in [0,x]$. Investment equals $x-a$ . Then, the level of investment parameterizes the stochastic transition technology $Q$ that generates next period output. Preferences and technologies satisfy the assumptions of the previous section, i.e. $u$ is increasing, continuous and strictly concave. For stochastic transition structures we take a special case of $g_j$ to be $ g_j(x,a):=\tilde{g}_j(x-a)$ and assume $\tilde{g}_j$ is increasing, continuous and concave.

As Assumption 1 is satisfied, so Theorem 2 holds. As the constraint set is strong set order increasing and $g_j$ has increasing differences, we also have conclusions of Theorem 4 as long as $\lambda _j$ does not depend on $x$. Finally, if we additionally impose Assumption 2, then Theorem 5 holds. For this model we can also easily show that SMNE policy is Lipschitz continuous. In any case, time consistent (and optimal) policies exist and form a nonempty countable chain complete poset. Further, as Assumption 1 holds, we can also (Fig. 1) compute optimal (among time consistent) policies via Theorem 3 (i.e., pointwise approximate the extremal time consistent equilibrium including the greatest value equilibrium, which is the optimal SMNE).

We can also provide an explicit example of how simple it is to apply our methods to compute/approximate equilibrium time-consistent strategies to this (Fig. 2) example. To see this, consider the following example from macroeconomic applications of hyperbolic discounting problems.

Example 1

Consider a power utility, Cobb–Douglas class of examples. Let the state space $S$ for the economy be given by $S=[0,1]$, the period utility function be $u(a)=a^{\alpha }$, $g(x,a)=(x-a)^{\gamma }$, while $\lambda (y|x)$ has a cdf given by: $y^{2-x}$. Let $1>\alpha >0,1>\gamma >0$.

For this economy, we can compute optimal SMNE via standard approximation methods (e.g., piecewise-constant approximation) by iterating on a simple Picard procedure based on the operator $T$. The results of our calculations are presented in the following figures.^{Footnote 23} In the first figure, we show convergence to the SMNE iterating both from above and below.

In the second figure, we present a simple set of numerical comparative statics results. Sensitivity analysis exemplifies the monotone comparative statics result from Theorem 6.

5.2 Consumption-savings with habit formation and $\beta - \delta $ preferences

In our second example, we consider an extension of our model to models with endogenous preferences. One interesting example of such a preference structure is the rational addiction model/habit formation mode of Becker and Murphy (1988), so we now consider a consumption-savings problem with quasi-hyperbolic preference and habit formation.

Along these lines, let us modify the primitive data of the model to accommodate habit formation. Let $z=a_{-1}$ where $a_{-1}$ denotes last period consumption, and $z$ denotes the level of the habit.^{Footnote 24} Let $ u(a,z) $ denote the current utility from consumption of $a\in \mathbb {R}_{+}$ , where the past consumption $z=a_{-1}\in \mathbb {R}_{+}$ parameterizes the current period utility function. In a similar manner to the assumption on preferences in Assumption 1, assume current payoff $u$ is continuous and strictly concave in its first argument, supermodular, and increasing. Also, assume that the stochastic production technology is exactly as in the previous application.

Then, under our conditions, Theorem 2 holds. If we additionally impose Assumption 2 on the noise, then Theorem 5 holds. In any case, time consistent (and optimal) policies exist and form a nonempty countable chain complete poset. Finally, as Assumption 1 holds, we can also compute optimal among time consistent policies via Theorem 3.

5.3 Environmental policy

In our final example, we apply our results to an environmental growth model with a pollution externality. This application is useful, as it is a case of the model where we have multidimensional choice spaces and multidimensional state spaces. In particular, the economy we study is based upon that studied in Jones and Manuelli (1995) and Acemoglu et al. (2012), but it also shares features found in a number of other papers including Jones and Manuelli (2001) , Brock and Taylor (2005), Karp and Tsur (2011), and Lemoine and Traeger (2012) . In this economy, there will be two sectors producing two types of consumption goods each period. One type of good will be referred to as a “clean” good, while the other type of good will be “dirty”. We again will have a stochastic capital accumulation technology that produces the goods from investment in the corresponding sectors.

More specifically, this economy has consumers deriving utility from consumption of both clean and dirty goods, with their preferences exhibiting a hyperbolic discounting. For a typical self $t$, she enters the period in state $s=(x_{c},x_{d})$, where $x_{c},x_{d}\in [0,S]$ denotes the level of clean and dirty capital, respectively. Each period self $t$ has lifetime utility given by equation (1) where preferences satisfy Assumption 1, and are defined over actions $ a=(c_{c},c_{d})\in $ $A\subset [0,S]\times [0,S]=K$ (where $ (c_{c},c_{d})$ denote consumptions of clean and dirty consumption goods).

As for production, the transition probability between states $s\in S$ given some action $a=(c_{c},c_{d})\in $ $A$ is given by $ Q(A|f_{c}(k_{c},k_{d})-c_{c},f_{d}(k_{c},k_{d})-c_{d},s)$, where $f_{i}$ for $i=c,d$ is a production function of clean and dirty consumption/capital goods. That is, each self $t$ leaves self $t+1$ clean investment goods of the amount $i_{c}=$ $f_{c}(k_{c},k_{d})-c_{c}$, and dirty investment goods of the amount $i_{d}=$ $f_{d}(k_{c},k_{d})-c_{d}$. Then, we assume the stochastic production technology is given by:

$$\begin{aligned}&Q(\cdot |f_{c}(k_{c},k_{d})-c_{c},f_{d}(k_{c},k_{d})-c_{d},s),\\&\quad =\sum ^J_{j=1}g_j(f_{c}(k_{c},k_{d})-c_{c},f_{d}(k_{c},k_{d})-c_{d})\lambda _{j}(\cdot |s)+g_0(f_{c}(k_{c},k_{d}) \\&\qquad -c_{c},f_{d}(k_{c},k_{d})-c_{d})\delta _{0}(\cdot ), \end{aligned}$$

where $g$ is continuous, increasing, concave and supermodular in $a$, $f$ monotone with $f(0,0)=0$.

To relate our result to this economy, first if the primitive data of this model satisfies Assumption 1, then by Theorem , we have existence of a nonempty countably chain complete set of time consistent equilibrium (with the greatest value equilibrium optimal). If we additionally impose Assumption , Theorem 5 holds, and time consistent (and optimal) policies are continuous. By 3, we can pointwise approximate the extremal time consistent equilibrium (including the optimal SMNE). Finally, if the production sectors are also separable (i.e., $g_{j}(\cdot ,\cdot )$ is separable in both arguments) and $\lambda _j$ does not depend on $s$, then conclusions of Theorem 4 hold.

5.4 Possibilities for stationary Markov equilibrium

In this final subsection we consider the question of the structure and computation of Stationary Markov equilibrium (SME) associated with SMNE time consistent policies. For this we return to the general model studied in the main section.

In particular, we consider the question of when for SMNE, we can construct equilibrium invariant distributions generated by stochastic transition $ Q(\cdot |x,h^{*}(x))$ for any time consistent policy function $ h^{*}$ that is not trivial (i.e., do not converge to a degenerate distribution). That is, to prevent the SMNE generating a trivial SME, we need some additional assumptions. Two examples below provide some sufficient conditions for such results.

Along these lines, first let $\Delta (S)$ denote a family of probability measures on the state space $S$. Further, define a pair of operators, namely $G:\mathcal {V}\rightarrow \mathcal {V}$, as well as $G^{*}:\Delta (S)\rightarrow \Delta (S)$ as follows:

$$\begin{aligned} G_{h}(f)(x)=\int \limits _{S}f(y)Q(dy|x,h(x)), \end{aligned}$$

and

$$\begin{aligned} G_{h}^{*}(\tau )(A)=\int \limits _{S}Q(A|x,h(x))\tau (dx). \end{aligned}$$

Notice, that the fixed points of $G_{h^{*}}^{*}$ are equilibrium invariant distributions of our economy associated with SMNE given by $h(x)$.

We now give conditions where non-trivial invariant distributions exist.

Example 2

Assume the upper bound for the state space $\bar{S}<\infty $, and the SMNE $ h^{*}$ is continuous. Let $\tau $ be some probability distribution on $ [0,\bar{S}]$, and describe it as follows:

$$\begin{aligned} \tau (\cdot )=\xi \,\tau _{N}(\cdot )+(1-\xi )\delta _{0}(\cdot ), \end{aligned}$$

(8)

where $\tau _{N}$ is probability measure which has no atom at $0$, with $\xi \in [0,1]$. If $x_{t}$ has distribution $\tau ,$ then the distribution of next state $x_{t+1}$ is given by

$$\begin{aligned} \tilde{\tau }(\cdot ):=G_{h^{*}}^{*}(\tau )(\cdot )=\sum \limits _{j=1}^{J}\int \limits _{S}g_{j}^{h^{*}}(x)\lambda _{j}(\cdot |x)\tau (dx)+\int \limits _{S}g_{0}^{h^{*}}(x)\tau (dx)\delta _{0}(\cdot ), \end{aligned}$$

where $g_{j}^{h^{*}}(x):=g_{j}(x,h^{*}(x)),$ for $h^{*}$ an equilibrium time consistent policy. Let $S_{h^{*}}:=\left\{ x:g_{0}^{h^{*}}(x)=0\right\} $. Clearly, this is compact set. We now construct an invariant distribution associated with $h^{*}$. To do this, we impose some additional assumptions in the noise:

$S_{h^*}$ is nonempty and $0\notin S_{h^*}$,
for all $j,$ $\lambda _{j}$ has a Feller property, and we have
$$\begin{aligned} (\forall x\in S_{h^{*}})\quad \sum \limits _{j=1}^{J}g_{j}^{h^{*}}(x)\lambda _{j}(S_{h^{*}}|x)=1. \end{aligned}$$

Given these assumptions, say $\tau $ is invariant: i.e., $\tilde{\tau } =G_{h^{*}}^{*}(\tau )=\tau $. We now characterize the invariant distribution under our added assumptions. That is, under the above assumptions, from equation (8), we have

$$\begin{aligned} \tilde{\tau }(\cdot )&:= \xi \sum \limits _{j=1}^{J}\int \limits _{S}g_{j}^{h^{*}}(x)\lambda _{j}(\cdot |x)\tau _{N}(dx) \!+\!\xi \int \limits _{S}g_{0}^{h^{*}}(x)\tau _{N}(dx)\delta _{0}(\cdot )\!+\!(1\!-\!\xi )g_{0}^{h^{*}}(0)\delta _{0}(\cdot ), \\&= \xi \sum \limits _{j=1}^{J}\int \limits _{S}g_{j}^{h^{*}}(x)\lambda _{j}(\cdot |x)\tau _{N}(dx) +\left( \xi \int \limits _{S}g_{0}^{h^{*}}(x)\tau _{N}(dx)+(1-\xi )\right) \delta _{0}(\cdot ). \end{aligned}$$

As $\tau $ is invariant, $g_{0}(\cdot )\ge 0,$ by (8) $ \int _{S}g_{0}^{h^{*}}(x)\tau _{N}(dx)=0$ unless $\xi =0$. But, if $\xi =0$ , then $\tau $ is trivial. Hence, we may assume $\xi \ne 0$. Since $g_{0}^{h^{*}}\ge 0$, $\tau _{N}$ must have a support in the set $ S_{h^{*}}$. By (8), we have

$$\begin{aligned} \tau _{N}(\cdot )=\sum \limits _{j=1}^{J}\int \limits _{S_{h^{*}}}g_{j}^{h^{*}}(x)\lambda _{j}(\cdot |x)\tau _{N}(dx). \end{aligned}$$

where the last equality follows from the fact that on $S_{h^{*}}$ function $g_{0}^{h^{*}}\equiv 0$.

Now, consider a set of probability distributions with a support on $ S_{h^{*}}$ (say $\Delta (S_{h^{*}})$). Since $S_{h^{*}}$ is compact, by the Prohorov Theorem [e.g., see Sect. 5 in Billingsley (1999)], the space $\Delta (S_{h^{*}})$ is compact in the weak topology. Define the operator on $\Delta (S_{h^{*}})$ as follows:

$$\begin{aligned} \mathcal {T}(\mu ):=\sum \limits _{j=1}^{J}\int \limits _{S_{h^{*}}}g_{j}^{h^{*}}(x)\lambda _{j}(\cdot |x)\mu (dx). \end{aligned}$$

We hence have:

$$\begin{aligned} \sum \limits _{j=1}^{J}\int \limits _{S_{h^{*}}}g_{j}^{h^{*}}(x)\lambda _{j}(S_{h^{*}}|x)\tau _{N}(dx)=1. \end{aligned}$$

Hence, $\mathcal {T}:\Delta (S_{h^{*}})\rightarrow \Delta (S_{h^{*}})$ . We now show $\mathcal {T}$ has a fixed point. Notice, as $\Delta (S_{h^{*}})$ is nonempty, convex and compact, to show the existence of a fixed point, it suffices to show $\mathcal {T}$ is continuous in the weak topology. Let $\mu _{n}\rightarrow \mu $ weakly, and $f:S_{h^{*}}\rightarrow S_{h^{*}}$ be a continuous function. Then, we have

$$\begin{aligned} \int \limits _{S_{h^{*}}}f(x)\mathcal {T}(\mu _{n})(dx)=\sum \limits _{j=1}^{J}\int \limits _{S_{h^{*}}}g_{j}(x)\int \limits _{S_{h^{*}}}f(y)\lambda _{j}(dy|x)\mu _{n}(dx). \end{aligned}$$

By Feller properties of $\lambda _{j},$ we have $x\rightarrow \int \limits _{S_{h^{*}}}f(y)\lambda _{j}(dy|x)$ is continuous. Hence, we have

$$\begin{aligned} \int \limits _{S_{h^{*}}}f(y)\mathcal {T}(\mu _{n})(dy)\rightarrow \int \limits _{S_{h^{*}}}f(y)\mathcal {T}(\mu )(dy), \end{aligned}$$

which implies $\mathcal {T}$ is continuous in the weak topology. Then, by the Schauder–Tykhonov Theorem, $\mathcal {T}$ has a fixed point $\tau _{N}^{*} $, and the SME invariant distribution takes the form $\tau (\cdot )=\xi \tau _{N}^{*}(\cdot )+(1-\xi )\delta _{0}(\cdot )$.

In the next example, we construct another situation where a SMNE $h^{*}$ has a non-trivial SME invariant distribution. In this case, the SME is given as a convex combination of uniform distributions on a fixed interval, and the Dirac delta centered at zero.

Example 3

Let $J=1, \bar{S}=5, \lambda (\cdot ):=\mathcal {U}(2,5)$ (i.e. $\lambda _{j}$ does not depend on either $j$ or $x$, and has a uniform distribution on the interval $[2,5]$), $u(a)=\sqrt{a}$ and $g(x,a)=\min \left( \sqrt{x-a} ,1\right) $. Assume that $\beta $ and $\delta $ satisfy: $\delta +\delta \beta \frac{14}{9}\ge 1$. First, we show that for $x\in [2,5], h^{*}(x)=x-1$. Let $x>2$ be an initial state. Let $v_{0}$ be a payoff under strategy $h^{*}(x)=x-1$ for $x>2$. Then,

$$\begin{aligned} v_{0}(x)=\sqrt{x-1}+\delta \int \limits _{S}v_{0}(y)\lambda (dy). \end{aligned}$$

(9)

Since $\mathrm{{supp}}\,(\lambda )=[2,5]$ from (9), we have

$$\begin{aligned} \int \limits _{S}v_{0}(y)\lambda (dy)=\frac{14}{9}+\delta \int \limits _{S}v_{0}(y)\lambda (dy), \end{aligned}$$

hence

$$\begin{aligned} \int \limits _{S}v_{0}(y)\lambda (dy)=\frac{\frac{14}{9}}{1-\delta }. \end{aligned}$$

Since $h^{*}$ is MSNE, it must solve a maximization problem:

$$\begin{aligned} a\in [0,x]\rightarrow \sqrt{a}+\delta \beta \int \limits _{S}v_{0}(y)\lambda (dy)\min \left( \sqrt{x-a},1\right) , \end{aligned}$$

$$\begin{aligned} =\sqrt{a}+\delta \beta \frac{\frac{14}{9}}{1-\delta }\min (\sqrt{x-a} ,1):=w(a). \end{aligned}$$

Notice that

$$\begin{aligned} w(a)&= \sqrt{a}+\delta \beta \frac{\frac{14}{9}}{1-\delta },\quad \text { for }a\le x-1,\\ w(a)&= \sqrt{a}+\beta \delta \frac{\frac{14}{9}}{1-\delta }\sqrt{x-a},\text { else}. \end{aligned}$$

Further, right derivative of $w$ at $x-1$ is

$$\begin{aligned} \left. \frac{\partial w}{\partial a}\right| _{a=x-1}=\frac{1}{2\sqrt{x-1} }-\frac{1}{2}\beta \delta \frac{\frac{14}{9}}{1-\delta }\le \frac{1}{2} \left( 1-\beta \delta \frac{\frac{14}{9}}{1-\delta }\right) \le 0, \end{aligned}$$

whenever $\delta +\delta \beta \frac{14}{9}\ge 1$. This implies that $x-1$ is optimal policy at state $x\in [2,5]$. We therefore have that $\tau ^{*}():=\xi \mathcal {U}(2,5)+(1-\xi )\delta _{0}$ is SME invariant distribution under strategy $h^{*}$ for arbitrary $\xi \in [0,1]$ . Indeed, if $\tau _{t}=^{d}\tau $ then^{Footnote 25}:

$$\begin{aligned} \tau _{t+1}=^{d}\xi \lambda +(1-\xi )\delta _{0}=^{d}\tau _{t}. \end{aligned}$$

6 Related results and conclusion

It is important to remember that equilibrium non-existence and its multiplicity, related to the class of quasi-hyperbolic games we study, have constituted a significant challenge for applied economists who sought to study models where such dynamic consistency failures play a key role. They have been equally as challenging for researchers that seek to identify tractable numerical approaches to computing SMNE in these (and related) dynamic games [e.g., see the discussion in Krusell and Smith (2003) or Judd (2004)]. On the one hand, Krusell et al. (2002a) propose a generalized Euler equation method for a version of a hyperbolic discounting consumer and obtain explicit solution for logarithmic utility and Cobb–Douglas production examples. But this is an example. On the other hand, in Judd (2004), he uses generalized Euler equation approach to analyze smooth time-consistent policies and proposes a perturbation method for calculating them. The problem here is providing conditions under which at any point in the state space the generalized Euler equations represent a sufficient first order theory for agent’s value function in the equilibrium of the game.^{Footnote 26} Concentrating on non-smooth policies, Krusell and Smith (2003) define a step function equilibrium and show its existence and resulting indeterminacy of steady state capital levels. Further, in a deterministic setting general existence result of optimal policies under quasi-geometric discounting can be provided using techniques proposed by Goldman (1980) for finite horizon economies, by Harris (1985) for infinite horizon or by Feinberg and Shwartz (1995) in the generalized discounting setting.

Summarizing, from a technical point of view, tools used to show existence and characterize Markovian policies are wide and motivated by specific applications or problems under study. Still the general framework for studying (analytically and numerically) of (possibly nonsmooth) SMNE is missing. To circumvent some of these mentioned predicaments in a unified setup authors also added noise to the decision problems or relevant dynamic games. Specifically, in a (recursive) decision approach, by adding noise (making payoff discontinuities negligible) Caplin and Leahy (2006) prove existence of recursively optimal plan for a finite horizon decision problem and general utility functions. Similarly Bernheim and Ray (1986) show that by adding enough noise to the dynamic game (to smooth discontinuities away) existence of SMNE is guaranteed. Such stochastic game approach was later developed by Harris and Laibson (2001) who characterize the set of smooth SMNE by (generalized) first order conditions. Finally Balbus and Nowak (2008) show conditions for SMNE existence in an infinite horizon, hyperbolic discounting stochastic game with many players in each generation^{Footnote 27}.

It is worth mentioning that authors have also analyzed optimal but not necessarily time-consistent policies. For infinite horizon decision problems Kydland and Prescott (1980, henceforth KP) notice that the state space of an appropriately defined value function must incorporate some pseudo-state variables like Lagrange multipliers for the problem (of finding optimal policies) to be recursive. KP method is linked to the Abreu et al. (1990, henceforth APS) type arguments. Specifically by adding appropriate noise to the time-consistency game, characterization of all sequential equilibria using APS methods can be offered. This approach is undertaken by Bernheim et al. (1999). They analyze our problem using APS type arguments. Specifically they consider a set of (bounded) values for (sequential) subgame perfect equilibria in a Phelps and Pollak (1968) self-game and analyze all subsets of such values. Later they construct a monotone (under set inclusion) operator on this set and numerically analyze its largest fixed point. Using this method they show existence of a sequential time-consistent policy and use it to analyze self-control in the context of a low asset trap.

Finally, the literature on self-control is larger than on a specific problem of time-consistency, including papers specifying preferences over menus allowing for temptation. That is, instead of taking a preference change as a primitive of the model, economist introduce preferences over menus which are time-consistent (i.e. do not change over time) but still allow for modeling of self-control (by introducing so-called set-betweenness axiom).

Specifically Gul and Pesendorfer (2001) (GP, henceforth) and Dekel et al. (2001, 2009) (DLR, henceforth) consider a general model of preferences over menus (lotteries), from which choice is made at a later date and show that preferences over menus can be used to identify an agent’s subjective beliefs regarding her future tastes and behavior. They explicitly model a cost of tomorrow’s temptation as a difference between tomorrow optimal decision and a current tempted decision^{Footnote 28}. GP introduce also an overwhelming temptation preferences or Strotz representation where the future decisions are always made according to the tempted preferences, which is exactly the case in our quasi-hyperbolic discounting problem. For application of GP see Krusell et al. (2002b) who study asset pricing puzzle.

Actually, there are more links between (stochastic) game methods used in this paper and the preference approach discussed above. Here we refer the reader to the paper of Benabou and Pycia (2002) who represent GP preferences by outcomes of the two-period game of control between a “planner” and a “doer”. Also Fudenberg and Levine (2006) paper presents a stochastic game between planner and a sequence of myopic doers. Doers choose actions and planner their costs. They show the strategies and outcomes of their game are equivalent to solutions of a “planner” maximization problem under incentive compatible constraints. Fudenberg and Levine (2006) also discusses relation between their game and GP preference representation. Hence a natural question, on applicability of our constructive (stochastic game or stochastic decision problem) methods to the Fudenberg and Levine (2006) or Benabou and Pycia (2002) game and hence GP or DLR representation, arise. This becomes especially important in the view of Dekel and Lipman (2012) random Strotz representation ^{Footnote 29}, where decision from the menu is constrained to the actions incentive compatible with (tempted) doer, but where the preferences of the doer are drawn from some probability distribution.

Finally let us note that quasi-hyperbolic discounting problem is linked to a problem of altruism towards successive generations [see Saez-Marti and Weibull (2005) for formal results]. This link can be also seen through a technical perspective, where the stochastic games methods [see Balbus et al. (2014)] can be applied for both quasi-hyperbolic discounting and intergenerational altruism [see Balbus et al. (2013)] models.

All in all, we think that our approach offers an interesting alternative to all mentioned contributions by using stochastic games framework and directly attacking existence, computation and comparative statics questions.

Notes

For a small sampling of this work, see the papers of O’Donoghue and Rabin (1999a, b), Laibson (1997) and Angeletos et al. (2001), or Eisenhauer and Ventura (2006) for an empirical evidence supporting quasi-hyperbolic preferences hypothesis, among others.
Recent approaches have appealed to related recursive optimization methods, but incorporating duality theory with policies defined on enlarged state spaces including dual variables that are associated with dynamic incentive constraints. Such contributions can be found in the work on recursive saddle-point theory that was first discussed implicitly in Kydland and Prescott (1977, 1980), and later developed in Marcet and Marimon (2011), and Messner et al. (2012a, b).
For example, from a decision theoretic perspective, when a ”current” decision maker is indifferent between some alternatives in the future, that same decision maker can still strictly prefer such an alternative in advance and be willing to commit yet lacks access to a reasonable ”commitment device” that would impose discipline on the choices of her future ”selves” when tomorrow actually arrives.
Also, in the case of recursive saddle-point methods, two critical problems exist. The first concerns conditions for the existence of saddle point stable solutions to dual programs which do not involve convexity conditions which typically are not present in quasi-hyperbolic discounting problems [e.g., see the papers of Messner et al. (2012a, b) for discussion of how these methods work when saddle points exist]. Second, often unique saddle points are needed to guarantee resulting dual solutions to Lagrangian formulations are primal feasible [e.g., to rule out the counterexample discussed in Messner and Pavoni (2004)].
See also Fudenberg and Levine (2006) for a more recent discussion of related issues.
The work of Maskin and Tirole (2001) provides an extensive set of motivations for why one might be interested in concentrating on SMNE (as opposed to SPNE).
A countably chain complete poset is a partially ordered set that is closed under sup/inf of each monotone sequence.
For example, in Sorger (2004), he proposes settings under which any twice continuously differentiable function can be supported as a policy of a time consistent hyperbolic consumer. This result can be subsequently linked to Gong and Smith (2007), where they show that a hyperbolic discounting is not observationally equivalent to exponential discounting. That is, it is always possible to calibrate an exponential model so that it predicts the same level of consumption as a hyperbolic model. However, the two models have radically different comparative statics. Hence, our approach allows us to sort out the exact nature of this question, and provide theoretical monotone comparative statics results based on the equilibrium set of the stochastic game itself. Such a result can clarify empirical questions that are asked by applied researchers.
See Harris and Laibson (2001) for a discussion of their methods and stochastic games [e.g., Harris and Laibson (2001), footnote 13].
In a GEE approach, the question of existence and characterization is closely related to solving systems of generalized Euler inequalities, and it appeals to the calculus of bounded variation for characterizing the structure of SMNE. Actually, when GEE methods have been used in the literature, they have been used in problems with a single control, and a single state. [e.g., the consumption-savings problem in Harris and Laibson (2001)]. For multidimensional problems with Euler inequalities, the calculus of bounded variation seem very difficult to implement and interpret. Our methods, in contrast, rely exclusively on value-policy iteration algorithm. This fact simplifies the problem of solving multidimensional dynamic consistency a great deal.
Mathematical definitions and notations relating to partially ordered sets and lattices are found in the appendix.
When the context is clear, in this paper, when we use the term “monotone”, we mean monotone increasing.
The complication in this paper arises as our dynamically consistent plans need to be measurable, and under pointwise partial orders, spaces of measurable functions are only countably chain complete. That is, in spaces of measurable functions, for arbitrary subsets (resp., subchains), the $ \sup $ (resp, $\inf )$ of the set need not be measurable unless the set is countable or compact in the uniform topology. So we need to prove a version of the Tarski/Markowsky theorem, but for “sigma” complete lattices (resp, countable chain complete partially ordered sets).
Veinott’s fixed point comparative statics result is found in Veinott (1992) (Theorem 14, Chap. 4). See also Topkis (1998), Theorem 2.5.2.
In this definition, recall for a poset $A\subset X,$ $\sup A$ (resp, $\inf A) $ is denoted as $\vee A$ (resp, $\wedge A).$
These concepts are also directly related to the idea of an sequential order continuous operators developed by Vulikh (1967, p.27).
Notice here the state space is not required to be bounded.
For example, even in stochastic dynamic programming problems with unbounded state spaces and returns unbounded below, let alone a stochastic game, it is not clear how to identify useful conditions to guarantee the existence of upper (resp., lower) pointwise bounds for candidate value functions needed to pose the existence problem for a unique value function. For a discussion of the complications of finding these upper and lower bounds for stochastic dynamic programs, see Marinacci and Montrucchio (2010), Matkowski and Nowak (2011), and Vailakis and Le Van (2012). See also related discussion in Martins-da Rocha and Vailakis (2010). One resolution of this problem for very particular examples of our model (e.g., asset pricing and consumption-savings versions) if players have unbounded returns below is to introduce “cheap points” into endowment processes so that lower bounded for utility can be constructed without strong joint restrictions on the growth rate of utility and shocks as consumption goes to zero. This is approach taken in Duffie et al. (1994). In our stochastic game, these upper and lower bounds are needed to find a suitable countable chain complete set of values functions in which to pose our SMNE existence problem. In the end, this seems to be a purely technical problem for our stochastic game, and we abstract from this case, and just assume positive bounded returns.
In an important recent paper by Chade et al. (2008), the authors study a repeated game using an APS type procedure. We remark that the presence of state variables in our model would greatly complicate the Footnote 19 continued
problem of extending their APS procedures to our model (especially for the case of multidimensional state space as indicated by the discussion above per measurability and weak-star closure of the Nash equilibrium set). Therefore, it is not clear how to extend either their existence or equilibrium comparative statics results to the class of models in this paper.
For each set $A$, $\mathbf {1}_A(\cdot )$ is said to be indicator of $A$.
The question of uniqueness of SMNE in this game is a critical one for numerical work, and is an important open question. We leave that problem for future work.
We are indebted to one of the referees of this paper for bring this particular issue to our attention.
MATLAB program implementing our numerical procedure is available from authors upon request.
For simplicity, we have assumed a habit that depreciates after 1 period. We can easily incorporate more “durable” habits into our framework.
$X=^{d}Y=^{d}\tau $ means that random variable $X$ has the same distribution as $Y$, and it is $\tau $.
We should elaborate a bit on this point. In a generalized Euler equation method, on an open set of any point in the state space, we can always construct an local linearization (in a space of functions) that might be valid as a linear approximation to the function that satisfies the functional equation near that point; the problem is showing the Euler equation is necessary and sufficient on that open set. For the later claim to be true, you must know the value function in the equilibrium of the game is concave. In our method, such a local expansion will be valid; but then, we do not need the generalized Euler equation to compute the models equilibrium.
See also Alj and Haurie (1983) or Nowak (2010) for related results.
See also Gul and Pesendorfer (2004) for recursive temptation driven preferences in a dynamic setting. In their Sect. 6 they use such preferences to analyze a dynamic model of temptation driven preference in a stochastic economy.
Dekel and Lipman (2012) also show that random Strotz preferences can represent GP preferences.

References

Abreu D, Pearce D, Stacchetti E (1990) Toward a theory of discounted repeated games with imperfect monitoring. Econometrica 58(5):1041–1063
Article Google Scholar
Acemoglu D, Aghion P, Bursztyn L, Hemous D (2012) The environment and directed technical change. Am Econ Rev 102(1):131–166
Article Google Scholar
Alj A, Haurie A (1983) Dynamic equilibria in multigenerational stochastic games. IEEE Trans Autom Control 28:193–203
Article Google Scholar
Amir R (1996) Strategic intergenerational bequests with stochastic convex production. Econ Theory 8: 367–376
Article Google Scholar
Amir R (2002) Complementarity and diagonal dominance in discounted stochastic games. Ann Oper Res 114:39–56
Article Google Scholar
Angeletos GM, Laibson D, Repetto A, Tobacman J, Weinberg S (2001) The hyberbolic consumption model: calibration, simulation, and empirical evaluation. J Econ Perspect 15(3):47–68
Article Google Scholar
Balbus Ł, Nowak AS (2008) Existence of perfect equilibria in a class of multigenerational stochastic games of capital accumulation. Automatica 44:1471–1479
Article Google Scholar
Balbus Ł, Reffett K, Woźny Ł (2013) A constructive geometrical approach to the uniqueness of Markov perfect equilibrium in stochastic games of intergenerational altruism. J Econ Dyn Control 37(5): 1019–1039
Google Scholar
Balbus Ł, Reffett K, Woźny Ł (2014) A constructive study of Markov equilibria in stochastic games with strategic complementarities. J Econ Theory 150:815–840
Google Scholar
Becker GS, Murphy KM (1988) A theory of rational addiction. J Polit Econ 96(4):675–700
Article Google Scholar
Benabou R, Pycia M (2002) Dynamic inconsistency and self-control: a planner–doer interpretation. Econ Lett 77(3):419–424
Article Google Scholar
Bernheim BD, Ray D (1986) On the existence of Markov-consistent plans under production uncertainty. Rev Econ Stud 53(5):877–882
Google Scholar
Bernheim D, Ray D, Yeltekin S (1999) Self-control, savings, and the low-asset trap. Stanford University
Billingsley P (1999) Convergence of Probability Measures. Wiley, New York
Book Google Scholar
Brock WA, Taylor MS (2005) Economic growth and the environment: a review of theory and empirics, Chap. 28. In: Durlauf S, Aghion P (eds) Handbook of economic growth. Elsevier, Amsterdam, pp 1749–1821
Google Scholar
Caplin A, Leahy J (2006) The recursive approach to time inconsistency. J Econ Theory 131(1):134–156
Article Google Scholar
Chade H, Prokopovych P, Smith L (2008) Repeated games with present-biased preferences. J Econ Theory 139(1):157–175
Article Google Scholar
Davis A (1955) A characterization of complete lattices. Pac J Math 5:311–319
Article Google Scholar
Dekel E, Lipman BL (2012) Costly self-control and random self-indulgence. Econometrica 80(3): 1271–1302
Article Google Scholar
Dekel E, Lipman BL, Rustichini A (2001) Representing preferences with a unique subjective state space. Econometrica 69(4):891–934
Article Google Scholar
Dekel E, Lipman BL, Rustichini A (2009) Temptation-driven preferences. Rev Econ Stud 76(3):937–971
Article Google Scholar
Duffie D, Geanakoplos J, Mas-Colell A, McLennan A (1994) Stationary Markov equilibria. Econometrica 62(4):745–781
Article Google Scholar
Dugundji J, Granas A (1982) Fixed point theory. Polish Scientific, Warsaw
Google Scholar
Eisenhauer JG, Ventura L (2006) The prevalence of hyperbolic discounting: some European evidence. Appl Econ 38(11):1223–1234
Google Scholar
Feinberg E, Shwartz A (1995) Constrained Markov decision models with weighted discounted rewards. Math Oper Res 20:302–320
Google Scholar
Fudenberg D, Levine DK (2006) A dual-self model of impulse control. Am Econ Rev 96(5):1449–1476
Article Google Scholar
Goldman SM (1980) Consistent plans. Rev Econ Stud 47(3):533–537
Article Google Scholar
Gong L, Smith W (2007) Consumption and risk with hyperbolic discounting. Econ Lett 96(2):153–160
Article Google Scholar
Gul F, Pesendorfer W (2001) Temptation and self-control. Econometrica 69(6):1403–1435
Article Google Scholar
Gul F, Pesendorfer W (2004) Self-control and the theory of consumption. Econometrica 72(1):119–158
Article Google Scholar
Harris C (1985) Existence and characterization of perfect equilibrium in games of perfect information. Econometrica 53(3):613–628
Article Google Scholar
Harris C, Laibson D (2001) Dynamic choices of hyperbolic consumers. Econometrica 69(4):935–957
Article Google Scholar
Jones LE, Manuelli RE (1995) A positive model of growth and pollution controls. Technical report, NBER working papers 5205
Jones LE, Manuelli RE (2001) Endogenous policy choice: the case of pollution and growth. Rev Econ Dyn 4(2):369–405
Article Google Scholar
Judd KL (2004) Existence, uniqueness, and computational theory for time consistent equilibria: a hyperbolic discounting example. Hoover Institution, Stanford, CA.
Kall P (1986) Approximation to optimization problems: an elementary review. Math Oper Res 11(1):9–18
Article Google Scholar
Karp L, Tsur Y (2011) Time perspective and climate change policy. J Environ Econ Manag 62(1):1–14
Article Google Scholar
Krusell P, Smith A (2003) Consumption-savings decisions with quasi-geometric discounting. Econometrica 71(1):365–375
Article Google Scholar
Krusell P, Kuruscu B, Smith AJ (2002a) Equilibrium welfare and government policy with quasi-geometric discounting. J Econ Theory 105(1):42–72
Article Google Scholar
Krusell P, Kuruscu B, Smith AJ (2002b) Time orientation and asset prices. J Monet Econ 49(1):107–135
Article Google Scholar
Kydland F, Prescott E (1977) Rules rather than discretion: the inconsistency of optimal plans. J Polit Econ 85(3):473–491
Article Google Scholar
Kydland F, Prescott E (1980) Dynamic optimal taxation, rational expectations and optimal control. J Econ Dyn Control 2(1):79–91
Article Google Scholar
Laibson D (1997) Golden eggs and hyperbolic discounting. Q J Econ 112(2):443–477
Article Google Scholar
Leininger W (1986) The existence of perfect equilibria in model of growth with altruism between generations. Rev Econ Stud 53(3):349–368
Article Google Scholar
Lemoine D, Traeger C (2012) Tipping points and ambiguity in the economics of climate change. Technical report, NBER working paper 18230.
Marcet A, Marimon R (2011) Recursive contracts. Technical report, Barcelona GSE working paper series working paper no. 552.
Marinacci M, Montrucchio L (2010) Unique solutions for stochastic recursive utilities. J Econ Theory 145(5):1776–1804
Article Google Scholar
Markowsky G (1976) Chain-complete posets and directed sets with applications. Algebra Univ 6:53–68
Article Google Scholar
Maskin E, Tirole J (2001) Markov perfect equilibrium: I. Observable actions. J Econ Theory 100(2):191–219
Article Google Scholar
Matkowski J, Nowak A (2011) On discounted dynamic programming with unbounded returns. Econ Theory 46(3):455–474
Article Google Scholar
Messner M, Pavoni N (2004) On the recursive saddlepoint methods. Technical report, MS
Messner M, Pavoni N, Sleet C (2012) Recursive methods for incentive problems. Rev Econ Stud 15(4): 501–525
Google Scholar
Messner M, Pavoni N, Sleet C (2013) The dual approach to recursive optimization: theory and examples. Technical report
Nowak AS (2003) On a new class of nonzero-sum discounted stochastic games having stationary Nash equilibrium points. Int J Game Theory 32:121–132
Google Scholar
Nowak AS (2006) On perfect equilibria in stochastic models of growth with intergenerational altruism. Econ Theory 28:73–83
Article Google Scholar
Nowak AS (2010) Existence of perfect equilibria in a class of multigenerational stochastic games of capital accumulation. J Optim Theory Appl 144:88–106
Article Google Scholar
O’Donoghue T, Rabin M (1999a) Doing it now or later. Am Econ Rev 89(1):103–124
Article Google Scholar
O’Donoghue T, Rabin M (1999b) Incentives for procrastinators. Q J Econ 114(3):769–816
Article Google Scholar
Peleg B, Yaari ME (1973) On the existence of a consistent course of action when tastes are changing. Rev Econ Stud 40(3):391–401
Article Google Scholar
Phelps E, Pollak R (1968) On second best national savings and game equilibrium growth. Rev Econ Stud 35:195–199
Article Google Scholar
Martins-da Rocha VF, Vailakis Y (2010) Existence and uniqueness of a fixed point for local contractions. Econometrica 78(3):1127–1141
Article Google Scholar
Saez-Marti M, Weibull J (2005) Discounting and altruism towards future decision-makers. J Econ Theory 122:254–266
Article Google Scholar
Sorger G (2004) Consistent planning under quasi-geometric discounting. J Econ Theory 118(1):118–129
Article Google Scholar
Strotz RH (1956) Myopia and inconsistency in dynamic utility maximization. Rev Econ Stud 23(3):165–180
Article Google Scholar
Tarski A (1955) A lattice-theoretical fixpoint theorem and its applications. Pac J Math 5:285–309
Article Google Scholar
Topkis DM (1978) Minimazing a submodular function on a lattice. Oper Res 26(2):305–321
Article Google Scholar
Topkis DM (1998) Supermodularity and complementarity. Frontiers of economic research. Princeton University Press, Princeton
Google Scholar
Vailakis Y, Van Le C (2012) Monotone concave and convex operators: applications to stochastic dynamic programming with unbounded returns.
Veinott (1992) Lattice programming: qualitative optimization and equilibria. MS Standford
Vulikh BZ (1967) Introduction to the theory of partially ordered spaces. Wolters–Noordhoff Press, Groningen
Google Scholar

Download references

Acknowledgments

We thank Robert Becker, Madhav Chandrasekher, Manjira Datta, Paweł Dziewulski, Amanda Friedenberg, Ed Green, Seppo Heikkilä, Len Mirman, Peter Streufert, and especially Ed Prescott, as well participants of our SAET 2011 session for helpful conversations on the topics of this paper. We especially thank two anonymous referees and the associate editor for their excellent comments on an earlier draft of this paper. Balbus and Woźny reserach has been supported by NCN Grant No. UMO-2012/07/D/HS4/01393. All the usual caveats apply.

Author information

Authors and Affiliations

Faculty of Mathematics, Computer Sciences and Econometrics, University of Zielona Góra, Zielona Gora, Poland
Łukasz Balbus
Department of Economics, Arizona State University, Tempe, AZ, USA
Kevin Reffett
Department of Quantitative Economics, Warsaw School of Economics, al. Niepodlełośsci 162, 02-554 , Warsaw, Poland
Łukasz Woźny

Authors

Łukasz Balbus
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Reffett
View author publications
You can also search for this author in PubMed Google Scholar
Łukasz Woźny
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Łukasz Woźny.

Appendix: Proof of technical result

We begin with some useful definitions not provided earlier in the paper, but used in proofs of the proposition. An arbitrary set ($X,\ge )$ is partially ordered set (or poset) if $X$ is equipped with an order relation $\ge :X\times X\rightarrow X$ that is reflexive, antisymmetric and transitive. If every element of a poset $X$ is comparable, then $X$ is chain. If $X$ is a chain and countable, $X$ is a countable chain. An upper (respectively, lower) bound for a set $ B\subset X$ is an element $x^{u}($respectively, $x^{l})\in X$ such that for any other element $x\in B,$ $x\le x^{u}$ (respectively, $x^{l}\le x)$. If there is a point $x^{u}$ (respectively, x$^{l})$ such that $x^{u}$ is the least element in the subset of upper bounds of $B\subset X$ (respectively, the greatest element in the subset of lower bounds of $B\subset X$), we say $ x^{u}$ (respectively, $x^{l})$ is the supremum (respectively, infimum) of $B.$ Clearly if the supremum or infimum of a set $X$ exists, it must be unique.

We say a set $L\subset X$ is a lattice if for any two elements, say $x$ and $x^{\prime }$ in $L, L$ is closed under the operation of infimum (denoted by $x\wedge x^{\prime }),$ and supremum (denoted $x\vee x^{\prime }).$ The former is referred to as “the meet” of the two points, while the latter is “the join” . A subset $L_{1}$ of $L$ is a sublattice of $L$ if it contains the sup and the inf (with respect to $L$) of any pair of points in $L_{1}.$ A lattice is complete if any $L_{1}\subset L$, the least upper bound (denoted $\vee L_{1})$ and the greatest lower bound (denoted $\wedge L_{1})$ are both in $L$ . If this completeness property only holds for countable subsets $L_{c}$, the lattice is $\sigma -$ complete. In a poset $X$, if every subchain $C\subset $ $X$ is complete, then $X\ $is referred to as a chain complete poset (or equivalent, a complete partially ordered set or CPO). A set $C$ is countable if it is either finite or there is a bijection from the natural numbers onto $C.$ If every countable chain $C\subset X$ is complete, then $X$ is referred to as a countably chain complete poset.

Let $(X_{1},\ge _{X_{1}})$ and $(X_{2},\ge _{X_{2}})$ be posets. A function (or, equivalently, operator) $f:X_{1}\rightarrow X_{2}$ is monotone (or order-preserving or isotone) if $f(x^{\prime })\ge _{X_{2}}f(x),$ when $x^{\prime }\ge _{X_{1}}x,$ for $x,x^{\prime }\in X_{1}$. A sequence $\{h_{n}\}$ in $H$ is order convergent if there exists two monotonic sequences of elements from $H$, one decreasing $ \{h_{\downarrow n}\},$ and one increasing $\{h_{\uparrow n}\} $, such that $ h=\inf h_{\downarrow n}=\sup h_{\uparrow n}$ and $h_{\uparrow n}\le h_{n}\le h_{\downarrow n}.$ A necessary and sufficient condition for an increasing sequence $h_{n}\rightarrow h$ to be order convergent is $h=\sup h_{n}.$

Let $(X,\ge )$ be a countably chain complete poset, i.e. where each increasing sequence has supremum, and each decreasing sequence has infimum. Assume that $X$ has the greatest element $\overline{\theta }$ and the least element $\underline{\theta }$. For a monotone sequence $\{x_{n}\}_{n=0}^{ \infty }$, let

$$\begin{aligned} \bigvee x_{n}:=\sup \limits _{n\in \mathbf {N}}x_{n}, \end{aligned}$$

and

$$\begin{aligned} \bigwedge x_{n}:=\inf \limits _{n\in \mathbf {N}}x_{n}. \end{aligned}$$

Denote by $F^{n}(x)$ the $n$-th orbit (or iteration) of $x$ under the function $F$, i.e. $F^{n}(x)=F\circ F\circ \ldots \circ F(x)$. We have the following two theorems, with the first pertaining to fixed point existence, and the second pertaining to fixed point comparative statics. The auxiliary fixed point theorem found in Sect. 2 is a corollary of both theorems.

Theorem 7

Let $X$ be a countably chain complete poset with the greatest element $\overline{\theta }$ and the least element $\underline{\theta }$, and $F:X\rightarrow X$ an increasing function, that is monotonically sup-inf-preserving i.e.

if $x_n$ is increasing, then $F\left( \bigvee x_n\right) =\bigvee F(x_n)$ and
if $x_n$ is decreasing, then $F\left( \bigwedge x_n\right) =\bigwedge F(x_n)$.

Then:

(i)
$\overline{\Phi }:=\bigwedge F^{n}(\overline{\theta })$ is the greatest fixed point and $\underline{\Phi }:=\bigvee F^{n}(\underline{\theta } ) $ is the least fixed point.
(ii)
the set of fixed points is a nonempty countably chain complete poset with
$$\begin{aligned} \overline{\Phi }=\bigvee \left\{ x:F(x)\ge x\right\} , \end{aligned}$$
(10)
and
$$\begin{aligned} \underline{\Phi }=\bigwedge \left\{ x:F(x)\le x\right\} , \end{aligned}$$
(11)

Proof

Proof of (i): Clearly $F(\overline{\theta })\le \overline{\theta }$. If for some $n,F^{n}(\overline{\theta })\ge F^{n+1}(\overline{\theta })$, then $F^{n+1}(\overline{\theta })=F(F^{n}(\overline{\theta }))\ge F(F^{n+1}( \overline{\theta }))=F^{n+2}(\overline{\theta })$. Hence, $F^{n}(\overline{ \theta })$ is decreasing, and $\overline{\Phi }$ is well defined. Since $F$ is monotonically inf-preserving, we have

$$\begin{aligned} F(\overline{\Phi })&= F\left( \bigwedge F^{n}(\theta )\right) , \\&= \bigwedge F^{n+1}(\theta ), \\&= \overline{\Phi }. \end{aligned}$$

Therefore, $\overline{\Phi }$ is fixed point of $F$. Let us take arbitrary fixed point $e=F(e)$. Clearly, $e\le \overline{\theta }$, and $e=F(e)\le F( \overline{\theta })$. If $e=\le F^{n}(\overline{\theta }),$ then $ e=F(e)=\le F^{n+1}(\overline{\theta })$. Therefore, $e\le F^{n}(\overline{ \theta })$ for all $n$, which implies $e\le \overline{\Phi }$. Similarly, we prove that $\underline{\Phi }$ is well defined and it is the least fixed point of $F$.

Proof of (ii): Let $e_{n}$ be an increasing set of fixed points. Let $\bar{e} =\bigvee e_{n}$. Then,

$$\begin{aligned} F(\bar{e})&= F\left( \bigvee e_{n}\right) , \\&= \bigvee F(e_{n}), \\&= \bigvee e_{n}=\bar{e}. \end{aligned}$$

Similarly, we prove the thesis for decreasing sequences. Now, we finally prove equality (10). Let $x$ be arbitrary point such that $ x\le F(x)$. Clearly $x\le \overline{\theta }$. Assume $x\le F^{n}( \overline{\theta })$. Then, $x\le F(x)\le F(F^{n}(\overline{\theta } ))=F^{n+1}(\overline{\theta })$. Hence, $x\le \overline{\Phi }$. Since $ \overline{\Phi }\in \{x:F(x)\ge x\},$ equality (10) is proven. We prove (11) analogously. $\square $

We finally prove a theorem (and a corollary) on increasing selections for parameterized problems that we use in the paper to obtain our results on equilibrium monotone comparatives.

Theorem 8

Let $X$ be a countably chain complete poset with the greatest and least elements and $T$ a poset. If $F:X\times T\rightarrow X$ is increasing, and monotonically-sup-inf preserving on $X$ then $ t\rightarrow \overline{\Phi }(t)$ and $t\rightarrow \underline{\Phi }(t)$ are isotone.

Proof

Let $t_{1}\le t_{2}$. From Theorem 7 we know that $m_{i}:= \overline{\Phi }(t_{i})=\vee \Gamma _{i}:=\vee \{x:F(x,t_{i})\le x\}$. Note that by isotonicity of $F(x,\cdot )$ we obtain $m_{1}=F(m_{1},t_{1})\le F(m_{1},t_{2})$. Hence $m_{1}\in \Gamma _{2}$. Since $m_{2}$ is the greatest element of $\Gamma _{2}$, hence $m_{1}\le m_{2}$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balbus, Ł., Reffett, K. & Woźny, Ł. Time consistent Markov policies in dynamic economies with quasi-hyperbolic consumers. Int J Game Theory 44, 83–112 (2015). https://doi.org/10.1007/s00182-014-0420-3

Download citation

Accepted: 20 March 2014
Published: 04 April 2014
Issue Date: February 2015
DOI: https://doi.org/10.1007/s00182-014-0420-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Time consistent Markov policies in dynamic economies with quasi-hyperbolic consumers

Abstract

Similar content being viewed by others

Dynamic preferences and the behavioral case against sin taxes

Existence of Walrasian equilibria with discontinuous, non-ordered, interdependent and price-dependent preferences, without free disposal, and without compact consumption sets

Closing the invisible hand: a rehabilitation of tâtonnement dynamics

1 Introduction

2 Preliminary result

Definition 1

Theorem 1

3 Benchmark model

Assumption 1

4 Main results

4.1 Existence

Lemma 1

Proof

Theorem 2

Proof

4.2 Computation

Theorem 3

Proof

Theorem 4

Proof

Assumption 2

Theorem 5

Proof

4.3 Monotone comparative statics

Assumption 3

Theorem 6

Proof

5 Applications and extensions

5.1 Consumption-savings with \(\beta -\delta \) preferences

Example 1

5.2 Consumption-savings with habit formation and \(\beta - \delta \) preferences

5.3 Environmental policy

5.4 Possibilities for stationary Markov equilibrium

Example 2

Example 3

6 Related results and conclusion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of technical result

Appendix: Proof of technical result

Theorem 7

Proof

Theorem 8

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation