1 Introduction

The axiomatic foundation of intertemporal decisions is a fundamental question in economics and generates considerable research interest. Despite the fact that a number of possible ways of discounting have appeared in the literature so far, two types have been predominantly used: exponential discounting, first introduced by Samuelson (1937), and quasi-hyperbolic discounting (Phelps and Pollak 1968; Laibson 1997). The important question to be answered is which axioms allow us to say that the preferences of a decision-maker can be represented using the discounted utility model with exponential or quasi-hyperbolic discount functions? Existing axiom systems for intertemporal decisions address this question. These systems can be roughly divided into two main groups: those with preferences over deterministic consumption streams and those with preferences over stochastic consumption streams.

The first group has been the leading approach in the area, both for exponential and quasi-hyperbolic functions. In this framework, a consumption set is endowed with topological structure, and Debreu’s (1960) theorem on additive representation is a key mathematical tool.

Koopmans’ result for exponential discounting with deterministic consumption streams (Koopmans 1960, 1972; Koopmans et al. 1964) remains the most well known. A revised formulation of Koopmans’ result was proposed by Bleichrodt et al. (2008), using alternative conditions on preferences. A similar approach was also suggested by Harvey (1986). The axiomatic foundation of exponential discounting for the special case of a single dated outcome was presented by Fishburn and Rubinstein (1982).

In a non-stochastic framework with a discrete time space, quasi-hyperbolic discounting has been axiomatized by Olea and Strzalecki (2014).Footnote 1 Building on Bleichrodt et al. (2008) they provide three alternative sets of axioms. Olea and Strzalecki’s axiomatization will be discussed in more detail in Sect. 6.

All the axiomatization systems mentioned above are formulated for infinite consumption streams. The finite horizon case has rarely been discussed. For exponential discounting, however, it can be found in Fishburn (1970).

The second group of axiomatic systems considers stochastic consumption streams. To obtain an additive form, the fundamental representation theorem of von Neumann and Morgenstern (vNM) (1947) is used. The application of this approach to exponential discounting was given by Epstein (1983). A consumption stream is considered to be an outcome of a lottery. The axiomatization of quasi-hyperbolic discounting by Hayashi (2003) builds on Epstein’s (1983) axiom system. Both Hayashi and Epstein axiomatize preferences over infinite stochastic consumption streams.

In this paper, we work with preferences over streams of consumption lotteries, i.e., a setting in which there is a lottery in each period of time. In other words, we restrict Epstein and Hayashi’s framework to product measures. This framework allows us to apply Anscombe and Aumann’s (1963) result from Subjective Expected Utility Theory. The main advantage of this method is that it gives an opportunity to construct the discussed functional forms of discounting in a simpler way. Importantly, the present work establishes a unified treatment of exponential and quasi-hyperbolic discounting in both finite and infinite settings. With Fishburn (1982) and Harvey (1986) as the key sources of technical inspiration, our approach requires relatively simple axioms and facilitates proofs that are relatively straightforward.

2 Preliminaries

Assume that the objectives of a decision-maker can be expressed by a preference order \(\succcurlyeq \) on the set of alternatives \(X^n\), where n may be \(\infty \). Think of these alternatives as dated streams, for time periods \(t \in \{1, 2, \ldots , n \}\).Footnote 2 We say that a utility function \(U :X^n \rightarrow \mathbb {R}\) represents this preference order, if for all \(\mathbf{x}, \mathbf{y} \in X^n\), \(\mathbf{x} \succcurlyeq \mathbf{y}\) if and only if \(U(\mathbf{x}) \ge U(\mathbf{y})\).

We assume that X is a mixture set. That is, for every \(x, y \in X\) and every \(\lambda \in [0, 1]\), there exists \(x \lambda y \in X\) satisfying:

  • \(x1y=x\),

  • \(x \lambda y = y (1-\lambda ) x\),

  • \((x \mu y) \lambda y = x (\lambda \mu ) y\).

Since X is a mixture set, the set \(X^n\) is easily seen to be a mixture set under the following mixture operation: \(\mathbf{x} \lambda \mathbf{y} = (x_1 \lambda y_1, \ldots , x_n \lambda y_n)\), where \(\mathbf{x}, \mathbf{y} \in X^n\) and \(\lambda \in [0, 1]\).

The utility function \(u :X \rightarrow \mathbb {R}\) is called mixture linear if for every \(x, y \in X\) we have \(u(x \lambda y)=\lambda u(x) + (1-\lambda ) u(y)\) for every \(\lambda \in [0, 1]\).

The binary relation \(\succcurlyeq \) on \(X^n\) induces a binary relation (also denoted \(\succcurlyeq \)) on X in the usual way: for any \(x, y \in X\) the preference \(x \succcurlyeq y\) holds if and only if \((x, x,\ldots , x)\succcurlyeq (y, y,\ldots , y)\).

The function U is called a discounted utility function if

$$\begin{aligned} U(\mathbf{x})=\sum _{t=1}^n D(t)u(x_t), \end{aligned}$$

for some non-constant \(u :X \rightarrow \mathbb {R}\) and some \(D:\mathbb {N}\rightarrow \mathbb {R}\) with \(D(1)=1\). The function D is called the discount function. If u is mixture linear (and non-constant), then the function U is called a discounted expected utility function.

There are two types of discount functions which are commonly used in modeling of time preferences:

  • Exponential discounting: \(D(t)=\delta ^{t-1}\), where \(\delta \in (0,1)\) is called a discount factor.

  • Quasi-hyperbolic discounting:

    $$\begin{aligned} D(t) = \left\{ \begin{array}{ll} \ 1 &{} \text {if} \ t=1,\\ \ \beta \delta ^{t-1}&{}\text {if} \ t\ge 2. \end{array} \right. \end{aligned}$$

    for some \(\delta \in (0, 1)\) and \(\beta \in (0, 1]\).

The important characteristic of quasi-hyperbolic discounting is that it exhibits present bias. Present bias means that delaying two consumption streams from present (\(t=1\)) to the immediate future (\(t=2\)) can change the preferences of a decision-maker between these consumption streams.

The results of the recent experiments by Chark et al. (2015) show that decision-makers are decreasingly impatient within the near future; however, they discount the remote future at a constant rate. In other words, present bias may extend over the present moment \((t=1)\) to the near future \(t>2\), with a constant discount factor from some period T. This gives a further generalization of quasi-hyperbolic discounting, which we call semi-hyperbolic discounting:

$$\begin{aligned} D(t) = \left\{ \begin{array}{ll} \ 1 &{}\ \text {if} \ t=1,\\ \ \displaystyle \prod _{i=1}^{t-1}\beta _i\delta &{}\ \text {if} \ 1<t \le T,\\ \ \delta ^{t-T}\displaystyle \prod _{i=1}^{T-1}\beta _i\delta &{} \ \text {if } \ t>T. \end{array} \right. \end{aligned}$$

We use SH(T) to denote this discount function (for given \(\delta , \beta _1, \ldots , \beta _{T-1}\)). This form of discounting was previously applied to model the time preferences of a decision-maker in a consumption-savings problem (Young 2007). Our SH(T) specification is not quite the same as the notion of semi-hyperbolic discounting used in Olea and Strzalecki (2014). They apply the term to any discount function which satisfies \(D(t)=\delta ^{t-T}D(T)\) for all \(t>T\) (for some T). This class includes SH(T), but is wider. The possibility of generalizing quasi-hyperbolic discounting was earlier suggested by Hayashi (2003). The form of the discount function he proposed is:

$$\begin{aligned} D(t) = \left\{ \begin{array}{ll} \ 1 &{}\ \text {if} \ t=1,\\ \ \displaystyle \prod _{i=1}^{t-1}\beta '_i &{}\ \text {if} \ 1<t \le T,\\ \ \delta ^{t-T}\displaystyle \prod _{i=1}^{T-1}\beta '_i &{} \ \text {if } \ t>T. \end{array} \right. \end{aligned}$$

By substituting \(\delta \beta _t = \beta '_t\) for all \(t\le {T-1}\) it is not difficult to see that semi-hyperbolic discounting SH(T) coincides with the form suggested by Hayashi (2003). It is worth mentioning that he does not provide an axiomatization of this form of discounting, pointing out that this case is somewhat complicated. In our framework, however, the axiomatization of semi-hyperbolic discounting can be obtained as a relatively straightforward extension of the axiomatization of quasi-hyperbolic discounting.

The evidence of Chark et al. (2015) on extended present bias suggests the following restrictions on the coefficients in SH(T): \(\beta _1<\beta _2<\cdots <\beta _{T-1}\). In our version of SH(T) we will impose the weaker requirements \(\beta _1 \le \beta _2 \le \cdots \le \beta _{T-1}\), and \(\beta _t\in (0, 1]\) for all \(t\le T-1\) and \(\delta \in (0, 1)\). Imposing these restrictions gives some advantages, as it can be immediately seen that exponential and quasi-hyperbolic discounting are the special cases of semi-hyperbolic discounting: SH(1) is the exponential discount function, whereas SH(2) is the quasi-hyperbolic discount function.

Finally, another possible generalization of quasi-hyperbolic discounting for single dated outcomes was offered by Pan et al. (2015). The discount function they use is called two-stage exponential (TSE) discounting:

$$\begin{aligned} D(t) = \left\{ \begin{array}{ll} \ \alpha ^t &{}\ \text {if} \ t\le \lambda ,\\ \ (\frac{\alpha }{\beta })^{\lambda }\beta ^t &{}\ \text {if} \ t>\lambda , \end{array} \right. \end{aligned}$$

where \(t\in [0, T]\), \(\alpha , \beta \in [0, 1]\), and \(\lambda \in [0, T]\) is called a switch point. The key characteristics of TSE discounting are that it has a constant discount factor \(\alpha \) before a switch point \(\lambda \) and a constant discount factor \(\beta \) after a switch point \(\lambda \). The coefficient \((\frac{\alpha }{\beta })^{\lambda }\) is included to guarantee the continuity of the discount function. TSE discounting is given for continuous time. Note, however, that TSE discounting can be viewed as an alternative generalization of quasi-hyperbolic discounting and it is distinctively different from semi-hyperbolic discounting.

3 AA representations

We say that the preference order \(\succcurlyeq \) on \(X^n\) has an Anscombe and Aumann (AA) representation, if for every \(\mathbf{x}, \mathbf{y} \in X^n\):

$$\begin{aligned} \mathbf{x}\succcurlyeq \mathbf{y} \text { if and only if } \sum _{t=1}^ n w _tu(x_t) \ge \sum _{t=1}^ n w_tu(y_t), \end{aligned}$$

where \(u:X \rightarrow \mathbb {R}\) is non-constant and mixture linear and \(w_t \ge 0\) for each t with at least one \(w_t>0\). We also say that the pair \((u, \mathbf{w})\) provides an AA representation for \(\succcurlyeq \).

A pre-condition for obtaining discounting in an exponential or quasi-hyperbolic form is additive separability. In the framework of preferences over streams of lotteries, Anscombe and Aumann’s (1963) theorem provides axioms which give an additively separable representation when \(n < \infty \). Anscombe and Aumann formulated their result for acts rather than temporal streams. Here, states of the world are replaced by time periods.

3.1 Finite case \((n< \infty )\)

For \(n<\infty \) the following axioms are necessary and sufficient for an AA representation:

  • Axiom F1 (Weak order)  \(\succcurlyeq \) is a weak order on \(X^n\).

  • Axiom F2 (Non-triviality)  There exist some \({a, b} \in X\) such that

    $$\begin{aligned} (a, a, \ldots , a) \succ (b, b, \ldots , b). \end{aligned}$$
  • Axiom F3 (Mixture independence)  \(\mathbf{x}\succcurlyeq \mathbf{y}\) if and only if \(\mathbf{x} \lambda \mathbf{z}\succcurlyeq \mathbf{y} \lambda \mathbf{z}\) for every \(\lambda \in (0, 1)\) and every \(\mathbf{x}, \mathbf{y}, \mathbf{z} \in X^n\).

  • Axiom F4 (Mixture continuity)  For every \(\mathbf{x}, \mathbf{y}, \mathbf{z} \in X^n\) the sets \(\{ \alpha :\mathbf{x} \alpha \mathbf{z} \succcurlyeq \mathbf{y} \}\) and \(\{ \beta :\mathbf{y} \succcurlyeq \mathbf{x} \beta \mathbf{z} \}\) are closed subsets of the unit interval.

  • Axiom F5 (Monotonicity)  For every \(\mathbf{x}, \mathbf{y} \in X^n\) if \(x_t \succcurlyeq y_t\) for every t then \(\mathbf{x} \succcurlyeq \mathbf{y}\).

Theorem 1

(AA) The preferences \(\succcurlyeq \) on \(X^n\) satisfy axioms F1–F5 if and only if there exists an AA representation \((u, \mathbf{w})\) for \(\succcurlyeq \) on \(X^n\). Moreover, \((u^{\prime }, {\mathbf{w}}^{\prime })\) is another AA representation for \(\succcurlyeq \) on \(X^n\) if and only if there are some \(A>0\), some B and some \(C>0\) such that \(u^{\prime }=Au+B\) and \(\mathbf{w}^{\prime }=C \mathbf{w}\).

The proof of the theorem for the general mixture set environment can easily be constructed by combining the arguments in Fishburn (1982) and Ryan (2009). Evidently, the key axiom here is the condition of mixture independence. It is a strong axiom which imposes an additive structure.

3.2 Infinite case (\(n=\infty \))

Anscombe and Aumann’s result may be extended to the infinite horizon case. One possible extension is given by Fishburn (1982). However, we give a slightly modified version which incorporates ideas from Harvey (1986).

Fix some \(x_0\in X\). We refer to the same \(x_0\) throughout the rest of the paper. A consumption stream \(\mathbf{x}\) is called ultimately \(x_0\)-constant if there exists T such that \(\mathbf{x}=(x_1, \ldots , x_T, x_0, x_0, \ldots )\). Note the difference between this term and the related notion of an “ultimately constant” stream in Bleichrodt et al. (2008) and Olea and Strzalecki (2014), which does not fix the value at which consumption is ultimately constant. Let \(X_T\) be the set of ultimately \(x_0\)-constant consumption streams of length T. Denote the union of the sets \(X_T\) over all T as \(X^{*}\). Let \(X^{**}\) be the union of \(X^*\) and all constant streams. It is not hard to see that both \(X^*, X^{**} \subset X^\infty \) are mixture sets.

We must mention that the fixed \(x_0\) serves two purposes: first, it will be needed to state the convergence axiom; and second, it allows us to define the class \(X^*\) of ultimately \(x_0\)-constant streams in a way that makes them a strict subset of the usually defined class. Since some of the axioms only restrict preferences over \(X^{**}\) this second aspect confers some advantages.

  • Axiom I1 (Weak order)  \(\succcurlyeq \) is a weak order on \(X^\infty \).

  • Axiom I2 (Non-triviality)  There exist some \({a, b} \in X\) such that \(a \succ x_0 \succ b\).

Axiom I2 implies that \(x_0\) is an interior point with respect to preference. It restricts both \(\succcurlyeq \) and the choice of the fixed element \(x_0\).

  • Axiom I3 (Mixture independence)  \(\mathbf{x}\succcurlyeq \mathbf{y}\) if and only if \(\mathbf{x} \lambda \mathbf{z}\succcurlyeq \mathbf{y} \lambda \mathbf{z}\) for every \(\lambda \in (0, 1)\) and every \(\mathbf{x},\mathbf{y}, \mathbf{z} \in X^{**}\).

  • Axiom I4 (Mixture continuity)  For every \(\mathbf{x}, \mathbf{z} \in X^{**}\) and every \(\mathbf{y} \in X^\infty \) the sets \(\{ \alpha :\mathbf{x} \alpha \mathbf{z} \succcurlyeq \mathbf{y} \}\) and \(\{ \beta :\mathbf{y} \succcurlyeq \mathbf{x} \beta \mathbf{z} \}\) are closed subsets of the unit interval.

  • Axiom I5 (Monotonicity)  For every \(\mathbf{x}, \mathbf{y} \in X^\infty \): if \(x_t \succcurlyeq y_t\) for every t then \(\mathbf{x} \succcurlyeq \mathbf{y}\).

We have applied a weaker version of the monotonicity axiom in comparison with the interperiod monotonicity used by Fishburn. However, Axiom I5 is sufficient to obtain an AA representation.

For the statement of the next axiom we need to introduce some notation. Let \([a]_k=(x_0, \ldots , x_0, a, x_0, \ldots )\) where \(a\in X\) is in the kth position. Using this notation, we state the following axiom:

  • Axiom I6 (Convergence)  For every \(\mathbf{x} = (x_1, x_2, \ldots ) \in X^\infty \), every \(x^+, x^- \in X\) and every k:

    • if \([x^+]_k\succ [x_k]_k\) there exists \(T^+\ge k\) such that

      $$\begin{aligned} \mathbf{x} \preccurlyeq \mathbf{x}_{k,T}^+ \ \text {for all }T \ge T^+, \end{aligned}$$

      where \(\mathbf{x}_{k,T}^+=(x_1, x_2, \ldots , x_{k-1}, x^+, x_{k+1}, \ldots , x_T, x_0, x_0, \ldots )\);

    • if \([x^-]_k \prec [x_k]_k\) there exists \(T^- \ge k\) such that

      $$\begin{aligned} \mathbf{x} \succcurlyeq \mathbf{x}_{k,T}^- \ \text {for all }T \ge T^-, \end{aligned}$$

      where \(\mathbf{x}_{k,T}^-=(x_1, x_2, \ldots , x_{k-1}, x^-, x_{k+1}, \ldots , x_T, x_0, x_0, \ldots )\).

Our convergence axiom differs from Axiom B6, that was used by Fishburn:

  • Axiom B6  For some \(\hat{x}\in X\), every \(\mathbf{x}, \mathbf{y} \in X^\infty \) and every \(\lambda \in (0, 1)\):

    • if \(\mathbf{x} \succ \mathbf{y}\), then there exists T such that \((x_1, \ldots , x_n, \hat{x}, \hat{x}, \ldots ) \succcurlyeq \mathbf{x} \lambda \mathbf{y}\) for all \(n \ge T\);

    • if \(\mathbf{x} \prec \mathbf{y}\), then there exists T such that \((x_1, \ldots , x_n, \hat{x}, \hat{x}, \ldots ) \preccurlyeq \mathbf{x} \lambda \mathbf{y}\) for all \(n \ge T\).

Instead, Axiom I6 adapts ideas from Harvey (1986).Footnote 3 Axiom I6 is more appealing for our purposes as it not only guarantees the convergence of the AA representation, but also allows us to relax two axioms, mixture independence and mixture continuity, which are no longer required to hold on all of \(X^\infty \).

We thus obtain the following representation:

Theorem 2

(Infinite AA) The preferences \(\succcurlyeq \) on \(X^\infty \) satisfy axioms I1–I6 if and only if there exists an AA representation \((u, \mathbf{w})\) for \(\succcurlyeq \) on \(X^\infty \). Moreover, \((u^{\prime }, \mathbf{w}^{\prime })\) is another AA representation for \(\succcurlyeq \) on \(X^\infty \) if and only if there are some \(A>0\), some B and some \(C>0\) such that \(u=Au^{\prime }+B\) and \(\mathbf{w}=C \mathbf{w}^{\prime }\).

The proof of Theorem 2 is given in the Appendix. It combines elements of the arguments in Fishburn (1982), Harvey (1986) and Ryan (2009).

4 Discounted utility: finite case (\(n<\infty \))

4.1 Exponential discounting

Recall that a preference \(\succcurlyeq \) on \(X^n\) is represented by an exponentially discounted utility function if there exists a non-constant function \(u :X \rightarrow \mathbb {R}\) and a parameter \(\delta \in (0, 1)\) such that

$$\begin{aligned} U(\mathbf{x}) = \sum _{t=1}^n \delta ^{t-1}u(x_t). \end{aligned}$$

If u is mixture linear (and non-constant), then we say that the pair \((u, {\delta })\) provides an exponentially discounted expected utility representation.

Based on Theorem 1 it is easy to obtain such a representation. To do so, an adjustment of non-triviality and two additional axioms—impatience and stationarity—are required.

  • Axiom F2 \(^\prime \) (Essentiality of period 1)  There exist some \(a, b \in X\) and some \(\mathbf{x}\in X^n\) such that \((a, x_2, \ldots , x_n) \succ (b, x_2, \ldots , x_n)\).

  • Axiom F6 (Impatience)  For all \(a, b \in X\) if \(a\succ b\), then for all \(\mathbf{x} \in X^n\)

    $$\begin{aligned} (a, b, x_3,\ldots ,x_n) \succ (b, a, x_3, \ldots , x_n). \end{aligned}$$
  • Axiom F7 (Stationarity)  The preference (\(a, x_2,\ldots , x_n)\succcurlyeq (a, y_2,\ldots , y_n)\) holds if and only if \((x_2,\ldots , x_n, a)\succcurlyeq (y_2,\ldots , y_n, a)\) for every \(a\in X\) and every \(\mathbf{x}, \mathbf{y} \in X^n\).

It is not hard to see that essentiality of each period t follows from the essentiality of period 1 and the stationarity axiom.

Now the following result can be stated:

Theorem 3

(Exponential discounting) The preferences \(\succcurlyeq \) on \(X^n\) satisfy axioms F1, F2\(^\prime \), F3–F7 if and only if there exists an exponentially discounted expected utility representation \((u, {\delta })\) for \(\succcurlyeq \) on \(X^n\). Moreover, \((u^{\prime }, { \delta ^{\prime }})\) is another exponentially discounted expected utility representations for \(\succcurlyeq \) on \(X^n\) if and only if there are some \(A>0\) and some B such that \(u=Au^{\prime }+B\) and \(\delta =\delta ^\prime \).

Proof

It is straightforward to show that the axioms are implied by the representation. Conversely, suppose the axioms hold. Note that non-triviality follows from essentiality of period 1 and monotonicity.

By Theorem 1 we, therefore, know that \(\succcurlyeq \) has an AA representation \((u, \mathbf{w})\). Define \(\succcurlyeq ^{\prime }\) on \(X^{n-1}\) as follows:

$$\begin{aligned} (x_1,\ldots , x_{n-1})\succcurlyeq ^{\prime } (y_1,\ldots , y_{n-1}) \Leftrightarrow (x_0, x_1,\ldots , x_{n-1})\succcurlyeq (x_0, y_1,\ldots , y_{n-1}). \end{aligned}$$

Then \(\succcurlyeq ^{\prime }\) is represented by:

$$\begin{aligned} U^{\prime } (\mathbf{x}) =w_2 u(x_1)+\cdots +w_n u(x_{n-1}). \end{aligned}$$

Next, define \(\succcurlyeq ^{\prime \prime }\) on \(X^{n-1}\) as follows:

$$\begin{aligned} (x_1,\ldots , x_{n-1})\succcurlyeq ^{\prime \prime } (y_1,\ldots , y_{n-1}) \Leftrightarrow (x_1,\ldots , x_{n-1}, x_0)\succcurlyeq (y_1,\ldots , y_{n-1}, x_0). \end{aligned}$$

Then \(\succcurlyeq ^{\prime \prime }\) is represented by:

$$\begin{aligned} U^{\prime \prime } (\mathbf{x})=w_1 u(x_1)+\cdots +w_{n-1} u(x_{n-1}). \end{aligned}$$

According to stationarity, these preferences are equivalent \(\left( \succcurlyeq ^{\prime }\equiv \succcurlyeq ^{\prime \prime } \right) \) with two different AA representations (\(U^{\prime }\) and \(U^{\prime \prime }\)). Preference orders \(\succcurlyeq ^{\prime }\equiv \succcurlyeq ^{\prime \prime }\) satisfy the AA axioms on \(X^{n-1}\). Recall that \(w_t\) are unique up to a scale. Hence, \(w_{t+1}=\delta w_t\) for some \(\delta >0\) and it follows that

$$\begin{aligned} w_n=\delta w_{n-1}=\delta ^2 w_{n-2}=\cdots = \delta ^{n-t} w_t=\cdots =\delta ^{n-1} w_1. \end{aligned}$$

Since all periods are essential it is without loss of generality to set \(w_1=1\). Then we obtain the following representation for \(\succcurlyeq \) on \(X^n\):

$$\begin{aligned} U(\mathbf{x}) = \sum _{t=1}^n\delta ^{t-1}u(x_t), \quad \text {where}\, \delta >0. \end{aligned}$$

Since impatience holds: if \(a\succ b\), then

$$\begin{aligned} (a, b, x_3,\ldots , x_n)\succ (b, a, x_3,\ldots , x_n). \end{aligned}$$

From the representation it follows that:

$$\begin{aligned} u(a)+\delta u(b) > u(b)+\delta u(a), \end{aligned}$$

or, equivalently,

$$\begin{aligned} (1-\delta ) (u(a) - u(b)) > 0. \end{aligned}$$

As \(u(a)>u(b)\) , it is possible to conclude that \(\delta \in (0,1)\).

We now prove the uniqueness part of the theorem. Suppose that \((u, {\delta })\) and \((u^{\prime }, {\delta ^{\prime }})\) both provide exponentially discounted expected utility representations for \(\succcurlyeq \) on \(X^n\). We need to show that \(u=Au^{\prime }+B\) for some \(A>0\) and \(\delta =\delta ^{\prime }\). Indeed, since \((u, { \delta })\) and \((u^{\prime }, {\delta ^{\prime }})\) both provide AA representations for \(\succcurlyeq \) it follows that \(u=Au^{\prime }+B\) for some \(A>0\) and some B, and there is some \(C>0\) such that \(\delta ^{t-1}=C(\delta ^{\prime })^{t-1}\) for all t. Taking \(t=1\) we obtain \(C=1\), and hence \(\delta =\delta ^{\prime }\). The sufficiency of the uniqueness conditions follows by routine arguments. \(\square \)

4.2 Semi-hyperbolic discounting

A preference \(\succcurlyeq \) on \(X^n\) has an SH(T) discounted utility representation if there exists a non-constant function \(u :X \rightarrow \mathbb {R}\) and parameters \(\beta _1 \le \beta _2 \le \cdots \le \beta _{T-1}\), and \(\beta _t\in (0, 1]\) for all \(t\le T-1\) and \(\delta \in (0, 1)\) such that the following function represents \(\succcurlyeq \):

$$\begin{aligned} U(\mathbf{x})= & {} u(x_1)+\beta _1 \delta u(x_2)+\beta _1 \beta _2 \delta ^2 u(x_3)+\cdots +\beta _1 \beta _2\cdots \beta _{T-2} \delta ^{T-2}u(x_{T-1}) \\&+\;\beta _1 \beta _2\cdots \beta _{T-1} \sum _{t=T}^n \delta ^{t-1}u(x_t). \end{aligned}$$

If u is mixture linear (and non-constant), then the function U is called an SH(T) discounted expected utility representation. In this case, we say that \((u, \varvec{\beta }, \delta )\) provides an SH(T) discounted expected utility representation, where \(\varvec{\beta }=(\beta _1, \beta _2, \ldots , \beta _{T-1})\).

To obtain this form of discounting, a number of modifications to the set of axioms is required. A stronger essentiality condition should be used:

  • Axiom F2 \(^{\prime \prime }\) (Essentiality of periods \(1,\ldots , T\)) There exist some \(a, b \in X\) and some \(\mathbf{x}\in X^n\) such that for every \(t=1,\ldots , T\):

    $$\begin{aligned} (x_1, x_2, \ldots , x_{t-1}, a, x_{t+1}, \ldots , x_n) \succ (x_1, x_2, \ldots , x_{t-1}, b, x_{t+1}, \ldots , x_n). \end{aligned}$$

The impatience axiom, which is used to guarantee \(\delta \in (0, 1)\), should be restated for the periods T and \(T+1\):

  • Axiom F6 \(^{\prime }\) (Impatience) For every \(a, b \in X\) if \(a\succ b\), then for every \(\mathbf{x} \in X^n\):

    $$\begin{aligned} (x_1,\ldots , x_{T-1}, a, b, x_{T+2},\ldots ,x_n) \succ (x_1, \ldots , x_{T-1}, b, a, x_{T+2},\ldots ,x_n). \end{aligned}$$

The generalization requires relaxing the axiom of stationarity to stationarity from period T.

  • Axiom F7 \(^{\prime }\) (Stationarity from period T) The preference

    $$\begin{aligned} (x_1, \ldots , x_{T-1}, a, x_{T+1},\ldots , x_n)\succcurlyeq (x_1, \ldots , x_{T-1}, a, y_{T+1},\ldots , y_n) \end{aligned}$$

    holds if and only if

    $$\begin{aligned} (x_1, \ldots , x_{T-1}, x_{T+1},\ldots , x_n, a)\succcurlyeq (x_1, \ldots , x_{T-1}, y_{T+1},\ldots , y_n, a) \end{aligned}$$

    for every \(a \in X\) and every \(\mathbf{x} \in X^n\).

The axiom of present bias (Olea and Strzalecki (2014), Axiom 10) for the preference order on \(X^\infty \) involves trade-offs between two periods \(\{1, 2\}\).

  • Axiom (Present Bias (Olea and Strzalecki (2014), Axiom 10)) For every \(a, b, c, d, e \in X\) such that \(a\succ c, b\prec d\), for all \(\mathbf{x} \in X^\infty \):

    $$\begin{aligned} \text {if } (e, a, b, e, \ldots )\sim (e, c, d, e, \ldots ), \ \text {then} \ (a, b, e, \ldots ) \succcurlyeq (c, d, e, \ldots ). \end{aligned}$$

The present bias axiom can be described as follows. Suppose there are two equivalent consumption streams one of which has larger consumption at \(t=2\) but smaller consumption at \(t=3\) than the other, with consumptions at other periods being equal. Then if the consumption at period \(t=1\) is removed from both streams and both streams are shifted forward by one period, a decision-maker will prefer the stream with the bigger consumption at \(t=1\) but smaller consumption at \(t=2\), thus valuing present consumption (\(t=1\)) more highly. In our framework, this axiom can be adapted to the finite case and extended so that present bias may arise between any periods \(\{t, t+1\}\), where \(t\le T\). Suppose that there are two identical consumption streams that differ only in values at periods \(\{t, t+1\}\), where \(t\le T\). Early bias between \(\{t, t+1\}\) means that if the first stream has a bigger level of consumption at period t but smaller level of consumption at period \(t+1\) than the second stream, then shifting the consumption at period \(t-1\) to the last period and shifting all consumption from period t onwards forward by one period changes the preference in favour of the first consumption stream.

  • Axiom F8 (Early bias) For every \(a, b, c, d \in X\) such that \(a\succ c, b\prec d\), for all \(\mathbf{x} \in X^n\) and every \(t \le T\) if

    $$\begin{aligned}&(x_1, \ldots , x_{t-1}, a, b, x_{t+2}, \ldots , x_n)\sim (x_1, \ldots , x_{t-1}, c, d, x_{t+2}, \ldots , x_n), \ \text {then}\\&(x_1, \ldots , x_{t-2}, a, b, x_{t+2}, \ldots , x_n, x_{t-1}) \succcurlyeq (x_1, \ldots , x_{t-2}, c, d, x_{t+2}, \ldots , x_n, x_{t-1}). \end{aligned}$$

The early bias axiom is also referred to as the extended present bias axiom.

Theorem 4

(Semi-hyperbolic discounting) The preferences \(\succcurlyeq \) on \(X^n\) satisfy axioms F1, F2\(^{\prime \prime }\), F3, F4, F5, F6\(^{\prime }\), F7\(^{\prime }\), F8 if and only if there exists an SH(T) discounted expected utility representation \((u, \varvec{\beta }, \delta )\) for \(\succcurlyeq \) on \(X^n\). Moreover, \((u^{\prime }, {\varvec{\beta }}', \delta ^{\prime })\) is another SH(T) discounted expected utility representation for \(\succcurlyeq \) on \(X^n\) if and only if there are some \(A>0\) and some B such that \(u=Au^{\prime }+B\) and \(\delta =\delta ^\prime \), \(\varvec{\beta }={\varvec{\beta }}^\prime \).

Proof

It can be easily seen that the axioms are implied by the representation. Suppose that the axioms hold. As for Theorem 3, the conditions of AA representation are satisfied, so it follows that \(\succcurlyeq \) has an AA representation \((\mathbf{w}, u)\). Define \(\succcurlyeq '\) on \(X^{n-T}\) as follows:

$$\begin{aligned}&(x_{1},\ldots , x_{n-T})\succcurlyeq ' (y_{1},\ldots , y_{n-T})\\&\quad \Leftrightarrow (x_0, \ldots , x_0, x_{1},\ldots , x_{n-T})\succcurlyeq (x_0, \ldots , x_0, y_{1},\ldots , y_{n-T}). \end{aligned}$$

Then \(\succcurlyeq '\) is represented by:

$$\begin{aligned} U'(\mathbf{x})=w_{T+1} u(x_{1})+\cdots +w_n u(x_{n-T}). \end{aligned}$$

Next, define \(\succcurlyeq ''\) on \(X^{n-T}\) as follows:

$$\begin{aligned}&(x_{1},\ldots , x_{n-T})\succcurlyeq '' (y_{1},\ldots , y_{n-T})\\&\quad \Leftrightarrow (x_0, \ldots , x_0, x_{1},\ldots , x_{n-T}, x_0)\succcurlyeq (x_0, \ldots , x_0, y_{1},\ldots , y_{n-T}, x_0). \end{aligned}$$

Then \(\succcurlyeq ''\) is represented by:

$$\begin{aligned} U''(\mathbf{x})=w_T u(x_{1})+\cdots +w_{n-1} u(x_{n-T}). \end{aligned}$$

According to stationarity from period T, the preferences are equivalent \(\left( \succcurlyeq '\equiv \succcurlyeq ''\right) \) with two different AA representations (\(U'\) and \(U''\)).

Preference orders \(\succcurlyeq '\equiv \succcurlyeq ''\) satisfy the AA axioms on \(X^{n-T}\). Recall that \(w_t\) are unique up to a scale. Hence, as essentiality holds for all t (which follows from Axiom F2\(^{\prime }\) and Axiom F7\(^{\prime }\)), we have \(w_{t+1}=\delta w_t\) for some \(\delta >0\) and hence

$$\begin{aligned} w_n=\delta w_{n-1}=\delta ^2 w_{n-2}=\cdots = \delta ^{n-t} w_t=\cdots =\delta ^{n-T} w_T. \end{aligned}$$

Therefore, \(w_t=\delta ^{t-T}w_T\) for all \(t\ge T+1\). We, therefore, obtain the following representation for \(\succcurlyeq \):

$$\begin{aligned} U(\mathbf{x}) = w_1 u(x_1)+\cdots + w_{T-1} u(x_{T-1}) + w_T \sum _{t=T}^n \delta ^{t-T} u(x_t). \end{aligned}$$

Because of the essentiality of the first period and uniqueness of u up to affine transformations, the function

$$\begin{aligned} {\hat{U}}(\mathbf{x}) = u(x_1)+\frac{w_2}{w_1}u(x_2)+\cdots +\frac{w_{T-1}}{w_1}u(x_{T-1})+ \frac{w_T}{w_1} \sum _{t=T}^n \delta ^{t-T} u(x_t), \end{aligned}$$

provides an alternative representation for \(\succcurlyeq \) which will be used instead of \(U(\mathbf{x})\) further in the proof.

Note that

$$\begin{aligned}&\frac{w_3}{w_1}=\frac{w_3}{w_2} \cdot \frac{w_2}{w_1},\\&\cdots , \\&\frac{w_T}{w_1}=\frac{w_T}{w_{T-1}}\cdot \frac{w_{T-1}}{w_{T-2}} \cdot \ldots \cdot \frac{w_2}{w_1}. \end{aligned}$$

Let \(\gamma _{t-1}=\frac{w_t}{w_{t-1}}\) for all \(t\le T\). Therefore,

$$\begin{aligned}&\frac{w_2}{w_1}=\gamma _1, \\&\frac{w_3}{w_1}=\gamma _1 \gamma _2, \\&\cdots , \\&\frac{w_T}{w_1}=\gamma _1 \gamma _2 \ldots \gamma _{T-1}. \end{aligned}$$

With this notation:

$$\begin{aligned} {\hat{U}}(\mathbf{x}) = u(x_1)+\gamma _1 u(x_2)+\cdots + \gamma _1\cdots \gamma _{T-2} u(x_{T-1}) + \gamma _1\cdots \gamma _{T-1}\sum _{t = T}^n \delta ^{t-T} u(x_t). \end{aligned}$$

It is necessary to show that \(\gamma _{t-1}=\beta _{t-1}\delta \) with \(\beta _{t-1}\in (0,1]\) for all \(t\le T\).

Suppose that \(t=T\). Choose \(a, b, c, d \in X\) such that \(u(b)<u(d)\), \(u(a)>u(c)\) and

$$\begin{aligned} \gamma _1\cdots \gamma _{T-1} u(a)+\gamma _1\cdots \gamma _{T-1}\delta u(b)=\gamma _1\cdots \gamma _{T-1} u(c)+\gamma _1\cdots \gamma _{T-1}\delta u(d). \end{aligned}$$
(1)

Since essentiality is satisfied for each period we can rearrange the equation (1):

$$\begin{aligned} \delta = \frac{u(a)-u(c)}{u(d)-u(b)}. \end{aligned}$$
(2)

From (1) it also follows that

$$\begin{aligned} (x_1, \ldots , x_{T-1}, a, b, x_{T+2}, \ldots , x_n)\sim (x_1, \ldots , x_{T-1}, c, d, x_{T+2}, \ldots , x_n), \end{aligned}$$

Therefore, by the early bias axiom:

$$\begin{aligned}&(x_1, \ldots , x_{T-2}, a, b, x_{T+2}, \ldots , x_n, x_{T-1}) \\&\quad \succcurlyeq (x_1, \ldots , x_{T-2}, c, d, x_{T+2}, \ldots , x_n, x_{T-1}). \end{aligned}$$

Thus, we obtain:

$$\begin{aligned} \gamma _1\cdots \gamma _{T-2} u(a)+\gamma _1\cdots \gamma _{T-1} u(b) \ge \gamma _1\cdots \gamma _{T-2} u(c)+\gamma _1\cdots \gamma _{T-1} u(d). \end{aligned}$$

Since the essentiality condition is satisfied for each period we can rearrange this inequality:

$$\begin{aligned} \gamma _{T-1} \le \frac{u(a)-u(c)}{u(d)-u(b)}. \end{aligned}$$
(3)

Comparing (2) to (3), we conclude that \(\delta \ge \gamma _{T-1}\), therefore, \(\gamma _{T-1}=\beta _{T-1} \delta \), where \(\beta _{T-1}\in (0, 1]\).

Analogously, suppose that \(t=T-1\). Choose \(a', b', c', d' \in X\) such that \(u(b')<u(d')\), \(u(a')>u(c')\) and

$$\begin{aligned} \gamma _1\cdots \gamma _{T-2} u(a')+\gamma _1\cdots \gamma _{T-1} u(b') = \gamma _1\cdots \gamma _{T-2} u(c')+\gamma _1\cdots \gamma _{T-1} u(d'),\qquad \end{aligned}$$
(4)

where the last equality can be rewritten as follows (since essentiality is satisfied):

$$\begin{aligned} \gamma _{T-1}=\frac{u(a')-u(c')}{u(d')-u(b')}. \end{aligned}$$
(5)

Then (4), early bias and essentiality of each period imply that

$$\begin{aligned} \gamma _{T-2} \le \frac{u(a')-u(c')}{u(d')-u(b')}. \end{aligned}$$
(6)

It follows from (5) and (6) that \(\gamma _{T-2} \le \gamma _{T-1}\). Therefore, \(\gamma _{T-2} = \beta '_{T-2} \gamma _{T-1}\), where \(\beta '_{T-2}\in (0, 1]\). Recall that \(\gamma _{T-1}=\beta _{T-1} \delta \). Hence,

$$\begin{aligned} \gamma _{T-2} = \beta '_{T-2} \beta _{T-1} \delta =\beta _{T-2} \delta , \end{aligned}$$

where \(\beta _{T-2} = \beta '_{T-2} \beta _{T-1}\) and \(\beta _{T-2} \in (0,1]\) as both \(\beta '_{T-2}\in (0, 1]\) and \(\beta _{T-1}\in (0, 1]\). Note also that \(\beta _{T-2}\le \beta _{T-1}\).

Using the early bias axiom repeatedly for \(t<T-1\), we obtain \(\gamma _{t-1}=\beta _{t-1}\delta \) with \(\beta _{t-1}\in (0,1]\) for all \(t\le T\) and \(\beta _1 \le \beta _2 \le \cdots \le \beta _{T-1}\). Hence,

$$\begin{aligned} {\hat{U}}(\mathbf{x}) =&\, u(x_1)+\beta _1 \delta u(x_2)+\beta _1 \beta _2 \delta ^2 u(x_3)+\cdots +\beta _1 \beta _2\cdots \beta _{T-2} \delta ^{T-2}u(x_{T-1}) \\&+\beta _1 \beta _2\cdots \beta _{T-1}\sum _{t=T}^n \delta ^{t-1}u(x_t). \end{aligned}$$

To show that \(\delta \in (0,1)\) the impatience axiom should be applied. For every \(a, b \in X\) if \(a\succ b\), then for every \(\mathbf{x} \in X^n\)

$$\begin{aligned} (x_1,\ldots , x_{T-1}, a, b, x_{T+2},\ldots ,x_n) \succ (x_1, \ldots , x_{T-1}, b, a, x_{T+2},\ldots ,x_n). \end{aligned}$$

Then

$$\begin{aligned} \beta _1\cdots \beta _{T-1} \delta ^{T-1} u(a) \!+ \!\beta _1\cdots \beta _{T-1} \delta ^T u(b) \!> \!\beta _1\cdots \beta _{T-1} \delta ^{T-1} u(b) \!+\!\beta _1\cdots \beta _{T-1} \delta ^T u(a). \end{aligned}$$

Therefore, due to essentiality of each period:

$$\begin{aligned} (1-\delta )(u(a) - u(b)) > 0. \end{aligned}$$

Hence, \(\delta \in (0, 1)\).

We now prove uniqueness. Suppose that \((u, \varvec{\beta }, \delta )\) and \((u^{\prime }, {\varvec{\beta }}', \delta ^{\prime })\) both provide SH(T) discounted expected utility representations for \(\succcurlyeq \) on \(X^n\). Let D(t) and \(D^{\prime }(t)\) be semi-hyperbolic discount functions for given \(\varvec{\beta }, \delta \) and \({\varvec{\beta }}^{\prime }, \delta ^{\prime }\), respectively. Since \((u, \varvec{\beta }, \delta )\) and \((u^{\prime }, {\varvec{\beta }}', \delta ^{\prime })\) both provide AA representations for \(\succcurlyeq \), it follows that \(u=Au^{\prime }+B\) for some \(A>0\) and some B, and there is some \(C>0\) such that \(D(t)=C\cdot D^{\prime }(t)\) for all t. Taking \(t=1\) we obtain \(C=1\), and hence, letting \(t=2, 3, \ldots , T\) we get \(\beta _t \delta =\beta ^{\prime }_t \delta ^{\prime }\) for all \(t\le T\). Finally, letting \(t=T+1\) we conclude that \(\delta =\delta ^{\prime }\). Therefore, \(\varvec{\beta }={\varvec{\beta }}^{\prime }\). The sufficiency of the uniqueness conditions follows by routine arguments. \(\square \)

5 Discounted utility: infinite case (\(n=\infty \))

5.1 Exponential discounting

Based on the AA representation for the preferences over infinite consumption streams (Theorem 2), with some strengthening of non-triviality (Axiom I2) and the addition of a suitable stationarity axiom, discounting functions in an exponential form can be obtained. The impatience axiom is not needed since convergence (Axiom I6) plays its role.

  • Axiom I2 \(^\prime \) (Essentiality of period 1) There exist some \(a, b \in X\) such that \([a]_1 \succ x_0 \succ [b]_1\).

  • Axiom I7 (Stationarity) The preference \((a, x_1, x_2, \ldots ) \succcurlyeq (a, y_1, y_2, \ldots )\) holds if and only if \( (x_1, x_2, \ldots ) \succcurlyeq (y_1, y_2, \ldots )\) for every \(a \in X\) and every \(\mathbf{x}, \mathbf{y} \in X^\infty \).

Theorem 5

(Exponential discounting) The preferences \(\succcurlyeq \) on \(X^\infty \) satisfy axioms I1, I2\(^\prime \), I3–I7 if and only if there exists an exponentially discounted expected utility representation \((u, \delta )\) for \(\succcurlyeq \) on \(X^\infty \). Moreover, \((u^{\prime }, \delta ^{\prime })\) is another exponentially discounted expected utility representation for \(\succcurlyeq \) on \(X^{\infty }\) if and only if there are some \(A>0\), some B and some \(C>0\) such that \(u=Au^{\prime }+B\) and \(\delta =\delta ^{\prime }\).

Proof

The necessity of the axioms is straightforward. The proof of sufficiency follows the steps of the proof of Theorem 3 with \(n=\infty \). Applying Theorem 2 to the preferences satisfying the stationarity axiom, we obtain the representation:

$$\begin{aligned} U(\mathbf{x}) = \sum _{t=1}^\infty \delta ^{t-1}u(x_t), \end{aligned}$$

where \(\delta >0\) and \(\mathbf{x} \in X^\infty \).

Next, instead of using the impatience axiom as it is done in the finite case, the convergence axiom is applied. Take a constant stream \(\mathbf{a}=(a, a, \ldots )\), such that \(u(a) \ne 0\). Then,

$$\begin{aligned} U(\mathbf{a}) = \sum _{t=1}^\infty \delta ^{t-1}u(a) = u(a) \sum _{t=1}^\infty \delta ^{t-1}, \end{aligned}$$

Convergence requires \(\delta <1\). The proof of the uniqueness claims is analogous to Theorem 3. \(\square \)

5.2 Semi-hyperbolic discounting

The extension of semi-hyperbolic discounting to the case where \(n=\infty \) is easily obtained.

  • Axiom I2 \(^{\prime \prime }\) (Essentiality of periods \(1,\ldots , T\)) For some \(a, b \in X\) we have \([a]_t\succ x_0 \succ [b]_t\) for every \(t=1,\ldots , T\).

The generalization requires relaxing the axiom of stationarity to stationarity from period T.

  • Axiom F7 \(^{\prime }\) (Stationarity from period T) The preference

    $$\begin{aligned} (x_1, \ldots , x_{T-1}, a, x_{T+1},\ldots )\succcurlyeq (x_1, \ldots , x_{T-1}, a, y_{T+1},\ldots ) \end{aligned}$$

    holds if and only if

    $$\begin{aligned} (x_1, \ldots , x_{T-1}, x_{T+1},\ldots )\succcurlyeq (x_1, \ldots , x_{T-1}, y_{T+1},\ldots ) \end{aligned}$$

    for every \(a \in X\), and every \(\mathbf{x} \in X^{\infty }\).

As in the finite case, the addition of the early bias axiom is needed. Consider two consumption streams that differ only in values at periods \(\{t, t+1\}\), where \(t\le T\). Early bias between \(\{t, t+1\}\) means that if the first stream has more consumption at period t but less consumption at period \(t+1\) than the second stream, then dropping the consumption at \(t-1\) from both streams and advancing consumption from period t onwards by one period results in the first consumption stream being preferred to the second consumption stream.

  • Axiom I8 (Early bias) For every \(a, b, c, d \in X\) such that \(a\succ c, b\prec d\), and for all \(\mathbf{x} \in X^{\infty }\) and every \(t \le T\)

    $$\begin{aligned}&\ \text {if} \ (x_1, \ldots , x_{t-1}, a, b, x_{t+2}, \ldots )\sim (x_1, \ldots , x_{t-1}, c, d, x_{t+2}, \ldots ), \ \text {then}\\&(x_1, \ldots , x_{t-2}, a, b, x_{t+2}, \ldots ) \succcurlyeq (x_1, \ldots , x_{t-2}, c, d, x_{t+2}, \ldots ). \end{aligned}$$

Theorem 6

(Semi-hyperbolic discounting) The preferences \(\succcurlyeq \) on \(X^{\infty }\) satisfy axioms I1, I2\(^{\prime \prime }\), I3–I6, I7\(^{\prime }\), I8 if and only if there exists an SH(T) discounted expected utility representation \((u, \varvec{\beta }, \delta )\) for \(\succcurlyeq \) on \(X^{\infty }\). Moreover, \((u^{\prime }, {\varvec{\beta }}', \delta ^{\prime })\) is another SH(T) discounted expected utility representation for \(\succcurlyeq \) on \(X^{\infty }\) if and only if there are some \(A>0\) and some B such that \(u=Au^{\prime }+B\) and \(\delta =\delta ^\prime \), \(\varvec{\beta }={\varvec{\beta }}^\prime \).

Proof

The necessity of the axioms is obviously implied by the representation. The proof of sufficiency is analogous to the finite case. Applying Theorem 2 and stationarity from period T, we get the representation:

$$\begin{aligned} U(\mathbf{x}) = w_1 u(x_1)+\cdots + w_{T-1} u(x_{T-1}) + w_T \sum _{t=T}^{\infty } \delta ^{t-T} u(x_t). \end{aligned}$$

Next, dividing by \(w_1>0\) and introducing the notation \(\frac{w_{t}}{w_{t-1}}=\gamma _{t-1}>0\), where \(t\le T\), the representation becomes

$$\begin{aligned} {\hat{U}}(\mathbf{x}) = u(x_1)+\gamma _1 u(x_2)+\cdots + \gamma _1\cdots \gamma _{T-2} u(x_{T-1}) + \gamma _1\cdots \gamma _{T-1} \sum _{t = T}^{\infty } \delta ^{t-T} u(x_t). \end{aligned}$$

Using essentiality of each period and the early bias axiom repeatedly, we demonstrate that \(\gamma _{t-1}=\beta _{t-1}\delta \) with \(\beta _{t-1}\in (0,1]\) for all \(t\le T\) and \(\beta _1 \le \beta _2 \le \cdots \le \beta _{T-1}\). Therefore,

$$\begin{aligned} {\hat{U}}(\mathbf{x}) =&\, u(x_1)+\beta _1 \delta u(x_2)+\beta _1 \beta _2 \delta ^2 u(x_3)+\cdots +\beta _1 \beta _2\cdots \beta _{T-2} \delta ^{T-2}u(x_{T-1}) \\&+\beta _1 \beta _2\cdots \beta _{T-1}\sum _{t=T}^{\infty } \delta ^{t-1}u(x_t). \end{aligned}$$

Finally, to show that \(\delta \in (0, 1)\), take a constant stream \(\mathbf{a}=(a, a, \ldots )\), such that \(u(a)\ne 0\). Then,

$$\begin{aligned} \hat{U}(\mathbf{a}) =&\, u(a)+\beta _1 \delta u(a)+\cdots +\beta _1\cdots \beta _{T-2}\delta ^{T-2} u(a) + \beta _1\cdots \beta _{T-1} \sum _{t = T}^{\infty } \delta ^{t-1} u(a)\\ =&\, u(a) \left( 1+\beta _1 \delta +\cdots +\beta _1\cdots \beta _{T-2}\delta ^{T-2} + \beta _1\cdots \beta _{T-1} \sum _{t = T}^{\infty } \delta ^{t-1} \right) . \end{aligned}$$

Convergence requires \(\delta <1\).

The proof of the uniqueness claims is analogous to Theorem 4. \(\square \)

6 Discussion

A number of axiomatizations of exponential and quasi-hyperbolic discounting have been suggested by different authors. In fact, all the axiomatizations use different assumptions and there is no straightforward transformation from one type of discounting to another. In this paper, we provided an alternative approach to get a time separable discounted utility representation, showing that Anscombe and Aumann’s result can be exploited as a common background for axiomatizing exponential and quasi-hyperbolic discounting in both finite and infinite time horizons. In addition, we demonstrated that the axiomatization of quasi-hyperbolic discounting can be easily extended to SH(T).

A key distinguishing feature of our setup is the mixture set structure for X and the use of the mixture independence condition. An essential question, however, is whether mixture independence is normatively compelling in a time preference context, because states are mutually exclusive whereas time periods are not. It is worth mentioning that the temporal interpretation of the AA framework was also used by Wakai (2009) to axiomatize an entirely different class of preferences, which exhibit a desire to spread bad and good outcomes evenly over time.

Commonly, the condition of joint independence is used to establish additive separability in time preference models. Given \(A \subseteq T\), where \(T=\{1, \ldots , n\}\), and \(\mathbf{x}, \mathbf{y} \in X^n\), define \(\mathbf{x}_{A}{} \mathbf{y}\) as follows: \((x_{A}y)_t\) is \(x_t\) if \(t\in A\) and \(y_t\) otherwise. The preference order \(\succcurlyeq \) satisfies joint independence if for every \(A\subseteq T\) and for every \(\mathbf{x}, \mathbf{x}', \mathbf{y}, \mathbf{y}' \in X^n\):

$$\begin{aligned} \mathbf{x}_A\mathbf{y} \succcurlyeq \mathbf{x}'_A\mathbf{y} \text { if and only if } \mathbf{x}_A\mathbf{y}' \succcurlyeq \mathbf{x}'_A\mathbf{y}'. \end{aligned}$$

Joint independence is used to obtain an additively separable representation by Debreu (1960), so we will sometimes refer to it as a Debreu-type independence condition. It is known that mixture independence implies joint independence (Grant and Zandt 2009), but whether joint independence (with some other plausible conditions) implies mixture independence is yet to be determined.

In fact, we are not the first to use a mixture-type independence condition in the context of time preferences. Wakai (2009) also does so, though he uses the weaker form of constant independence introduced by Gilboa and Schmeidler (1989).

A version of the mixture independence condition can also be formulated in a Savage environment (Savage 1954) without objective probabilities, as discussed in Gul (1992). Olea and Strzalecki (2014) use precisely this version of mixture independence in one of their axiomatizations of quasi-hyperbolic discounting. For every \(x, y \in X\) let us write (xy) for \((x, y, y, \ldots )\in X^\infty \). Let \(m(x_1,y_1)\) denote some \(c \in X\) satisfying \((x_1, y_1) \sim (c, c)\). For any streams \((x_1, x_2)\) and \((z_1, z_2)\) the consumption stream \((m(x_1,z_1), m(x_2,z_2))\) is called a subjective mixture of \((x_1, x_2)\) and \((z_1, z_2)\). Olea and Strzalecki’s version of the mixture independence axiom (their Axiom I2) is as follows: for every \(x_1, x_2, y_1, y_2, z_1, z_2 \in X\) if \((x_1, x_2) \succcurlyeq (y_1, y_2)\), then

$$\begin{aligned} (m(x_1,z_1), m(x_2,z_2)) \succcurlyeq (m(y_1,z_1), m(y_2,z_2)) \end{aligned}$$

and

$$\begin{aligned} (m(z_1,x_1), m(z_2,x_2)) \succcurlyeq (m(z_1,y_1), m(z_2,y_2)). \end{aligned}$$

In other words, if a consumption stream \((x_1, x_2)\) is preferred to a stream \((y_1, y_2)\), then subjectively mixing each stream with \((z_1, z_2)\) does not affect the preference.

In their axiomatization of quasi-hyperbolic discounting Olea and Strzalecki invoke their mixture independence condition (Axiom 12) as well as Debreu-type independence conditions. The latter are used to obtain a representation in the form

$$\begin{aligned} \mathbf{x}\succcurlyeq \mathbf{y} \text { if and only if } u(x_1)+\sum _{t=2}^{\infty } \delta ^{t-1}v(x_t) \ge u(y_1)+\sum _{t=2}^{\infty } \delta ^{t-1}v(y_t), \end{aligned}$$

then their Axiom 12 is used to ensure \(v=\beta u\).Footnote 4

Hayashi (2003) and Epstein (1983) considered preferences over lotteries over consumption streams. In their framework \(X^\infty \) is the set of non-stochastic consumption streams, where X is required to be a compact connected separable metric space. Denote the set of probability measures on Borel \(\sigma \)-algebra defined on \(X^\infty \) as \(\Delta (X^\infty )\). It is useful to note that our setting is the restriction of the Hayashi and Epstein setup to product measures, i.e., to \(\Delta (X)^\infty \subset \Delta (X^\infty )\). The axiomatization systems by Hayashi and Epstein are based on the assumptions of expected utility theory. The existence of a continuous and bounded vNM utility index \(U :\Delta (X^\infty ) \rightarrow \mathbb {R}\) is stated as one of the axioms. A set of necessary and sufficient conditions for this is provided by Grandmont (1972), and includes the usual vNM independence condition on \(\Delta (X^\infty )\): for every \( \mathbf{x}, \mathbf{y}, \mathbf{z} \in \Delta (X^\infty )\) and any \(\alpha \in [0, 1]\), \(\mathbf{x} \sim \mathbf{y}\) implies \(\alpha \mathbf{x} + (1-\alpha ) \mathbf{z} \sim \alpha \mathbf{y} + (1-\alpha ) \mathbf{z}\).

Obviously, this independence condition is not strong enough to deliver joint independence of time periods, which is why additional assumptions of separability are needed. Two further Debreu-type independence conditions are required for exponential discounting:

  • independence of stochastic outcomes in periods \(\{1, 2\}\) from deterministic outcomes in \(\{3, 4, \ldots \}\),

  • independence of stochastic outcomes in periods \(\{2, 3, \ldots \}\) from deterministic outcomes in period \(\{1\}\).

To obtain quasi-hyperbolic discounting two additional Debreu-type independence conditions should be satisfied:

  • independence of stochastic outcomes in periods \(\{2, 3\}\) from deterministic outcomes in periods \(\{1\}\) and \(\{4, \ldots \}\),

  • independence of stochastic outcomes in periods \(\{3, 4, \ldots \}\) from deterministic outcomes in periods \(\{1, 2\}\).

It is easy to see that these axioms applied to the non-stochastic consumption streams are analogous to the Debreu-type independence conditions used in Bleichrodt et al. (2008) and Olea and Strzalecki (2014).

In summary, to get a discounted utility representation with the discount function in either exponential and quasi-hyperbolic form separability must be assumed. The mixture independence axiom appears to be a strong assumption, however, it gives the desired separability without the need for additional Debreu-type independence conditions.