1 Introduction

The problem of intertemporal choice arises when outcomes are received in different moments of time. Samuelson (1937) proposed to evaluate intertemporal prospects with discounted utility that is also known as constant or exponential discounting. However, already Samuelson (1937, p. 159) acknowledged that “… it is completely arbitrary to assume that the individual behaves so as to maximize [discounted utility]. This involves the assumption that at every instant of time the individual’s satisfaction depends only upon the consumption at that time, and that, furthermore, the individual tries to maximize the sum of instantaneous satisfactions reduced to some comparable base by time discount. As has been suggested, we might assume that the individual maximizes an integral which contains not only consumption per unit of time but also the rate of change of consumption per unit of time…”. This paper goes essentially in this direction by considering a decision maker who derives utility not only from current consumption but also from “residual” consumption in the previous moment of time.

“Complete arbitrariness” of Samuelson (1937) model became more acceptable in neoclassical microeconomic theory after Koopmans (1960) provided the preference foundation (axiomatization) of discounted utility. Yet, Koopmans (1960, p. 292) also was rather skeptical about the descriptive realism of the model: “… we are willing to postulate that the particular bundle of commodities to be consumed in the first period has no effect on the preference between alternative sequences of bundles in the remaining future, and conversely. One cannot claim a high degree of realism for such a postulate, because there is no clear reason why complementarity of goods could not extend over more than one time period.”

Baucells and Sarin (2007, p. 170) nicely summarize this descriptive problem: “Simply stated, consumption independence requires that the utility of current consumption does not depend on past consumption. It is easy to see that the utility of current consumption (spicy food today) may depend on past consumption (spicy food yesterday), especially when the time interval between periods is small. For some consumption goods, such as a vacation or a particular movie, consumption independence may not hold even when time periods are separated by as much as a year.” Baucells and Sarin (2007) proposed a generalization of discounted utility theory where a decision maker derives utility not only from current consumption but also from the “satiation level” of consumption, which is the cumulative discounted consumption in all previous periods. Our model overlaps with the model of Baucells and Sarin (2007) in two ways. First, in our model, only the consumption in the previous moment of time contributes to utility evaluation of consumption in the current moment whereas in Baucells and Sarin (2007) the consumption in all past moments of time contributes to utility evaluation of consumption in the current moment. Second, in our model utilities are discounted with weights that are essentially a quasi-hyperbolic discounting function whereas Baucells and Sarin (2007) use constant (exponential) discounting as in discounted utility.

Koopmans (1960, pp. 293–294) showed that constant (exponential) discounting in discounted utility is essentially due to the assumption of stationarity: “We … require that the preference ordering be the same as the ordering of corresponding programs obtained by advancing the timing of each future consumption vector by one period (and, of course, forgetting about the common first-period vector originally stipulated). This expresses the idea that the passage of time does not have an effect on preferences.” Thaler (1981, 202) was one of the first to show that the passage of time does, in fact, affect time preferences, which is known as dynamic inconsistency. Thaler (1981, 202) argued that a decision maker may prefer to consume one apple today over two apples tomorrow and have a reversed preference when both consumptions are delayed for 1 year. This particular descriptive limitation of discounted utility became known as the common difference effect (Loewenstein and Prelec 1992, section II.1, p.574). Quasi-hyperbolic discounting (Elster 1979; Laibson 1997; Phelps and Pollak 1968) generalizes constant discounting by allowing the discount factor between the present and the following moment of time to be different from constant discount factor thereafter. Inter alia this accounts for the common difference effect.

This paper shows that quasi-hyperbolic discounting emerges in a model of intertemporal choice with one period lag. If consumption in the previous moment of time contributes to utility evaluation of consumption in the current moment of time, we are forced to weaken Koopmans’ stationarity principle: if two consumption streams have the same consumption in the first and the second period then a decision maker’s preference between these streams does not change when all consumption is advanced by one period. Somewhat surprisingly, this weaker version of Koopmans’ stationarity results in essentially a quasi-hyperbolic discounting function.

Duesenberry (1952) pioneered the idea of current consumption being affected by “habits” of the past consumption. The models of habit formation (e.g., Pollak, 1970; Ryder and Heal, 1973) assume that habits are captured by the sum of (constantly) discounted consumption in all past moments of time. Becker and Murphy (1988) study the consumption of addictive products in the framework of habit formation. Loewenstein (1987) proposes a reversed model where current utility depends not only on the current consumption but also—on the “anticipatory” future consumption (constantly discounted sum of future consumption).

Wathieu (1997, 2004) proposed another model where a decision maker derives utility not only from current consumption but also from consumption in the past moments of time. Wathieu (1997, 2004) postulates that a decision maker derives utility from increment of the current utility over a reference point, which is a weighted sum of all past consumptions and the initial (given) reference point. Wathieu (1997, 2004) aggregates these utilities with constant (exponential) discounting.

Blavatskyy (2016) proposed another model where past consumption matters for utility evaluation of the current consumption. In rank-dependent discounted utility model, “ … a decision maker behaves as if maximizing the sum of discounted incremental utilities of future payoffs. A decision maker aggregates the stock of payoffs and subsequent future payoffs are evaluated by their contribution to the overall utility of this stock” (Blavatskyy (2016, p. 788). This model does not generalize Samuelson (1937) discounted utility—two models overlap only if utility function is linear. Blavatskyy (2016, p. 790) allows for non-constant discounting.

The remainder of the paper is organized as follows. Section 2 presents our proposed model. Section 3 details behavioral characterization (axiomatization) of this model. Section 4 applies the model to the problem of intertemporal consumption/savings. Section 5 concludes.

2 The model

Notation \(x\,\underline{\underline{{{\text{def}}}}} \left( {x_{0} , x_{1} , \ldots , x_{T} } \right) \in {\mathbb{R}}^{T + 1}\) denotes a stream of intertemporal payoffs xt ∊ ℝ that a decision maker receives in moments of time t ∊ {0, 1, …, T}, for some T ≥ 1. These payoffs are numbered in a chronological order, i.e., t = 0 denotes the present moment of time and t < s denotes the moment of time t ∊ {0,1,…,T-1} that precedes the moment of time s ∊ {1, …, T}. We consider objective time but the model could be adopted to a nonlinear subjective time perception (cf. Bradford et al. 2019).

A decision maker has a preference relation ≽ on the set of streams of intertemporal payoffs. As usual, the symmetric part of ≽ is denoted by ∼ and the asymmetric part of ≽ is denoted by ≻. The preference relation ≽ is represented by utility function U(.) if xy implies U(x) ≥ U(y) and vice versa for all streams x and y.

We consider preferences represented by the following utility function (1):

$$ U\left( x \right)\, = \,u\left( {\left( {1 - \alpha } \right)x_{0} } \right)\, + \,\mathop \sum \limits_{t = 1}^{T - 1} \frac{{\beta_{2}^{t - 1} }}{{\beta_{1}^{t - 2} }}u\left( {\left( {1 - \alpha } \right)x_{t} \, + \,\alpha x_{t - 1} } \right)\, + \,\beta_{T} u\left( {x_{T} \, + \,\alpha x_{T - 1} } \right) $$
(1)

In utility function (1), parameters β1, β2, and βT ∊ [0,1] denote (subjective) discount factors correspondingly in the first, second and last moment of time; parameter α ∊ [0,½] denotes the fraction of the consumption in the previous moment of time that contributes to utility evaluation of the current consumption; and u(.) denotes a continuous real-valued utility function that is generically unique up to a positive affine transformation. For simplicity, we assume zero consumption before the present moment of time (t = 0) and we assume that consumption in the last period fully contributes to the utility in the last period.

Samuelson (1937) discounted utility is a special case of utility function (1) when \(\beta_{2} = \beta_{1}^{2}\), \(\beta_{T} = \beta_{1}^{T}\), and α = 0. Quasi-hyperbolic discounting is a special case of utility function (1) when \(\beta_{1} = \delta \beta\), \(\beta_{2} = \delta \beta^{2}\), \(\beta_{T} = \delta \beta^{T}\), and α = 0.

The intuition behind model (1) is that α ∊ [0, ½] fraction of the consumption in the previous moment of time carries on as the “residual” consumption to the subsequent moment of time. One could reasonably argue that fraction α is relatively small (large) when moments of time are temporally far away from each other (close to each other).

The main motivation for model (1) is to explain well-documented violations of independence when a decision maker exhibits intertemporal substitution, complementarity or wealth effects. For example, Loewenstein (1987) and Loewenstein and Prelec (1993, example 3, p. 95) found that subjects prefer to go to a fancy lobster dinner two weekends after a fancy dinner at a French restaurant rather than the next weekend after the weekend when they had a fancy French dinner. Indeed, if a fancy French dinner yields “residual savoring” even one weekend after it happened, under the conventional assumption of diminishing marginal utility, adding a fancy lobster dinner to this lagged satisfaction from the previous weekend may not be as rewarding as postponing the lobster dinner one more weekend away (when there is no more “residual savoring”).

Urminsky and Kivetz (2011) report a violation of independence, which they label the “mere token” effect. Urminsky and Kivetz (2011) found that subjects reveal a preference for stream A that yields x in 1 day plus $900 in 1 year over stream B that yields x in 1 day plus $300 in 1 week as x increases from zero to $10. If t = 0 denotes tomorrow, t = 1 denotes a moment of time in 1 week and T denotes a moment of time in 1 year, then utility (1) of the first stream is given by \(U\left( A \right) = u\left( {\left( {1 - \alpha } \right)x} \right) + \beta_{1} u\left( {\alpha x} \right) + \beta_{T} u\left( {900} \right)\) and utility of the second stream is given by \(U\left( B \right) = u\left( {\left( {1 - \alpha } \right)x} \right) + \beta_{1} u\left( {300\left( {1 - \alpha } \right) + \alpha x} \right) + \beta_{2} u\left( {300\alpha } \right)\). If α = 0 then increasing x from zero to $10 increases the utility of both streams by the same amount \(u\left( x \right)\). Yet, if α ≠ 0, utility of the first stream increases by \(u\left( {\left( {1 - \alpha } \right)x} \right) + \beta_{1} u\left( {\alpha x} \right)\) whereas utility of the second stream increases only by \(u\left( {\left( {1 - \alpha } \right)x} \right) + \beta_{1} u\left( {300\left( {1 - \alpha } \right) + \alpha x} \right) - \beta_{1} u\left( {300\left( {1 - \alpha } \right)} \right)\). Thus, increasing x from zero to $10 increases the utility of A more than the utility of B if \(u\left( {\alpha x} \right) + u\left( {300\left( {1 - \alpha } \right)} \right) > u\left( {300\left( {1 - \alpha } \right) + \alpha x} \right)\). This inequality is always satisfied for a concave utility function u(.).

Read and Scholten (2012, experiment 1) report another violation of independence in intertemporal choice, sometimes known as the common consequence effect (e.g., Scholten et al. 2016, p.1199). Their subjects prefer to receive $300 today rather than $400 in 50 weeks but this preference diminishes when both options offer extra $350 to be received in 4 weeks. Similarly to the previous example, if $300 today yields some residual satisfaction even after 4 weeks and Bernoulli utility function over money is concave then receiving $350 in 4 weeks is less rewarding (in terms of utility) in the first option compared to the second option.

3 Behavioral characterization

Utility function (1) can be described as generalized additively separable utility. Axiomatic characterization of additively separable utility is relatively well known so we shall employ the results from the existing literature. First, as for any real-valued utility function, we need to assume that preferences over intertemporal streams are rational.

Axiom 1 (Completeness) For any two streams x and y either xy or yx (or both).

Axiom 2 (Transitivity) For any three streams x, y, and z if xy and yz then xz.

Next, we assume that preferences are continuous. We use the continuity axiom 3 because it is relatively well known. Alternatively, we can also derive additively separable utility representation by assuming two implications of continuity that are known as solvability and Archimedean axiom (cf. Wakker 1988; Köbberling and Wakker 2003, p. 398).

Axiom 3 (Continuity) For any stream x the sets {\( {\varvec{z}} \in {\mathbb{R}}^{T\, + \,1}\): xz} and {\( {\varvec{z}} \in {\mathbb{R}}^{T\, + \,1}\): zx}are closed with respect to the product topology on \({\mathbb{R}}_{{}}^{T + 1}\).

One of the traditional preference characterizations of additively separable utility is to assume tradeoff consistency (Wakker 1984, 1989) that is also known as Reidemeister closure condition in geometry (Blaschke and Bol 1938). Blavatskyy (2013) provided a new preference characterization of additively-separable utility that relies on a weaker axiom known as cardinal independence or standard sequence invariance (e.g., Krantz et al. 1971, Section 6.11.2). To introduce this axiom, we need the following notation. Let atx denote a stream that results from stream x when we replace outcome xt with outcome a for some moment of time t. A moment of time t is null (or inessential) if atxbtx for any two outcomes a and b and any stream x. Otherwise, a moment of time is nonnull (or essential). Finally, for any stream x and constant α ∊ [0, ½] let \(x^{\alpha } \,\underline{\underline{{{\text{def}}}}} \,\left( {\left( {1 - \alpha } \right)x_{0} , \left( {1 - \alpha } \right)x_{1} \, + \,\alpha x_{0} , \ldots , x_{T} \, + \,\alpha x_{T - 1} } \right)\).

Axiom 4 (Cardinal Independence) There is α ∊ [0, ½] such that whenever \(a_{t} {\varvec{x}}^{{\varvec{\alpha}}}\)\(b_{t} {\varvec{y}}^{{\varvec{\alpha}}}\), \(a_{t} {\varvec{y}}^{{\varvec{\alpha}}}\)\(c_{t} {\varvec{x}}^{{\varvec{\alpha}}}\) , and \(b_{s} {\varvec{z}}^{{\varvec{\alpha}}}\)\(a_{s} {\varvec{w}}^{{\varvec{\alpha}}}\) then \(a_{s} {\varvec{z}}^{{{\varvec{\upalpha}}}}\)\(c_{s} {\varvec{w}}^{{{\varvec{\upalpha}}}}\) for any three outcomes a, b, and c; any streams x, y, z, and w; any nonnull time period t and any time period s.

Intuitively, if replacing outcome a with outcome c in stream \(a_{t} {\varvec{x}}^{{\varvec{\alpha}}}\) and replacing outcome b with outcome a in stream \(b_{t} {\varvec{y}}^{{\varvec{\alpha}}}\) reverses the preference between these two streams, then the same replacement should not change the preference between streams \(b_{s} {\varvec{z}}^{{\varvec{\alpha}}}\) and \(a_{s} {\varvec{w}}^{{\varvec{\alpha}}}\) if the decision maker already prefers the former to the latter.

Proposition 1 A preference relation ≽ satisfies axioms 1–4 if and only if it admits representation (2), where βt ∊ [0,1] for all t ∊ {0, …, T}, and function u: ℝ → ℝ is continuous. Function u(.) is unique up to a positive affine transformation if at least two moments of time are nonnull.

$$ U\left( x \right)\, = \,\mathop \sum \limits_{t = 0}^{T} \beta_{t} u\left( {x_{t}^{\alpha } } \right) $$
(2)

The proof is presented in the appendix.

In models of intertemporal choice, it is conventional to assume that utility of any outcome in the present moment of time t = 0 is not discounted, i.e., we have constant β0 = 1. Koopmans (1960, postulate 2, p. 291) assumed that the present moment of time is nonnull.Footnote 1 Under this assumption, constant β0 in utility representation (2) cannot be zero. Since utility is unique up to a positive affine transformation, we can divide all constants βt in (2) by β0 ≠ 0 to obtain a conventional utility \(U\left( \user2{x} \right) = u\left( {x_{0}^{\alpha } } \right) + \sum\nolimits_{{t\, = 1\,}}^{T} {\beta '_{t} u\left( {x_{t}^{\alpha } } \right)} \), where \(\beta ^{\prime}_{t} = \beta_{t} /\beta_{0}\).

According to the stationarity axiom, if two streams have the same consumption in the current moment t = 0 then a decision maker’s preference between these streams does not change when all consumption is advanced by one moment of time and consumption in t = 0 is shifted to the last moment of time (e.g., Krantz et al., 1971, Definition 15, p. 304).

Axiom 5 (Stationarity) If \(\left( {x_{0} , x_{1} , \ldots , x_{T} } \right)\)\(\left( {x_{0} , y_{1} , \ldots , y_{T} } \right)\) then \(\left( {x_{1} , \ldots , x_{T} ,x_{0} } \right)\)\(\left( {y_{1} , \ldots , y_{T} ,x_{0} } \right)\)

In the context of utility function (1), stationarity axiom 5 is less appealing since consumption in the current moment has a “residual” effect on utility evaluation of consumption in the subsequent moment of time. However, one could reasonably argue that consumption in the current moment is irrelevant for preference between intertemporal streams that happen to have the same consumption in both periods t = 0 and t = 1. Hence, we propose a new axiom 6 that weakens classic stationarity axiom 5.

Axiom 6 (Restricted stationarity) If \(\left( {x_{0} , x_{1} , x_{2} , \ldots , x_{T - 1} ,x_{T} } \right)\)\(\left( {x_{0} , x_{1} ,y_{2} , \ldots , y_{T - 1} ,x_{T} } \right)\) then \(\left( {x_{1} , x_{2} , \ldots , x_{T - 1} ,x_{T} ,x_{0} } \right)\)\(\left( {x_{1} ,y_{2} , \ldots , y_{T - 1} ,x_{T} ,x_{0} } \right)\)

Proposition 2 A preference relation ≽ satisfies axioms 1–4 and 6 if and only if it admits representation (1) with uniqueness results the same as in Proposition 1.

The proof is presented in the Appendix.

4 Application: intertemporal consumption/savings

Let us consider the classic problem of intertemporal consumption/savings. A decision maker is endowed with income Y > 0 at the present moment t = 0 and decides how to consume/save this income across T + 1 moments of time. In all but the last moment of time, any saved income that is not consumed is transferred to the subsequent moment of time earning interest rate R > 1. Any income that is not consumed at the last moment of time t = T perishes. Thus, a decision maker solves optimization problem \(\mathop {\max }\limits_{{x_{0} , \ldots ,x_{T} }} U\left( {x_{0} , \ldots ,x_{T} } \right)\) subject to the budget constraint \(x_{T} = YR^{T} - \mathop \sum \limits_{t = 0}^{T - 1} x_{t} R^{T - t}\).

In the classic case of discounted utility (i.e., when \(\beta_{2} = \beta_{1}^{2}\), \(\beta_{T} = \beta_{1}^{T}\), and α = 0), assuming a differentiable utility function, the first-order condition is \(u^{\prime}\left( {x_{t} } \right) = R\beta_{1} u^{\prime}\left( {x_{t + 1} } \right)\). For example, in case of constant relative risk aversion (CRRA) utility function \(u\left( {x_{t} } \right) = x_{t}^{1 - \theta } /\left( {1 - \theta } \right)\) when θ ≠ 1 and \(u\left( {x_{t} } \right) = \ln x_{t}\) when θ = 1, the closed form solution is given by

$$ \begin{array}{*{20}c} {x_{t} \, = \,Y\frac{{\left( {R\beta_{1} } \right)^{{\frac{t}{\theta }}} }}{{\mathop \sum \nolimits_{s = 0}^{T} R^{ - s} \left( {R\beta_{1} } \right)^{{\frac{s}{\theta }}} }},} & {t \in \left\{ {0,1, \ldots ,T} \right\}} \\ \end{array} $$

Figure 1 illustrates this closed form solution for several parameterizations.

Fig. 1
figure 1

Optimal consumption path under discounted CRRA utility for several values of Rβ1 and coefficient of relative risk aversion θ

In case of quasi-hyperbolic discounting (i.e., when \(\beta_{1} = \delta \beta\), \(\beta_{2} = \delta \beta^{2}\), \(\beta_{T} = \delta \beta^{T}\), and α = 0), the first-order condition is \(u^{\prime}\left( {x_{t} } \right) = R\beta u^{\prime}\left( {x_{t + 1} } \right)\) for all \(t \in \left\{ {1, \ldots ,T} \right\}\) and \(u^{\prime}\left( {x_{0} } \right) = R\delta \beta u^{\prime}\left( {x_{1} } \right)\). For example, in case of CRRA utility function the closed form solution is given by

$$ \begin{array}{*{20}c} {x_{t} \, = \,Y\frac{{\delta^{{\frac{1}{\theta }}} \left( {R\beta } \right)^{{\frac{t}{\theta }}} }}{{1 + \delta^{{\frac{1}{\theta }}} \mathop \sum \nolimits_{s = 1}^{T} R^{ - s} \left( {R\beta } \right)^{{\frac{s}{\theta }}} }},} & {t \in \left\{ {1, \ldots ,T} \right\}} \\ \end{array} $$
$$ x_{0} \, = \,\frac{Y}{{1\, + \,\delta^{{\frac{1}{\theta }}} \mathop \sum \nolimits_{s = 1}^{T} R^{ - s} \left( {R\beta } \right)^{{\frac{s}{\theta }}} }} $$

Figure 2 illustrates this closed form solution for several parameterizations. Essentially, under quasi-hyperbolic discounting the optimal consumption path resembles that of discounted utility (cf. Figure 1) except that consumption in the present moment of time is boosted when the present bias δ is less than one. Except for the present moment t = 0, optimal consumption path is constant/increasing/decreasing when parameter happens to be equal to one/greater than one/less than one. The lower is the coefficient of relative risk aversion the faster the optimal consumption increases or decreases.

Fig. 2
figure 2

Optimal consumption path under quasi-hyperbolic discounting with CRRA utility for several values of Rβ and coefficient of relative risk aversion θ with δ = 0.9

In the general case when consumption in the previous moment of time contributes to decision maker’s satisfaction in the subsequent moment of time (i.e., when α ≠ 0), the first-order conditions are given by

$$ \left( {1 - \alpha } \right)u^{\prime}\left( {\left( {1 - \alpha } \right)x_{0} } \right)\, + \,\alpha \beta_{1} u^{\prime}\left( {\left( {1 - \alpha } \right)x_{1} \, + \,\alpha x_{0} } \right)\, = \,\lambda R^{T} $$
$$ \begin{array}{*{20}c} {\left( {1 - \alpha } \right)\frac{{\beta_{2}^{t - 1} }}{{\beta_{1}^{t - 2} }}u^{\prime}\left( {\left( {1 - \alpha } \right)x_{t} \, + \,\alpha x_{t - 1} } \right)\, + \,\alpha \frac{{\beta_{2}^{t} }}{{\beta_{1}^{t - 1} }}u^{\prime}\left( {\left( {1 - \alpha } \right)x_{t + 1} \, + \,\alpha x_{t} } \right)\, = \,\lambda R^{T - t} ,} & {t \in \left\{ {1, \ldots ,T - 1} \right\}} \\ \end{array} $$

where \(\lambda = \beta_{T} u^{\prime}\left( {x_{T} + \alpha x_{T - 1} } \right)\) is the Lagrange multiplier.

Figure 3 illustrates the solution for CRRA utility function with θ = 0.5 (square root) and share α = 0.1. When 1 = 1 under constant discounting (solid black line on Fig. 3) or, more generally, when 2/β1 = 1 under quasi-hyperbolic discounting (grey line on Fig. 3) optimal consumption remains on the same level except for the present and the last moment of time. In the present moment of time, consumption is higher due to our simplifying assumption that there is no “carryover” consumption from the past at the initial moment of time t = 0. In the last moment of time, consumption is lower due to our assumption that there is no “carryover” effect in the last moment. As before, optimal consumption path is asymptotically increasing (decreasing) over time when parameter 2/β1 is greater (smaller) than one.

Fig. 3
figure 3

Optimal consumption path with CRRA utility function (θ = 0.5) and share α = 0.1

Optimal consumption path is monotone under classic discounted utility (cf. Figure 1) as well as under quasi-hyperbolic discounting, except for the initial moment of time (cf. Figure 2). In contrast, under model (1) with positive share α optimal consumption path is cyclic. Figure 4 illustrates optimal consumption path for CRRA utility function with θ = 0.5 (square root) and a relatively large share α = 1/3, interest rate R = 1.02 and constant discounting with β1 = 0.95/R (solid curve) as well as quasi-hyperbolic discounting with the present bias δ = 0.9, β1 = 0.95δ/R (dashed curve). In the present moment of time, a decision maker significantly increases consumption since there is no “savoring” of any past consumption at t = 0. As before, this effect is catalyzed even more under quasi-hyperbolic discounting. Yet, as Fig. 4 illustrates,Footnote 2 consumption in the next moment t = 1 drastically drops only to increase again in moment t = 2 and so on. Thus, the model with utility satisfaction from both instantaneous consumption as well as a share of consumption in the previous moment of time generates consumption cycles on the optimal consumption path. Arguably, this is a more realistic representation of actual consumer behavior compared to generically monotone paths.

Fig. 4
figure 4

Optimal consumption path with CRRA utility function (θ = 0.5) and share α = 1/3

In the classic discounted utility or quasi-hyperbolic discounting model, optimal consumption path is a generalized parabola (under CRRA utility) since the decision maker attempts to smooth marginal utility of consumption across all time periods. In contrast, consumption changes at nearly linear rate (after the decay of initial consumption cycles) when consumption in the previous moment of time has a carryover effect on utility satisfaction from consumption in the subsequent moment of time. This carryover effect gives the decision maker a different channel of intertemporal consumption smoothing—he or she no longer needs to seek ever higher/lower consumption to equilibrate marginal utilities of consumption across time periods.

5 Conclusion

Discounted utility and many of its generalizations assume that intertemporal choice decisions are independent of any outcome that all available choice alternatives yield in the same moment of time (e.g., Bleichrodt et al. 2008, p. 342). This resembles the independence axiom of expected utility theory in choice under risk. The independence axiom itself was challenged by numerous behavioral paradoxes (e.g., Allais 1953) but the assumption of independence in intertemporal choice is even more problematic. While in choice under risk the independence is assumed between possible mutually exclusive outcomes, in intertemporal choice the independence is assumed between outcomes that are consumed one after another (e.g., Frederick et al. 2002, Section 3.3, p. 357). Clearly, one can reasonably assume substitution or complementarity effects in the latter case. Already the founding father of discounted utility Samuelson (1952, p. 674) acknowledged that wine consumption yesterday may directly influence the consumption of wine and milk today. As another example, a decision maker who chose to go on vacation to France rather than Hawaii last year may have exactly the opposite preference this year to avoid repetition (Blavatskyy 2016, p. 809).

This paper generalizes classic discounted utility to allow for the possibility that consumption in the previous moment of time has a certain carryover effect on utility satisfaction from consumption in the subsequent moment of time. We limit analysis to only one period lag since the most recent consumption arguably has the strongest effect on the evaluation of the subsequent consumption. Weakening the independence assumption of classic discounted utility also necessitates weakening of the stationarity principle of discounted utility. If consumption in the current moment influences utility satisfaction in the subsequent moment, then one cannot reasonably argue that common consumption in the current moment can be dropped altogether unless consumption is also the same in the subsequent moment of time. Somewhat surprisingly, such weakening of the stationarity principle results in essentially a quasi-hyperbolic discounting function. Thus, our proposed model improves the descriptive realism of discounted utility simultaneously from two sides: by weakening the independence assumption between intertemporal outcomes and by weakening the constant discounting of utilities of these outcomes. This illustrates that improving one specific descriptive property of a model can have a multiplier synergy effect on the other building blocks of the same model. The proposed model can rationalize violations of independence such as the “mere token” effect (Urminsky and Kivetz, 2011) or the common consequence effect (Scholten et al. 2016, p.1199) as well as violations of stationarity (the present bias).

Applying the proposed model to standard consumption/savings problem reveals that the optimal consumption path exhibits decaying consumption cycles. In contrast, for example, the model of Baucells and Sarin (2007, Fig. 3, p. 176) generates a generically monotone optimal consumption path (except for the initial and terminal period effects). Consumption cycles become more pronounced the larger is the fraction of “residual” consumption that contributes to the satisfaction from consumption in the subsequent moment of time. Apart from consumption cycles, the proposed model has similar implications as most of the literature: quasi-hyperbolic discounting boosts consumption in the initial period and optimal consumption path is asymptotically increasing (decreasing) when gross interest rate multiplied by a discount factor is greater (smaller) than one. This result could be interpreted as a positive message for standard microeconomic theory. Classic discounted utility theory predicts optimal consumption path that, in general terms, is coherent with the optimal path generated by more descriptively accurate models of intertemporal choice.

Like any model, our proposed model also has its limitations, and it cannot rationalize certain choice patterns. For example, it cannot rationalize a preference for decreasing streams. Loewenstein and Sicherman (1991) found that many museum visitors prefer (hypothetical) increasing streams of wages over constant or decreasing streams of wages with the same cumulative payoff. However, Gigliotti and Sopher (1997, Table III, p. 51) reported that subjects choose a decreasing stream over a constant stream and the latter—over an increasing stream of real monetary payoffs (with the same cumulative payoff).Footnote 3 Manzini et al., (2010, p. 338) also found that “a majority of subjects prefers decreasing to increasing sequences” when choosing between streams of real monetary payoffs.

The proposed model respects the basic consequentialist premise. It cannot account for “delay-speed up” asymmetry when subjects reveal a willingness to pay for a quicker delivery significantly lower than their willingness to accept a compensation for delayed delivery (of the same product over the same time period) documented in Loewenstein (1988). Similarly, the proposed model cannot rationalize the hidden-zero effect (Magen et al. 2008; Read and Scholten 2012; Scholten et al. 2016, p. 1178) when revealed preferences are affected by explicit presentation of zero outcomes.

This paper assumes that decision makers consider objective (chronological) time. A promising descriptive extension of this model is to consider nonlinear time perception. Takahashi (2005) demonstrates that a classical discounted utility maximizer, who transforms objective (chronological) time into subjective (mental) time with a logarithmic function (according to the Weber-Fechner law), behaves as a hyperbolic discounter. Ebert and Prelec (2007) and Killeen (2009) consider a decision maker with a diminishing sensitivity to time, who transforms objective time through a power function. Similarly, Kim and Zauberman (2009) and Zauberman et al. (2009) consider a decision maker who maps objective time into subjective (mental) time with a power function (according to Stevens’ power law).

Utility function (1) assumes that consumption in the preceding moment of time affects consumption in the subsequent moment of time. Alternatively, we could consider a model where the utility of consumption in the preceding moment of time has a “residual” influence on consumption in the subsequent moment of time, cf. utility function (3):

$$ U\left( x \right)\, = \,\left( {1 - \alpha } \right)u\left( {x_{0} } \right)\, + \,\beta_{1} \left( {1 - \alpha } \right)u\left( {x_{1} \, + \,u^{ - 1} \left[ {\alpha u\left( {x_{0} } \right)} \right]} \right)\, + \,\beta_{2} \left( {1 - \alpha } \right)u\left( {x_{2} \, + \,u^{ - 1} \left[ {\alpha u\left( {x_{1} \, + \,u^{ - 1} \left[ {\alpha u\left( {x_{0} } \right)} \right]} \right)} \right]} \right) + \ldots $$
(3)

An important advantage of utility function (3) is that it converges to the utility of total consumption \(u\left( {\sum\nolimits_{{t\, = \,0}}^{T} {x_{t} } } \right)\) when all moments of time converge to each other, and we assume that share α converges to one in such case. In other words, splitting total consumption in two parts, one of which is only slightly delayed in time cannot increase utility (3). Such discontinuity problem arises, however, in classic discounted utility (cf. Baucells and Sarin, 2007, p. 173; Blavatskyy 2016, p. 786; Blavatskyy 2021, Sect. 3) and many of its generalizations such as quasi-hyperbolic discounting. On the other hand, behavioral characterization of model (3) is not straightforward since consumption in any moment of time affects the utility of consumption in all subsequent moments of time, i.e., utility function (3) is not separable.