1 Introduction

Existing models of portfolio investment (Konno & Yamazaki, 1991; Markowitz, 1952; Yitzhaki, 1982) use statistical measures of dispersion (variance or standard deviation, mean absolute deviation, Gini (1912) mean difference) to capture the undesirable attribute (risk) of financial portfolios. Such statistical measures, however, count both unexpectedly low and unexpectedly high returns. Arguably, only the former are undesirable (cf. Roy, 1952) and investors may be even attracted to the latter. A model of portfolio investment based on the tradeoff between expected return and expected loss considers only returns below the reference point as undesirable. Such an approach can be backed by a large literature in behavioral economics finding evidence of loss aversion (Kahneman & Tversky, 1979). For example, willingness to accept often exceeds willingness to pay, a behavioral regularity known as the endowment effect (Kahneman et al., 1990; Thaler, 1980); decision-makers often prefer to retain the status quo (Knetsch, 1989; Samuelson & Zeckhauser, 1988); investors demand higher returns from stocks with downside risk (Ang et al., 2006); traders who experience losses in the morning take extra risks in the afternoon to recover (Coval & Shumway, 2005); investors are prone to the disposition effect (Odean, 1998; Shefrin & Statman, 1985) etc.

Markowitz (1952) mean–variance approach always violates the first-order stochastic dominance (Borch, 1969). Such violations are normatively unappealing and rarely observed in the data (Carbone & Hey, 1995; Loomes & Sugden, 1998, Table 2, p. 591; Hey, 2001, Table 2, p.14; see, however, Tversky & Kahneman, 1986, p. 264; Birnbaum & Navarrete, 1998, p. 61). A model of portfolio investment based on the tradeoff between expected return and expected loss has an important normative advantage over other models—a first-order stochastically dominant portfolio always has a higher expected return and a lower expected loss (Proposition 1 below).

Behavioral finance literature typically models aversion to negative returns by using the elements of Kahneman and Tversky (1979) prospect theory that aggregates positive and negative returns together, with “losses looming larger than gains”. For example, Benartzi and Thaler (1995) combine loss aversion with mental accounting to rationalize the equity premium puzzle (Mehra & Prescott, 1985). Barberis et al. (2001) combine loss aversion with a sensitivity to prior outcomes to explain anomalies in asset prices. This paper models aversion to negative returns in the spirit of Markowitz (1952) mean–variance approach: investors tradeoff expected returns versus expected losses.

The remainder is organized as follows. Section 2 presents the expected return–expected loss model of portfolio investment. Section 3 compares this model with other models. Section 4 shows that this model can rationalize the equity premium puzzle. Section 5 concludes.

2 Expected return—expected loss model of portfolio investment

2.1 Set of feasible and efficient portfolios

There is a finite number of the states of the world n ∊ N. Only one state of the world is true ex post but an investor does not know ex ante which one. States of the world are numbered by subscripts i ∊ {1,…,n}. Notation p(si) denotes the probability of the state of the world si, i ∊ {1,…,n}. There is a finite number of securities m ∊ N. Securities are numbered by subscripts j ∊ {1,…,m}. In particular, the return of security j in the state of the world si is denoted by Rj(si). Expected return of security j is given by Eq. (1).

$$ ER_{j} = \mathop \sum \limits_{i = 1}^{n} R_{j} \left( {s_{i} } \right)p\left( {s_{i} } \right) $$
(1)

Losses (gains) are returns below (above) the reference point. For expositional clarity, we consider the simplest case when the reference point is the status quo (zero return) so that losses are simply negative returns. A more general model with a non-zero reference point is qualitatively similar. Expected loss of security j is given by Eq. (2).

$$ EL_{j} = - \sum\limits_{{\mathop {i = 1}\limits_{{R_{j} \left( {s_{i} } \right){\text{ < }}0}} }}^{n} {R_{j} \left( {s_{i} } \right)p\left( {s_{i} } \right)} $$
(2)

Portfolio \({\varvec{\alpha}} = \left\{ {\alpha_{1} ,\alpha_{2} , \ldots ,\alpha_{m} } \right\}\) yields return \( \sum\nolimits_{{j = 1}}^{m} {\alpha _{j} R_{j} \left( {s_{i} } \right)} \) in the state of the world si, αj ∊ [0,1] for all j ∊ {1,…,m} and \( \sum\nolimits_{{j = 1}}^{m} {\alpha _{j} = 1} \). Expected return of any portfolio Eq. (3) is simply the weighted average of expected returns of individual securities in this portfolio.

$$ ER\left( {\varvec{\alpha}} \right) = \mathop \sum \limits_{j = 1}^{m} \alpha_{j} ER_{j} $$
(3)

Two securities are comoving if they yield losses in the same states of the world (which implies that they yield gains in the same states of the world as well). Note that this concept of comoving securities is different from the concept of (sign) comonotonic securities (cf. Wakker et al., 1994) that is used in rank-dependent utility (Quiggin, 1981, 1982) or cumulative prospect theory (Tversky & Kahneman, 1992). For example, two securities TSLA and NVDAFootnote 1 presented in Table 1 are comoving (both securities are in the red in the same states of the world) but not comonotonic (NVDA yields a higher return in state s1 but TSLA yields a higher return in state s2). If all securities are comoving then the expected loss of a portfolio is simply the weighted average of expected losses of individual comoving securities Eq. (4).

$$ EL\left( {\varvec{\alpha}} \right) = \mathop \sum \limits_{j = 1}^{m} \alpha_{j} EL_{j} $$
(4)
Table 1 An example of two comoving securities

It is conventional to represent securities/portfolios in a two-dimensional diagram with their desirable attribute (expected return) plotted on the vertical axis and their undesirable attribute (expected loss)—on the horizontal axis. A set of portfolios that can be constructed with two comoving securities is then represented by a straight line connecting these two securities. For example, Fig. 1 shows the set of all feasible portfolios that can be constructed with securities TSLA and NVDA presented in Table 1 (on the assumption that all states of the world are equally likely).

Fig. 1
figure 1

Feasible portfolios that can be constructed with two securities in Table 1

Let us now consider two securities that are not comoving. In other words, there is at least one state of the world si in which there is no comovement i.e. R1(si)R2(si) < 0. For any such state, we calculate a share α1(si) of the first security such that the expected return of the binary portfolio is equal to the reference point (zero return) in state si.Footnote 2

$$ \alpha_{1} \left( {s_{i} } \right) = \frac{{R_{2} \left( {s_{i} } \right)}}{{R_{2} \left( {s_{i} } \right) - R_{1} \left( {s_{i} } \right)}} $$
(5)

We plot binary portfolios corresponding to all such “break even” shares as well as α1 = 0 and α1 = 1 on the expected return—expected loss plane. Finally, we connect them by straight lines. The obtained piece-wise linear curve represents the set of all feasible portfolios that can be constructed with two securities that are not comoving.Footnote 3

For example, two securities WMT and AAPLFootnote 4 presented in Table 2 are not comoving in states s3 and s10 (AAPL is in the red but not WMT) and s5, s6, s8 and s11 (WMT is in the red but not AAPL). The last column of Table 2 shows the implied “break even” shares Eq. (5) of WMT for these six states of the world. Figure 2 presents six binary portfolios constructed with these “break even” shares on the expected return—expected loss plane (on the assumption that all states of the world are equally likely). A piece-wise linear curve that connects these six portfolios (as well as two degenerate portfolios with 100% of WMT and AAPL) represents all feasible portfolios that can be constructed with shares of WMT and AAPL.

Table 2 An example of two securities that are not commoving
Fig. 2
figure 2

Feasible portfolios that can be constructed with two securities in Table 2

We assume that an investor prefers portfolios with a higher expected return and a lower expected loss. Thus, a set of efficient portfolios is a subset of feasible portfolios for which there is no other feasible portfolio with a higher expected return and/or lower expected loss. For example, in Fig. 1 the set of efficient portfolios coincides with the set of feasible portfolios. On the other hand, in Fig. 2, the set of efficient portfolios is a subset of feasible portfolios with WMT share being at most 62.6%.

The algorithm of constructing the set of efficient portfolios over two securities is then extended, by iteration, to efficient portfolios over any finite number of securities (comoving or not). For example, Fig. 3 plots the set of feasible portfolios that can be constructed with three securities: TSLA from Table 1, WMT and AAPL from Table 2. This set is bounded by three piece-wise linear curves: (1) a solid curve showing all binary portfolios feasible with only two securities WMT and AAPL (the same as presented in Fig. 2); (2) a dashed curve showing all binary portfolios feasible with WMT and TSLA; and (3) a dashed-dotted curve showing all binary portfolios feasible with AAPL and TSLA.Footnote 5 Figure 3 shows that the set of efficient portfolios that can be constructed with three securities TSLA, WMT and AAPL is the same as the set of efficient portfolios that can be constructed with only two securities TSLA and WMT (a section of the dashed curve with WMT share being at most 79.6%).

Fig. 3
figure 3

Feasible portfolios that can be constructed with three securities WMT, AAPL and TSLA from Tables 1, 2 (portfolio A is 94.5%WMT and 5.5%TSLA; portfolio B is 96.3%WMT and 3.7%TSLA; portfolio C is 98.8%WMT and 1.2%TSLA; portfolio D is 99.2%WMT and 0.8%TSLA; portfolio E is 62.6%WMT and 37.4%AAPL; portfolio F is 90.3%WMT and 9.7%AAPL; portfolio G is 98%WMT and 2%AAPL; and portfolio H is 98.7%WMT and 1.3%AAPL)

In the example presented in Fig. 3, three piece-wise linear curves corresponding to feasible binary portfolios do not intersect (they only meet at points representing degenerate portfolios with 100% share of one security). In this case, the set of efficient portfolios contains only binary portfolios (i.e. one security is never used for diversification). Let us now consider a more complex example when the set of efficient portfolios contains fully diversified portfolios (with positive shares of all available securities). We consider securities WMT and AAPL from Table 2 (that are also presented in the second and fifth columns of Table 3) and security REGNFootnote 6 presented in the third/sixth column of Table 3. The fourth (seventh) column of Table 3 presents “break even” shares Eq. (5) of WMT (AAPL) in a binary portfolio with REGN.

Table 3 “Break even” probabilities for binary portfolios WMT-REGN and AAPL-REGN
Table 4 Yields on a 13 week US treasury bill and monthly changes in NASDAQ index

A solid piece-wise linear curve in Fig. 4 shows all binary portfolios feasible with WMT and AAPL (the same as presented in Fig. 2). A dashed (dashed-dotted) piece-wise linear curve in Fig. 4 shows all binary portfolios feasible with WMT (APPL) and REGN that is constructed using five “break even” shares from the fourth (seventh) column of Table 3.

Fig. 4
figure 4

Feasible portfolios that can be constructed with three securities WMT, AAPL and REGN (portfolio A is 76.3%WMT, 6.8%AAPL and 16.9%REGN; portfolio B is 47% WMT, 32.2%AAPL and 20.8%REGN; portfolio C is 22.6%WMT, 53.4%AAPL and 24%REGN)

The dashed and dashed-dotted curves in Fig. 4 intersect (unlike those in Fig. 3) so that the set of feasible portfolios is bounded not only by solid, dashed and dashed-dotted curves (corresponding to feasible binary portfolios). To construct the set of feasible portfolios we proceed as follows. Table 3 shows that there are eight states of the world where the returns of three available securities are not comoving (s3–s6 and s8–s11). We calculate a share α1(si,sj) of the first security and a share α2(si,sj) of the second security such that the expected return of a diversified portfolio over three securities is equal to the reference point (zero return) in two distinct states si and sj where the returns of three available securities are not comoving (i.e. i,j ∊ {3–6,8–11}, i ≠ j). This amounts to solving a system of linear Eq. (6).

$$ \left\{ {\begin{array}{*{20}c} {\alpha_{1} \left( {s_{i} ,s_{j} } \right)R_{1} \left( {s_{i} } \right) + \alpha_{2} \left( {s_{i} ,s_{j} } \right)R_{2} \left( {s_{i} } \right) + \left[ {1 - \alpha_{1} \left( {s_{i} ,s_{j} } \right) - \alpha_{2} \left( {s_{i} ,s_{j} } \right)} \right]R_{3} \left( {s_{i} } \right) = 0} \\ {\alpha_{1} \left( {s_{i} ,s_{j} } \right)R_{1} \left( {s_{j} } \right) + \alpha_{2} \left( {s_{i} ,s_{j} } \right)R_{2} \left( {s_{j} } \right) + \left[ {1 - \alpha_{1} \left( {s_{i} ,s_{j} } \right) - \alpha_{2} \left( {s_{i} ,s_{j} } \right)} \right]R_{3} \left( {s_{j} } \right) = 0} \\ \end{array} } \right. $$
(6)

such that \(\alpha_{1} \left( {s_{i} ,s_{j} } \right),\alpha_{2} \left( {s_{i} ,s_{j} } \right) \in \left[ {0,1} \right]\), \(\alpha_{1} \left( {s_{i} ,} \right) + \alpha_{2} \left( {s_{i} ,s_{j} } \right) \le 1\), i ≠ j, \(\mathop {\min }\limits_{{}} \left\{ {R_{1} \left( {s_{i} } \right),R_{2} \left( {s_{i} } \right),R_{3} \left( {s_{i} } \right)} \right\} < 0 < \mathop {\max }\limits_{{}} \left\{ {R_{1} \left( {s_{i} } \right),R_{2} \left( {s_{i} } \right),R_{3} \left( {s_{i} } \right)} \right\}\), \(\mathop {\min }\limits_{{}} \left\{ {R_{1} \left( {s_{j} } \right),R_{2} \left( {s_{j} } \right),R_{3} \left( {s_{j} } \right)} \right\} < 0 < \mathop {\max }\limits_{{}} \left\{ {R_{1} \left( {s_{j} } \right),R_{2} \left( {s_{j} } \right),R_{3} \left( {s_{j} } \right)} \right\}\).

The solution α1(s4,s11) = 0.763 and α2(s4,s11) = 0.068 to system Eq. (6) is represented as point A in Fig. 4; the solution α1(s4,s10) = 0.47 and α2(s4,s10) = 0.322 to system Eq. (6) is represented as point B in Fig. 4; and the solution α1(s3,s4) = 0.226 and α2(s3,s4) = 0.534 to system Eq. (6) is represented as point C in Fig. 4.Footnote 7 Points A, B and C as well as the two nearest binary portfolios (84.1%WMT-15.9%REGN and 59.3%AAPL-40.7%REGN) that form a convex hull with these points are connected by a piece-wise linear dotted curve in Fig. 4. The set of all feasible portfolios is then bounded by this piece-wise linear dotted curve as well as solid, dashed and dashed-dotted curves. The set of efficient portfolios is then the set of feasible binary portfolios AAPL-REGN with AAPL share being at most 59.3% plus the set of (compound/conglomerate) portfolios that can be constructed by mixing a binary portfolio 59.3%AAPL-40.7%REGN with portfolio C (22.6%WMT, 53.4%AAPL and 24%REGN).

2.2 Investor’s indifference curves

As the next step, we specify investor’s preferences to determine the most preferred efficient portfolio. In general, investor’s preferences are represented by the utility function U(ER(α),EL(α)) that is increasing in the first argument and decreasing in the second argument. A major theoretical advantage of the expected return—expected loss model of portfolio investment over other models proposed in the literature is that utility function U(ER(α),EL(α)) does not violate the first-order stochastic dominance (cf. Proposition 1 below). In contrast, the influential Markowitz (1952) mean–variance approach always violates the first-order stochastic dominance (Borch, 1969). The mean-absolute deviation approach (Blavatskyy, 2010; Konno & Yamazaki, 1991) respects the first-order stochastic dominance only when indifference curves are not too steep (with a slope less than 0.5) in the expected return—absolute deviation plane. A similar restriction also applies to the mean-Gini approach (Shalit & Yitzhaki, 1984; Yitzhaki, 1982) that employs Gini (1912) mean absolute difference statistic for measuring statistical dispersion of assets’ returns.Footnote 8

Proposition 1

If portfolio α first-order stochastically dominates portfolio β then ER(α) ≥ ER(β) and EL(α) ≤ EL(β).

The proof is presented in the appendix.

For practical applications a useful parametric form is a quasi-linear utility function U(ER(α),EL(α)) = ER(α)—a*EL(α)b, where a,b ≥ 0 are constant. When a = 0, an investor does not care about expected losses (does not diversify) and picks a portfolio with the highest expected return as manifested by horizontal indifference curves. When a is infinitely large, an investor does not care about expected returns and picks a portfolio with the lowest expected loss (extreme loss aversion) as manifested by vertical indifference curves. Parameter b captures investor’s sensitivity to expected losses: if b > 1 (< 1) the investor becomes more (less) averse to larger expected losses. Given the set of efficient portfolios and indifference curves representing investor’s preferences, the optimal (most preferred) portfolio is an efficient portfolio that is located on the highest indifference curve. Figure 5 illustrates an optimal portfolio (59.3% AAPL and 40.7% REGN) for an example with three securities WMT, AAPL and REGN (presented in Table 3) when investor’s preferences are such that expected losses are twice as undesirable as expected gains. Since the set of efficient portfolios is a piece-wise linear curve, “sharp” kink points on this curve are likely to be optimal portfolios for a wide range of preferences.

Fig. 5
figure 5

Optimal portfolio with three securities WMT, AAPL and REGN for utility function U(ER(α),EL(α)) = ER(α)—2EL(α) represented by dashed indifference curves

3 Relationship to other models of portfolio investment

The expected return—expected loss model of optimal portfolio investment is closest to the mean-absolute deviation approach (Blavatskyy, 2010; Konno & Yamazaki, 1991). In the expected return—expected loss model the reference point is constant (the same for all securities). In the mean-absolute deviation approach, the reference point is context-dependent (different for different securities). Specifically, the reference point of security j is the expected return ERj of this security.Footnote 9 The expected loss of a security j is then equal to the mean absolute semideviation of this security (7).

$$ EL_{j} = \sum\limits_{{\mathop {i = 1}\limits_{{R_{j} \left( {s_{i} } \right){\text{ < }}ER_{j} }} }}^{n} {\left[ {ER_{j} - R_{j} \left( {s_{i} } \right)} \right]p\left( {s_{i} } \right) = \frac{1}{2}\sum\limits_{{i = 1}}^{n} {\left| {R_{j} \left( {s_{i} } \right) - ER_{j} } \right|} p\left( {s_{i} } \right)} $$
(7)

To understand the relationship between the mean-Gini approach (Shalit & Yitzhaki, 1984; Yitzhaki, 1982) and the expected return—expected loss model we need to assume that the reference point is both context-dependent and state-dependent (i.e. the reference point is stochastic). Specifically, the reference point RPij of the j-th security in the i-th state of the world is the expected return of this security in all states that bring a higher return than j-th return in state i: \(G_{ij} \equiv \left\{ {k \in \left\{ {1, \ldots ,n} \right\}{|}R_{j} \left( {s_{k} } \right) > R_{j} \left( {s_{i} } \right)} \right\}\).

$$ RP_{ij} = \mathop \sum \limits_{{k \in G_{ij} }} R_{j} \left( {s_{k} } \right)\frac{{p\left( {s_{k} } \right)}}{{\mathop \sum \nolimits_{{h \in G_{ij} }} p\left( {s_{h} } \right)}} $$
(8)

The expected loss of the j-th security is then equal to one-half of Gini (1912) mean absolute difference statistic Eq. (9) with mean differences weighted by the inverse of the j-th decumulative distribution function in the state with a lower return.

$$ \begin{aligned} EL_{j} & = \mathop \sum \limits_{i = 1}^{n} \left[ {RP_{ij} - R_{j} \left( {s_{i} } \right)} \right]p\left( {s_{i} } \right) \hfill \\ \;\;\;\;\;\; & = \mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{{k \in S_{ij} }} \frac{{\left[ {R_{j} \left( {s_{k} } \right) - R_{j} \left( {s_{i} } \right)} \right]}}{{\mathop \sum \nolimits_{{h \in G_{ij} }} p\left( {s_{h} } \right)}}p\left( {s_{i} } \right)p\left( {s_{k} } \right) \hfill \\ \;\;\;\;\;\; & = \frac{1}{2}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{k = 1}^{n} \frac{{\left| {R_{j} \left( {s_{k} } \right) - R_{j} \left( {s_{i} } \right)} \right|}}{{\mathop \sum \nolimits_{{h \in G_{{\min \left\{ {i,k} \right\}j}} }} p\left( {s_{h} } \right)}}p\left( {s_{i} } \right)p\left( {s_{k} } \right) \hfill \\ \end{aligned} $$
(9)

The expected return—expected loss model is overlapping with Kahneman and Tversky (1979) original prospect theory as well as Tversky and Kahneman (1992) cumulative prospect theory. The special case when an investor tradeoffs the expected return vs. the expected loss in a linear manner, i.e. U(ER(α),EL(α)) = ER(α)—a*EL(α), corresponds to the special case of both versions of the prospect theory with a piece-wise linear value function with a loss aversion coefficient λ = 1 + a and without non-linear probability weighting.

There is no obvious mathematical relationship between the expected return—expected loss model and Markowitz (1952) mean–variance approach. The relationship between these two models can be illustrated with the example of three securities WMT, AAPL and REGN. The solid line in Fig. 6 shows the set of efficient portfolios that can be constructed with these three securities according to the expected return—expected loss model (the same as in Figs. 4 and 5). The set of efficient portfolios according to Markowitz (1952) mean–variance approach is the set of binary portfolios AAPL-REGN with AAPL share being at most 62.4% plus the set of (compound/conglomerate) portfolios that can be constructed by mixing a binary portfolio 62.4%AAPL-37.6%REGN with a fully diversified portfolio 65.2%WMT, 14.9%AAPL and 19.9%REGN. This set is represented by a dashed line in Fig. 6.

Fig. 6
figure 6

Efficient portfolios with three securities WMT, AAPL and REGN according to the expected return—expected loss model and mean variance approach

The set of efficient portfolios according to the expected return—expected loss model (a solid line in Fig. 6) overlaps with that according to Markowitz (1952) mean–variance approach (a dashed line in Fig. 6) when efficient portfolios have a relatively high expected return and a relatively high expected loss. In this case, according to both models, efficient portfolios are binary portfolios AAPL-REGN with AAPL share being at most 59.3%. Thus, if investors prefer stocks with high returns and are relatively less averse to expected losses (or variance of returns), the optimal portfolio is likely to be the same according to both models.

If investors are sensitive to expected losses, the optimal portfolio differs in two models. Optimal portfolio according to the mean–variance approach generally has a lower expected return and higher expected losses compared to the optimal portfolio according to our model. This happens because a portfolio with low expected losses may have abnormally high returns, which makes it too “volatile” according to the mean–variance approach. The latter favors portfolios with returns smoothed over all states of the world (at the cost of lower returns).

4 The equity premium puzzle

The equity premium puzzle (Mehra & Prescott, 1985) refers to an empirical finding that an unreasonably high degree of risk aversion (under expected utility theory) is required to rationalize a diversified portfolio that includes a stock market index and relatively low-interest governmental bonds. To illustrate the equity premium puzzle, the second column of Table 4 shows yields on a 13 week US treasury bill between June 1st, 2019 and May 1st, 2020 and the third column of Table 4 shows monthly changes in adjusted close prices of NASDAQ Composite Index for the same time period.Footnote 10 Let us assume that each of the twelve months listed in Table 4 is an equally likely state of the world. We ignore the possibility of different taxation rates for treasury bills and traded stocks. Under these assumptions, an investor maximizing expected utility with a constant relative risk aversion utility function

$$ u\left( x \right) = \frac{{x^{1 - r} }}{1 - r} $$
(10)

holds both a 13 week US treasury bill and NASDAQ Composite Index in her portfolio if her coefficient of relative risk aversion is r > 2.41. For example, an investor with a coefficient of relative risk aversion r = 2.5 holds a diversified portfolio 4% T-bills and 96% NASDAQ index. A decision-maker with r = 2.5 sells a 50–50% chance to receive either $100 or $1 for only $1.58, which might be an unreasonably low certainty equivalent. Indeed, many empirical studies find lower coefficients of relative risk aversion. For instance, Blavatskyy and Pogrebna (2010, Table III, p.973) estimate coefficients of relative risk aversion to be between − 0.12 and 0.45 (depending on the econometric model of probabilistic choice) in a natural experiment with Italian contestants standing to win up to €500000. de Roos & Sarafidis (2010, Table XIV) estimate coefficients of relative risk aversion to be between 0.46 and 0.65 (depending on econometric model) for Australian contestants standing to win up to AUD 200000.

To rationalize even a small share of treasury bonds in a diversified portfolio that includes a stock market index, an expected utility maximizer must exhibit the levels of risk aversion that are descriptively unjustifiable. In contrast, the expected return—expected loss model of portfolio investment can rationalize the equity premium puzzle without unrealistic assumptions about investor’s preferences. The last column of Table 4 lists “break even” shares of treasury bills needed for constructing the set of efficient binary portfolios over a 13 week US Treasury bill and NASDAQ Composite Index. This set of efficient binary portfolios is shown as a solid piece-wise linear curve in Fig. 7. Note that binary portfolios with 100% or 99.7% share of treasury bills have no negative returns, i.e. their expected loss is zero.

Fig. 7
figure 7

Set of efficient binary portfolios of a 13 week US treasury bill and NASDAQ Composite Index (solid line) and indifference curves (dashed line) rationalizing the optimal portfolio with 57.4% share of treasury bills

Figure 7 shows that the optimal binary portfolio includes 57.4% of a 13 week US treasury bill when the slope of the investor’s indifference curve touching the solid line in the expected return—expected loss plane is between 0.52 and 0.64. As discussed in the previous section, this corresponds to a coefficient of loss aversion between 1.52 and 1.64 in a version of prospect theory without non-linear probability weighting. Coefficients of loss aversion greater than two are not uncommon in the empirical literature (e.g., Tversky & Kahneman, 1992). Hence, we can conclude that a relatively modest aversion to expected losses is sufficient for rationalizing a relatively large share of treasury bonds in a diversified portfolio together with a stock market index.

5 Conclusion

The problem of optimal portfolio investment amounts to finding the most desirable convex combination of several random variables. Often regarded as a normative benchmark, expected utility theory (von Neumann & Morgenstern, 1947) employs a non-linear Bernoulli utility function. Thus, in general, under expected utility theory there is no closed-form solution for an optimal convex combination of several random variables. This limitation carries over to many well-known generalizations of expected utility theory such as rank-dependent utility (Quiggin, 1981, 1982) or cumulative prospect theory (Tversky & Kahneman, 1992). The possibility to characterize a closed-form solution for an optimal convex combination is valued in practical applications in finance. Hence, several models of optimal portfolio investment were proposed with a linear utility function, e.g. Markowitz (1952) mean–variance approach. Such models produce a closed-form solution for an optimal convex combination but lack a solid preference foundation. For example, Markowitz (1952) mean–variance approach may lead to violations of the first-order stochastic dominance. This state of the literature demands for a model of optimal portfolio investment that, on the one hand, is based on intuitive micro-economic preferences (e.g. respects the first-order stochastic dominance) and, on the other hand, produces a closed-form solution for an optimal convex combination (which essentially boils down to avoiding any non-linear transformations of state-contingent payoffs). The main contribution of this paper is to advance one such model.

The idea that investors weight expected returns vis-à-vis expected losses is intuitively appealing. Such a model also has normatively attractive properties. For example, it does not lead to violations of the first-order stochastic dominance. On the descriptive side, the model can be supported by a large behavioral literature finding evidence of loss aversion. To construct a set of efficient portfolios in the expected return—expected loss plane we only need to solve (a system of) linear equations. Thus, a closed-form solution for an optimal convex combination under this model is even simpler compared to the classic mean–variance approach (which requires solving a quadratic optimization problem).

An important concept in the expected return—expected loss model is that of comoving random variables. Two random variables are comoving if there is no state of the world in which one random variable yields a positive return and the other—a negative return. Comoving random variables do not provide any additional benefit from diversification. The expected return (loss) of a convex combination of comoving random variables is equal to the weighted average of the expected returns (losses) of individual comoving random variables. In contrast, random variables that are not comoving provide additional benefits from diversification. The expected loss of a convex combination of such variables is less than the corresponding weighted average of the expected losses of individual random variables. In other words, securities that are not comoving provide hedging benefits in portfolio investment.

The efficiency frontier under expected return—expected loss model is qualitatively similar to that under the mean–variance approach. The mean-absolute deviation approach (Konno & Yamazaki, 1991) can be viewed as the expected return—expected loss model with an endogenous reference point that is equal to the expected return of a security. The mean-Gini approach (Shalit & Yitzhaki, 1984; Yitzhaki, 1982) can be viewed as the expected return—expected loss model with an endogenous state-dependent reference point. Finally, the expected return—expected loss model is overlapping with prospect theory (Kahneman & Tversky, 1979; Tversky & Kahneman, 1992). A special case of the former, when an investor linearly trade-offs expected returns and expected losses, is also a special case of the latter, when a decision-maker has a piece-wise linear value function without any probability weighting.

One possible criticism of the expected return—expected loss model is that the model predicts no diversification in a bear market when all securities yield negative returns (since the security with the smallest loss is then also the security with the highest expected return). Yet, in such a relatively rare stock market where all securities yield negative returns, the assumption that investors maintain zero return as their reference point is not realistic. On such a bear market the investors would lose money anyway and they might be concerned with avoiding larger losses so that their reference point is a negative return. With a negative reference point, the expected return—expected loss model predicts diversification on a bear market when all securities yield negative returns.