Introduction

In mean–variance (MV) or \((\mu , \sigma )\)-analysis, preferences over random distributions of, say, consumption or wealth are represented by functions that depend only on the mean and the variance (or standard deviation) of consumption or wealth. In addition to being an intuitive tool in the analysis of decision making under uncertainty, MV preferences are a perfect substitute for the classical expected utility (EU) approach if all attainable distributions belong to a location–scale family (Meyer 1987). Then, risk attitudes (such as risk aversion, prudence) originally formulated in the EU approach have convenient analogues in terms of MV preferences (see, e.g., Meyer 1987; Lajeri-Chaherli 2002, 2005; Eichner and Wagener 2003b). Moreover, as argued by Meyer (1987) and others, the location–scale property is satisfied in a wide range of univariate economic decision problems. Such problems, encompassing portfolio selection (Fishburn and Porter 1976), competitive firm behavior (Sandmo 1971), co-insurance (Meyer 1992), export production (Broll et al. 2006), bank (Broll et al. 2015), and others, can then be studied equivalently both in terms of the EU and the MV approach.

In their simplest form, preferences and choices under risk are analyzed under the assumption that there is only a single source of uncertainty, a “direct” risk. The—probably more relevant—case of multiple risks has only recently found more attention in MV analysis. Inspired by studies on the effects of (additive) background risks on risk-taking under the EU hypothesis (see, e.g., Eeckhoudt et al. 1996; Caballé and Pomansky 1997), Wong and Ma (2008) or Eichner and Wagener (2003a, 2009) analyze quasi-linear decision problems where the MV decision maker has faced both a direct, controllable risk and an exogenous background risk. Eichner and Wagener (2011a) study linear portfolio choices with several risky assets. In these studies, the different risks are additive, i.e., final wealth or consumption emerges as a linear combination of multiple random variables.Footnote 1

In this paper we survey previous studies on MV preferences in the presence of several additive risks (capturing, but not being confined to, the case of a direct risk plus a background risk). Such a linear setting is particularly suited to draw parallels between the EU and the MV approach since the location–scale property often prevails and MV and EU approach can be considered as perfect substitutes. Compared to the EU approach, where the analysis of background uncertainty is quite complex, MV analysis with its simple two-parameter utility functions has the advantage that all risk attitudes or comparative statics can be couched in terms of marginal rates of substitution between risk and return, represented, respectively, by the variance/standard deviation and the mean.

A key feature of the two-parameter structure (in combination with additively connected risks) is that it ensures that risk attitudes that have been studied for univariate sources of risks are bequeathed also to the multiple-risks scenario. We demonstrate this in the following way: we start from formal parallels between EU and MV approach, relating, e.g., to absolute and relative risk aversion, prudence, temperance, and their monotonicity properties in univariate settings (“MV preferences and EU approach” section) and then show that the attending MV concepts (in terms of marginal rates of substitution between risk and return) are preserved with several additive risks (“Additive risks and risk attitudes” section). In “Additive risks and risk attitudes” section, we apply these results to study the comparative statics of optimal risk-taking in the presence of (dependent) background risks.

Most studies on additive (background) risks both in the EU and the MV framework suppose that all risks are independently distributed (for exceptions, see Tsetlin and Winkler 2005, or Eichner and Wagener 2012). A particular advantage of MV preferences is, however, that risk attitudes and comparative statics with dependent random variables can be dealt with relatively easily, due to the fact that the variance (or standard deviation) as a measure of riskiness reduces—and limits—all dependence structures to just linear ones. As we show, background risks do not pose significant analytical problems for the MV approach within its linear confines, neither for risk attitudes nor for comparative statics of changes in the distribution and even in the dependence structure of direct and background risks. In “Additive risks and risk attitudes” and “Optimal decisions with additive risks” sections we fully characterize these features. Moreover, with the help of the analogies between MV and EU approach reported in “MV preferences and EU approach” section all MV features can be related to results for EU preferences. Although the results in “Optimal decisions with additive risks” section are actually Propositions 1 and 2 in Eichner and Wagener (2009), our contribution here is to simplify the related proofs and embed them into a comprehensive framework to make them easier to understand. As a new illustration, we apply the results obtained in “Optimal decisions with additive risks” section to study the risk-taking behavior of a banking firm with background risk in the MV model.

A frequent source of concern with respect to MV analysis is the role of higher-order derivatives of the \((\mu , \sigma )\)-function, the attending indifference maps or, once compatibility with the EU approach is assumed, of the underlying von Neumann–Morgenstern (vNM) utility index; such derivatives of general order n will also appear in “MV preferences and EU approach” section. For the EU approach, studies on higher-order moments and on higher-order risk measures indeed reveal close relations between high-order risk changes or dominance relations and higher-order derivatives of vNM utility (see Chan et al. 2016 or Niu et al. 2017 for surveys). The MV framework is, by construction, confined to changes in the first two moments. Still, concepts of (vNM) risk preferences that involve higher-order derivatives can, in many ways, be translated into two-parameter parlance; this is simply due to the fact that the signs of higher-order derivatives (and their combinations) convey more and different things than just preferences towards high-order changes in risk. For example, already Lajeri and Nielsen (2000) show that for MV analysis with normal distributions, the corresponding utility function is concave if and only if the agent has decreasing prudence. Lajeri-Chaherli (2002) presents an economic interpretation for the quasi-concavity of a MV utility function and finds that quasi-concavity plus decreasing risk aversion is equivalent to proper risk aversion, as coined by Pratt and Zeckhauser (1987) in the expected utility framework. Wagener (2002) demonstrates how prudence, risk vulnerability, temperance, and some related concepts can be meaningfully formulated in terms of two-moment, mean–standard deviation preferences. Eichner and Wagener (2003a) show the equivalence of decreasing absolute prudence in the expected utility framework and the concavity of utility as a function of mean and variance. Wagener (2003) finds that in the two-parameter approach, a number of plausible comparative statics already emerges under the assumption of decreasing absolute risk aversion. Moreover, risk vulnerability, temperance, and standardness imply, appropriately transferred to the MV framework, the plausible effect that risk-taking will be reduced if background risks increase. Lajeri-Chaherli (2004) assumes that the agent expects two independent, risky incomes in the future and focusses on his precautionary saving motive or equivalently consumption behavior at time zero. She finds that this framework allows for the definition of new concepts, called proper prudence, standard prudence, and precautionary vulnerability. Eichner and Wagener (2004) show that relative risk aversion being smaller than one and relative prudence being smaller than two emerge as preference restrictions that fully determine the optimal responses of decisions under uncertainty to certain shifts in probability distributions. They characterize the magnitudes of relative risk aversion and relative prudence in terms of the two-parameter approach. They also demonstrate that this characterization is instrumental in obtaining comparative static results in the two-parameter setting. Eichner (2008) transfers the concept of risk vulnerability to mean variance preferences, showing that it is equivalent to the slope of the MV indifference curve being decreasing in mean and increasing in variance. He also shows that MV vulnerability links the concepts of decreasing absolute risk aversion, risk vulnerability, properness, and standardness. These concepts are characterized in terms of MV indifference curve properties and in terms of absolute risk measures. The general equivalences presented in “MV preferences and EU approach” section are instrumental in deriving these and potentially other relations between EU and MV preferences (without leaving the linear domain).

The remainder of the paper is organized as follows: “MV preferences and EU approach” section sets up the formal framework of MV preferences and their relations to the EU approach. In that framework, “Additive risks and risk attitudes” section then studies the impact of additive risks on the shapes of indifference curves and measures for risk attitudes. “Optimal decisions with additive risks” section analyzes the comparative statics of changes in risk parameters in a generic linear decision problem with additive background uncertainty. An application to the banking firms’ risk-taking behavior is also given in this section. “Concluding remarks” section concludes.

MV preferences and EU approach

General

Suppose that \(Y, Z \ldots\) are random variables that denote final wealth, consumption, or any other valued, cardinal outcome. Denote by \(F_Y, F_Z, \ldots\) the distribution functions of, respectively, \(Y, Z, \ldots .\) A decision maker who behaves in accordance with the von Neumann–Morgenstern consistency properties then assesses lotteries (= risk distributions) by their expected utility. Specifically, lottery Z is weakly preferred to lottery Y if \(E_{F_Z} u(s) \ge E_{F_Y} u(s),\) where

$$E_{F} u(s) = \int _{-\infty }^{+\infty } u(s) \text {d}F(s)$$

and \(u: {\mathbb{R}}\rightarrow {\mathbb{R}}\) is a strictly increasing utility index. Without much loss in generality we shall assume that u is a smooth function such that \(u^{\prime}>0\) everywhere.

Let \(Y_0\) be a “seed” random variable with zero mean, unit variance, and distribution function \(F_0.\) The location–scale family \({\mathcal{D}}_{Y_0}\) generated by \(Y_0\) is then given byFootnote 2

$${\mathcal{D}}_{Y_0} = \{ Y | Y = \mu _Y + \sigma _Y Y_0, \; \sigma _Y > 0, \; -\infty< \mu _Y < \infty \}.$$
(1)

The distribution \(F_Y\) of \(Y \in {\mathcal{D}}_{Y_0}\) is \(F_Y(y)= F_0((y-\mu _Y)/\sigma _Y)\); the mean and standard deviation of Y are \(\mu _Y\) and \(\sigma _Y,\) respectively.

Following Meyer (1987), the expected utility of any lottery \(Y \in {\mathcal{D}}_{Y_0}\) can then be written as a function merely of the mean and the standard deviation of Y:

$$E_{F_Y} u(y) = \int _{-\infty }^{+\infty } u(\mu _Y + \sigma _y s ) \;\text {d}F_{0}(s) =: U(\sigma _Y, \mu _Y).$$
(2)

If, in a decision problem, all attainable lotteries come from a location–scale family \({\mathcal{D}} \subset {\mathcal{D}}_{Y_0},\) the expected utility framework and two-parameter, MV functions are, thus, equivalent representations of preferences under risk.

For a location–scale family \({\mathcal{D}} \subset {\mathcal{D}}_{Y_0},\) denote by \(M \subseteq {\mathbb{R}}_{++} \times {\mathbb{R}}\) with \(M=\{(\sigma , \mu ) | \mu + \sigma Y_0 \in {\mathcal{D}} \}\) the set of attending distribution parameters.

Parallels

It is evident from (2) that u(y) is increasing for all y if and only if \(U(\sigma , \mu )\) is increasing in μ for all \((\sigma , \mu ) \in M.\) Furthermore, the following relationships hold for all \(n \in {\mathbb{N}}\) Footnote 3:

$$\begin{aligned}& {u^{(n+1)}(y) \, \frac{<}{>} \, 0 \quad \forall y} \\ & \iff \frac{\partial^{n+1} U(\sigma , \mu )}{\partial \mu^{n+1}} \, \frac{<}{>} \, 0 \quad \forall (\sigma , \mu ) \in M \end{aligned}$$
(3)
$$\begin{aligned}&\iff \frac{\partial^n U(\sigma , \mu )}{\partial \sigma \partial \mu^{n-1}} \, \frac{<}{>} \, 0 \quad \forall (\sigma , \mu ) \in M \end{aligned}$$
(4)
$$\begin{aligned}&\iff \frac{\partial^{n+1} U(\sigma , \mu )}{\partial \mu^{n+1}} \cdot \frac{\partial^{n+1} U(\sigma , \mu )}{\partial \sigma^2 \partial \mu^{n-1}} - \left( \frac{\partial^{n+1} U(\sigma , \mu )}{\partial \sigma \partial \mu^{n}} \right)^2\;\frac{>}{<} \, 0 \quad \forall (\sigma , \mu ) \in M. \end{aligned}$$
(5)

From (3), the monotonicity properties of U with respect to μ are reflected by the monotonicity properties of u with respect to y. Analogous equivalences exist for \(U_{\mu }\) and \(u^{\prime},\) and so forth. Equation (4) shows that \(u^{(n)}(y)\) is equal in sign to the \((n-1)\)st derivative of \(U_\sigma\) with respect to μ. Finally, Eq. (5) identifies the curvature properties of \(\partial^{n-1} U/\partial \mu^{n-1}\) as being determined by the curvature of \(u^{(n-1)}(y)\) (i.e., the monotonicity of \(u^{(n+1)}\)). For \(n=1,\) (4) and (5) already appear in Meyer (1987) who shows that \(U(\sigma , \mu )\) is strictly decreasing in σ and concave in \((\sigma , \mu )\) if and only if \(u^{\prime \prime}(y)<0\) everywhere.Footnote 4

For \(n \ge 1\) define by

$$C_n(u) =\left\{ (\sigma ,\mu ) \left| \frac{\partial^{n-1} U(\sigma ,\mu )}{\partial \mu^{n-1}} \equiv u \right. \right\}$$
(6)

the u-level set for \(\frac{\partial^{n-1} U(\sigma ,\mu )}{\partial \mu^{n-1}}.\) Here, \(C_1(u)\) is the familiar \((\sigma , \mu )\)-indifference curve at utility level u. Similarly, \(C_2(u)\) comprises all \((\sigma , \mu )\)-combinations where a marginal increase in μ gives the same additional utility u, etc.

Elements in \(C_n(u)\) can be characterized in terms of marginal rates of substitution: For \(n\ge 1\) define

$$S_n(\sigma , \mu ):= -\frac{\partial^n U(\sigma , \mu )}{\partial \sigma \partial \mu^{n-1}} \Bigg/\frac{\partial^n U(\sigma , \mu )}{\partial \mu^n},$$
(7)

where \(S_1\) is the marginal rate of substitution between μ and σ for utility function U; likewise \(S_n\) is the marginal rate of substitution between μ and σ for \(\frac{\partial^{n-1} U(\sigma ,\mu )}{\partial \mu^{n-1}}.\) Then the level sets \(C_n(u)\) can be represented as curves with slopes

$$\left. \frac{\text {d} \mu }{\text {d} \sigma } \right| _{(\sigma , \mu ) \in C_n(u)}= S_n(\sigma , \mu ).$$
(8)

For vNM-function utility indexes u, the class of absolute measures of risk attitudes in the EU approach is defined by

$$A_n(y):= -\frac{u^{(n+1)}(y)}{u^{(n)}(y)}$$
(9)

(\(y \in {\mathbb{R}},\) \(n \in {\mathbb{N}}\)). \(A_1\) is the Arrow–Pratt measure of absolute risk aversion (Lajeri and Nielsen 2000; Ormiston and Schlee 2001), while \(A_2,\) \(A_3,\) \(A_4\) are, respectively, the measures of absolute prudence (Kimball 1990), absolute temperance (Eeckhoudt et al. 1996), and edginess (Lajeri-Chaherli 2004). Analogously, relative measures of risk attitude can be constructed: for \(y, z \in {\mathbb{R}}\) and \(n \in {\mathbb{N}}\) set

$$R_n(y,z):= - z \cdot \frac{u^{(n+1)}(y+ z)}{u^{(n)}(y+ z)}.$$
(10)

For \(n=1\) this yields the index of partial relative risk aversion as introduced by Menezes and Hanson (1970). \(R_2\) and \(R_3\) are, respectively, the indices of partial relative prudence (Choi et al. 2011) and partial relative temperance (Honda 1985).

Meyer (1987, Property 5) shows that the MRS \(S_1\) is the two-parameter equivalent of the Arrow–Pratt measure \(A_1\) of absolute risk aversion. For higher values of n, similar analogies were derived in Eichner and Wagener (2005). In particular, as can be inferred from (3) and (4), if expected utility approach and two-parameter approach are compatible, then for all \(n \in {\mathbb{N}}\)

$$A_n(y) \ge 0 \;\forall y \quad \iff \quad S_n(\sigma , \mu ) \ge 0 \; \forall (\sigma , \mu ) \in M.$$
(11)

For \(n=1,\) the relationship between (11) and (5) has already been made in or Meyer (1987). As these authors note, they cover the following cases:

  • if \(u^{\prime \prime}(y)< 0 < u^{\prime}(y)\) for all y, then \((\sigma , \mu )\)-indifference curves are strictly convex upward in \((\sigma , \mu )\)-space: the compensation in term of μ needed for an increase in uncertainty is always positive and increases in the level of uncertainty (risk aversion);

  • if \(u^{\prime}(y), u^{\prime \prime}(y) >0\) for all y, then \((\sigma , \mu )\)-indifference curves are concave downward: μ needs to be reduced to compensate for an increase in uncertainty, and this reduction decreases in the level of uncertainty (risk-seeking attitude);

  • if \(u^{\prime}(y) >0=u^{\prime \prime}(y)\) for all y, then \((\sigma , \mu )\)-indifference curves are parallel to the σ-axis (risk neutrality).

Similarly interpretations arise for \(n>1.\) E.g., for \(n=2,\) a prudent and risk-averse decision maker (\(u'''>0>u^{\prime \prime}\)) faced with an increase in uncertainty will require an increase in μ to keep his marginal utility from μ constant.

Interestingly, analogies extend to monotonicity properties as well:

Result 2.1

(Eichner and Wagener 2005, Proposition 1) For all \(n \in {\mathbb{N}},\)

$$A_n'(y)\, \frac{<}{>}\, 0 \quad \forall y \iff \quad \frac{\partial S_n(\sigma ,\mu )}{\partial \mu} \, \frac{<}{>} \; 0 \quad \forall (\sigma , \mu ) \in M;$$
(12)
$$\frac{\partial R_n (y, z)}{\partial z} \ge 0 \quad \forall y \iff \quad \frac{\partial S_n(\sigma , \mu )}{\partial \sigma } \ge 0 \quad \forall (\sigma , \mu ) \in M.$$
(13)

In the case \(n=1,\) the equivalences in (12) and (12) mean that risk aversion for \((\sigma , \mu )\)-utility functions (as measured by \(S_1\)) (i) decreases [increases] in μ if the underlying vNM-index exhibits decreasing [increasing] absolute risk aversion and (ii) increases [decreases] in σ if the vNM-index exhibits increasing [decreasing] partial relative risk aversion. As Menezes and Hanson (1970) argue, if one wants \(R_1\) to be monotone in z everywhere, then this is only compatible with \(A_1>0\) if \(R_1\) strictly increases. Hence, \(\partial S/\partial \sigma <0\) can then at most be a local property. Moreover, decreasing absolute risk aversion (\(A(y)> 0 > A^{\prime}(y)\)) implies that \(S(\sigma , \mu )\) is decreasing in σ (for details, see Eichner and Wagener 2005).

The cases \(n>1\) are analogous to \(n=1,\) lifting relationships between partial relative measures of risk attitudes for vNM-functions and to higher orders.

Additive risks and risk attitudes

General

How does the addition of risks (e.g., via background uncertainty in one’s investment) affect risk attitudes? Specifically, if an additive uncertainty B, also measured in terms of final wealth, changes returns on a risky activity from X to \(Y=X+B,\) how are risk preferences affected? To ensure transferability to the EU approach, we require that the location–scale framework still applies and make the following

Assumption 3.1

   Let \(X_0\) and \(B_0\) be two seed variables with attending location–scale families \({\mathcal{D}}_{X_0}\) and \({\mathcal{D}}_{B_0}.\) Then the set of all \(Y = X+B\) with \(X \in {\mathcal{D}}_{X_0}\) and \(B \in {\mathcal{D}}_{B_0}\) forms a location–scale family \({\mathcal{D}}_{Y_0}\) with seed \(Y_0.\)

We note that in Assumption 3.1 \(Y_0\) may not be equal to \(X_0+B_0.\) Under Assumption 3.1 we have \(Y= X+B = \mu _X + \mu _B + \sigma _X X_0 + \sigma _B B_0,\) implying that \(\mu _Y = \mu _X + \mu _B\) and \(\sigma _Y = \sqrt{\sigma _X^2 + \sigma _B^2 + 2 \sigma _{XB}},\) where the covariance between X and B, \(\sigma _{XB} = \rho \sigma _X \sigma _B\) measures the linear dependence of X and B; \(\rho \in (-1,1)\) denotes (Pearson’s) correlation coefficient. Denote by \(F_{XB}(x,b)\) the joint distribution of (XB).

Assumption 3.1 will, e.g., be satisfied if \(X_0\) is equal in distribution as \(B_0,\) both are independent, and \(X_0\) adheres to a stable distribution; \(X_0+B_0\) then even inherits the type of distribution. Moreover, if both \(X_0\) and \(B_0\) are elliptically distributed (but not necessarily identically or independently), then so is their sum (Fang et al. 1990, Theorem 2.16).Footnote 5 This encompasses, e.g., that \(X_0, B_0 \sim N(0,1)\) such that \(X+B \sim N \big (\mu _X+\mu _B, \sqrt{Var(X+B) \big )}\); the same holds if \(X_0\) and \(B_0\) are gamma-distributed with equal scale parameter.

Assumption 3.1 allows for dependence between the two random variables. While independence is routinely assumed in the EU literature on background risks, the MV approach can quite easily cater for dependent background risks. In fact, if we were assuming independence, then for elliptical distributions Assumption 3.1 essentially confines the analysis to X and B both being Gaussian (Fang et al. 1990, Theorem 4.11).

Under Assumption 3.1, the expected utility from random variable \(Y = X+ B\) can be represented by

$$E_{F_Y} u(y)= E[ u(X+B)] = \int _{-\infty }^\infty u(\mu _Y+ \sigma _Ys) \;{\text {d}}F_0(s) =: U(\sigma _Y, \mu _Y),$$
(14)

where \(F_0(s)\) is the distribution function of the seed variable \(Y_0.\) If X and B were independent, the density of \(X+B\) can be obtained by taking the convolution of X and B; otherwise not. In (14), \(U(\sigma _Y, \mu _Y )\) in (14) represents expected utility in two-parameter, mean–standard deviation form.

The impact of (greater) additive uncertainty on risk attitudes can now be studied by help of our previous observations. Assumption 3.1 essentially implies that all risk attitudes (and their monotonicity properties) in the absence of background uncertainty remain unchanged if background risks are added.

Changes in location parameters

Taking the partial derivative with respect to \(\mu _B\) or \(\mu _X\) captures the effects of a shift in risks. They are identical to the standard wealth or income effects that arise when some exogenous, non-risky wealth changes. In particular, as a straightforward implication from Result 2.1, Eq. (12), we obtain

Corollary 3.1

For \(k=X,B,\)

$$\frac{\partial S_n(\sigma ,\mu _X+\mu _B)}{\partial \mu _k} \, \frac{>}{<} \, 0 \quad \forall (\sigma , \mu ) \in M \quad \iff \quad A_n'(y) \, \frac{>}{<} \, 0 \quad \forall y.$$
(15)

Hence, a higher expected return on any risk makes decision makers more [less] risk-averse if absolute risk aversion is increasing [decreasing] in income (\(n=1\)). It makes them more [less] prudent if absolute prudence rises [diminishes] with income (\(n=2\)); and similar for higher degrees.

Changes in scale parameters

The variance of final wealth is given by

$$\sigma _Y^2 = \sigma _X^2 + \sigma _B^2 + 2 \rho \sigma _X \sigma _B.$$
(16)

In this decomposition, we can separate changes in the marginal distributions of X and B from changes in their dependence structure, represented by ρ. In fact, in the realm of elliptical distributions, where MV analysis is most appropriate, the Pearson correlation coefficient adequately captures how and how strongly X and B hang together (Landsman and Tsanakas 2006).

The partial derivatives with respect to \(\sigma _B\) or \(\sigma _X\) capture the effects of changes in the riskiness of the single risks B or X. Observe that

$$\frac{\partial S_n\left( \sqrt{\sigma _X^2+\sigma _B^2 + 2 \rho \sigma _X \sigma _B},\mu \right) }{\partial \sigma _k} = \frac{\partial \sigma _Y}{\partial \sigma _k} \cdot \frac{\partial S_n\left( \sigma _Y, \mu \right) }{\partial \sigma _Y} = \underbrace{\frac{\sigma^2_k + \sigma _{XB}}{\sigma _Y \sigma _k}}_{(i)} \cdot \underbrace{\frac{\partial S_n\left( \sigma _Y,\mu \right) }{\partial \sigma _Y}}_{(ii)}.$$
(17)

Hence, the effect of an increase in \(\sigma _k\) (for \(k=X,B\)) on risk attitudes depends on (i) how that change affects overall riskiness \(\sigma _Y\) and (ii) the risk attitude proper.

As for (i), an increase in either \(\sigma _X\) or \(\sigma _B\) does not necessarily increase \(Var(X+B)\); increases in marginal risks may well be beneficial in the MV framework. This reflects that for the variance (or standard deviation) as a risk measure increases in the marginal risk-ordering for that measure are not preserved under linear combinations of dependent random variables. Increases in \(\sigma _k\) will only raise \(\sigma _Y\) if \(\sigma _{XB} >-\sigma^2_k.\) This is the case if (but not only if) X and B are independent or positively correlated.Footnote 6

As for effect (ii) in (17), condition (13) in Result 2.1 applies:

Corollary 3.2

For all n, if \(\partial \sigma _Y/\partial \sigma _k>0,\) then for \(k=X,B,\)

$$\frac{\partial S_n\left( \sqrt{\sigma _X^2+\sigma _B^2 + 2 \sigma _{XB}},\mu \right) }{\partial \sigma _k} \, \frac{>}{<} \, 0 \quad \forall (\sigma , \mu ) \in M \quad \iff \quad R_n'(y,z) \, \frac{>}{<} \, 0 \quad \forall y.$$
(18)

This observation conveys that, if a greater marginal riskiness makes total wealth riskier, this renders decision makers more [less] risk-averse if relative risk aversion is increasing [decreasing] in income (\(n=1\)). It makes them more [less] prudent if relative prudence rises [diminishes] with income (\(n=2\)); and similar for higher degrees of n. In case a greater marginal riskiness makes total wealth safer, the results are reversed.

Changes in the dependence structure

With the decomposition (16), an increase in ρ represents that X and B move more closely together (with invariant marginals). An increase in ρ is detrimental to utility as it also increases \(\sigma _Y.\) Hence, we can directly apply (13) from Result 2.1 again:

Corollary 3.3

For all n,

$$\frac{\partial S_n\left( \sqrt{\sigma _X^2+\sigma _B^2 + 2 \rho \sigma _{X} \sigma _B},\mu \right) }{\partial \rho } \, \frac{>}{<} \, 0 \quad \forall (\sigma , \mu ) \in M \quad \iff \quad R_n'(y,z) \, \frac{>}{<} \, 0 \quad \forall y.$$
(19)

For the interpretation of (19), we can refer to the discussion of Corollary 3.2 above.

Compensatory changes in risks

Suppose that the riskiness changes: one of \(s=\sigma _X, \sigma _B, \rho\) varies by, say, \(d s>0.\) Then the compensatory change \(d \mu _Y\) that keeps the investor in \(C_n(u)\) is given by

$$\frac{\partial \sigma _Y}{\partial s} \cdot S_n(\sigma _Y, \mu _Y)$$

in (8). For \(n=1\) such compensatory changes have been studied, e.g., in Wong and Ma (2008). With \(n=1,\) (8) simply restates that risk-averse [risk-loving] individuals, who feel better [or worse] upon an increase in riskiness or correlation (depending on the sign of \(\partial \sigma _Y/\partial s\)), can be compensated by an increase [decrease] in \(\mu _Y= \mu _B + \mu _X\) in the magnitude of the MRS.

For \(n=2,\) (8) captures that if a decision makers wishes to keep the marginal utility from wealth unchanged in the wake of a more pronounced riskiness, this requires an increase in μ if he is either risk-averse and prudent (\(u^{\prime \prime}(y)<0<u'''(y)\) for all y) or risk-loving and imprudent (\(u^{\prime \prime}(y)>0>u'''(y)\) for all y)—and a decrease in μ otherwise.

Optimal decisions with additive risks

Set-up

It is well known that changes in risk attitudes do not necessarily lead to the intuitively expected changes in decision maker’s behavior. For example, somebody who becomes more risk-averse upon a change in risk does not necessarily engage in less risky activities upon that change in risk. Against that backdrop, it is informative to see how additive risks affect risky choices in a generic decision problem.

Suppose a risk-averse decision maker (with \(U_\sigma<0< U_\mu\)) faces some exogenous risk B with non-negative mean \(\mu _B\ge 0\) and standard deviation \(\sigma _B \ge 0\) (where \(\sigma _B=0\) captures the case of some non-random, exogenous wealth). In the presence of this “background risk” he sets a variable \(\alpha \in {\mathbb{R}}\) that linearly increases his exposure to some risk X (with \(\mu _X, \sigma _X>0\)). Letting the covariance of X and B to be \(\sigma _{XB},\) we put all location and scale parameters in the vector \(\theta =(\mu _X, \sigma _X, \mu _B, \sigma _B,\sigma _{XB}) \in {\mathbb{R}}_{++}^2 \times {\mathbb{R}}_+^2\times {\mathbb{R}}\) for notational convenience. Given \(\theta ,\) the decision maker maximizes her utility \(U(\sigma _Y, \mu _Y)\) with

$$\mu _Y = \alpha \mu _X +\mu _B \quad \text{ and } \quad \sigma _Y^2= \alpha^2 \sigma^2_X + \sigma _B^2+2\alpha \sigma _{XB}.$$
(20)

This could represent, for example, a stylized portfolio choice or (mutatis mutandis) an insurance problem with background uncertainty. From the first-order condition, we obtain

$$\mu _X- S_1(\sigma _Y, \mu _Y)\frac{\partial \sigma _Y}{\partial \alpha }=0$$
(21)

which implicitly defines the optimal choice \(\alpha^{\ast}=\alpha^{\ast}(\theta).\) Here, \(\frac{\partial \sigma _Y}{\partial \alpha }=\frac{\alpha \sigma^2_X+\sigma _{XB}}{\sigma _Y}.\) We will henceforth assume that α denotes a risky activity in the sense that it marginally increases the standard deviation of final wealth at its optimal level \(\alpha^{\ast}.\) Over here, we assume that

$$\frac{\partial \sigma _Y}{\partial \alpha^{\ast}}>0.$$
(22)

The condition in (22) will automatically hold whenever X and B are non-negatively correlated but not for negative values of \(\sigma _{XB}.\) Thus, we need to assume (22).

The signs of the comparative statics with respect to the distribution parameters in \(\theta\) are obtained by applying the implicit function theorem to (21), taking into account that the SOC for \(\alpha^{\ast}(\theta )\) requires that the derivative of the left-hand side of (21) is negative. A common intuition for the comparative statics to come can be gained from interpreting (21) geometrically: it defines the optimal choice, α, as a situation where the slope, \(S_1,\) of a decision maker’s \((\sigma _Y, \mu _Y)\)-indifference curve is equal to the slope, given by \(\mu _X/(\partial \sigma _Y/\partial \alpha),\) of the “opportunity locus,” which defines the marginal trade-off between the increases in return and in risk to which the choice problem (20) exposes the decision maker. Whether and into what direction the optimal choice drifts when a parameter of the choice problem varies then depends on whether the marginal rate of substitution between risk and return varies relatively more strongly than the slope of the opportunity locus. This gives rise to the elasticity considerations in Eqs. (23)–(27) below. It also explains why the comparative statics with respect to parameters related to the “endogenous” direct risk differ qualitatively from those for the exogenous background risk: the exposure to the former is a chosen one (via α), the exposure to the latter cannot be avoided (but at best be indirectly reduced, via a covariance effect). In essence, this makes the comparative statics with respect to the background risk simpler—which is in marked contrast to the EU framework.

In full detail, the elasticity intuition for comparative statics in the MV framework is developed in Eichner and Wagener (2009, pp. 1145ff), which also includes a discussion of the differences between studying background risk in the MV model and in the conventional expected utility model.

Changes in the background risk

Starting with the background risk B, we get

Result 4.1

The comparative statics of the optimal choice \(\alpha^{\ast}\) with respect to the background risk are characterized by

$$\begin{aligned} \frac{\partial \alpha^{\ast}(\theta )}{\partial \mu _B} \, \ge 0 \quad \forall \theta \iff \quad \frac{\partial S_1}{\partial \mu _Y} \, \le \, 0 \quad \forall (\sigma _Y, \mu _Y) \in M ; \end{aligned}$$
(23)
$$\frac{\partial \alpha^{\ast}(\theta )}{\partial \sigma _B} \le 0 \quad \forall \theta\iff \quad \frac{\partial S_1}{\partial \sigma _Y} \cdot \frac{\sigma _Y}{S_1} \ge 1 \quad \forall (\sigma _Y, \mu _Y) \in M.$$
(24)

Proof

Equation (23) is an immediate implication of (15). To arrive at (24) observe that

$$\begin{aligned} \frac{\partial \alpha^{\ast}(\theta )}{\partial \sigma _B} \le 0 \quad \forall \theta & \iff \frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial S_1}{\partial \sigma _Y}\frac{\partial \sigma _Y}{\partial \sigma _B}+S_1\frac{\partial (\partial \sigma _Y/\partial \alpha )}{\partial \sigma _Y}\frac{\partial \sigma _Y}{\partial \sigma _B}\ge 0 \\ & \iff \frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial S_1}{\partial \sigma _Y} -S_1\frac{\partial \sigma _Y}{\partial \alpha }\frac{1}{\sigma _Y}\ge 0 \\ & \iff \frac{\partial S_1}{\partial \sigma _Y} \cdot \frac{\sigma _Y}{S_1} \ge 1 \quad \forall (\sigma _Y, \mu _Y) \in M. \end{aligned}$$

\(\square\)

From (23), a risk-averse decision maker increases risk-taking upon a shift in the location of a dependent background risk if and only if his preferences exhibit decreasing absolute risk aversion (also cf. (15)).

He will reduce risk-taking in response to an increase in the scale of the background risk if the elasticity of his risk aversion with respect to the riskiness of final wealth is larger than one. Comparing (24) with (18) (for \(n=1\) and \(k=B\)) we observe: in order that a greater background risk reduces risk-taking (\(\partial \alpha^{\ast}/\partial \sigma _B <0\)), it does not suffice that the decision maker gets more risk-averse; \(\partial S_1/\partial \sigma _B\) being positive is necessary, but not sufficient for \(\partial \alpha^{\ast}/\partial \sigma _B\) to be negative.

Changes in the direct risk

The comparative statics with respect to the direct risk X are slightly more difficult to characterize. They can be framed, however, in terms of the concepts of risk attitudes introduced in “MV preferences and EU approach” section:

Result 4.2

The comparative statics of the optimal choice \(\alpha^{\ast}\) with respect to the direct risk are characterized by

$$\frac{\partial \alpha^{\ast}(\theta )}{\partial \mu _X} \ge 0 \quad \forall \theta \iff \quad \frac{\partial S_1}{\partial \mu _Y} \cdot \frac{\mu _Y}{S_1} \le 1 \quad \forall (\sigma _Y, \mu _Y) \in M;$$
(25)
$$\frac{\partial \alpha^{\ast}(\theta )}{\partial \sigma _X} \le 0 \quad \forall \theta \iff \quad \frac{\partial S_1}{\partial \sigma _Y} \cdot \frac{\sigma _Y}{S_1} \ge -1 \quad \forall (\sigma _Y, \mu _Y) \in M.$$
(26)

Proof

Condition (25) is obtained from differentiating the LHS of (21) with respect to \(\mu _X\) and then using (21) again:

$$\frac{\partial }{\partial \mu _X} \left( \mu _X- S_1\frac{\partial \sigma _Y}{\partial \alpha }\right) = 1-\frac{\partial S_1}{\partial \mu _Y}\frac{\partial \sigma _Y}{\partial \alpha }\alpha = 1-\frac{\partial S_1}{\partial \mu _Y}\frac{\alpha \mu _X}{S_1}.$$

Now observe that \(\mu _Y \ge \alpha \mu _X.\) For (26), differentiate the LHS of (21) with respect to \(\sigma _X\):

$$\begin{aligned} \frac{\partial }{\partial \sigma _X} \left( S_1\frac{\partial \sigma _Y}{\partial \alpha }-\mu _X\right) &= \frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial S_1}{\partial \sigma _Y}\frac{\partial \sigma _Y}{\partial \sigma _X}+ S_1\left[ \frac{\partial \sigma _Y}{\partial \alpha }(-\frac{1}{\sigma _Y})\frac{\partial \sigma _Y}{\partial \sigma _X} +\frac{2\alpha \sigma _X}{\sigma _Y}\right] \\ &= \frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial \sigma _Y}{\partial \sigma _X}\left[ \frac{\partial S_1}{\partial \sigma _Y}-\frac{S_1}{\sigma _Y} \left( 1-2 \frac{\alpha \sigma _X}{\frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial \sigma _Y}{\partial \sigma _X}} \right) \right] . \end{aligned}$$

Here we get

$$\frac{\alpha \sigma _X}{\frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial \sigma _Y}{\partial \sigma _X}} =\frac{1}{\frac{\partial \sigma _Y}{\partial \alpha }\frac{\alpha }{\sigma _Y}}=\frac{\sigma^2_Y}{\alpha^2\sigma^2_X+\alpha \sigma _{XB}}.$$

Further assuming that \(\alpha \sigma _{XB}+\sigma^2_B\ge 0\)—which holds when X and B are non-negatively correlated—we can have that \(\frac{\sigma _Y^2}{\alpha^2 \sigma _X^2+\alpha \sigma _{XB}} \ge 1,\) where the lower bound is reached at \(\alpha \sigma _{XB}+\sigma^2_B=0.\) Hence, the expression in round brackets above never exceeds \(-1\); a lower bound can exist when X and B are negatively correlated. \(\square\)

From (25) the decision maker will increase risk-taking in response to an increase in the expected return of his activity if the elasticity of his risk aversion with respect to expected wealth is smaller than one. This condition has an expected utility analogue, too. As shown in Eichner and Wagener (2014), if EU and MV approach are compatible, then the wealth elasticity of MV risk aversion being smaller than one is equivalent to the index of partial relative risk aversion, \(R_1(a,y-a)\) (cf. (10)) being smaller than one for all \(a>0.\) Hadar and Seo (1990) and Dionne and Gollier (1992) have shown that this condition characterizes the comparative static effects for first-order stochastic dominance shifts in the returns to a risky activity—of which an increase in \(\mu _X\) is the MV analogue.

Condition (26) says that the decision maker will decrease risk-taking in response to an increase in the variance of his activity if the elasticity of his risk aversion with respect to wealth risk is larger than \(-1.\) Again this condition—which originally was derived in Battermann et al. (2002) and Broll et al. (2006)—has an EU analogue, viz. that the index of partial relative risk prudence, \(R_2(a,y-a)= -(y-a) \frac{u'''(y)}{u^{\prime \prime}(y)}\) (again cf. (10)) being smaller than 2 for all \(a>0\) (Eichner and Wagener 2005). Ormiston and Schlee (2001) identify this as the condition that a mean-preserving spread in the returns to a risky activity tempers risk-taking—of which an increase in \(\sigma _X\) is the MV analogue here.

Changes in the dependence between the direct risk and the background risk

Now we turn to study the comparative statics with respect to the dependence between the direct risk and the background risk. It can be framed in terms of the concepts of risk attitudes introduced in “MV preferences and EU approach” section:

Result 4.3

The comparative statics of the optimal choice \(\alpha^{\ast}\) with respect to the covariance between direct and background risk are characterized by

$$\frac{\partial \alpha^{\ast}(\theta )}{\partial \sigma _{XB}} \le 0 \quad \forall \theta \iff \quad \frac{\partial S_1}{\partial \sigma _Y} \cdot \frac{\sigma _Y}{S_1} \ge 0 \quad \forall (\sigma _Y, \mu _Y) \in M$$
(27)

Proof

For (27), differentiate the negative of the LHS in (21) with respect to \(\sigma _{XB}\):

$$\begin{aligned} \frac{\partial }{\partial \sigma _{XB}} \left( S_1\frac{\partial \sigma _Y}{\partial \alpha }-\mu _X\right) &= \frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial S_1}{\partial \sigma _Y}\frac{\partial \sigma _Y}{\partial \sigma _{XB}}+ S_1\left[ \frac{\partial \sigma _Y}{\partial \alpha }(-\frac{1}{\sigma _Y})\frac{\partial \sigma _Y}{\partial \sigma _{XB}} +\frac{1}{\sigma _Y}\right] \\ &= \frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial \sigma _Y}{\partial \sigma _{XB}}\left[ \frac{\partial S_1}{\partial \sigma _Y}-\frac{S_1}{\sigma _Y} \left( 1- \frac{1}{\frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial \sigma _Y}{\partial \sigma _{XB}}} \right) \right] . \end{aligned}$$

Here we get

$$\frac{1}{\frac{\partial \sigma _Y}{\partial \alpha }\frac{\partial \sigma _Y}{\partial \sigma _{XB}}} =\frac{1}{\frac{\partial \sigma _Y}{\partial \alpha }\frac{\alpha }{\sigma _Y}}=\frac{\sigma^2_Y}{\alpha^2\sigma^2_X+\alpha \sigma _{XB}}.$$

If we can further assume that \(\alpha \sigma _{XB}+\sigma^2_B\ge 0.\) This holds when X and B are non-negatively correlated. In this situation, we can have that \(\frac{\sigma _Y^2}{\alpha^2 \sigma _X^2+\alpha \sigma _{XB}} \ge 1,\) where the lower bound is reached at \(\alpha \sigma _{XB}+\sigma^2_B=0.\) Hence, the expression in round brackets is never larger than 0; a lower bound can exist when X and B are negatively correlated. \(\square\)

Condition (27) says that the decision maker will reduce risk-taking in response to an increase in the covariance of the two risks if the elasticity of his risk aversion with respect to wealth risk is larger than 0. Again, this condition has an EU analogue, viz. that the index of partial relative risk prudence, \(R_2(a,y-a)= -(y-a) \frac{u'''(y)}{u^{\prime \prime}(y)}\) (again cf. (10)) being smaller than 1 for all \(a>0.\)

For Results 4.1–4.3, which can actually be found as Propositions 1 and 2 in Eichner and Wagener (2009), our contribution is to simplify the proofs and make them easier to access.

Application: a risk-taking bank with background risk

Recently, Broll et al. (2015) have investigated the banking firm and risk-taking in a two-moment decision model. In this section, we add a background risk to this problem and apply the results presented above to its comparative statics.

Consider a bank that decides on how many and which fiscal assets to hold. The bank has the following balance sheet: \(\alpha = K + D,\) where α is the amount of financial assets, D is the quantity of deposits, and K is the stock of equity capital. We assume that short sales of the asset are forbidden, i.e., \(\alpha \ge 0.\) Moreover, there is a capital requirement, imposing that \(K \ge k \cdot \alpha\) for some \(k \in (0,1).\) The risky return on financial assets is given by random variable \(\tilde{r}.\)

The bank’s shareholders contribute equity capital with a required rate of return, \(r_K,\) on their investment. The supply of deposits is perfectly elastic at an exogenous deposit rate, \(r_D.\) We suppose that \(r_K > r_D,\) implying that the capital requirement will bite: \(k \alpha = K.\) Moreover, the bank’s weighted average cost of capital (WACC) is then given by \(r_c := (1 - k) r_D + k r_K.\) There are no fixed costs; the bank’s operating cost, \(C(\alpha ),\) is increasing and convex, that is, \(C(0)=0,\) \(C'>0,\) and \(C''\ge 0\) for all α. There is some additive background risk B (e.g., from operations off the balance sheet).

Substituting the bank’s balance sheet constraint and the binding capital requirement, the bank’s shareholder gets final wealth at date 1 of

$$Y = \alpha X - C(\alpha )+ B,$$

where we set \(X: =\tilde{r}-r_c.\) The bank chooses α such as to maximize the MV utility from Y. Clearly, with respect to risks, this is a problem within a linear distribution class as in (20). We can, thus, directly use the results presented earlier to arrive at its comparative statics.

For changes in the background risk, conditions (23), (24), and (27) apply: the bank will take in more risky assets in response to a higher expected background income if its preferences exhibit decreasing absolute risk aversion; its response to an increase in the risk of background income or in the correlation between the risks on financial and other incomes depends on the magnitude of the elasticity of its risk aversion with respect to \(\sigma _Y.\)

For changes in the direct financial risk, conditions (25) and (26) applyFootnote 7: the magnitude of the elasticity of the bank’s risk aversion with respect to \(\mu _Y\) and \(\sigma _Y\) determines whether the bank holds more financial assets when, respectively, their expected return or their riskiness increases.

The interpretations of the above conditions are similar to the general cases and thus are omitted here. By adopting the MV approach, the effects of dependent background risk on the banking firm’s risk-taking can be easily structured and clearly studied.

Concluding remarks

With multiple additive risks, the MV approach and the expected utility approach of risk preferences are compatible if all attainable distributions belong to the same location–scale family. For such scenarios, this paper presents parallels of the two approaches with respect to risk attitudes, the changes thereof, and the comparative statics for simple, linear choice problems under risks.

Given that the preference functional in the MV approach only depends on mean and variance, all effects depend on the monotonicity properties either of the utility function itself or of the attending marginal rate of substitution between the two parameters. This once again highlights the simplicity and convenience of the MV approach: all effects can be framed in terms of risk-return trade-offs.

The MV approach provides a genuine and surprisingly rich framework for the economic modeling of preferences and choice under risk. Still, many extensions can be envisioned, both within and beyond the location–scale framework where equivalence with the EU approach prevails. Starting from the discussion offered in this paper, non-additive background risks or S-shaped vNM utilities appear to be promising topics. Last, we note that after establishing a theoretical model, the next step is to develop an estimation and/or hypothesis testing (see, for example, Leung and Wong 2008) for the model. We leave the estimation and testing of the model we developed in our paper in the future study.

There are many applications of the theory developed in this paper and other papers. For example, recently, Broll and Mukherjee (2017) examine the optimal production and trade decisions of a domestic firm facing uncertainties owing to exchange rate volatility under MV preferences. Extending their analysis to situations with background risk is an interesting and important problem.