1 Introduction

When choosing between risks a decision maker is often at the same time selecting an overall risk level. Choosing between two risky mutual funds is combined with deciding how much wealth to invest in the selected mutual fund and how much to hold in an insured savings account instead. Buying a car or house is combined with choosing how much auto or homeowner’s insurance to purchase. When selecting a television or washing machine, extending the manufacturer’s warranty is often an option. These secondary decisions are made simultaneously with the primary decision and are ways that decision makers can use to reduce the riskiness of their initial selection. The point being made with these examples is that when a decision maker is choosing between two risky alternatives (the primary decision), the decision of how much of the chosen risk to bear and how much to transfer to others is often made simultaneously. This secondary decision is relevant and important precisely because it is made concurrently with the primary decision. The decision maker’s attitude toward risk affects both decisions, and this interaction can alter the choice between risky alternatives.Footnote 1

This paper explicitly models a simultaneous secondary decision in the stochastic dominance context. Sets of decision makers are asked to rank or choose between two random variables \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) knowing that a portion of the risk they choose can be transferred to others. The analysis extends the literature on stochastic dominance that was started more than 50 years ago by Hadar and Russell (1969) and Hanoch and Levy (1969). The basic stochastic dominance question they pose is: when is random variable \(\stackrel{{\sim }}{\text{x}}\) preferred or indifferent to random variable \(\stackrel{{\sim }}{\text{y}}\) by all decision makers with a utility function in some set U? This same question is addressed here with the additional feature that both \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) can be altered by choosing an optimal risk reducing change or transformation. This additional decision is made simultaneously with choosing between \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), and can lead to a different selection. Because of this extra decision, individual decision makers may rank \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) differently than without it, and therefore the answer to the basic stochastic dominance question may also be different.Footnote 2

Because the answer to the basic stochastic dominance question can be different when \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) can be transformed, a new term is required to discuss the partial order that is generated. The term used is “stochastically superior” or “stochastic superiority.” Random variable \(\stackrel{{\sim }}{\text{x}}\) is said to be stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) for a set of decision makers U if every decision maker in U weakly prefers the optimally transformed \(\stackrel{{\sim }}{\text{x}}\) to the optimally transformed \(\stackrel{{\sim }}{\text{y}}\). The transformation applied to \(\stackrel{{\sim }}{\text{y}}\) can be different from that applied to \(\stackrel{{\sim }}{\text{x}}\), and these transformations can and likely do differ across decision makers. Each decision maker in U optimally selects a risk reducing transformation for \(\stackrel{{\sim }}{\text{x}}\) and for \(\stackrel{{\sim }}{\text{y}}\) at the same time he or she is choosing between \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\). All decisions are assumed to be made to maximize the decision maker’s expected utility.

When decision makers can each optimally choose how to transform both \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), the question of which of the two random variables is unanimously preferred to the other by sets of decision makers could well have fewer rather than more clear-cut answers; that is, there could be fewer rather than more unanimous rankings of \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\). The additional complication of a simultaneous risk level decision could make it more difficult for sets of decision makers to agree as to whether \(\stackrel{{\sim }}{\text{x}}\) or \(\stackrel{{\sim }}{\text{y}}\) is best. A main finding here, however, is that for the simultaneous risk level decision considered in this analysis, \(\stackrel{{\sim }}{\text{x}}\) stochastically dominating \(\stackrel{{\sim }}{\text{y}}\) implies, but is not implied by, \(\stackrel{{\sim }}{\text{x}}\) being stochastically superior to \(\stackrel{{\sim }}{\text{y}}\). That is, for sets of decision makers, the inclusion of this simultaneous risk level decision makes unanimity for the primary decision of ranking \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) more rather than less likely. That stochastic superiority is implied by stochastic dominance is presented as Corollary 1 to Theorem 1 in the paper.

Stochastic superiority allows random variables to be ranked when stochastic dominance does not, and therefore leads to smaller efficient sets. As the title of the Hanoch and Levy (1969) paper indicates, reducing sets of random variables to efficient sets, ones containing only elements that at least one decision maker would choose, is one of the main goals of stochastic dominance analysis. This is true not only for first- and second-degree stochastic dominance as defined by Hadar and Russell (1969) and Hanoch and Levy (1969), but also for general nth-degree stochastic dominance and many other forms of stochastic dominance presented in the literature since then. Each new definition refines a previously established form of stochastic dominance in a way that leads to additional rankings and smaller efficient sets.

Leshno and Levy (2002), when defining almost stochastic dominance, are also concerned with obtaining additional rankings of random variables. They define a procedure that can be applied to any of the established forms of stochastic dominance such as first-, second- or nth- degree stochastic dominance. Almost stochastic dominance modifies an existing form of stochastic dominance by removing from the associated set of utility functions those utility functions that represent preferences that Leshno and Levy consider to be “extreme, pathological or simply unrealistic.”Footnote 3 By removing these extreme preferences and utility functions, unanimous rankings are obtained that these extreme preferences would otherwise prevent.Footnote 4

The almost stochastic dominance procedure follows a long-established pattern. All stochastic dominance definitions, including those generated using almost stochastic dominance, propose a new form of stochastic dominance that allows more random variables to be ranked and accomplishes this by removing from consideration some of the utility functions previously allowed. This pattern was established at the very beginning by Hadar and Russell and Hanoch and Levy when they defined second-degree stochastic dominance by removing from consideration all increasing utility functions that are not concave. By removing these utility functions from consideration, second-degree stochastic dominance can rank more random variables and generate smaller efficient sets than first-degree stochastic dominance.

Like almost stochastic dominance, stochastic superiority is a modification procedure that can be applied to any established form of stochastic dominance and the goal is additional rankings and smaller efficient sets. Stochastic superiority obtains these additional rankings, however, not by removing utility functions, but by allowing decision makers to modify the random variables that are being compared. Stochastic superiority facilitates seemingly reasonable rankings not by discarding “extreme, pathological or simply unrealistic” preferences, but instead by allowing all decision makers, including those whose preferences seem to be extreme, to optimally reduce the risk associated with any random variable. Allowing this additional simultaneous risk reduction decision is often enough to cause the ranking of \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) by those with extreme preferences to be aligned with the ranking by decision makers whose preferences are not so extreme. Stochastic superiority and almost stochastic dominance have the same goal, to rank more random variables and generate smaller efficient sets, but the way in which this goal is accomplished is very different.

Consider the following example provided by Leshno and Levy (2002). Assume that two random alternatives are available, \(\stackrel{{\sim }}{\text{x}}\) which yields either $0 or $1,000,000 with probabilities 0.01 and 0.99 respectively, and \(\stackrel{{\sim }}{\text{y}}\) which is $1 with certainty. Leshno and Levy point out that although neither \(\stackrel{{\sim }}{\text{x}}\) nor \(\stackrel{{\sim }}{\text{y}}\) dominates the other in the first-degree,Footnote 5 except for very extreme preferences, \(\stackrel{{\sim }}{\text{x}}\) would be chosen over \(\stackrel{{\sim }}{\text{y}}\) by those who prefer more to less. They provide an example of such extreme risk preferences represented by the utility function u(x) = x for x ≤ 1 and u(x) = 1 for x > 1. Leshno and Levy argue that to ensure that \(\stackrel{{\sim }}{\text{x}}\) is preferred to \(\stackrel{{\sim }}{\text{y}}\) by all decision makers, utility functions as extreme as this u(x) must be excluded. The almost stochastic dominance modification procedure excludes these utility functions and thereby ensures that all decision makers unanimously prefer \(\stackrel{{\sim }}{\text{x}}\) to \(\stackrel{{\sim }}{\text{y}}\).

To see how stochastic superiority deals with this same example, notice that random variable \(\stackrel{{\sim }}{\text{x}}\) can be interpreted as asset whose value is $1,000,000 but subject to total loss with probability 0.01. Expected loss is then $10,000. Suppose coinsurance is available at a very high price of $100,000, which equals a loading factor of 9. When the chosen share of loss that is reimbursed is θ in [0, 1], the required insurance premium would be $100,000 (θ), and random wealth \(\stackrel{{\sim }}{\text{x}}\) becomes.

  • $1,000,000 (θ)–100,000 (θ), with probability 0.01

  • $1,000,000–100,000 (θ), with probability 0.99

This transformed \(\stackrel{{\sim }}{\text{x}}\) stochastically dominates $1 in the first-degree whenever ($1,000,000–100,000) (θ) ≥ $1 or θ ≥ 1/900,000.

This example shows that every individual who prefers more to less, including the one whose utility function is u(x) given above, would prefer \(\stackrel{{\sim }}{\text{x}}\) to \(\stackrel{{\sim }}{\text{y}}\) when a coinsurance arrangement with a loading factor even as high as 9 is available. After \(\stackrel{{\sim }}{\text{x}}\) has been chosen, individual decision makers will choose a coinsurance rate θ in [0, 1] to maximize their own expected utility. Many would choose not to insure at all, but those with extreme preferences including the example u(x), would choose to coinsure at a rate greater than 1/900,000.

The paper is organized as follows. In the next section, the question of how to compare one set of random variables with another set for groups of decision makers, the set dominance issue, is discussed, and the notion of stochastic superiority is formally defined. Set dominance is relevant for the definition of stochastic superiority because allowing a risk reducing secondary decision associates a set of random variables with each original one, and as a result ranking a pair of random variables is transformed into ranking two sets of random variables. In Sect. 3, a simple and seemingly overly strong sufficient condition for stochastic superiority is presented. One important implication of this sufficient condition is that stochastic dominance implies stochastic superiority. In the same section, this same sufficient condition is demonstrated to be necessary for convex sets of concave utility functions. Corollaries are provided that state these general results for the specific cases of first, second and nth-degree stochastic superiority. The set of utility functions corresponding to each of these is a convex set of utility functions, and concavity of utility is assumed for all but first-degree stochastic superiority. An important extension to situations where the cost of risk reduction differs for different random variables is discussed in Sect. 4. This material is particularly relevant when insurance is used to reduce risk since the cost of insurance is typically dependent on the expected loss associated with the indemnification contract. Section 5 is the section with three example applications. The first two show precisely how stochastic superiority can be used to obtain a ranking of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\) without removing extreme preferences from the set of utility functions under consideration. The final application is an extensive analysis of the decision to self-protect using stochastic superiority as an analysis tool. A result is presented indicating when all risk averse decision makers choose a higher level of self-protection over a lower level. As those who have examined the self-protection literature know, such a result is impossible to obtain in the standard model without a secondary risk reduction decision. This last example adds significantly to the self-protection literature. Finally, the paper concludes and discusses extensions of this work in Sect. 6.

2 Set dominance and stochastic superiority

How decision makers might rank sets of objects has been extensively studied.Footnote 6 The discussion of set dominance presented here differs from this general literature in two important ways. First, the objects in the sets to be ranked are random variables that are ranked using expected utility. Second, in the tradition of stochastic dominance, the relative desirability of one set of random variables over another is based on the unanimous preference for the former over the latter by groups rather than individual decision makers.

Let {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j} denote two sets of random variables. The supports of these random variables are assumed to lie in an interval [a, b]. The question under consideration is: when would all decision makers with utility functions u(x) in a set U rather be faced with the set {\(\stackrel{{\sim }}{\text{x}}\)i} than the set {\(\stackrel{{\sim }}{\text{y}}\)j}? Throughout the paper, the utility functions in U are assumed to be continuously differentiable as many times as necessary. Most often the set U will be one of the many sets of utility functions associated with the various degrees and forms of stochastic dominance, but sets containing only a small finite number of elements are also possible. A general specification of U is used in the majority of the analysis. The set dominance definition used in this analysis is given below.

Definition 1

The set {\(\stackrel{{\sim }}{\text{x}}\)i} is said to dominate or be preferred or indifferent to the set {\(\stackrel{{\sim }}{\text{y}}\)j} on U if for every u(x) in U the best element from {\(\stackrel{{\sim }}{\text{x}}\)i} is preferred or indifferent to the best element from {\(\stackrel{{\sim }}{\text{y}}\)j}.

In Definition 1, both the best element from a set and the relative desirability of the respective best elements are determined using expected utility. There is no requirement in this definition that the decision makers agree as to which element of either set is best. Using this definition to rank sets of random variables is consistent with assuming that after a set is selected, the decision maker then chooses and obtains the most preferred element from that set. There is no requirement on U imposed by this definition. U can contain a single element, a small finite number of elements, or U can be any one of the various sets of utility functions employed in stochastic dominance analysis.

Definition 1 is not the only set dominance definition that is possible or reasonable. For game models it may be more accurate to assume that after a set is selected, someone else chooses the particular element from either {\(\stackrel{{\sim }}{\text{x}}\)i} or {\(\stackrel{{\sim }}{\text{y}}\)j}, possibly in a malevolent way, or more neutrally, to assume that nature chooses from these sets in a random way. For our purpose of including a simultaneous risk reduction decision with a primary ranking decision, Definition 1 provides the desired concept of set dominance.

Since the work here is similar in other ways to that presented by Levy and Kroll (1976, 1978) and Levy (2006),Footnote 7 it is important to point out that they do not use Definition 1 as their definition of set dominance.Footnote 8 Levy and Kroll indicate that the set {\(\stackrel{{\sim }}{\text{x}}\)i} LK-dominates the set {\(\stackrel{{\sim }}{\text{y}}\)j} on U if for every element of {\(\stackrel{{\sim }}{\text{y}}\)j} there exists an element of {\(\stackrel{{\sim }}{\text{x}}\)i} such that the latter element is preferred or indifferent to the former by all u(x) in U. When set {\(\stackrel{{\sim }}{\text{x}}\)i} dominates set {\(\stackrel{{\sim }}{\text{y}}\)j} by the LK definition it also does so by Definition 1, but the reverse is not true. LK-dominance is more difficult to satisfy than is Definition 1.

A restaurant analogy can be used to illustrate the differences between these two concepts of set dominance. Consider two restaurants, each offering a menu of food items. The definition of set dominance used here indicates that one restaurant is preferred or indifferent to the other by a group of diners if every diner in the group (weakly) prefers her or his favorite item on the one restaurant’s menu to her or his favorite item at the other restaurant. This definition of set dominance is consistent with assuming that after choosing which restaurant to visit, each person in the group can order their own most preferred item on that restaurant’s menu. In contrast, one restaurant LK-dominates the other if for every item on the latter restaurant’s menu, there exists an item on the former restaurant’s menu that is (weakly) preferred by all diners in the group. For situations where all diners are required to order the same menu item, the LK definition can be useful. For the purposes here, the LK-dominance relation is stronger than needed or desired. It is because of this different set dominance definition that the work presented here differs from that of Levy and Kroll.

As indicated in the introduction, the fundamental stochastic dominance question was answered by Hadar and Russell (1969) and Hanoch and Levy (1969) when the set U consists of all nondecreasing utility functions, denoted U1, or when the set U consists of all nondecreasing and weakly concave utility functions, denoted U2. These are, of course, the well-known first-degree stochastic dominance (FSD) and second-degree stochastic dominance (SSD) definitions, respectively. In most of the literature since 1969, extending FSD and SSD involves specifying a different set of utility functions U which is a subset of U2. For example, general nth-degree stochastic dominance considers sets of utility functions Un whose kth derivatives (k = 1, 2, …, n) alternate in sign, the first derivative being nonnegative. The almost stochastic dominance (ASD) literature also follows this pattern of altering the set of utility functions and does so by excluding from U1, U2 or Un, decision makers whose preferences are considered to be extreme.

The stochastic superiority definition presented below extends the FSD and SSD findings but does so in a different manner. Rather than considering subsets of U2, instead assumptions are made that allow each decision maker to choose to transform the pair of alternatives, \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), that are being compared. The possible transformations lead to a pair of sets {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j}, that are derived from \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), respectively, which the decision makers must then rank or choose between. The ranking of \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) is determined by the ranking of sets {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j}, and Definition 1 is used when determining this ranking. This way of extending stochastic dominance is not the usual one and requires a new term. The term used is stochastic superiority, and a general definition of stochastic superiority is provided next.

Definition 2

Random variable \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to random variable \(\stackrel{{\sim }}{\text{y}}\) on U if for every u(x) in U the most preferred element in set {\(\stackrel{{\sim }}{\text{x}}\)i} is preferred or indifferent to the most preferred element in set {\(\stackrel{{\sim }}{\text{y}}\)j}, where {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j} are derived from \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), respectively, by transforming \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) in a simultaneous secondary decision.

Whether \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) or not depends critically on two things, the set of utility functions under consideration, and the nature of the simultaneous secondary decision that gives rise to the sets {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j}. The sets of utility functions U considered in this analysis all are well known and have been extensively discussed in the literature. Included in the analysis are U1, U2 and Un, the sets of utility functions that are associated with FSD, SSD and nth-degree stochastic dominance, respectively. The focus here is not on new sets of utility functions.

In contrast, the sets {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j}, and the simultaneous secondary decision that generates these sets, are less well known in the stochastic dominance literature. While many simultaneous secondary decisions can be considered, the focus here is on a particularly simple one which is often available to the decision maker. The secondary decision is that of choosing an overall risk level. That is, in addition to choosing between \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), the decision maker also decides how much of the risk that results from choosing \(\stackrel{{\sim }}{\text{x}}\) or \(\stackrel{{\sim }}{\text{y}}\) to assume and how much of that risk to avoid or transfer to others. In this analysis, when choosing between risky alternatives, decision makers are simultaneously deciding how much to reduce the risk they have selected using a risk reduction tool such as holding a riskless asset or purchasing insurance. These risk reducing secondary decisions are represented by the set of linear transformations which are described next.

When \(\stackrel{{\sim }}{\text{x}}\) is transformed linearly, and the α in the linear transformation \(\alpha \stackrel{{\sim }}{\text{x}}+\left(1-\alpha \right)\rho\) is restricted to be in [0, 1], the new random variable has smaller variance and less risk using a variety of measures of riskiness.Footnote 9 The transformed \(\stackrel{{\sim }}{\text{x}}\) is also in the same location-scale family as \(\stackrel{{\sim }}{\text{x}}\), and thus maintains many of its other general shape properties such as skewness or kurtosis.Footnote 10 Assuming changes to \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) are accomplished using such linear transformations is a simple assumption and one that maintains a focus on ranking \(\stackrel{{\sim }}{\text{x}}\) relative to \(\stackrel{{\sim }}{\text{y}}\). The assumption does not intentionally bias the choice between \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) in one direction or the other since each of these random variables can be transformed in the same way.

When ρ in the linear transformation is chosen to be equal to the mean value of \(\stackrel{{\sim }}{\text{x}}\), the transformation is a Rothschild and Stiglitz (1970) mean preserving reduction in risk. While the analysis considers a range of values for ρ, the focus is not on mean preserving risk reductions. Instead, ρ is always assumed to be less than the mean of \(\stackrel{{\sim }}{\text{x}}\). This is done to incorporate costly risk reduction into the simultaneous secondary risk reduction decision. The decision maker has the opportunity to transform \(\stackrel{{\sim }}{\text{x}}\) in a risk reducing manner but must give up a higher expected return in order to obtain this risk reduction. When diversifying using a riskless asset, this means the riskless return is assumed to be smaller than the expected returns to risky assets. When insuring, this implies that the price of insurance is assumed to be actuarially unfair; that is, the loading factor is positive. It is important to point out that the smaller the value for ρ, the more costly is risk reduction.

In summary, the sets {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j} are formed from \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) by selecting a value for αi or βj in [0, 1] so that \({\stackrel{{\sim }}{\text{x}}}_{\text{i}}={\alpha }_{\text{i}}\stackrel{{\sim }}{\text{x}}+\left(1-{\alpha }_{\text{i}}\right)\rho\) and \({\tilde{y }}_{\text{j}}={\beta }_{\text{j}}\stackrel{{\sim }}{\text{y}}+\left(1-{\beta }_{\text{j}}\right)\rho\), respectively. These chosen values for α and β represent how much of the risk of \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) is retained, and can range from no risk at all, α = 0 or β = 0, to accepting all of the risk associated with \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), α = 1 or β = 1. The general stochastic superiority definition can now be restated in a more specific way using these linear risk reducing transformations to generate the sets {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j}.

Definition 3

Random variable \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to random variable \(\stackrel{{\sim }}{\text{y}}\) on U for a given ρ if for every u(x) in U, the most preferred element in set \({\left\{\alpha \stackrel{{\sim }}{\text{x}}+\left(1-\alpha \right)\rho \right\}}_{\alpha \in \left[{0,1}\right]}\) is preferred or indifferent to the most preferred element in set \({\left\{\beta \stackrel{{\sim }}{\text{y}}+\left(1-\beta \right)\rho \right\}}_{\beta \in \left[{0,1}\right]}\).

To avoid trivial comparisons and repeated qualifications, the value of ρ is assumed to be such that at least one of \(\tilde{x}\) and \(\tilde{y}\) does not stochastically dominate ρ on U and ρ < max {E(\(\stackrel{{\sim }}{\text{x}}\)), E(\(\stackrel{{\sim }}{\text{y}}\))}. This is because when both \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) dominate ρ, risk reduction in either \(\stackrel{{\sim }}{\text{x}}\) or \(\stackrel{{\sim }}{\text{y}}\) is never expected utility increasing, and if ρ is greater than or equal to the mean of both \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), risk reduction is free and maximum risk reduction is chosen by all risk averse decision makers.

3 A sufficient and a necessary condition for stochastic superiority

At first glance, finding a necessary and sufficient condition for stochastic superiority appears to be a difficult task. The most preferred element in {\(\stackrel{{\sim }}{\text{x}}\)i} must be compared with the most preferred element in {\(\stackrel{{\sim }}{\text{y}}\)j} for each utility function in set U. A sufficient and a necessary condition for stochastic superiority are provided in this section. An important first step in proving sufficiency is the following lemma. The lemma holds for an arbitrary set of utility functions U, and for more general linear transformations than the risk reducing ones employed in Definition 3.

Lemma 1

\(\stackrel{{\sim }}{\text{x}}\) stochastically dominates \(\stackrel{{\sim }}{\text{y}}\) for all u(x) in U implies λ\(\stackrel{{\sim }}{\text{x}}\) + γ stochastically dominates λ·\(\stackrel{{\sim }}{\text{y}}\) + γ for all u(x) in U, where λ and γ are any constants with λ ≥ 0.

Lemma 1 is quite obvious, easy to demonstrate, and is stated here because it is used when proving Theorem 1.Footnote 11 The set U in Lemma 1 can contain a single element, a small finite number of elements, or any one of the various sets of utility functions employed in standard stochastic dominance analysis.

Theorem 1

If for some α in [0, 1] and a fixed ρ, α \(\stackrel{{\sim }}{\text{x}}\) + (1 − α) ρ stochastically dominates \(\stackrel{{\sim }}{\text{y}}\) for all u(x) in U, then \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) for all u(x) in U for the given ρ.

Proof

See the Appendix.

Theorem 1 provides a sufficient test condition for stochastic superiority. One only need find any element of the set {α·\(\stackrel{{\sim }}{\text{x}}\) + (1 − α)·ρ} that stochastically dominates the untransformed \(\stackrel{{\sim }}{\text{y}}\) and stochastic superiority of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\) holds. It is not necessary to find the most preferred element in \(\left\{ {\alpha \tilde{x} + \left( {1 - \alpha } \right)\rho } \right\}\) nor to compare elements from this set with any element other than the untransformed element in the set \(\left\{ {\beta \tilde{y} + \left( {1 - \beta } \right)\rho } \right\}\).

To relate stochastic superiority to stochastic dominance, observe that the special case where α = 1 indicates that stochastic dominance of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\) implies stochastic superiority of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\). This is formally stated as the following corollary.

Corollary 1

Stochastic dominance of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\) on U implies stochastic superiority of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\) on U for all ρ.

The sufficient condition in Theorem 1 appears to be overly strong. As with Lemma 1, Theorem 1 holds for an arbitrary set U which can contain only one or a small number of elements as well as any one of the sets of utility functions that are associated with the various conventional forms of stochastic dominance. In the search for weaker sufficient conditions, to our surprise, it was determined that the condition in Theorem 1 is not only sufficient, but also necessary for virtually all forms of stochastic dominance. The necessity result, Theorem 2, requires more of the set U and as a result it is stated and demonstrated as a separate theorem.

Theorem 2

Suppose \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) for a given ρ on some convex set of concave utility functions U. Then there exists an α0 in [0, 1] such that \(\alpha_{0} \tilde{x} + \left( {1 - \alpha_{0} } \right)\rho\) stochastically dominates \(\stackrel{{\sim }}{\text{y}}\) on U.

Proof

See the Appendix.

For the sets of {\(\stackrel{{\sim }}{\text{x}}\)i} and {\(\stackrel{{\sim }}{\text{y}}\)j} generated using linear transformations, it is straightforward to see that the condition in Theorem 1 is both sufficient and necessary for set {\(\stackrel{{\sim }}{\text{x}}\)i} to LK-dominate set {\(\stackrel{{\sim }}{\text{y}}\)j}. So it is quite natural that this condition is also sufficient for stochastic superiority which is based on the weaker notion of set dominance given in Definition 1. It is important to mention again that Theorem 1 holds for arbitrary sets of utility functions.

The sufficient condition in Theorem 1 is demonstrated to be necessary in Theorem 2 when the set of utility functions is a convex set of concave utility functions. Recall that the sets of utility functions for all conventional forms of stochastic dominance other than first-degree satisfy this requirement. The proof of Theorem 2 relies on the convexity of the set U and also on the concavity of the individual u(x). It is these restrictions on U that imply that the weak set dominance condition provided in Definition 1 is equivalent to the seemingly much stronger set dominance notion provided by Levy and Kroll (1976, 1978).Footnote 12 In summary, from the very beginning stochastic dominance has most often required unanimous preference by utility functions that come from a convex set of concave utility functions. Theorem 2 demonstrates the powerful nature of this assumption and uses it to provide a relatively simple and useful test condition for stochastic superiority.

To relate these sufficient and necessary conditions, which are stated quite generally, to more familiar sets of utility functions Un, where n ≥ 1, the following two corollaries are provided. Note that Un is a convex set of concave utility functions for all n ≥ 2.

Corollary 2

If for some α in [0, 1] and a fixed ρ,·\(\alpha \stackrel{{\sim }}{\text{x}}+\left(1-\alpha \right)\rho\) stochastically dominates \(\stackrel{{\sim }}{\text{y}}\) on Un (or in the nth-degree), where n ≥ 1, then \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) on Un (or in the nth-degree) for the given ρ.

Corollary 3

Suppose that n ≥ 2. If \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) in the nth-degree for a given ρ, then there exists an α0 in [0, 1] such that \({\alpha }_{0}\stackrel{{\sim }}{\text{x}}+\left(1-{\alpha }_{0}\right)\rho\) stochastically dominates \(\stackrel{{\sim }}{\text{y}}\) in the nth-degree.

Two additional properties of stochastic superiority on Un are worth mentioning and follow directly from Definition 3. First, if \(\stackrel{{\sim }}{\text{x}}\) is nth-degree stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) for a given ρ, then for any n′ > n, \(\stackrel{{\sim }}{\text{x}}\) is also n′th-degree stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) for that ρ. Second, if \(\stackrel{{\sim }}{\text{x}}\) is nth-degree stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) for a given ρ, then E(\(\stackrel{{\sim }}{\text{x}}\)) ≥ E(\(\stackrel{{\sim }}{\text{y}}\)). This is because that the risk neutral linear utility function is in Un for all n ≥ 1.

The parameter ρ in the linear risk reducing transformation is inversely related to the cost of risk reduction to the decision maker. Lower values for ρ imply a higher cost of reducing risk. While it is not obvious from Definition 3 whether a larger ρ would make stochastic superiority more likely, the following theorem gives an affirmative answer.

Theorem 3

If \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) on some convex set of concave utility functions U for a given ρ, then \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) on U for all ρ′ such that ρ′ ≥ ρ.

Proof

See the Appendix.

Theorem 3 indicates that when stochastic superiority of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\) holds for a given cost of risk reduction, then it also holds for all lower costs of risk reduction.

4 An extension: Stochastic superiority of (\(\stackrel{{\sim }}{\text{x}}\), ρx) over (\(\stackrel{{\sim }}{\text{y}}\), ρy)

All analysis to this point assumes that the same value for ρ is used when transforming both \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\). Recall that ρ indicates the cost of risk reduction, and that smaller values for ρ indicate a larger cost of risk reduction. This section generalizes the definition of stochastic superiority to the situation where the cost of risk reduction is different for \(\stackrel{{\sim }}{\text{x}}\) than for \(\stackrel{{\sim }}{\text{y}}\). First, Definition 3 is simply restated for different costs of risk reduction for \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\).

Definition 4

(\(\stackrel{{\sim }}{\text{x}}\), ρx) is stochastically superior to (\(\stackrel{{\sim }}{\text{y}}\), ρy) on U if for every u(x) in U, the most preferred element in set \({\left\{\alpha \stackrel{{\sim }}{\text{x}}+\left(1-\alpha \right){\rho }_{x}\right\}}_{\alpha \in \left[{0,1}\right]}\) is preferred or indifferent to the most preferred element in set \({\left\{\beta \stackrel{{\sim }}{\text{y}}+\left(1-\beta \right){\rho }_{y} \right\}}_{\beta \in \left[{0,1}\right]}\).

By definition, if (\(\stackrel{{\sim }}{\text{x}}\), ρx) is stochastically superior to (\(\stackrel{{\sim }}{\text{y}}\), ρy) on U, then for all ρx′ ≥ ρx and ρy′ ≤ ρy, (\(\stackrel{{\sim }}{\text{x}}\), ρx′) is stochastically superior to (\(\stackrel{{\sim }}{\text{y}}\), ρy′) on U. The following property is less obvious and is derived from the sufficient and necessary conditions established in Theorems 1 and 2.

Theorem 4

Suppose that U is a convex set of concave utility functions. (\(\stackrel{{\sim }}{\text{x}}\), ρx) is stochastically superior to (\(\stackrel{{\sim }}{\text{y}}\), ρy) on U for any ρy such that ρx ≥ ρy if and only if (\(\stackrel{{\sim }}{\text{x}}\), ρx) is stochastically superior to (\(\stackrel{{\sim }}{\text{y}}\), ρx) on U.

Proof

See the Appendix.

To illustrate situations where the cost of risk reduction is likely to differ across risky alternatives and at the same time to demonstrate that insurance is a method of risk reduction that is included in this analysis, consider the following general example. Suppose there are two different risky assets: \(\stackrel{{\sim }}{\text{x}}={\text{w}}_{1}-{\stackrel{{\sim }}{\text{L}}}_{1}\) and \(\stackrel{{\sim }}{\text{y}}={\text{w}}_{2}-{\stackrel{{\sim }}{\text{L}}}_{2}\), where wi, i = 1 or 2, is the nonrandom initial wealth and \({\stackrel{{\sim }}{\text{L}}}_{1}\) ≥ 0 and \({\stackrel{{\sim }}{\text{L}}}_{2}\) ≥ 0 are two different random losses.Footnote 13\(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) could represent the uninsured value of two houses. In this example, each house has a different value, and each house is subject to a different random loss. These differences can occur because of different locations and the likelihood of earthquakes or hurricanes at each, or various other house characteristics. Traditional stochastic dominance analysis compares the two random variables representing the uninsured houses. Stochastic superiority adds to this comparison the assumption that when choosing between the two houses, the decision maker knows that he or she will purchase insurance to partially indemnify losses and may choose to insure each house differently.

With co-insurance, \(\stackrel{{\sim }}{\text{x}}\) can be transformed into any random final wealth in the set \(\left\{ {w_{1} - \tilde{L}_{1} + \theta \left( {\tilde{L}_{1} - m_{1} } \right)} \right\}_{{\theta \in \left[ {0,1} \right]}} = \left\{ {\left( {1 - \theta } \right)\tilde{x} + \theta \left( {w_{1} - m_{1} } \right)} \right\}_{{\theta \in \left[ {0,1} \right]}}\), where θ in [0, 1] is the share of loss that is reimbursed, and \(m_{1} = \left( {1 + \pi } \right)E\left( {\tilde{L}_{1} } \right)\) is the insurance premium for full coverage when the loading factor is π > 0. Similarly, \(\stackrel{{\sim }}{\text{y}}\) can be transformed into any random final wealth in the set \(\left\{ {w_{2} - \tilde{L}_{2} + \theta \left( {\tilde{L}_{2} - m_{2} } \right)} \right\}_{{\theta \in \left[ {0,1} \right]}} = \left\{ {\left( {1 - \theta } \right)\tilde{y} + \theta \left( {w_{2} - m_{2} } \right)} \right\}_{{\theta \in \left[ {0,1} \right]}}\), where \(m_{2} = \left( {1 + \pi } \right)E\left( {\tilde{L}_{2} } \right)\). Each of these sets are exactly in the form of the general linear transformation used when stochastic superiority is defined (Definitions 3 or 4).

It is important to recognize, however, that the price of risk reduction is likely to differ across these two assets due to differences in the loss variables. For these \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\), the cost of risk reduction for \(\stackrel{{\sim }}{\text{x}}\) is ρx = w1 − m1 and for \(\stackrel{{\sim }}{\text{y}}\) is ρy = w2 − m2 and these values may be different. In the specific situation of coinsurance, the cost of risk reduction is partially determined by the loading factor. Theorems 3 and 4 can be used together to establish the following theorem indicating how the likelihood of stochastic superiority changes with changes in the loading factor.

Corollary 4

Suppose that U is a convex set of concave utility functions. If \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) on U under a common loading factor π1 > 0, then \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) on U under any common loading factor π2 such that π1 > π2 ≥ 0.

5 Applications

Three examples are used to illustrate how stochastic superiority provides additional rankings of random variables. All three examples involve the so-called “left tail problem” where \(\stackrel{{\sim }}{\text{x}}\) is better than \(\stackrel{{\sim }}{\text{y}}\) for many or most decision makers, but stochastic dominance (of any degree) of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\) does not hold because \(\stackrel{{\sim }}{\text{x}}\) has more probability mass than \(\stackrel{{\sim }}{\text{y}}\) in the extreme left tail.

The first example is provided in Levy (2006).Footnote 14 Two alternatives are available, \(\stackrel{{\sim }}{\text{x}}\) which yields either $1 or $1,000,000 with probabilities 0.1 and 0.9 respectively, and \(\stackrel{{\sim }}{\text{y}}\) which is either $2 or $3 with those same probabilities. As Levy points out, neither \(\stackrel{{\sim }}{\text{x}}\) nor \(\stackrel{{\sim }}{\text{y}}\) dominates the other in the first-degree, yet \(\stackrel{{\sim }}{\text{x}}\) appears to be clearly better than \(\stackrel{{\sim }}{\text{y}}\). Almost stochastic dominance remedies this and allows \(\stackrel{{\sim }}{\text{x}}\) to be unanimously ranked higher than \(\stackrel{{\sim }}{\text{y}}\) by eliminating those decision makers exhibiting extreme risk aversion from consideration. Stochastic superiority, on the other hand, indicates that \(\stackrel{{\sim }}{\text{x}}\) is unanimously chosen over \(\stackrel{{\sim }}{\text{y}}\) by all decision makers, including those with extreme risk aversion, when all decision makers are permitted to transfer some of the risk in \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) to others.

Specifically, suppose ρ = 2.5, a value such that it is larger than the smallest values of both \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) (so neither \(\stackrel{{\sim }}{\text{x}}\) nor \(\stackrel{{\sim }}{\text{y}}\) stochastically dominates ρ), and is also smaller than the means of both \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) (implying that risk reduction is costly for both \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\)). It can be readily seen that the transformation (0.1)\(\stackrel{{\sim }}{\text{x}}\) + (0.9)(2.5) yields a random variable which stochastically dominates \(\stackrel{{\sim }}{\text{y}}\) in the first-degree. Thus, by Theorem 1, \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) in the first-degree for ρ = 2.5. This risk reducing transformation of \(\stackrel{{\sim }}{\text{x}}\) eliminates the left tail issue that would lead some with extreme preferences to choose \(\stackrel{{\sim }}{\text{y}}\) over \(\stackrel{{\sim }}{\text{x}}\). This is an extreme transformation since it gives up 90% of the gain associated with the outcome $1,000,000, but those with extreme enough preferences would choose to do this. Those with extreme preferences are willing to transfers risk to others even when the cost of such a transfer is very high. Of course, many decision makers would forego this risk reducing action because their preferences are less extreme. When this risk transfer is available, the ranking of \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) by those with extreme preferences are aligned with the majority.

The next example does not come from the almost stochastic dominance literature. In fact, the almost stochastic dominance modification procedure is unlikely to lead to a ranking of the two alternatives. Suppose one must choose between two goods that can either fail immediately or not fail at all. The first good fails with probability 0.3, and the second good fails with a higher probability 0.5. Assume that the value provided by each good is $1000 in the case on no failure and $0 in the case of failure. For risk averse buyers, these goods must sell for less than expected value. Assume the selling price of the first good is $500, and that of the second is $400. Which good does a risk averse decision maker choose?

When there is no refund and no warranty against failure, the first good generates a random outcome, denoted \(\stackrel{{\sim }}{\text{x}}\), that yields a net loss of $500 with probability 0.3 and a net gain of $500 with probability 0.7. Similarly, the second good generates a random outcome, denoted \(\stackrel{{\sim }}{\text{y}}\), that yields a net loss of $400 or a net gain of $600 but with equal probability. Many risk averse decision makers would choose the first good over the second both because of the smaller probability of failure and the larger expected value, but many others would choose the second over the first because of the smaller net loss when failure occurs. It does not require extreme risk preferences to choose the second good over the first, making it difficult to know which preferences to eliminate in the almost stochastic dominance procedure.

It can be demonstrated that the first good is stochastically superior to the second in the second-degree even though it does not stochastically dominate in the second-degree. Specifically, suppose that the cost of risk transfer is high, ρ = 0. Even with this high cost of transferring risk to others, it can be readily seen that for ρ = 0 and α = 0.8, the transformed \(\stackrel{{\sim }}{\text{x}}\) (i.e., 0.8 \(\stackrel{{\sim }}{\text{x}}\)) stochastically dominates \(\stackrel{{\sim }}{\text{y}}\) in the second-degree and therefore by Theorem 1, \(\stackrel{{\sim }}{\text{x}}\) is stochastically superior to \(\stackrel{{\sim }}{\text{y}}\) in the second-degree for ρ = 0. This risk reducing transfer occurs by finding someone to take a 20% share in both outcomes associated with the purchase of the good. Many risk averse decision makers prefer the original \(\stackrel{{\sim }}{\text{x}}\) to this transformed one, but this particular transformation of \(\stackrel{{\sim }}{\text{x}}\) would be preferred to both the original \(\stackrel{{\sim }}{\text{x}}\) and to \(\stackrel{{\sim }}{\text{y}}\) by those with preferences putting significant weight on outcomes in the left tail. Even though \(\stackrel{{\sim }}{\text{x}}\) does not stochastically dominant \(\stackrel{{\sim }}{\text{y}}\) in the second-degree, \(\stackrel{{\sim }}{\text{x}}\) is preferred or indifferent to \(\stackrel{{\sim }}{\text{y}}\) by all risk averse decision makers when the decision makers are allowed to transform \(\stackrel{{\sim }}{\text{x}}\) (and \(\stackrel{{\sim }}{\text{y}}\)) in this risk reducing way.

These two examples illustrate the way in which stochastic superiority can help remedy the left tail problem and allow the ranking of more pairs of random variables. In general, when random variable \(\stackrel{{\sim }}{\text{x}}\) has a lowest outcome that is smaller than the lowest outcome for \(\stackrel{{\sim }}{\text{y}}\), denoted ax < ay, one can choose an α in [0, 1] so that α·ax + (1 − α)·ρ ≥ ay as long as ρ > ay. This latter condition is one often met since the riskless return is above the lowest outcome for all risky choices.

The above two examples have been constructed to illustrate a particular weakness in traditional stochastic dominance analysis referred to as the left tail problem. The final application is not a constructed example but instead an application to a model that is prominently discussed in the literature. In this application, stochastic superiority is shown to add a significant finding to the very large literature on self-protection.

The left tail problem arises in the model of self-protection introduced by Ehrlich and Becker (1972) and is the most prominent characteristic of that decision model. As Ehrlich and Becker point out, it is the case that choosing more self-protection always leads to a smaller lowest outcome than when less self-protection is selected. This fact implies that those decision makers who are extremely risk averse would always prefer lower levels of self-protection to higher levels no matter how effective self-protection is in reducing the probability of loss and increasing the expected value of the random outcome. As a result, without modifying the basic model of Ehrlich and Becker, it is impossible to establish any theorems which indicate that increased self-protection would make all risk averse decision makers better off.

This seemingly perverse feature of the self-protection model can be eliminated if the decision maker can both self-protect and also transfer some of the risk that is assumed to others. Including such a risk transfer possibility is enough to present a positive result in which conditions are provided indicating when all risk averse decision makers would choose a higher level of self-protection over a lower level. These conditions depend on the parameter values in the self-protection model but are reasonable and are satisfied in many instances. The model of the decision to self-protect used here is that presented by Ehrlich and Becker (1972) and the notation used is that found in Eeckhoudt and Gollier (2005).

Eeckhoudt and Gollier assume that a decision maker begins with certain wealth w and that this wealth is subject to loss L > 0. This loss occurs with probability p(e), where e represents the level of self-protection investment/cost chosen by the decision maker and p(e) is decreasing in e. Let W denote final wealth, which can take on one of two values, either (w − L − e) or (w − e), with probabilities p(e) and 1 − p(e), respectively.

Assume that there are two different levels of self-protection e1 ≥ 0 and e2 > e1. Also, to simplify notation, let p(e1) = p1 and p(e2) = p2 < p1. Let \(\stackrel{{\sim }}{\text{x}}\) denote the random wealth with the larger amount of self-protection, and \(\stackrel{{\sim }}{\text{y}}\) the random wealth with the smaller amount of self-protection. Therefore, \(\stackrel{{\sim }}{\text{x}}\) yields either (w − L − e2) or (w − e2), with probabilities p2 and (1 − p2) respectively, and \(\stackrel{{\sim }}{\text{y}}\) yields either (w − L − e1) or (w − e1), with probabilities p1 and (1 − p1) respectively. The mean of \(\stackrel{{\sim }}{\text{x}}\) is larger than the mean of \(\stackrel{{\sim }}{\text{y}}\) if and only if

$$\left( {p_{1} - p_{2} } \right)L > e_{2} - e_{1} .$$
(1)

The term “effective” is used to describe the added level of self-protection when (Eq. 1) holds. Later on a cost–benefit ratio proves useful, so the following definition is suggested.

Definition 5

Self-protection level e2 is said to be an effective increase over e1 whenever (Eq. 1) holds. In addition, \(\left({e}_{2}-{e}_{1}\right)/\left[L\left({p}_{1}-{p}_{2}\right)\right]\) is said to be the cost–benefit ratio for the added level of self-protection.

The cost–benefit ratio indicates how much additional effort must be expended per unit reduction in the mean loss. Obviously, an increase in self-protection effort from e1 to e2 is effective if and only if the associated cost–benefit ratio is less than one. When this ratio is less than one, neither \(\stackrel{{\sim }}{\text{x}}\) nor \(\stackrel{{\sim }}{\text{y}}\) stochastically dominates the other in the second (or any) degree.Footnote 15

Suppose that coinsurance is available to insure the self-protected asset at a loading factor π ≥ 0, and the individual can decide on a share of loss, denoted (1 − α), that is reimbursed.Footnote 16 Then \(\tilde{x} = {\text{w}} - {\text{e}}_{2} - \tilde{L}_{x}\), where \(\stackrel{{\sim }}{\text{L}}\)x yields either L or 0 with probabilities p2 and (1 − p2) respectively. With coinsurance, \(\stackrel{{\sim }}{\text{x}}\) can be transformed into any element of the form:

$$\left\{ {\alpha \tilde{x} + \left( {1 - \alpha } \right)\rho_{x} } \right\}_{{\alpha \in \left[ {0,1} \right]}}$$

where \(\rho_{x} = w - e_{2} - \left( {1 + \pi } \right)E\left( {\tilde{L}_{x} } \right) = w - e_{2} - \left( {1 + \pi } \right)p_{2} L.\)

Similarly, \(\tilde{y} = w - e_{1} - \tilde{L}_{y}\), where \(\stackrel{{\sim }}{\text{L}}\)y yields either L or 0 with probabilities p1 and (1 − p1) respectively, can be transformed into any element of the form:

$$\left\{ {\alpha \tilde{y} + \left( {1 - \alpha } \right)\rho_{y} } \right\}_{{\alpha \in \left[ {0,1} \right]}}$$

where \(\rho_{y} = w - e_{1} - \left( {1 + \pi } \right)E\left( {\tilde{L}_{y} } \right) = w - e_{1} - \left( {1 + \pi } \right)p_{1} L.\)

The following theorem states that an effective increase in self-protection is stochastically superior to the status quo in the second-degree, as long as the loading factor is below a certain critical level that is determined by the parameters of the self-protection model.

Theorem 5

Suppose that an increase in the level of self-protection from e1 to e2 is effective, and that coinsurance is available to insure the self-protected asset at a loading factor π ≥ 0. Then all risk averse decision makers choose the larger level of self-protection over the smaller level if and only if the loading factor π satisfies.

$$\pi \le \frac{{\left( {1 - {\text{p}}_{2} } \right)}}{{{\text{p}}_{2} }}\left[ {1 - \frac{{\left( {{\text{e}}_{2} - {\text{e}}_{1} } \right)}}{{{\text{L}}\left( {{\text{p}}_{1} - {\text{p}}_{2} } \right)}}} \right]$$
(2)

Proof

See the Appendix.

Note that the ratio in the brackets is the cost–benefit ratio for the increase in self-protection. It can be readily seen that (Eq. 2) can be satisfied with π > 0 whenever the additional self-protection is effective (i.e., the cost–benefit ratio is less than one).

Theorem 5 indicates that when self-protection is effective, and the loading factor is small enough, all risk averse persons choose more self-protection. To see how this condition can be used, consider the following simple but reasonable parameter values. Let p1 = 0.2, p2 = 0.1, L = 100, e1 = 5 and e2 = M. If M < 15, self-protection is effective since the probability change decreases mean loss by 10. When M = 10, the cost–benefit ratio is 0.5, and loading factors π ≤ 4.5 are small enough that all risk averse decision makers choose the larger level of self-protection. When the added self-protection costs more, say e2 = 14, the cost–benefit ratio increases to 9/10 and the right hand side of (2) becomes 0.9. Even so, loading factors less than 0.9 satisfy Theorem 5. Finally, as the cost of the added self-protection further increases so that e2 = 15, the cost–benefit ratio increases to 1, and no positive loading factor would make every risk averse individual better off at the higher level of self-protection.

6 Conclusion

Including the opportunity to simultaneously reduce risk when making the primary decision of choosing between a pair of random alternatives can have an impact on that primary decision. This paper extends the notion of stochastic dominance to the notion of stochastic superiority by introducing such a risk reducing secondary decision. Stochastic superiority allows more pairs of random alternatives to be ranked than stochastic dominance. Both a sufficient and a necessary condition for stochastic superiority are established. These two conditions are identical when the relevant set of utility functions is a convex set of concave utility functions. One very important application of stochastic superiority is to provide an alternative to almost stochastic dominance when seeking to resolve the “left tail problem.”

The analysis, theorems and proofs in this paper all assume decisions are made on the basis of expected utility (EU). Examples where violations of the EU axioms occur abound, including the well-known Allais paradox. As a result, many non-EU models of decision making under risk have been suggested.Footnote 17 This fact raises the question of the usefulness of stochastic superiority when the axioms of EU do not hold. Much like first- or second-degree stochastic dominance, first- or second-degree stochastic superiority can play a role in non-EU decision models. For instance, it is now common to define first-degree stochastic dominance of F(x) over G(x) using the necessary and sufficient condition derived in an EU decision model; that is, when F(x) ≤ G(x) for all x. Even though this condition on random variables was interpreted within an EU decision model, it has since been extensively used as the definition of first-degree stochastic dominance in non-EU decision models. In fact, proposed non-EU decision models are often judged in part by how the decision model performs when such a first-degree stochastic dominant change occurs.

Similar uses for stochastic superiority in non-EU decision models are possible. First-degree stochastic superiority of \(\stackrel{{\sim }}{\text{x}}\) over \(\stackrel{{\sim }}{\text{y}}\) for fixed ρ can be defined as occurring when for some α in [0, 1] and a fixed ρ, α·\(\stackrel{{\sim }}{\text{x}}\) + (1 − α)·ρ stochastically dominates \(\stackrel{{\sim }}{\text{y}}\) in the first-degree. Such a definition does not assume EU even though it was formulated and interpreted assuming EU. Non-EU decision models can then be judged by how the model performs when such a first-degree stochastically superior change occurs. Thus, even though all results in the paper are derived assuming an EU decision model, applications to non-EU decision models are possible. Exploring stochastic superiority in the various non-EU decision models would be too large an addition to this paper, but can be a valuable avenue for future research.

There are also at least two other major directions in which this work can be extended or used. First, the assumption of linear risk reducing transformations adopted in this paper could be changed. A more lengthy title for this paper could involve the words “stochastic superiority with respect to linear transformations” to make this obvious. Of course a random prospect’s risk can also be reduced through nonlinear transformations reflecting such things as deductible insurance or options rather than coinsurance or diversification using a riskless asset. The framework here allows such an extension, and using a concave or convex transformation would result in a different form of stochastic superiority.

The second direction of extension involves an application to a different literature. The work here is presented as part of decision making under risk. If the distributions describing random variables are instead assumed to represent distributions of income or wealth, then the findings in this paper can be applied to the literature that ranks income or wealth distributions and specifically the inequality literature. In that literature, the transformations of \(\stackrel{{\sim }}{\text{x}}\) and \(\stackrel{{\sim }}{\text{y}}\) have the very natural interpretation of using taxes, subsidies or other policies to redistribute income or wealth. Since linear or non-linear transformations alter both tails of a distribution, the effect on both the right and left tail can be discussed. For decision making under risk, the effect on left tail is the primary concern, but for income inequality the effect on each tail is important.