Keywords

1 Introduction

The problem of optimal portfolio selection is subject of major theoretical and computational studies in finance. A fundamental issue while dealing with uncertain outcomes is a theoretically sound approach to their comparison.

The theory of stochastic orders plays a fundamental role in economics (see Mosler and Scarsini 1991; Whitmore and Findlay 1978). These are relations that induce partial order in the space of real random variables in the following way. A random variable R dominates the random variable Y if E[u(R)] ≥ E[u(Y )] for all functions u (⋅) from certain set of functions, called the generator of the order. The concept of stochastic dominance is very popular and widely used in economics and finance because of its relation to models of risk-averse preferences (Fishburn 1964). It originated from the theory of majorization (Hardy et al. 1934) for the discrete case, and was later extended to general distributions (Quirk and Saposnik 1962; Hadar and Russell 1969; Rotschild and Stiglitz 1969). Stochastic dominance of second order is defined by the set of nondecreasing concave functions: a random variable R dominates another random variable Y in the second order if E[u(R)] ≥ E[u(Y )] for all nondecreasing concave functions u (⋅) for which these expected values are finite. Thus, no risk-averse decision maker will prefer a portfolio with return rate Y over a portfolio with return rate R.

A popular approach is the utility optimization approach. Von Neumann and Morgenstern (1944) developed the expected utility theory: for every rational decision maker there exists a utility function u (⋅) such that the decision maker prefers outcome R over outcome Y if and only if E[u(R)] > E[u(Y )]. This approach can be implemented also very efficiently; however, it is almost impossible to elicit the utility function of a decision maker explicitly. More difficulties arise when a group of decision makers with different utility functions have to reach a consensus. Recently, the dual utility theory (or rank dependent expected utility theory) has attracted much attention in economics. This approach was first presented by Quiggin (1982) and later rediscovered in a special case by Yaari (1987). From a different system of axioms than those of von Neumann and Morgenstern, one derives that every decision maker has a certain rank dependent utility functionw : [0, 1] → R. Then a nonnegative outcome R is preferred over a nonnegative outcome Y , if and only if

$$-{\int \nolimits \nolimits}_{0}^{1}w(p){\mathit{dF}}_{(-1)}(R;p) \geq -{\int \nolimits \nolimits}_{0}^{1}w(p){\mathit{dF}}_{(-1)}(Y ;p),$$
(15.1)

where, F (− 1)(R; ⋅) is the inverse distribution function of R. For a comprehensive treatment of the rank dependent utility theory, we refer to Quiggin (1993), and for its application in actuarial mathematics, see Wang et al. 1997; Wang and Yong 1998.

Another classical approach, pioneered by Markowitz (1952, 1959, 1987), is the mean-risk approach, which compares the portfolios with respect to two characteristics. One is the expected return rate (the mean) and another one is the risk, which is given by some scalar measure of the uncertainty of the portfolio return rate. The mean-risk approach recommends the selection of Pareto-efficient portfolios with respect to these two criteria. In a mean-risk portfolio model we combine these criteria by specifying some parameter as a tradeoff between them. As a parametric optimization problem the mean-risk model can be solved numerically very efficiently, which makes this approach very attractive (Konno and Yamazaki 1991; Ruszczyński and Vanderbei 2003).

In this paper we formulate a model for risk-averse portfolio optimization and demonstrate its relation to the expected utility approach and to rank dependent utility approach. We optimize the portfolio performance under an additional constraint that the portfolio return rate stochastically dominates a benchmark return rate, for example, the return rate of an index. The model is based on the publications of Dentcheva and Ruszczyński (2003a, b, c; 2004a, b) where a new model of risk-averse optimization has been introduced. This approach has a fundamental advantage over mean-risk models and utility function models. All data for our model are readily available. In mean-risk models the choice of the risk measure has an arbitrary character, and it is difficult to argue for one measure against another. Similarly, optimization of expected utility requires the form of the utility function to be specified. Our analysis, departing from the benchmark outcome, generates implied utility function of the decision maker. It is implicitly defined by the benchmark used, and by the problem under consideration. We provide two problem formulations in which the stochastic dominance has a primal or inverse form: a Lorenz curve. The primal form has a dual problem in terms of expected utility functions, and the inverse form has a dual problem in terms of rank dependent utility functions. In this way our model provides a link between this two competing economic approaches. Duality relations with coherent measures of risk are explored in Dentcheva and Ruszczyński (2008).

2 The Portfolio Problem

Let R 1, R 2, , R n be random return rates of assets 1, 2, , n. We assume that \(\mathit{E}[\vert {R}_{j}\vert ]\,<\,\infty \) for all j = 1, , n.

Our aim is to invest our capital in these assets in order to obtain some desirable characteristics of the total return rate on the investment. Denoting by x 1, x 2, , x n the fractions of the initial capital invested in assets 1, 2, , n we can easily derive the formula for the total return rate:

$$R(x) = {R}_{1}{x}_{1} + {R}_{2}{x}_{2} + \ldots + {R}_{n}{x}_{n}.$$
(15.2)

Clearly, the set of possible asset allocations can be defined as follows:

$$X =\{x \in {\mathit{R}}^{n} : {x}_{1} + {x}_{2} + \ldots + {x}_{n} = 1,\;{x}_{j} \geq 0,j = 1,2,\ldots ,n\}.$$

In some applications one may introduce the possibility of short positions (i.e., allow some x j ’s to become negative). Other restrictions may limit the exposure to particular assets or their groups, by imposing upper bounds on the x j ’s or on their partial sums. One can also limit the absolute differences between the x j ’s and some reference investments \(\bar{{x}}_{j}\), which may represent the existing portfolio, and so on. Our analysis does not depend on the detailed way this set is defined; we only use the fact that it is a convex polyhedron. All modifications discussed above define some convex polyhedral feasible sets and are, therefore, covered by our approach.

The main difficulty in formulating a meaningful portfolio optimization problem is the definition of the preference structure among feasible portfolios. If we use only the mean return rate E[R(x)], then the resulting optimization problem has a trivial and meaningless solution: invest everything in assets that have the maximum expected return rate. For these reasons, the practice of portfolio optimization usually resorts to two approaches.

In the first approach we associate with portfolio x some dispersion measure ρ(R(x)) representing the variability of the return rate R(x). In the classical Markowitz model the function ρ(R(x)) is the variance of the return rate,

$$\rho (R(x)) = \mathit{V}[R(x)],$$

but many other measures are possible here as well.

The mean-risk portfolio optimization problem is formulated as follows:

$${\max}_{x\in X}\mathit{E}[R(x)] - \lambda \rho (R(x)).$$
(15.3)

Here, λ is a nonnegative parameter representing our desirable exchange rate of mean for risk. If λ = 0, the risk has no value and the problem reduces to the problem of maximizing the mean. If λ > 0 we look for a compromise between the mean and the risk. Alternatively, one can minimize the risk function ρ(x), while fixing the expected return rate E[R(x)] at some value m, and consider a family of problems parametrized by m. The reader is referred to the book by Elton et al. (2006) for the modern perspective on mean-risk analysis in portfolio theory.

The general question of constructing mean-risk models that are in harmony with the stochastic dominance relations has been the subject of the analysis of the recent papers by Ogryczak and Ruszczyński (1999; 2001; 2002). We have identified there several primal risk measures, most notably central semideviations, and dual risk measures, based on the Lorenz curve, which are consistent with the stochastic dominance relations.

The second approach is to select a certain utility functionu : RR and to formulate the following optimization problem

$${\max}_{x\in X}\mathit{E}[u(R(x))].$$
(15.4)

It is usually required that the function u (⋅) is concave and nondecreasing, thus representing preferences of a risk-averse decision maker (Fishburn 1964; 1970).

Recently, a dual (rank dependent) utility model attracts much attention. It is based on distorting the cumulative probability distribution of the random variable R(x) rather than applying a nonlinear function u (⋅) to the realizations of R(x). The corresponding problem has the following form

$${\max}_{x\in X}{\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(R(x),p)\mathit{dw}(p).$$
(15.5)

Here F (− 1)(R(x), p) is the p-quantile of the random variable R(x), and w (⋅) is the rank dependent utility function, which distorts the probability distribution. We discuss this in Sect. 15.3.2.

The challenge in both utility approaches is to select the appropriate utility function or rank dependent utility function that represent our preferences and whose application leads to nontrivial and meaningful solutions of Equation (15.4) or (15.5).

In this paper we propose an alternative approach: introducing a comparison to a benchmark return rate into our optimization problem. The comparison is based on the stochastic dominance relation. More specifically, we consider only portfolios whose return rates stochastically dominates a certain benchmark return rate.

3 Stochastic Dominance

3.1 Direct Forms

In the stochastic dominance approach, random return rates are compared by a point-wise comparison of some performance functions constructed from their distribution functions. For a real random variable V , its first performance function is defined as the right-continuous cumulative distribution function of V :

$${F}_{1}(V ;\eta ) = \mathit{P}\{V \leq \eta \}\quad \mathit{for}\;\eta \in \mathit{R}.$$

A random return V is said (Lehmann 1955; Quirk and Saposnik 1962) to stochastically dominate another random return S in the first order, denoted V(1) S, if

$${F}_{1}(V ;\eta ) \leq {F}_{1}(S;\eta )\quad \mathit{for}\;\mathit{all}\;\eta \in \mathit{R}.$$

We can say that V is “stochastically larger” than S, because it takes values lower than η with smaller (or equal) probabilities than S, no matter what the target η is.

The second performance function F 2 is given by areas below the distribution function F,

$${F}_{2}(V ;\eta ) ={\int \nolimits \nolimits}_{-\infty}^{\eta}{F}_{1}(V ;\xi )\,d\xi \quad \mathit{for}\ \eta \in \mathit{R},$$

and defines the weak relation of the second order stochastic dominance (SSD). That is, random return V stochastically dominates S in the second order, denoted V(2) S, if

$${F}_{2}(V ;\eta ) \leq {F}_{2}(S;\eta )\quad \mathit{for}\;\mathit{all}\;\eta \in \mathit{R}.$$

(see Hadar and Russell 1969; Rotschild and Stiglitz 1969).

We can express the function F 2(V ; ⋅) as the expected shortfall (see, for example, Levy 2006; Ogryczak and Ruszczyński 1999): for each target value η we have

$${F}_{2}(V ;\eta ) = \mathit{E}[{(\eta - V )}_{+}],$$
(15.6)

where \({(\eta - V )}_{+} =\max (\eta - V,0)\). The function F (2)(V ; ⋅) is continuous, convex, nonnegative and nondecreasing. It is well defined for all random variables V with finite expected value. Due to this representation, the second order stochastic dominance relation V(2) S can be equivalently characterized by the system of inequalities on the expected shortfall below any target η:

$$\mathit{E}[{(\eta - V )}_{+}] \leq \mathit{E}[{(\eta - S)}_{+}]\quad \mathit{for}\;\mathit{all}\;\eta \in \mathit{R}.$$
(15.7)

Also, we obtain an equivalent characterization in terms of the expected utility theory of von Neumann and Morgenstern (see, for example, Hanoch and Levy 1969; Levy 2006; Müller and Stoyan 2002):

  • For any two random variables V, S the relation V(1) S holds true if and only if for all nondecreasing functions u (⋅) defined on R we have

    $$\mathit{E}[u(V )] \geq \mathit{E}[u(S)].$$
    (15.8)
  • For any two random variables V, S with finite expectations, the relation V(2) S holds true if and only if Equation (15.8) is satisfied for all nondecreasing concave functions u (⋅).

In the context of portfolio optimization, we consider stochastic dominance relations between random return rates defined by Equation (15.2). Thus, we say that portfolio x dominates portfolio y in the first order, if

$${F}_{1}(R(x);\eta ) \leq {F}_{1}(R(y);\eta )\quad \mathit{for}\;\mathit{all}\;\eta \,\in \,\mathit{R}.$$

This is illustrated in Fig. 15.1.

Fig. 15.1
figure 1

First order stochastic dominance R(x) ≽(1) R(y)

Similarly, we say that x dominates y in the second order (R(x) ≽(2) R(y)), if

$${F}_{2}(R(x);\eta ) \leq {F}_{2}(R(y);\eta )\quad \mathit{for}\;\mathit{all}\;\eta \in \mathit{R}.$$

The second order relation is illustrated in Fig. 15.2.

Fig. 15.2
figure 2

Second order dominance R(x) ≽(2) R(y)

Recall that the individual return rates R j have finite expected values and thus the function F 2(R(x); ⋅) is well defined.

3.2 Inverse Forms

Let us consider the inverse model of stochastic dominance, frequently referred to as Lorenz dominance. For a real random variable V (for example, a random return rate) we define the left-continuous inverse of the cumulative distribution function F 1(V ; ⋅) as follows:

$${F}_{(-1)}(V ;p) =\inf \;\{\eta : {F}_{1}(V ;\eta ) \geq p\}\quad \mathit{for}\quad 0 < p < 1.$$

Given p ∈ (0, 1), the number q = q(V ; p) is called a p-quantile of the random variable V if

$$\mathit{P}\{V < q\} \leq p \leq \mathit{P}\{V \leq q\}.$$

For p ∈ (0, 1) the set of p-quantiles is a closed interval and F (− 1)(V ; p) represents its left end. Directly from the definition of the first order dominance we see that

$$V {\succcurlyeq}_{(1)}S \Leftrightarrow {F}_{(-1)}(V ;p) \geq {F}_{(-1)}(S;p)\ \ \mathit{for}\;\mathit{all}\ \ 0 < p < 1.$$
(15.9)

The first order dominance constraint can be interpreted as a continuum of probabilistic (chance) constraints, studied in stochastic optimization (see, Dentcheva 2005; Prékopa 2003).

Our analysis uses the absolute Lorenz functionF (− 2)(V ; ⋅) : [0, 1] → R, introduced in (Lorenz 1905). It is defined as the cumulative quantile:

$$\begin{array}{rcl} & {F}_{(-2)}(V ;p) ={\int \nolimits \nolimits}_{0}^{p}{F}_{(-1)}(V ;t)\mathit{dt}\quad \mathit{for}\quad 0 < p \leq 1,&\end{array}$$
(15.10)
$$\begin{array}{rcl}& {F}_{(-2)}(V ;0) = 0. & \\ \end{array}$$

Similarly to F 2(V ; ⋅), the function F (− 2)(V ; ⋅) is well defined for any random variable V , which has a finite expected value. We notice that

$${F}_{(-2)}(V ;1) ={\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(V ;t)\mathit{dt} = \mathit{E}[V ].$$

By construction, the Lorenz function is convex. Lorenz functions are commonly used for inequality ordering of positive random variables, relative to their (positive) expectations (see Gastwirth 1971; Muliere and Scarsini 1989). Such a Lorenz function, \(p\mapsto {F}_{(-2)}(V ;p)/\mathit{E}[V ]\), is convex and nondecreasing. The absolute Lorenz function, however, is not monotone, when negative outcomes occur.

It is well known (see, for example, Ogryczak and Ruszczyński 2002) that we may fully characterize the second order dominance relation by using the function F (− 2)(V ; ⋅):

$$V {\succcurlyeq}_{(2)}S \Leftrightarrow {F}_{(-2)}(V ;p) \geq {F}_{(-2)}(S;p)\ \ \mathit{for}\;\mathit{all}\ \ 0 \leq p \leq 1.$$
(15.11)

This characterization of stochastic dominance by Lorenz functions is widely used in economics and statistics.

We now provide an equivalent characterization by rank dependent utility functions. It is analogous to the characterization by expected utility functions.

Dentcheva and Ruszczyński (2006b) provide the following characterization.

  • For any two random variables V, S the relation V(1) S holds true if and only if for all nondecreasing functions w (⋅) defined on [0,1] we have

    $${\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(V ;p)\mathit{dw}(p) \geq {\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(S;p)\mathit{dw}(p).$$
    (15.12)
  • For any two random variables V, S with finite expectations, the relation V(2) S holds true if and only if Equation (15.12) is satisfied for all nondecreasing concave functions w (⋅).

The functions w (⋅) appearing in this characterization are rank dependent (dual) utility functions.

In the context of portfolio optimization, we consider stochastic dominance relations between random return rates defined by Equation (15.2). Thus, we say that portfolio x dominates portfolio y in the first order, if

$${F}_{(-1)}(R(x);p) \geq {F}_{(-1)}(R(y);p)\quad \mathit{for}\;\mathit{all}\;p \in (0,1).$$

This is illustrated in Fig. 15.3.

Fig. 15.3
figure 3

First order stochastic dominance R(x) ≽(1) R(y) in the inverse form

Similarly, we say that x dominates y in the second order (R(x) ≽(2) R(y)), if

$${F}_{(-2)}(R(x);p) \geq {F}_{(-2)}(R(y);p)\quad for\;all\;p \in [0,1].$$
(15.13)

Recall that the individual return rates R j have finite expected values and thus the function F (− 2)(R(x); ⋅) is well defined. The second order relation is illustrated in Fig. 15.4.

Fig. 15.4
figure 4

Second order dominance R(x) ≽(2) R(y) in the inverse form

3.3 Relations to Value at Risk and Conditional Value at Risk

There are fundamental relations between the concepts of Value at Risk (VaR) and Conditional Value at Risk (CVaR) and the stochastic dominance constraints. The VaR constraint in the portfolio context is formulated as follows. We define the loss rate \(L(x) = -R(x)\). We specify the maximum fraction ω p of the initial capital allowed for risk exposure at risk level p ∈ (0, 1), and we require that

$$\mathit{P}[L(x) \leq {\omega}_{p}] \geq 1 - p.$$

Denoting by VaR p (L(x)) the left (1 − p)-quantile of the random variable L(x), we can equivalently formulate the VaR constraint as

$${\mathit{VaR}}_{p}(L(x)) \leq {\omega}_{p}.$$

The first order stochastic dominance relation between two portfolios is equivalent to the continuum of VaR constraints. Portfolio x dominates portfolio y in the first order, if

$${\mathit{VaR}}_{p}(L(x)) \leq {\mathit{VaR}}_{p}(L(y))\;\mathit{for}\;\mathit{all}\;p \in (0,1).$$

The CVaR at level p, roughly speaking, has the following form

$${\mathit{CVaR}}_{p}(L(x)) = \mathit{E}[L(x)\vert L(x) \geq {\mathit{VaR}}_{p}(L(x))].$$

This formula is precise if VaR p (L(x)) is not an atom of the distribution of L(x). More precisely we express it as follows:

$${\mathit{CVaR}}_{p}(L(x)) = \frac{1} {p}{\int \nolimits \nolimits}_{0}^{p}{\mathit{VaR}}_{t}(L(x))\mathit{dt}.$$

We note that

$${\mathit{CVaR}}_{p}(L(x)) = -\frac{1} {p}{F}_{(-2)}(R(x),p).$$
(15.14)

Another description uses extremal properties of quantiles and equivalently represents CVaR as follows (Rockafellar and Uryasev 2000):

$${\mathit{CVaR}}_{p}(L(x)) ={\inf}_{\eta}\left \{\frac{1} {p}\mathit{E}[{(\eta - R(x))}_{+}] - \eta \right \}.$$
(15.15)

A CVaR constraint on the portfolio x can be formulated as follows:

$${\mathit{CVaR}}_{p}(L(x)) \leq {\omega}_{p}.$$
(15.16)

Using Equations (15.14) and (15.13) we conclude that the second order stochastic dominance relation for two portfolios x and y is equivalent to the continuum of CVaR constraints:

$$\begin{array}{rcl} R(x) {\succcurlyeq}_{(2)}R(y)& \Leftrightarrow &{\mathit{CVaR}}_{p}(L(x)) \\ & \leq &{\mathit{CVaR}}_{p}(L(y))\;\mathit{for}\;\mathit{all}\;p \in (0,1].\end{array}$$
(15.17)

Assume that we compare the performance of a portfolio x with a random benchmark Y (for example, an index return rate or another portfolio return rate) requiring R(x) ≽(2) Y. Then the fraction ω p of the initial capital allowed for risk exposure at level p is given by the benchmark Y :

$${\omega}_{p} ={\mathit{CVaR}}_{p}(-Y ),\quad p \in (0,1].$$

Assume that Y has a discrete distribution with realizations y i , i = 1, , m. Then relation Equation (15.7) is equivalent to

$$\mathit{E}[{({y}_{i} - R(x))}_{+}] \leq \mathit{E}[{({y}_{i} - Y )}_{+}],\quad i = 1,\ldots ,m.$$
(15.18)

This result does not imply that the continuum of CVaR constraints Equation (15.17) can be replaced by finitely many constraints of form

$${\mathit{CVaR}}_{{p}_{i}}(R(x)) \geq {\mathit{CVaR}}_{{p}_{i}}(Y ),\quad i = 1,\ldots ,m,$$

with some fixed probabilities p i , i = 1, , m. The reason is that we do not know at which probability levels the CVaR constraints have to be imposed.

4 The Dominance-Constrained Portfolio Problem

4.1 Direct Formulation

The starting point for our model is the assumption that a benchmark random return rate Y having a finite expected value is available. It may have the form of \(Y = R(\overline{z})\), for some benchmark portfolio \(\overline{z}\). It may be an index or our current portfolio. Our intention is to have the return rate of the new portfolio, R(x), preferable over Y. Therefore, we introduce the following extension of the optimization problem Equation (15.3):

$$\begin{array}{rcl} & & \max \;\mathit{E}[R(x)] - \lambda \rho (R(x))\end{array}$$
(15.19)
$$\begin{array}{rcl} & & \mathrm{subject\ to} \\ \end{array}$$
$$\begin{array}{rcl} & & R(x) {\succcurlyeq}_{(2)}Y,\end{array}$$
(15.20)
$$\begin{array}{rcl} & & x \in X.\end{array}$$
(15.21)

Similarly to Equation (15.3), we optimize a mean-risk objective function, but we introduce a constraint that the portfolio return dominates a benchmark. Even when λ = 0 and we maximize just the expected value of the return rate, our model will still lead to nontrivial solutions, due to the presence of the dominance constraint Equation (15.20).

To increase flexibility of model Equations (15.19)–(15.21), we may also allow a uniform shift of R(x) by a constant c, as in the following model:

$$\begin{array}{rcl} & \max \;\mathit{E}[R(x)] - \lambda \rho (R(x)) - \delta c& \\ & \mathrm{subject\ to} & \\ & R(x) + c {\succcurlyeq}_{(2)}Y, & \\ & x \in X. & \\ \end{array}$$

Here δ > 0 can be interpreted a cost of the shift c. Observe that the shift c may also become negative, in which case we are rewarded for uniformity of dominating Y. The shift c may be interpreted as an additional cash added to the return, and δ is the interest to be paid when the loan is paid back.

To simplify the derivations, from now on we focus on the simplest formulation of the dominance-constrained problem:

$$\begin{array}{rcl} & & \max \;\mathit{E}[R(x)]\end{array}$$
(15.22)
$$\begin{array}{rcl} & & \mathrm{subject\ to} \\ & & R(x) {\succcurlyeq}_{(2)}Y,\end{array}$$
(15.23)
$$\begin{array}{rcl} & & x \in X.\end{array}$$
(15.24)

We can observe the first advantage of our problem formulation: all data in it are readily available. Moreover, the set defined by Equation (15.23) is convex (Dentcheva and Ruszczyński 2003c; 2004a, c).

Let us assume now that Y has a discrete distribution with realizations y i attained with probabilities π i , i = 1, , m. We also assume that the return rates have a discrete joint distribution with realizations \({r}_{\mathit{jt}},\ t = 1,\ldots ,T,\ j = 1,\ldots ,n\), attained with probabilities p t , t = 1, 2, , T. Then the formulation of the stochastic dominance relation Equation (15.23) resp. Equation (15.18) simplifies even further. Introducing variables s it representing the shortfall of R(x) below y i in realization \(t,\ i = 1,\ldots ,m,\ t = 1,\ldots ,T\), we can formulate problem Equations (15.22)–(15.24) as follows:

$$\begin{array}{rcl} & & \max \;\sum _{i=1}^{T}{p}_{t}\sum _{j=1}^{n}{x}_{j}{r}_{\mathit{jt}}\end{array}$$
(15.25)
$$\begin{array}{rcl} & & \mathrm{subject}\;\mathrm{to} \\ & & \sum _{j=1}^{n}{x}_{j}{r}_{\mathit{jt}} + {s}_{\mathit{it}} \geq {y}_{i},\quad i = 1,\ldots ,m,\quad t = 1,\ldots ,T,\end{array}$$
(15.26)
$$\begin{array}{rcl} & & \sum _{t=1}^{T}{p}_{t}{S}_{\mathit{it}} \leq \sum _{l=1}^{m}{\pi}_{k}{({y}_{i} - {y}_{k})}_{+},\quad i = 1,\ldots ,m,\end{array}$$
(15.27)
$$\begin{array}{rcl} & & {s}_{\mathit{it}} \geq 0,\quad i = 1,\ldots ,m,\quad t = 1,\ldots ,T.\end{array}$$
(15.28)
$$\begin{array}{rcl} & & x \in X.\end{array}$$
(15.29)

Indeed, or every feasible point x of (15.22)–(15.24), setting

$$\begin{array}{rcl}{s}_{\mathit{it}}& =\max \left (0,{y}_{i} -\sum _{j=1}^{n}{x}_{j}{r}_{\mathit{jt}}\right ),\quad i = 1,\ldots ,m,& \\ & \qquad t = 1,\ldots ,T, & \\ \end{array}$$

we obtain a feasible pair (x, s) for Equations (15.26)–(15.29). Conversely, for any feasible pair (x, s) for Equations (15.26)–(15.29), inequalities Equations (15.26) and (15.28) imply that

$${s}_{\mathit{it}} \geq \max (0,{y}_{i} -\sum _{j=1}^{n}{x}_{j}{r}_{\mathit{jt}}),\quad i = 1,\ldots ,m,\quad t = 1,\ldots ,T.$$

Taking the expected value of both sides and using Equation (15.27) we obtain

$${F}_{2}(R(x);{y}_{i}) \leq {F}_{2}(Y ;{y}_{i}),\quad i = 1,\ldots ,m.$$

Therefore, problem Equations (15.22)–(15.24) is equivalent to problem Equations (15.25)–(15.29).

If the set X is a convex polyhedron, problem Equations (15.25)–(15.29) becomes a large scale linear programming problem. It may be solved by general-purpose linear programming solvers. However, the size of the problem increases dramatically with the number of assets n, their return realizations T, and benchmark realizations m, which makes it impractical for even moderate dimensions (in thousands). For the purpose of solving these problems, we developed a specialized decomposition method presented in Dentcheva and Ruszczyński (2006a).

4.2 Inverse Formulation

Assume that the return rates have a joint discrete distribution realizations r jt , t = 1, , T and j = 1, , n, attained with probabilities p t , t = 1, 2, , T. Moreover, we assume that all probabilitiesp t are equal, that is, \({p}_{t} = 1/T,\ t = 1,\ldots ,T\). This is the case of empirical distributions. Correspondingly, we assume that Y has a discrete distribution with m = T equally probable realizations y t , t = 1, , T.

We use the symbol R [t](x) to denote the ordered realizations of R(x); that is,

$${R}_{[1]}(x) \leq {R}_{[2]}(x) \leq \ldots \leq {R}_{[T]}(x).$$

Since R(x) has a discrete distribution, the functions F 2(R(x); ⋅) and F (− 2)(R(x); ⋅) are piecewise linear. Owing to the fact that all probabilities p t are equal, the break points of F (− 2)(R(x); ⋅) occur at tT, for t = 0, 1, , m. The same applies to F (− 2)(Y ; ⋅). It follows from Equation (15.13) that the stochastic dominance constraint Equation (15.23) can be equivalently expressed as

$${F}_{(-2)}\left (R(x); \frac{t} {T}\right ) \geq {F}_{(-2)}\left (Y ; \frac{t} {T}\right ),\quad t = 1,\ldots ,T.$$

Note that \({F}_{(-2)}(R(x);0) = {F}_{(-2)}(Y ;0) = 0\). We have

$${F}_{(-2)}\left (R(x); \frac{t} {T}\right ) = \frac{1} {T}\sum _{k=1}^{t}{R}_{[k]}(x),\quad t = 1,\ldots ,T.$$

Therefore problem Equations (15.22)–(15.24) can be written with an equivalent inverse form of the dominance constraint:

$$\begin{array}{rcl} & & \max \;\mathit{E}[R(x)]\,\mathrm{subject}\,\mathrm{to}\end{array}$$
(15.30)
$$\begin{array}{rcl} & & \sum _{k=1}^{t}{R}_{[k]}(x) \geq \sum _{k=1}^{t}{y}_{[k]},\quad t = 1,\ldots ,T,\end{array}$$
(15.31)
$$\begin{array}{rcl} & & x \in X.\end{array}$$
(15.32)

It was shown in (Ogryczak and Ruszczyński 2002) that the function x↦ ∑ k = 1 t R [k](x) is concave and positively homogeneous. It is also polyhedral. Therefore, Equation (15.31) are convex polyhedral constraints. If the set X is a convex polyhedron, problem Equations (15.30)–(15.32) has an equivalent linear programming formulation.

All these transformations are possible due to the crucial assumption that the probabilities of all elementary events are equal. If they are not equal, the break points of the function F (− 2)(R(x); ⋅) depend on x, and therefore inequality Equation (15.13) cannot be reduced to finitely many convex inequalities. This is in contrast to the primal formulation, where the discreteness of Y alone was sufficient to reduce the stochastic dominance constraint to finitely many convex inequalities.

We have to observe that the quantile formulation Equation (15.31) of stochastic dominance constraints is more involved than the primal formulation, and requires more sophisticated computational methods. Using Equation (15.31) directly would require employing nonsmooth optimization methods to solve problem Equations (15.30)–(15.32). Equivalent formulation with linear constraints has very many constraints, because of the large number of pieces of the function x↦ ∑ k = 1 t R [k](x). Still, Dentcheva and Ruszczyński (2010) developed a highly efficient cutting plane method, which significantly outperforms direct approaches.

5 Optimality and Duality

5.1 Primal Form

From now on we assume that the probability distributions of the return rates are discrete with finitely many realizations realizations \({r}_{\mathit{jt}},\ t = 1,\ldots ,T,\ j = 1,\ldots ,n\), attained with probabilities p t , t = 1, 2, , T. We also assume that there are finitely many ordered realizations of the benchmark outcome Y : y 1 < y 2 < ⋯ < y m . The probabilities of these realizations are denoted by π i , i = 1, , m. We also assume that the set X is compact.

We define the set U of functions u : RR satisfying the following conditions:

  • u (⋅) is concave and nondecreasing

  • u (⋅) is piecewise linear with break points y i , i = 1, , m

  • u(t) = 0 for all ty m

It is evident that U is a convex cone.

Let us define the function L : R n ×UR as follows

$$L(x,u) = \mathit{E}[R(x) + u(R(x)) - u(Y )].$$
(15.33)

It will play for problem Equations (15.22)–(15.24) a similar role to that of a Lagrangian. It is well defined, because for every uU and every xR n the expected value E[u(R(x))] exists and is finite.

The following theorem has been proved in a more general version in (Dentcheva and Ruszczyński 2003c).

Theorem 15.1

If \(\hat{x}\) is an optimal solution of Equations (15.22)–(15.24) then there exists a function u ∈ U such that

$$\begin{array}{rcl} & L(\hat{x},\hat{u}) ={\max}_{x\in X}L(x,\hat{u})&\end{array}$$
(15.34)
$$\begin{array}{rcl} & \mathit{E}[\hat{u}(R(\hat{x}))] = \mathit{E}[\hat{u}(Y )].&\end{array}$$
(15.35)

Conversely, if for some function u ∈U an optimal solution \(\hat{x}\) of Equation (15.34) satisfies Equations (15.23) and (15.35), then \(\hat{x}\) is an optimal solution of Equations (15.22)–(15.24).

We can also develop duality relations for our problem. With the function Equation (15.33) we can associate the dual function

$$D(u) ={\max}_{x\in X}L(x,u).$$

We are allowed to write the maximization operation here, because the set X is compact and L (⋅, u) is continuous.

The dual problem has the following form

$${\min}_{u\in \mathit{E}}D(u).$$
(15.36)

The set U is a closed convex cone and D (⋅) is a convex function, so Equation (15.36) is a convex optimization problem.

Theorem 15.2

Assume that Equations (15.22)–(15.24) has an optimal solution. Then problem Equation (15.36) has an optimal solution and the optimal values of both problems coincide. Furthermore, the set of optimal solutions of Equation (15.36) is the set of functions u ∈ U satisfying Equations (15.34)–(15.35) for an optimal solution \(\hat{x}\) of Equations (15.22)–(15.24).

Note that all constraints of our problem are linear or convex polyhedral, and therefore we do not need any constraint qualification conditions here.

The “Lagrange multiplier” u is directly related to the expected utility theory of von Neumann and Morgenstern. We have established earlier that the second order stochastic dominance relation is equivalent to Equation (15.8) for all utility functions in U. Our result shows that one of them, u (⋅), assumes the role of a Lagrange multiplier associated with Equation (15.23). A point \(\hat{x}\) is a solution to Equations (15.22)–(15.24) if there exists a utility function u (⋅) such that \(\hat{x}\) maximizes over X the objective function E[R(x)] augmented with this dual utility. We see that the optimization problem in Equation (15.34) is equivalent to

$${\max}_{x\in X}\mathit{E}[v(R(x))],$$
(15.37)

where \(v(\eta ) = \eta + u(\eta )\). At the optimal solution the function \(\hat{v}(\eta ) = \eta +\hat{u}(\eta )\) is the implied utility function. It attaches higher penalty to smaller realizations of R(x) (bigger realizations of L(x)). By maximizing L(R(x), u) we look for x such that the left tail of the distribution of R(x) is thin.

It is important to stress that the optimal function u (⋅) is piecewise linear, with break points at the realizations y 1, , y m of the benchmark Y. Therefore, the dual problem has also an equivalent linear programming formulation.

5.2 Inverse Form

In addition to the assumption that all involved distributions are discrete, we also assume that all probabilities p t are equal, and that m = T.

We introduce the set W of concave nondecreasing functions w : [0, 1] → R. It is evident that W is a convex cone.

Recall the identity

$$\mathit{E}[R(x)] ={\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(R(x);p)dp.$$

Let us define the function Φ : X ×WR, as follows

$$\begin{array}{rcl} \Phi (x,w)& ={\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(R(x);p)dp +{\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(R(x);p)dw(p)& \\ & \quad \, -{\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(Y ;p)dw(p). &\end{array}$$
(15.38)

It plays a role similar to that of a Lagrangian of Equations (15.30)–(15.32).

Theorem 15.3

If \(\hat{x}\) is an optimal solution of Equations (15.30)–(15.32) then there exists a function w ∈ W such that

$$\begin{array}{rcl} \Phi (\hat{x},\hat{w})& =& {\max}_{x\in X}\Phi (x,\hat{w})\end{array}$$
(15.39)
$$\begin{array}{rcl} {\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(R(\hat{x});p)d\hat{w}(p)& =& {\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(Y ;p)d\hat{w}(p).\end{array}$$
(15.40)

Conversely, if for some function w ∈ W we find an optimal solution \(\hat{x}\) of Equation (15.39) that satisfies Equations (15.31) and (15.40), then \(\hat{x}\) is an optimal solution of Equations (15.30)–(15.32).

We can also develop a duality theory based on Lagrangian Equation (15.38). For every function wW the problem

$${\max}_{x\in X}\Phi (x,w)$$
(15.41)

is a Lagrangian relaxation of problem Equations (15.30)–(15.32). Its optimal value, Ψ(w), is always greater than or equal to the optimal value of Equations (15.30)–(15.32).

We define the dual problem as

$${\min}_{w\in \mathit{W}}\Psi (w).$$
(15.42)

The set W is a closed convex cone and Ψ (⋅) is a convex function, so problem Equation (15.42) is a convex optimization problem. Duality relations in convex programming yield the following result.

Theorem 15.4

Assume that problem Equations (15.30)–(15.32) has an optimal solution. Then problem Equation (15.42) has an optimal solution and the optimal values of both problems coincide. Furthermore, the set of optimal solutions of Equation (15.42) is the set of functions w ∈ W satisfying Equations (15.39)–(15.40) for an optimal solution \(\hat{x}\) of Equations (15.30)–(15.32).

The “Lagrange multiplier” w in this case is related to rank dependent expected utility theory. We have established earlier that the second order stochastic dominance relation is equivalent to Equation (15.12) for all dual utility functions in W. Our result shows that one of them, w (⋅), assumes the role of a Lagrange multiplier associated with Equation (15.31). A point \(\hat{x}\) is a solution to Equations (15.30)–(15.32) if there exists a dual utility function w (⋅) such that \(\hat{x}\) maximizes over X the objective function E[R(x)] augmented with this dual utility. We can transform the Lagrangian Equation (15.38) in the following way:

$$\begin{array}{rcl} \Phi (X,w)& =& {\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(R(x);p)dp +{\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(R(x);p)dw(p) \\ & & -{\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(Y ;p)\mathit{dw}(p) ={\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(R(x);p)dv(p) \\ & & -{\int \nolimits \nolimits}_{0}^{1}{F}_{(-1)}(Y ;p)dw(p), \\ \end{array}$$

where \(v(p) = p + w(p)\). At the optimal solution the function \(\hat{v}(p) = p +\hat{w}(p)\) is the quantile utility function implied by the benchmark Y. Since ∫_{0}^{1}F (− 1)(Y ; p)dw(p) is fixed, the problem at the right hand side of Equation (15.39) becomes a problem of maximizing the implied rank dependent expected utility in X. It attaches higher weights to quantiles corresponding to smaller probabilities p. By maximizing Φ(R(x), w) we look for x such that the left tail of the distribution of R(x) is thin.

Fig. 15.5
figure 5

Implied utility functions

Similarly to von Neumann–Morgenstern utility function, it is very difficult to elicit the dual utility function in advance. Our model derives it from a random benchmark.

The optimal function w (⋅) is piecewise linear, with break points at \(\frac{t} {T},\ t = 1,\ldots ,T\). Therefore, the dual problem has also an equivalent linear programming formulation. This property, however, is conditioned on the assumption of equal probabilities.

6 Numerical Illustration

We have tested our approach on a basket of 719 real-world assets, using 616 possible realizations of their joint return rates (Ruszczyński and Vanderbei 2003). Historical data on weekly return rates in the 12 years from spring 1990 to spring 2002 were used as equally likely realizations.

Implied utility functions corresponding to dominance constraints for four benchmark portfolios.

We have used four benchmark return rates Y. Each of them was constructed as a return rate of a certain index composed of our assets. As we actually know the past return rates, for the purpose of comparison we have selected equally weighted indexes composed of the N assets having the highest average return rates in this period. Benchmark 1 corresponds to N = 26, Benchmark 2 corresponds to N = 54, Benchmark 3 corresponds to N = 82, and Benchmark 4 corresponds to N = 200. Our problem was to maximize the expected return rate, under the condition that the return rate of the benchmark portfolio is dominated. Since the benchmark point was a return rate of a portfolio composed from the same basket, we have \(m = T = 616\) in this case.

We have solved the problem by our method of minimizing the dual problem that was presented in Dentcheva and Ruszczyński (2006a).

The implied utility functions from Equation (15.37) obtained by solving the optimization problem Equation (15.34) in the optimality conditions are illustrated in Fig. 15.5. We see that for Benchmark Portfolio 1, which contains only a small number of fast-growing assets, the utility function is linear on almost the entire range of return rates. Only very negative return rates are penalized.

A different situation occurs when the benchmark portfolio contains more assets and is therefore more diversified and less risky. In order to dominate such a benchmark, we have to use a utility function which introduces penalty for a broader range of return rates and is steeper. For the broadly based index in Benchmark Portfolio 4, the optimal utility function is smoother and is nonlinear even for positive return rates. It is worth mentioning that all these utility functions, although nondecreasing and concave, have rather complicated shapes. It would be a very hard task to determine in advance the utility function that should be used to obtain a solution dominating our benchmark portfolio.

Obviously, the shape of the utility function is determined by the benchmark within the context of the optimization problem considered. If we change the optimization problem, the utility function will change.

Finally, we may remark that our model Equations (15.22)–(15.24) can be used for testing the statistical hypothesis that the return rate Y of the benchmark portfolio is nondominated.

7 Conclusions

We presented a new approach to portfolio selection based on stochastic dominance. The portfolio return rate in the new model is required to stochastically dominate a random benchmark, for example, the return rate of an index. We formulated optimality conditions and duality relations for these models and constructed equivalent optimization models with utility functions. Two different formulations of the stochastic dominance constraint: primal and inverse, lead to two dual problems that involve von Neuman–Morgenstern utility functions for the primal formulation and rank dependent (or dual) utility functions for the inverse formulation. The utility functions play the roles of Lagrange multipliers associated with the dominance constraints. In this way our model provides a link between the expected utility theory and the rank dependent utility theory. A numerical example illustrates the new approach and demonstrates the efficacy of the method.

Future challenges are extensions of the approach to multivariate and multistage outcomes and benchmarks.