1 Introduction

At the core of stochastic optimization is the problem of minimizing \({\mathbb {E}}_{\mathbb {P}}[f(x,{\tilde{z}})],\) where \(x\in {\mathbb {R}}^n\) is the decision vector, \({\tilde{z}}\) is a random vector, \(f\,{:}\,{\mathbb {R}}^n\times {\mathbb {R}}^m\rightarrow (-\infty ,+\infty ],\) \({\mathbb {E}}\) stands for expectation, and \({\mathbb {P}}\) is the joint probability distribution of \({\tilde{z}}\). In classical numerical stochastic optimization it is assumed that the distribution of \({\mathbb {P}}\) is given, which is restrictive since in practice only partial information on \({\mathbb {P}}\) is available, say, one only knows \({\mathbb {P}}\in \mathcal{A}\), where \(\mathcal{A}\) is defined by certain known statistics of \({\tilde{z}}\). Therefore we are naturally led to a “distributionally robust” formulation as follows

$$\begin{aligned}&\mathrm{(DRSO)}\quad \min ~\sup _{{\mathbb {P}}\in \mathcal{A}}{\mathbb {E}}_{\mathbb {P}}(f(x,{\tilde{z}})):={\mathcal {R}}(f(x,{\tilde{z}})). \end{aligned}$$

Observe that for a fixed \(x, X:=f(x,{\tilde{z}})\) is a random variable and the property of mapping \({\mathcal {R}}(X)=\sup _{{\mathbb {P}}\in \mathcal{A}}{\mathbb {E}}_{\mathbb {P}}(X)\) deserves a careful study. In fact, as pointed by Rockafellar (2007), it is natural to consider the functional \( {\mathcal {R}}(f(x,{\tilde{z}}))\) as a “risk measure” or “surrogate” of the random cost function \(f(x,{\tilde{z}})\). This paper aims at studying a dual representation of the function \({\mathcal {R}}\) and its applications in optimization.

Given a probability space \((\Omega , \Sigma , {\mathbb {P}}_0)\), it is well known that \(X\,{:}\,\Omega \rightarrow {\mathbb {R}}\) is a random variable if it is \(\Sigma \)-measurable, that is, \(\{\omega \,{:}\,X(\omega )\le a\}\in \Sigma \) for any \(a\in {\mathbb {R}}\). We call \({\mathbb {P}}_0\) the base probability measure, which is fixed in our analysis. To simplify our notation, when the expectation with respect to \({\mathbb {P}}_0\) is concerned, we omit \({\mathbb {P}}_0\) and write \({\mathbb {E}}_{{\mathbb {P}}_0}(X)\) as \({\mathbb {E}}(X).\) As usual, for \(1\le p\le \infty \), we use \({{\mathscr {L}}}\,^p(\Omega ,\Sigma ,{\mathbb {P}}_0)\) (\({{\mathscr {L}}}\,^p\) for short) to denote the set of all random variables X satisfying \({\mathbb {E}}(|X|^p)<+\infty \). For convenience in engineering applications , we restrict ourselves to the space of \(X\in {\mathscr {L}}^2\) although the main results of this paper could be extended to a larger space such like \({\mathscr {L}}^1\). Therefore, in this paper a risk measure \({\mathcal {R}}\) is a functional from \({{\mathscr {L}}}^2\) to \((-\infty ,+\infty ]\). It may represent “the risk of loss” where X may represent “the real amount of loss”. Furthermore, if \({\mathcal {R}}(X)\) is finite for any \(X\in {{\mathscr {L}}}^2\), then we call \({\mathcal {R}}\) a finite risk measure. A risk measure \({\mathcal {R}}\) is coherent in the basic sense (“coherent” for short) if it satisfies the following five axioms (Artzner et al. 1997, 1999; Rockafellar 2007).

(A1):

\({\mathcal {R}}(C)=C\) for all constant C,

(A2):

\({\mathcal {R}}((1-\lambda )X+\lambda X')\le (1-\lambda ){\mathcal {R}}(X)+\lambda {\mathcal {R}}(X')\) for \(\lambda \in [0,1]\) (“convexity”),

(A3):

\({\mathcal {R}}(X)\le {\mathcal {R}}(X')\) if \(X\le X'\) almost surely (“monotonicity”),

(A4):

\({\mathcal {R}}(X)\le 0\) when \(\Vert X^k-X\Vert _2\rightarrow 0\) with \({\mathcal {R}}(X^k)\le 0\) (“closedness”),

(A5):

\({\mathcal {R}}(\lambda X)=\lambda {\mathcal {R}}(X) \) for \(\lambda >0\) (“positive homogeneity”).

In early literature on coherency (Artzner et al. 1997, 1999), it was required to have \({\mathcal {R}}(X+C)={\mathcal {R}}(X)+C\). It can be shown that this follows automatically by (A1) and (A2) (Rockafellar et al. 2006).

Consider another probability measure \({\mathbb {P}}\) on \((\Omega ,\Sigma ), {\mathbb {P}}\) is said to be absolutely continuous with respect to \({\mathbb {P}}_0\) (denoted by \({\mathbb {P}}\ll {\mathbb {P}}_0\)) if \({\mathbb {P}}_0(A)=0\) implies \({\mathbb {P}}(A)=0\) for any measurable set \(A\in \Sigma \). If \({\mathbb {P}}\ll {\mathbb {P}}_0\), then by probability theory there is a well-defined Radon–Nikodym derivative \(Q=\frac{d{\mathbb {P}}}{d{\mathbb {P}}_0}\). Such derivatives make up the set

$$\begin{aligned} {\mathcal {P}}:=\left\{ Q\in {{\mathscr {L}}}^2:~Q\ge 0,~{\mathbb {E}}(Q)=1\right\} . \end{aligned}$$
(1.1)

Q is called the “density” of \({\mathbb {P}}\) because the expectation of a random variable X with respect to \({\mathbb {P}}\) is equal to \({\mathbb {E}}(XQ)\), namely

$$\begin{aligned} {\mathbb {E}}_{{\mathbb {P}}}(X)=\int _\Omega X(\omega )d{\mathbb {P}}(\omega )=\int _\Omega X(\omega )Q(\omega )d{\mathbb {P}}_0(\omega )={\mathbb {E}}(XQ). \end{aligned}$$
(1.2)

Any nonempty closed convex subset \({\mathcal {Q}}\) of \({\mathcal {P}}\) is called a “risk envelope”. According to the theory of conjugacy in convex analysis, there is a dual representation for coherent risk measures (Theorem 4(a), Rockafellar 2007), which says that

\({\mathcal {R}}\) is a coherent measure of risk in the basic sense if and only if there is a risk envelope \({\mathcal {Q}}\) (which will be uniquely determined) such that

$$\begin{aligned} {\mathcal {R}}(X)=\sup \limits _{Q\in {\mathcal {Q}}}{\mathbb {E}}(XQ). \end{aligned}$$
(1.3)

Here and below, we will regard this result as “the dual representation theorem” for short.

It follows from (1.3) that the risk envelope \({\mathcal {Q}}\) can be written explicitly as

$$\begin{aligned} {\mathcal {Q}}=\{Q\in {\mathcal {P}}:~{\mathbb {E}}(XQ)\le {\mathcal {R}}(X)~\text{ for } \text{ all }~X\in {{\mathscr {L}}}^2\}. \end{aligned}$$
(1.4)

Note that the requirement \(Q\ge 0\) in (1.1) is equivalent to Axiom (A3) and the requirement \({\mathbb {E}}(Q)=1\) is equivalent to (A1), as shown in Rockafellar et al. (2006). Furthermore, the setting of \(X\in {{\mathscr {L}}}^2\) implies \(Q\in {\mathscr {L}}^2\). Hence all requirements for Q in (1.1) are natural. It should be noted that a primary form of the above representation theorem with a finite set \(\Omega \) has existed long before the notion of coherent risk measure, see, e.g., Huber (1981).

Many applications of risk measures are concerned with “averse risk measures”. A risk measure is averse if it satisfies axioms (A1), (A2), (A4), (A5) and

(A6):

\({\mathcal {R}}(X)>{\mathbb {E}}(X)\) for all non-constant X.

It would be interesting both in theory and practice to describe aversity in the context of dual representation of risk measures. We shall discuss this topic in Sect. 4.

The contributions of this paper can be outlined as follows:

  1. 1.

    We derive formulae of risk measures when the corresponding risk envelopes involve set operations such as union, intersection, and positive combination (see Proposition 2.1, Theorems 2.1 and 2.2, respectively).

  2. 2.

    We present independent proofs in Sects. 3.13.5 for the correspondence between several popular risk measures and their risk envelopes.

  3. 3.

    We study sufficient and necessary conditions on the risk envelope that guarantee the aversity of the corresponding risk measure (see Propositions 4.24.5).

  4. 4.

    We indicate a connection between the so-called uncertainty sets in robust optimization and the dual representation of risk measures (See Propositions 5.15.2 specify and Theorem 5.1 for details).

The paper is organized as follows. In Sect. 2, we consider the set operations of risk envelopes. In Sects. 3 and 4, we discuss risk envelopes for several popular risk measures and risk aversity, respectively. Section 5 addresses the relationship between the risk measures defined through uncertainty sets and the ones defined through risk envelopes. Section 6 concludes this paper.

2 Set operations of risk envelopes

Suppose \({\mathcal {R}}_1,{\mathcal {R}}_2,\ldots ,{\mathcal {R}}_n\) is a collection of coherent risk measures on \({{\mathscr {L}}}^2\) with risk envelopes \({\mathcal {Q}}_1,{\mathcal {Q}}_2,\ldots ,{\mathcal {Q}}_n\), respectively. Since \({{\mathscr {L}}}^2\) is a Banach lattice (that is, it is a Banach space and \(X,Y\in {{\mathscr {L}}}^2\) with \(|X|\le |Y|\) implies \(\Vert X\Vert _2\le \Vert Y\Vert _2\)), if \({\mathcal {R}}_i\) is finite, then it is continuous, subdifferentiable on \({{\mathscr {L}}}^2\), and bounded above in some neighborhood of the origin by Proposition 3.1 of Ruszczynski and Shapiro (2006). It then follows that, by Theorem 10 of Rockafellar (1974), the corresponding \({\mathcal {Q}}_i\) is compact in the weak topology of \({{\mathscr {L}}}^2\), that is, \({\mathcal {Q}}_i\) is weakly compact.

The following result deals with positive combination of the sets \({\mathcal {Q}}_1,{\mathcal {Q}}_2,\ldots ,{\mathcal {Q}}_n\). A similar result can be found in Rockafellar and Uryasev (2013).

Proposition 2.1

Let \(\lambda _1,\ldots ,\lambda _n\) be positive numbers. Then the positive combination

$$\begin{aligned} {\mathcal {R}}:=\lambda _1{\mathcal {R}}_1+\cdots +\lambda _n{\mathcal {R}}_n \end{aligned}$$

is a coherent risk measure with risk envelope

$$\begin{aligned} \bar{\mathcal {Q}}=\mathrm{cl }\,(\lambda _1{\mathcal {Q}}_1+\cdots +\lambda _n{\mathcal {Q}}_n), \end{aligned}$$

where \(\mathrm{cl }\,\) means the closure of the set. Moreover, if all but perhaps one of the \({\mathcal {R}}_i\)’s are finite, then the risk envelope is simply

$$\begin{aligned} {\mathcal {Q}}=\lambda _1{\mathcal {Q}}_1+\cdots +\lambda _n{\mathcal {Q}}_n. \end{aligned}$$

Proof

Since

$$\begin{aligned} \sup _{Q\in \bar{\mathcal {Q}}}{\mathbb {E}}(XQ)=\sup _{Q_i\in {\mathcal {Q}}_i,i=1,\ldots ,n}{\mathbb {E}}\left[ X(\lambda _1Q_1+\cdots +\lambda _nQ_n)\right] =\sum _{i=1}^n\lambda _i{\mathcal {R}}_i(X)={\mathcal {R}}(X), \end{aligned}$$

the first part of the proposition follows. For the second part, as discussed above, we know that if \({\mathcal {R}}_i\) is finite, then the corresponding \({\mathcal {Q}}_i\) is weakly compact. It is easy to see that \({\mathcal {Q}}\) is a nonempty and convex subset of \({\mathcal {P}}\) [as defined in (1.1)]. Furthermore, \({\mathcal {Q}}\) is weakly closed since all but perhaps one of the \({\mathcal {Q}}_i\)’s are weakly compact, and the sum of finitely many weakly closed set, if all but perhaps one of which is weakly compact, is a weakly closed set. Then \({\mathcal {Q}}\) is closed because closedness coincides with weak closedness for convex sets. Therefore, \(\bar{{\mathcal {Q}}}={\mathcal {Q}}\) in this case. \(\square \)

Next, define

$$\begin{aligned} {\widetilde{{\mathcal {R}}}}_1(X):= & {} \max \limits _{1\le i\le n}{\mathcal {R}}_i(X),\quad {\widetilde{{\mathcal {R}}}}_2(X) :=\min \limits _{1\le i\le n}{\mathcal {R}}_i(X), \text{ and }\\ {\widetilde{{\mathcal {R}}}}_3(X):= & {} \mathrm{cl }\,({\mathcal {R}}_1{\Box \,}{\mathcal {R}}_2{\Box \,}\cdots \Box {\mathcal {R}}_n)(X), \end{aligned}$$

where \(\mathrm{cl }\,\) means the closure of the function (Rockafellar and Wets 1997) and

$$\begin{aligned} ({\mathcal {R}}_1{\Box \,}{\mathcal {R}}_2{\Box \,}\cdots {\Box \,}{\mathcal {R}}_n)(X):= & {} \inf \{{\mathcal {R}}_1(X_1)+{\mathcal {R}}_2(X_2)\\&+\cdots +\,{\mathcal {R}}_n(X_n):~X_1+X_2+\cdots +X_n=X\} \end{aligned}$$

is the so-called inf-convolution of the functionals \({\mathcal {R}}_i, i=1,\ldots ,n.\) Let us call \({\widetilde{{\mathcal {R}}}}_1 \) and \({\widetilde{{\mathcal {R}}}}_2\) the “max” and the “min” of the risk measures \({\mathcal {R}}_1,{\mathcal {R}}_2,\ldots ,{\mathcal {R}}_n\), respectively. Clearly, \({\widetilde{{\mathcal {R}}}}_2(X)\) is not coherent because it may not be convex. We next show that \({{\widetilde{{\mathcal {R}}}}}_1\) and the lower-convexification of \({{\widetilde{{\mathcal {R}}}}}_2\), namely \({{\widetilde{{\mathcal {R}}}}}_3\), are coherent risk measures generated by the risk envelopes \(\mathrm {conv}\left( \bigcup \nolimits _{i=1}^n{\mathcal {Q}}_i\right) \) and \(\bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\), respectively, where \(\mathrm {conv}(\cdot )\) stands for the convex hull. We begin with the following lemma about \({\widetilde{{\mathcal {R}}}}_2\) and \({\widetilde{{\mathcal {R}}}}_3\).

Lemma 2.1

\({\widetilde{{\mathcal {R}}}}_3\) is the “lower-convexification” of \({\widetilde{{\mathcal {R}}}}_2\) in the sense that

  1. (1)

    \({\widetilde{{\mathcal {R}}}}_3(X)\le {\widetilde{{\mathcal {R}}}}_2(X)\) for all X.

  2. (2)

    Let \({\mathcal {R}}(X)\) be any coherent risk measure satisfying \({\mathcal {R}}(X)\le {\widetilde{{\mathcal {R}}}}_2(X)\) for all X. Then \({\mathcal {R}}(X)\le {\widetilde{{\mathcal {R}}}}_3(X)\) for all X.

Proof

  1. (1)

    By the definition of \({\widetilde{{\mathcal {R}}}}_3\), we have for any \(1\le i\le n\) and for all X,

    $$\begin{aligned} {\widetilde{{\mathcal {R}}}}_3(X)\le \mathrm{cl }\,\big [{\mathcal {R}}_1(0)+\cdots +{\mathcal {R}}_{i-1}(0)+{\mathcal {R}}_i(X)+{\mathcal {R}}_{i+1}(0)+\cdots +{\mathcal {R}}_n(0)\big ]={\mathcal {R}}_i(X). \end{aligned}$$

    Then \({\widetilde{{\mathcal {R}}}}_3(X)\le \min \limits _{1\le i\le n}{\mathcal {R}}_i(X)={\widetilde{{\mathcal {R}}}}_2(X)\) as desired.

  2. (2)

    Since \({\mathcal {R}}(X)\le {\widetilde{{\mathcal {R}}}}_2(X)\) for all X, we have \({\mathcal {R}}(X)\le {\mathcal {R}}_i(X)\) for any \(1\le i\le n\) and for all X. Furthermore, by the convexity of \({\mathcal {R}}\), we have for any \(X_1,X_2,\ldots ,X_n\) such that \(X_1+X_2+\cdots +X_n=X\),

    $$\begin{aligned} {\mathcal {R}}(X)\le {\mathcal {R}}(X_1)+{\mathcal {R}}(X_2)+\cdots +{\mathcal {R}}(X_n)\le {\mathcal {R}}_1(X_1)+{\mathcal {R}}_2(X_2)+\cdots +{\mathcal {R}}_n(X_n). \end{aligned}$$

    Taking closure of infimum on the right hand side, by the definition of \({\widetilde{{\mathcal {R}}}}_3\) together with the continuity of \({\mathcal {R}}_1,\ldots ,{\mathcal {R}}_n\), we get \({\mathcal {R}}(X)\le {\widetilde{{\mathcal {R}}}}_3(X)\) for all X, as desired. \(\square \)

The main results of this section are the following two theorems. A finite-dimensional version of them appeared in Theorem 3.3.3 of Hiriart-Urruty and Lemaréchal (1993). Here, we present a proof for the \({\mathscr {L}}^2\) version.

Theorem 2.1

If \({\mathcal {R}}_1,\ldots ,{\mathcal {R}}_n\) are finite, then \(\widetilde{\mathcal {R}}_1(\cdot )\) is a coherent risk measure with risk envelope \({\widetilde{{\mathcal {Q}}}}_1=\mathrm {conv}\left( \bigcup \nolimits _{i=1}^n{\mathcal {Q}}_i\right) \).

Proof

We first claim that \(\mathrm {conv}\left( \bigcup \nolimits _{i=1}^n{\mathcal {Q}}_i\right) \) is closed and convex. The convexity is trivial. For closedness, since \({\mathcal {Q}}_1,\ldots ,{\mathcal {Q}}_n\) are all weakly compact, we have that \(\mathrm {conv}\left( \bigcup \nolimits _{i=1}^n{\mathcal {Q}}_i\right) \) is weakly compact because the union of any finite collection of weakly compact sets is again weakly compact, and its convex hull is therefore weakly compact. Furthermore, \(\mathrm {conv}\left( \bigcup \nolimits _{i=1}^n{\mathcal {Q}}_i\right) \) is closed because weak compactness implies weak closedness, and weak closedness coincides with closedness for convex sets. Next, for any \(X\in {{\mathscr {L}}}^2\), we have

$$\begin{aligned} {\widetilde{{\mathcal {R}}}}_1(X)= & {} \max \limits _{1\le i\le n}{\mathcal {R}}_i(X)=\max \limits _{1\le i\le n}\left( \sup \limits _{Q\in {\mathcal {Q}}_i}{\mathbb {E}}(XQ)\right) \\= & {} \sup \limits _{Q\in \bigcup \limits _{i=1}^n{\mathcal {Q}}_i}{\mathbb {E}}(XQ)= \sup \limits _{Q\in \mathrm {conv}\left( \bigcup \limits _{i=1}^n{\mathcal {Q}}_i\right) }{\mathbb {E}}(XQ). \end{aligned}$$

Hence by the dual representation theorem, \({\widetilde{{\mathcal {R}}}}_1\) is a coherent risk measure and its risk envelope is \({\widetilde{{\mathcal {Q}}}}_1=\mathrm {conv}\left( \bigcup \nolimits _{i=1}^n{\mathcal {Q}}_i\right) \), as desired.\(\square \)

Theorem 2.2

\({\widetilde{{\mathcal {R}}}}_3(\cdot )\) is a coherent risk measure with risk envelope \(\bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\) if and only if \(\bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\ne \emptyset \).

Proof

For the “if” part, we first verify that \({\widetilde{{\mathcal {R}}}}_3(\cdot )\) is a coherent risk measure. By the closure of inf-convolution formula of \({\widetilde{{\mathcal {R}}}}_3\), the convexity (A2) and closedness (A4) hold. For positive homogeneity (A5), one has

$$\begin{aligned} \widetilde{\mathcal {R}}_3(\lambda X)= & {} \mathrm{cl }\,\inf _{X_2,\ldots ,X_n}\left\{ {\mathcal {R}}_1(\lambda X-X_2-\cdots -X_n)+{\mathcal {R}}(X_2)+\cdots +{\mathcal {R}}_n(X_n)\right\} \\= & {} \mathrm{cl }\,\inf _{Y_2,\ldots ,Y_n}\left\{ {\mathcal {R}}_1(\lambda X-\lambda Y_2-\cdots -\lambda Y_n)+{\mathcal {R}}(\lambda Y_2)+\cdots +{\mathcal {R}}_n(\lambda Y_n)\right\} \\= & {} \lambda \widetilde{\mathcal {R}}_3(X). \end{aligned}$$

Axiom (A1) is true because

$$\begin{aligned} \widetilde{\mathcal {R}}_3(C)\le {\mathcal {R}}_1(C)+{\mathcal {R}}_2(0)+\cdots +{\mathcal {R}}_n(0)= C \text{ and } \text{ similarly, } \widetilde{\mathcal {R}}_3(-C)\le -C. \end{aligned}$$
(2.1)

Then by convexity and positive homogeneity

$$\begin{aligned} 0 =\widetilde{\mathcal {R}}_3(0)\le \widetilde{\mathcal {R}}_3(C) +\widetilde{\mathcal {R}}_3(-C)\le \widetilde{\mathcal {R}}_3(C)- C\ \ \Longleftrightarrow \ \ \widetilde{\mathcal {R}}_3(C)\ge C. \end{aligned}$$
(2.2)

Thus, (A1) follows. Finally, let \(X\le Y\) almost surely. Then

$$\begin{aligned} \widetilde{\mathcal {R}}_3(X)= & {} \mathrm{cl }\,\inf _{X_2,\ldots ,X_n}\left\{ {\mathcal {R}}_1(X-X_2-\cdots -X_n)+{\mathcal {R}}(X_2)+\cdots +{\mathcal {R}}_n(X_n)\right\} \\\le & {} \mathrm{cl }\,\inf _{X_2,\ldots ,X_n}\left\{ {\mathcal {R}}_1(Y-X_2-\cdots -X_n)+{\mathcal {R}}(X_2)+\cdots +{\mathcal {R}}_n(X_n)\right\} \\= & {} \widetilde{\mathcal {R}}_3(Y), \end{aligned}$$

hence monotonicity (A3) holds. Therefore, \(\widetilde{\mathcal {R}}_3(X)\) is a coherent risk measure. Let \(\widetilde{\mathcal {Q}}_3\) be its risk envelope. Since \(\widetilde{\mathcal {R}}_3(X)\le {\mathcal {R}}_i(X),\) by (1.4), \(\widetilde{\mathcal {Q}}_3\subseteq {\mathcal {Q}}_i\) for \(1\le i\le n.\) Thus, \(\widetilde{\mathcal {Q}}_3\subseteq \bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\). Conversely, suppose \({\widetilde{{\mathcal {R}}}}\) is the risk measure with envelope \(\bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\). Since \({\widetilde{{\mathcal {R}}}}\) is convex, positive homogeneous, and \({\widetilde{{\mathcal {R}}}}(X)\le {\widetilde{{\mathcal {R}}}}_2(X)\) for all X, by Lemma 2.1 we get \({\widetilde{{\mathcal {R}}}}(X)\le {\widetilde{{\mathcal {R}}}}_3(X)\) for all X. Using (1.4) again, we can get \(\bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\subseteq {\widetilde{{\mathcal {Q}}}}_3\). Thus, we have \({\widetilde{{\mathcal {Q}}}}_3=\bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\).

We next prove the “only if” part. If \({\widetilde{{\mathcal {R}}}}_3(\cdot )\) is a coherent risk measure, then it has a nonempty risk envelope \({\widetilde{{\mathcal {Q}}}}_3\), which is an implication of Axiom (A1) and the dual representation theorem. Using the same argument from the last paragraph, we can get \({\widetilde{{\mathcal {Q}}}}_3\subseteq \bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\). Therefore, \(\bigcap \nolimits _{i=1}^n{\mathcal {Q}}_i\ne \emptyset \). \(\square \)

Note that Theorem 2.2 does not require the \({\mathcal {R}}_i\)s to be finite.

Set operations of risk envelopes may be used to create new risk measures that are more conservative (say, by union) or more aggressive (say, by intersection) in applications. Chen et al. (2010) used intersections of five uncertainty sets to create new uncertainty sets in robust optimization and here we have shown the same principle applies to risk envelopes.

3 Popular risk measures and their risk envelopes

Besides set operations, one can create various different coherent risk measures by adding additional functional constraints to the risk envelope \({\mathcal {P}}\) in (1.1). In this section we study (1) risk measure from expectation, (2) risk measure from worst case analysis, (3) risk measure from subdividing the future, (4) risk measures from the conditional value at risk and optimized certainty equivalence, and (5) risk measure from mean-deviation. Most of the results in this section have been stated in Rockafellar (2007) without proofs. In fact their proofs are scattered in the literature via different approaches. Here we provide independent proofs based on the unified view of dual representation of risk measures. Our approach is to directly specify the risk envelope \({\mathcal {Q}}\) for each of the above cases and to verify the relationship \({\mathcal {R}}(X)=\sup \nolimits _{Q\in {\mathcal {Q}}}{\mathbb {E}}(XQ).\) The coherency of \({\mathcal {R}}\) then follows from the dual representation theorem.

3.1 Risk envelope for expectation

Here \({\mathcal {Q}}=\{Q\in {\mathscr {L}}^2:Q\equiv 1\}.\) Then \({\mathbb {E}}(X)=\sup _{Q\in {\mathcal {Q}}}{\mathbb {E}}(XQ).\)

3.2 Risk envelope for the worst case

Here the risk envelope is \({\mathcal {Q}}={\mathcal {P}}\) and by “the worst case” we mean the “essential supremum” function of X, that is,

$$\begin{aligned} \text {ess-sup}(X):=\inf \{a:~{\mathbb {P}}_0(X>a)=0\}. \end{aligned}$$
(3.1)

Note that \(\sup \nolimits _{Q\in {\mathcal {P}}}{\mathbb {E}}(XQ)\le \text {ess-sup}(X)\) for any \(X\in {{\mathscr {L}}}^2\), and therefore \({\mathcal {P}}\subseteq {\mathcal {Q}}\). Hence \({\mathcal {Q}}={\mathcal {P}}\).

It is possible that \(\text {ess-sup}(X)=\infty \) for some X, which could happen if X does not have a finite essential supremum. Thus, \(\text {ess-sup}(\cdot )\) is not a finite risk measure.

3.3 The risk measure from subdividing the future

In Rockafellar (2007) the following risk measure is discussed. Let \(\Omega \) be partitioned into subsets \(\Omega _1,\ldots ,\Omega _r, r\ge 2,\) having positive probability \({\mathbb {P}}_0(\Omega _k)=\lambda _k\) with \( \lambda _1+\cdots +\lambda _r=1.\) For \(k=1,\ldots ,r\), let

$$\begin{aligned} {\mathcal {R}}_k(X):= & {} \mathop {\text {ess-sup}}\limits _{\omega \in \Omega _k}X(\omega )\\:= & {} \inf \{a:~{\mathbb {P}}_0(\{X>a\}\cap \Omega _k)=0\}. \end{aligned}$$

Then

$$\begin{aligned} {\mathcal {R}}:=\lambda _1{\mathcal {R}}_1+\cdots +\lambda _r{\mathcal {R}}_r \end{aligned}$$
(3.2)

is a coherent risk measure, called the risk measure from subdividing the future, whose risk envelope is

$$\begin{aligned} {\mathcal {Q}}:=\lambda _1{\mathcal {Q}}_1+\cdots +\lambda _r{\mathcal {Q}}_r\quad \text{ with } {\mathcal {Q}}_k:=\{Q\in {\mathcal {P}}:~{\mathbb {E}}(Q\mathbf{1 }_{\Omega _k})=1\}. \end{aligned}$$
(3.3)

To prove this by Proposition 2.1, we only need to prove that \({\mathcal {Q}}\) is closed. Suppose \(Q_n\in \lambda _1{\mathcal {Q}}_1+\cdots +\lambda _r{\mathcal {Q}}_r\) for \(n=1,2,\ldots \) and \(\Vert Q_n-Q\Vert _2\rightarrow 0\) as \(n\rightarrow \infty \). Then by (3.3), for \(n=1,2,\ldots \) we have \({\mathbb {E}}(Q_n\mathbf{1 }_{\Omega _k})=\lambda _k\) for \(k=1,2,\ldots ,r\). Note that for \(k=1,2,\ldots ,r\),

$$\begin{aligned} |{\mathbb {E}}(Q_n\mathbf{1 }_{\Omega _k})-{\mathbb {E}}(Q\mathbf{1 }_{\Omega _k})| \le \Vert Q_n-Q\Vert _2\cdot [{\mathbb {P}}_0(\Omega _k)]^{\frac{1}{2}}\rightarrow 0 \end{aligned}$$

as \(n\rightarrow \infty \). Thus, \({\mathbb {E}}(Q\mathbf{1 }_{\Omega _k})=\lambda _k\) for \(k=1,2,\ldots ,r\), and therefore \(Q\in \lambda _1{\mathcal {Q}}_1+\cdots +\lambda _r{\mathcal {Q}}_r\). This implies \(\lambda _1{\mathcal {Q}}_1+\cdots +\lambda _r{\mathcal {Q}}_r\) is closed in \({{\mathscr {L}}}^2\). \(\square \)

3.4 The conditional value at risk (CVaR) and the optimized certainty equivalence (OCE)

An important coherent risk measure is the conditional value at risk, popularized by Rockafellar and Uryasev (2000), with the formula

$$\begin{aligned} \mathrm{CVaR}_\alpha (X)=\min _{\beta \in {\mathbb {R}}}\left\{ \beta +{1\over 1-\alpha }{\mathbb {E}}(X-\beta )_+\right\} , \end{aligned}$$
(3.4)

where \((t)_+=\max (t,0)\). We next prove that the risk envelope of CVaR is

$$\begin{aligned} {\mathcal {Q}}_\alpha :=\left\{ Q\in {\mathscr {L}}^2:~{\mathbb {E}}(Q)=1,0\le Q\le {1\over 1-\alpha }\right\} . \end{aligned}$$

For any \(Q\in {\mathcal {Q}}_\alpha \) and \(\beta \in {\mathbb {R}},\) we have

$$\begin{aligned} {\mathbb {E}}(XQ)= & {} {\mathbb {E}}\left[ (X-\beta )Q\right] +\beta {\mathbb {E}}(Q)\\\le & {} \beta +{\mathbb {E}}[Q(X-\beta )_+]\le \beta +{1\over 1-\alpha }{\mathbb {E}}(X-\beta )_+. \end{aligned}$$

Taking supremum on the left hand side over \(Q\in {\mathcal {Q}}_\alpha \) and infimum on the right hand side over all \(\beta \in {\mathbb {R}},\) we get

$$\begin{aligned} \sup _{Q\in {\mathcal {Q}}_\alpha }{\mathbb {E}}(XQ)\le \min _\beta \left\{ \beta +{1\over 1-\alpha }{\mathbb {E}}(X-\beta )_+\right\} . \end{aligned}$$
(3.5)

On the other hand, noting that the “value-at-risk” (VaR) is defined as

$$\begin{aligned} \mathrm {VaR}_\alpha (X):=\inf \left\{ \nu \in {\mathbb {R}}:{\mathbb {P}}(X>\nu )<1-\alpha \right\} , \end{aligned}$$

we have

$$\begin{aligned} {\mathbb {P}}_0(X>\hbox {VaR}_\alpha (X))\le 1-\alpha \le {\mathbb {P}}_0(X\ge \hbox {VaR}_\alpha (X)). \end{aligned}$$

Thus, there exists \(\lambda \in [0,1]\) such that

$$\begin{aligned} 1-\alpha =\lambda \cdot {\mathbb {P}}_0(X>\mathrm {VaR}_\alpha (X))+(1-\lambda )\cdot {\mathbb {P}}_0(X\ge \mathrm {VaR}_\alpha (X)). \end{aligned}$$

Set

$$\begin{aligned} Q_0=\frac{1}{1-\alpha }\cdot [\lambda \cdot \mathbf{1 }_{\{X>\mathrm {VaR}_\alpha (X)\}}+(1-\lambda )\cdot \mathbf{1 }_{\{X\ge \mathrm {VaR}_\alpha (X)\}}]. \end{aligned}$$

Note that \(0\le Q_0\le \frac{1}{1-\alpha }\) and \({\mathbb {E}}(Q_0)=1\). Thus \(Q_0\in {\mathcal {Q}}_\alpha \) and

$$\begin{aligned} \sup _{Q\in {\mathcal {Q}}_\alpha }{\mathbb {E}}(XQ)\ge & {} {\mathbb {E}}(XQ_0)\\= & {} {\mathbb {E}}[(X-\mathrm {VaR}_\alpha (X))\cdot Q_0]+\mathrm {VaR}_\alpha (X)\cdot {\mathbb {E}}(Q_0)\\= & {} \mathrm{VaR}_\alpha (X)+{1\over 1-\alpha }\cdot {\mathbb {E}}(X-\mathrm{VaR}_\alpha (X))_+\\\ge & {} \min _{\beta \in {\mathbb {R}}}\left\{ \beta +{1\over 1-\alpha }{\mathbb {E}}(X-\beta )_+\right\} . \end{aligned}$$

Combine (3.5) and the above we obtain that

$$\begin{aligned} \mathrm {CVaR}_\alpha (X)=\sup _{Q\in {\mathcal {Q}}_\alpha }{\mathbb {E}}(XQ). \end{aligned}$$

As a by-product of the proof, we see that the minimum in (3.4) is attained at \(\beta =\mathrm {VaR}_\alpha (X)\), that is,

$$\begin{aligned} \mathrm {CVaR}_\alpha (X)=\mathrm {VaR}_\alpha (X)+\frac{1}{1-\alpha }\cdot {\mathbb {E}}\left( X-\mathrm {VaR}_\alpha (X)\right) _+. \end{aligned}$$

Ben-Tal and Teboulle (2007) proved that the negative of their OCE function

$$\begin{aligned} \mathrm{OCE}_u(X)=\sup _\eta \{\eta +{\mathbb {E}}[u(X-\eta )]\}, \end{aligned}$$

where u is a piecewise linear utility function, is a coherent risk measure that includes CVaR as a special case. Since X is a risk rather than an income in our context and we are considering risk rather than utility, we define

$$\begin{aligned} S_r(X):= & {} -\mathrm{OCE}_u(-X)\nonumber \\= & {} \inf _\eta \{-\eta +{\mathbb {E}}[-u(-X-\eta )]\}\nonumber \\= & {} \inf _\beta \{\beta +{\mathbb {E}}[r(X-\beta )]\}, \end{aligned}$$
(3.6)

where \(r(X)=-u(-X)\) and we can similarly show that if

$$\begin{aligned} r(X)=\gamma _1[X]_+-\gamma _2[-X]_+ \quad \text{ with } \quad 0\le \gamma _2<1<\gamma _1, \end{aligned}$$

then \(S_r(X)\) is a coherent risk measure with risk envelope \(\gamma _2\le Q\le \gamma _1.\) i.e.,

$$\begin{aligned} S_r(X)= & {} \sup _{Q\in {\mathcal {Q}}_{\gamma _1,\gamma _2}}{\mathbb {E}}(XQ), \hbox { where}\; \quad \nonumber \\ {\mathcal {Q}}_{\gamma _1,\gamma _2}:= & {} \left\{ Q\in {\mathcal {P}}:~\gamma _2\le Q\le \gamma _1\right\} . \end{aligned}$$
(3.7)

It is interesting to observe that OCE can be representable by CVaR, namely

$$\begin{aligned} S_r(X)=\gamma _2{\mathbb {E}}(X)+\mathrm {CVaR}_\alpha (X), \text{ where } \alpha =1-(\gamma _1-\gamma _2)^{-1}. \end{aligned}$$

This formula can be obtained by using Proposition 2.1 and the fact

$$\begin{aligned} Q_{\gamma _1,\gamma _2}=\gamma _2\{1\}+Q_\alpha . \end{aligned}$$

3.5 The mean-deviation

Fix \(0\le \lambda \le 1\). Define the mean-deviation risk measure as

$$\begin{aligned} {\mathcal {R}}(X)={\mathbb {E}}X+\lambda \cdot \Vert (X-{\mathbb {E}}X)_+\Vert _2 \end{aligned}$$

for all \(X\in {\mathscr {L}}^2\), where \(\Vert \cdot \Vert _2\) denotes the \({\mathscr {L}}^2\)-norm, that is, \(\Vert X\Vert _2:=\left[ {\mathbb {E}}(X^2)\right] ^{\frac{1}{2}}.\)

Similar to (3.1), we define

$$\begin{aligned} \text {ess-inf}(X):=\sup \{a:~{\mathbb {P}}_0(X<a)=0\}. \end{aligned}$$
(3.8)

We claim that the risk envelope of \({\mathcal {R}}\) is

$$\begin{aligned} {\mathcal {Q}}=\left\{ 0\le Q\in {{\mathscr {L}}}^2:~{\mathbb {E}}(Q)=1,~\Vert Q-\text {ess-inf}Q\Vert _2\le \lambda \right\} . \end{aligned}$$

In fact, on one hand, for any \(X\in {{\mathscr {L}}}^2\) and \(Q\in {\mathcal {Q}}\), we have

$$\begin{aligned} {\mathbb {E}}(XQ)= & {} {\mathbb {E}}[(X-{\mathbb {E}}X)(Q-\text {ess-inf}Q)]+{\mathbb {E}}X\le {\mathbb {E}}X\\&+\,{\mathbb {E}}[(X-{\mathbb {E}}X)_+(Q-\text {ess-inf}Q)]\\\le & {} {\mathbb {E}}X+\Vert (X-{\mathbb {E}}X)_+\Vert _2\cdot \Vert Q-\text {ess-inf}Q\Vert _2\\\le & {} {\mathbb {E}}X+\lambda \cdot \Vert (X-{\mathbb {E}}X)_+\Vert _2 \end{aligned}$$

by Cauchy–Schwartz inequality. Hence we get

$$\begin{aligned} \sup \limits _{Q\in {\mathcal {Q}}}{\mathbb {E}}(XQ)\le {\mathbb {E}}X+\lambda \cdot \Vert (X-{\mathbb {E}}X)_+\Vert _2 \end{aligned}$$
(3.9)

for any \(X\in {{\mathscr {L}}}^2\). On the other hand, set

$$\begin{aligned} Q_0:=1+\frac{\lambda \cdot \left[ (X-{\mathbb {E}}X)_+-{\mathbb {E}}(X-{\mathbb {E}}X)_+\right] }{\Vert (X-{\mathbb {E}}X)_+\Vert _2}. \end{aligned}$$

Since \(0\le \lambda \le 1\), we have

$$\begin{aligned} \text {ess-inf}Q_0=1-\frac{\lambda \cdot {\mathbb {E}}(X-{\mathbb {E}}X)_+}{\Vert (X-{\mathbb {E}}X)_+\Vert _2}\ge 1-\frac{{\mathbb {E}}(X-{\mathbb {E}}X)_+}{\Vert (X-{\mathbb {E}}X)_+\Vert _2}\ge 0. \end{aligned}$$

Thus, \(0\le Q_0\in {{\mathscr {L}}}^2, {\mathbb {E}}Q_0=1\) and

$$\begin{aligned} \Vert Q_0-\text {ess-inf}Q_0\Vert _2=\frac{\left\| \lambda \cdot (X-{\mathbb {E}}X)_+\right\| _2}{\Vert (X-{\mathbb {E}}X)_+\Vert _2}=\lambda , \end{aligned}$$

that is, \(Q_0\in {\mathcal {Q}}\). Then for any \(X\in {{\mathscr {L}}}^2\),

$$\begin{aligned} \sup \limits _{Q\in {\mathcal {Q}}}{\mathbb {E}}(XQ)\ge & {} {\mathbb {E}}(XQ_0)\nonumber \\= & {} {\mathbb {E}}X+\frac{\lambda \cdot {\mathbb {E}}\left[ (X-{\mathbb {E}}X)_+\cdot (X-{\mathbb {E}}X)\right] }{\Vert (X-{\mathbb {E}}X)_+\Vert _2}\nonumber \\= & {} {\mathbb {E}}X+\frac{\lambda \cdot \Vert (X-{\mathbb {E}}X)_+\Vert _2^2}{\Vert (X-{\mathbb {E}}X)_+\Vert _2}\nonumber \\= & {} {\mathbb {E}}X+\lambda \cdot \Vert (X-{\mathbb {E}}X)_+\Vert _2. \end{aligned}$$
(3.10)

(3.9) and (3.10) together imply

$$\begin{aligned} \sup \limits _{Q\in {\mathcal {Q}}}{\mathbb {E}}(XQ)={\mathbb {E}}X+\lambda \cdot \Vert (X-{\mathbb {E}}X)_+\Vert _2. \end{aligned}$$

We can check that \({\mathcal {Q}}\) is nonempty, convex and closed in \({{\mathscr {L}}}^2\). Therefore, it is the risk envelope for the mean-deviation risk measure.

It should be noted that \(\lambda \le 1\) is necessary for coherency as shown by the following example. Consider

$$\begin{aligned} {\mathcal {R}}(X)={\mathbb {E}}X+\lambda \cdot \Vert (X-{\mathbb {E}}X)_+\Vert _2, \end{aligned}$$

where X is a discrete random variable with distribution

$$\begin{aligned} {\mathbb {P}}(X=-1)=p,\quad {\mathbb {P}}(X=0)=1-p, \end{aligned}$$

where \(0<p<1\). Then \({\mathbb {E}}X=-p\), so

$$\begin{aligned} {\mathbb {P}}((X-{\mathbb {E}}X)_+=0)=p,\quad {\mathbb {P}}((X-{\mathbb {E}}X)_+=p)=1-p, \end{aligned}$$

and therefore, \({\mathcal {R}}(X)=-p+\lambda p\sqrt{1-p}=p(\lambda \sqrt{1-p}-1)\). If \(\lambda >1\), we can take \(p>0\) sufficiently small to get \({\mathcal {R}}(X)>0\). However, since we have \(X\le 0\) almost surely, this contradicts monotonicity.

4 Discussion on aversity

In this section, we study the effect of aversity on risk measures. Suppose \({\mathcal {R}}\) is a functional from \({{\mathscr {L}}}^2\) to \((-\infty ,+\infty ]\). Recall that an averse risk measure is defined by axioms (A1), (A2), (A4), (A5) and

(A6):

\({\mathcal {R}}(X)>{\mathbb {E}}(X)\) for all non-constant X.

We are interested in the risk measures which are both coherent and averse. Next we develop the conditions of risk envelopes under which a coherent risk measure is averse. We use the notion “\(A\subset B\)” to denote that A is a proper subset of B, that is, \(A\subseteq B\) but \(A\ne B\). The following necessary condition is trivial.

Proposition 4.1

Suppose \({\mathcal {R}}\) is a coherent risk measure on \({{\mathscr {L}}}^2\) with risk envelope \({\mathcal {Q}}\). If \({\mathcal {R}}\) is averse, then \(\{\mathbf{1}\}\subset {\mathcal {Q}}\).

On the other hand, a sufficient condition is stated in the following proposition.

Proposition 4.2

Suppose \({\mathcal {R}}\) is a coherent risk measure with risk envelope \({\mathcal {Q}}\). If \(\mathbf{1}\) is a relative interior point of \({\mathcal {Q}}\) (relative to \({\mathcal {P}}\)), then \({\mathcal {R}}\) is averse.

Proof

Since \(\mathbf{1}\) is a relative interior point of \({\mathcal {Q}}\) (relative to \({\mathcal {P}}\)), there exists \(\delta \in (0,1)\) such that

$$\begin{aligned} \{Q\in {\mathcal {P}}:~\Vert Q-\mathbf{1}\Vert _2<\delta \}\subseteq {\mathcal {Q}}. \end{aligned}$$
(4.1)

If X is not a constant almost surely, then there exists \(b\in {\mathbb {R}}\) such that

$$\begin{aligned} {\mathbb {P}}_0(X\ge b)=p\in (0,1),\quad {\mathbb {P}}_0(X<b)=1-p\in (0,1). \end{aligned}$$

Set

$$\begin{aligned} Q_0:=\left\{ \begin{array}{ll}1+(1-p)\delta &{}\quad \text{ if }~X\ge b,\\ 1-p\delta &{}\quad \text{ if }~X<b. \end{array}\right. \end{aligned}$$

Then we have

$$\begin{aligned} Q_0\ge 0,\quad {\mathbb {E}}(Q_0)=1,\quad \Vert Q_0-\mathbf{1}\Vert _2<\delta . \end{aligned}$$

By (4.1), we can get that \(Q_0\in {\mathcal {Q}}\). Thus,

$$\begin{aligned} {\mathbb {E}}(XQ_0)\le \sup \limits _{Q\in {\mathcal {Q}}}{\mathbb {E}}(XQ)={\mathcal {R}}(X). \end{aligned}$$
(4.2)

Furthermore, we have

$$\begin{aligned} {\mathbb {E}}(XQ_0)-{\mathbb {E}}(X)= & {} (1-p)\delta \cdot {\mathbb {E}}(X\mathbf{1 }_{\{X\ge b\}})-p\delta \cdot {\mathbb {E}}(X\mathbf{1 }_{\{X<b\}})\nonumber \\&>(1-p)\delta b\cdot {\mathbb {P}}_0(X\ge b)-p\delta b\cdot {\mathbb {P}}_0(X<b)=0. \end{aligned}$$
(4.3)

(4.2) and (4.3) together imply that \({\mathcal {R}}(X)>{\mathbb {E}}(X)\) for all non-constant X. Therefore, \({\mathcal {R}}\) is averse. \(\square \)

From Propositions 4.1 and 4.2, we can get the following:

$$\begin{aligned} \mathbf{1}~\text{ is } \text{ a } \text{ relative } \text{ interior } \text{ point } \text{ of }~{\mathcal {Q}}~(\text{ relative } \text{ to }~{\mathcal {P}})\Longrightarrow {\mathcal {R}}~\text{ is } \text{ averse }\Longrightarrow \{\mathbf{1}\}\subset {\mathcal {Q}}. \end{aligned}$$
(4.4)

Generally, the converse of (4.4) may not be true, which can be seen from the following two examples.

Example 4.1

Suppose \(\Omega =[0,1], \Sigma \) is the Borel sigma algebra on [0, 1], and \({\mathbb {P}}_0\) is the Lebesgue measure. In this case

$$\begin{aligned} \{\mathbf{1}\}:=\{{\tilde{Q}}_1(\omega )\equiv 1\}. \end{aligned}$$

Consider \({\mathcal {R}}=\mathrm {CVaR}_{0.5}\). By Rockafellar (2007), \({\mathcal {R}}\) is a coherent and averse risk measure with risk envelope \({\mathcal {Q}}=\{Q\in {{\mathscr {L}}}^2:~0\le Q\le 2,~{\mathbb {E}}(Q)=1\}\). However, \(\mathbf{1}\) is not a interior point of \({\mathcal {Q}}\). In fact, for any \(\delta \in (0,1)\), the random variable \({\widetilde{Q}}_{\delta }\) defined as

$$\begin{aligned} {\widetilde{Q}}_{\delta }(\omega )=\left\{ \begin{array}{ll} 3&{}\,\,\omega \in \left[ 0,\frac{\delta ^2}{16+\delta ^2}\right] ,\\ 1-\frac{\delta ^2}{8}&{}\,\,\omega \in (\frac{\delta ^2}{16+\delta ^2},1] \end{array}\right. \end{aligned}$$

is arbitrarily close to \({\widetilde{Q}}_1(\omega )\), but \({\widetilde{Q}}_{\delta }\not \in {\mathcal {Q}}\). Therefore, \(\mathbf{1}\) is not a relative interior point of \({\mathcal {Q}}\). Hence the converse of the first “\(\Longrightarrow \)”in (4.4) may not be true.

Example 4.2

Suppose \(\Omega =\{\omega _1,\omega _2,\omega _3\}\) and \({\mathbb {P}}_0(\{\omega _1\})={\mathbb {P}}_0(\{\omega _2\})={\mathbb {P}}_0(\{\omega _3\})=1/3\). Let

$$\begin{aligned} Q_0:~~Q_0(\omega _1)=\frac{3}{4},\quad Q_0(\omega _2)=\frac{3}{2},\quad Q_0(\omega _3)=\frac{3}{4}. \end{aligned}$$

Then \(Q_0\in {\mathcal {P}}\) and in this case

$$\begin{aligned} \mathbf{1}:=Q_1\,{:}\,Q_1(\omega _1)=1,\quad Q_1(\omega _2)=1,\quad Q_1(\omega _3)=1. \end{aligned}$$

Take \({\mathcal {Q}}:=\mathrm {conv}\{Q_1,Q_0\}\), then \(\{\mathbf{1}\}\subset {\mathcal {Q}}\). However, for the non-constant random variable

$$\begin{aligned} X\,{:}\,X(\omega _1)=-1,\quad X(\omega _2)=0,\quad X(\omega _3)=1, \end{aligned}$$

one has

$$\begin{aligned} {\mathcal {R}}(X)=\sup _{Q\in {\mathcal {Q}}}{\mathbb {E}}(XQ)=\max \{{\mathbb {E}}(XQ_1),{\mathbb {E}}(XQ_0)\}=0={\mathbb {E}}(X). \end{aligned}$$

Therefore, \({\mathcal {R}}\) is not averse.

From Example 4.2 we can see that the converse of the second “\(\Longrightarrow \)” in (4.4) may not hold even when \(\Omega \) is finite. However, the converse of the first “\(\Longrightarrow \)” always holds when \(\Omega \) is finite, see the following proposition.

Proposition 4.3

If \(\Omega \) is finite and \({\mathcal {R}}\) is a coherent risk measure with risk envelope \({\mathcal {Q}}\), then \({\mathcal {R}}\) is averse if and only if \(\mathbf{1}\) is a relative interior point of \({\mathcal {Q}}\).

Proof

By Proposition 4.2, we only need to prove one direction, that is, aversity implies that \(\mathbf{1}\) is a relative interior point. Suppose \(\Omega =\{\omega _1,\ldots ,\omega _n\}\) and \({\mathbb {P}}_0(\{\omega _i\})=p_i>0\) for \(i=1,2,\ldots ,n\). In this case,

$$\begin{aligned} {\mathcal {P}}=\left\{ (q_1,\ldots ,q_n):~q_1,\ldots ,q_n\ge 0,~\sum \limits _{i=1}^nq_ip_i=1\right\} , \end{aligned}$$

and the risk envelope of \({\mathcal {R}}\) is a certain nonempty closed convex set \({\mathcal {Q}}\subseteq {\mathcal {P}}\), that is,

$$\begin{aligned} {\mathcal {R}}(X)=\max \limits _{(q_1,\ldots ,q_n)\in {\mathcal {Q}}}\{x_1q_1p_1+\cdots +x_nq_np_n\} \end{aligned}$$

for \(X=(x_1,\ldots ,x_n)\in {\mathbb {R}}^n\). Here, \(x_i=X(\omega _i)\) for \(i=1,2,\ldots ,n\). Moreover, since \({\mathcal {R}}\) is averse, we have

$$\begin{aligned} \max \limits _{(q_1,\ldots ,q_n)\in {\mathcal {Q}}} \{x_1q_1p_1+\cdots +x_nq_np_n\}>x_1p_1+\cdots +x_np_n \end{aligned}$$
(4.5)

whenever \(X(\omega _i)\) is not a constant. Note that the affine hull of \({\mathcal {P}}\) is a hyperplane of dimension \(n-1\) with a normal vector \((p_1,\ldots ,p_n)\). Let the apostrophe of a vector represent its transpose. Therefore, to prove that \((1,\ldots ,1)\) is an interior point of \({\mathcal {Q}}\) relative to \({\mathcal {P}}\), we only need to prove that

$$\begin{aligned} \max \limits _{(q_1,\ldots ,q_n)\in {\mathcal {Q}}}(y_1,\ldots ,y_n)[(q_1,\ldots ,q_n)-(1,\ldots ,1)]'>0 \end{aligned}$$
(4.6)

for any \((y_1,\ldots ,y_n)\) that is not a normal vector of the affine hull of \({\mathcal {P}}\). In other words, we show that (4.6) holds for any \((y_1,\ldots ,y_n)\) that is not a multiple of \((p_1,\ldots ,p_n)\).

To prove (4.6), noting that if \(\frac{y_1}{p_1},\ldots ,\frac{y_n}{p_n}\) are not the same, then setting \(x_i={y_i\over p_i}\) in (4.5), we have

$$\begin{aligned} \max \limits _{(q_1,\ldots ,q_n)\in {\mathcal {Q}}}\{y_1q_1+\cdots +y_nq_n\}= & {} \max \limits _{(q_1,\ldots ,q_n)\in {\mathcal {Q}}}\left\{ \frac{y_1}{p_1}\cdot q_1p_1+\cdots +\frac{y_n}{p_n}\cdot q_np_n\right\} \\= & {} \max \limits _{(q_1,\ldots ,q_n)\in {\mathcal {Q}}}\{x_1q_1p_1+\cdots +x_nq_np_n\}\\> & {} x_1p_1+\cdots +x_np_n\\= & {} y_1+\cdots +y_n. \end{aligned}$$

Therefore (4.6) is true, implying that \((1,1,\ldots ,1)\) is an interior point of \({\mathcal {Q}}\) relative to \({\mathcal {P}}.\) \(\square \)

We next analyze the examples in Sect. 3. Obviously, the expectation measure \({\mathbb {E}}(\cdot )\) in Sect. 3.1 is not averse. We call a risk measure \({\mathcal {R}}\) “law-invariant” if \({\mathcal {R}}(X)={\mathcal {R}}(Y)\) whenever X and Y have the same distribution under \({\mathbb {P}}_0\). Föllmer and Schied (2002) proved that if \({\mathcal {R}}\) is a coherent, law-invariant risk measure in \({{\mathscr {L}}}^\infty \) (not \({{\mathscr {L}}}^2\)) other than \({\mathbb {E}}(\cdot )\), then \({\mathcal {R}}\) is averse. Therefore, the examples in Sects. 3.23.4 and 3.5 are all averse. However, since we are considering the \({{\mathscr {L}}}^2\) case, we cannot use the result in Föllmer and Schied (2002) directly. We also note that the result in \({{\mathscr {L}}}^2\) space has appeared in Rockafellar and Uryasev (2013) without proof. For completeness, we give a direct proof in the next proposition.

Proposition 4.4

The worst-case, CVaR, OCE and mean-deviation, as risk measures, are all averse.

Proof

The proof is trivial for \(\text {ess-sup}(\cdot )\), since the expectation of any random variable is no larger than its essential supremum, and they are equal if and only if the random variable is a constant almost surely.

For the mean deviation measure, obviously, we have \({\mathbb {E}}X+\lambda \cdot \Vert (X-{\mathbb {E}}X)_+\Vert _2\ge {\mathbb {E}}X\) for any \(X\in {{\mathscr {L}}}^2\), in which the equality holds if and only if \(X\le {\mathbb {E}}X\) almost surely, which implies \(X={\mathbb {E}}X\) (i.e., X is a constant) almost surely. Therefore, the mean deviation measure is averse.

For the OCE measure, since \(1\in {\mathcal {Q}}_{\gamma _1,\gamma _2}\), we have \(S_r(X)\ge {\mathbb {E}}(X)\) by Proposition 4.2. Next, if

$$\begin{aligned} {\mathbb {E}}(X)=S_r(X)=\min _{\beta \in {\mathbb {R}}}\big \{\beta +{\mathbb {E}}[\gamma _1(X-\beta )_+-\gamma _2(\beta -X)_+]\big \}, \end{aligned}$$

then there exists a constant \(\beta _0\in {\mathbb {R}}\) such that

$$\begin{aligned} \beta _0+{\mathbb {E}}\big [\gamma _1(X-\beta _0)_+-\gamma _2(\beta _0-X)_+\big ]={\mathbb {E}}(X)=\beta _0+{\mathbb {E}}\big [(X-\beta _0)_+-(\beta _0-X)_+\big ], \end{aligned}$$

that is,

$$\begin{aligned} (\gamma _1-1){\mathbb {E}}[(X-\beta _0)_+]+(1-\gamma _2){\mathbb {E}}[(\beta _0-X)_+]=0. \end{aligned}$$

Since \(0\le \gamma _2<1<\gamma _1\), we can get \({\mathbb {E}}[(X-\beta _0)_+]={\mathbb {E}}[(\beta _0-X)_+]=0\), and therefore, \(X=\beta _0\) almost surely. Hence the OCE measure is averse.

Finally, setting \(\gamma _1=(1-\alpha )^{-1}\) and \(\gamma _2=0\) in (3.6), we obtain CVaR. Thus, CVaR is averse.\(\square \)

On the contrary, we next show that the risk measure from dividing the future is not averse.

Proposition 4.5

The risk measure defined in (3.2) is not averse if \(r\ge 2\).

Proof

If \({\mathbb {P}}_0(\Omega _k)\ne \lambda _k\) for some \(k=1,2,\ldots ,r\), then by (3.3), \(1\not \in {\mathcal {Q}}\). Thus, by Proposition 4.1, \({\mathcal {R}}\) is not averse.

If \({\mathbb {P}}_0(\Omega _k)=\lambda _k\) for all \(k=1,2,\ldots ,r\), then set \(X=\sum \nolimits _{k=1}^rk\mathbf{1 }_{\Omega _k}\). Obviously X is nonconstant. Since

$$\begin{aligned} {\mathcal {R}}(X)=\sum \limits _{k=1}^r\lambda _k\cdot k=\sum \limits _{k=1}^rk{\mathbb {P}}_0(\Omega _k)={\mathbb {E}}(X), \end{aligned}$$

which implies that \({\mathcal {R}}\) is not averse.\(\square \)

Although the risk measure from subdividing the future is not averse, this risk measure can be used in composition with other averse measures (say, CVaR) to create new risk measures that make practical sense. We leave this topic for future research.

5 Coherent risk measures on subspaces: risk envelopes and uncertainty sets

Recently, coherent risk measures have been studied in the literature of robust optimization. For instance, several coherent risk measures were constructed by using the so-called uncertainty sets in Natarajan et al. (2009), while Bertsimas and Brown (2009) examined the question from a different perspective: If risk preferences are specified by a coherent risk measure, how would the uncertainty set be constructed? In general, from the viewpoint of robust optimization, a risk measure is applied to a random variable of a special structure (say, a linear combination of basic random variables) and is defined by uncertainty sets without involving the exact details of the probability structure of the random variables. In particular, the mean-standard deviation measure, the discrete CVaR, and the distortion risk measure are defined through cone-representable uncertainty sets. If the same risk measure can be constructed by both risk envelope and uncertainty set, then there must be certain relation between the two subjects. It is therefore of interest to explore the connection between risk envelopes and uncertainty sets. This would help to have a deeper understanding on robust optimization.

Let us consider a rather general case in robust optimization, where all uncertain data are linear functions of a finite number of random variables, \(X_1,\ldots ,X_n,\) where \(X_i\in {{\mathscr {L}}}^2(\Omega ,\Sigma ,{\mathbb {P}}_0)\) for \(1\le i\le n\). Denote

$$\begin{aligned} {\mathcal {V}}:=\left\{ X= \sum \limits _{i=1}^na_iX_i:~ a_1,\ldots ,a_n\in {\mathbb {R}}\right\} . \end{aligned}$$

Then \({\mathcal {V}}\) is the subspace generated by \(X_1,\ldots ,X_n\). Let \({\mathcal {R}}\) be a coherent risk measure on \({{\mathscr {L}}}^2(\Omega ,\Sigma ,{\mathbb {P}}_0).\) We define a risk envelope by

$$\begin{aligned} {\mathcal {Q}}_{\mathcal {V}}:=\left\{ Q\in {\mathcal {P}}:~{\mathbb {E}}(XQ)\le {\mathcal {R}}(X)~\text{ for } \text{ all }~X\in {\mathcal {V}}\right\} . \end{aligned}$$
(5.1)

It is easy to check that \({\mathcal {Q}}_{\mathcal {V}}\subseteq {\mathcal {P}}\) and is nonempty, convex and closed, so it is a risk envelope with an induced risk measure

$$\begin{aligned} {\mathcal {R}}_{\mathcal {V}}\left( X\right) =\sup \limits _{Q\in {\mathcal {Q}}_{\mathcal {V}}}{\mathbb {E}}(XQ). \end{aligned}$$
(5.2)

Note that the risk envelope \({\mathcal {Q}}_{\mathcal {V}}\), together with \({\mathcal {R}}_{\mathcal {V}}\), relies on the choice of the subspace \({\mathcal {V}}\) as well as the original risk measure \({\mathcal {R}}\). Since \({\mathcal {V}}\) and \({\mathcal {R}}\) are fixed in the analysis below, for notational convenience, we henceforth use \({\bar{\mathcal{Q}}}\) and \({\bar{\mathcal{R}}}\) for \({\mathcal {Q}}_{\mathcal {V}}\) and \({\mathcal {R}}_{\mathcal {V}}\), respectively. We will also call \({\bar{\mathcal{R}}}\) the risk measure on \({\mathcal {V}}\) to specify its dependence on \({\mathcal {V}}\) and \({\mathcal {R}}\).

We next show that the uncertainty set used in robust optimization for constructing a coherent risk measure on \({\mathcal {V}}\) is the (weak) closure of “expected image” of the risk envelope. We need introduce some notations. For any risk envelope \({\mathcal {Q}}\), we denote

$$\begin{aligned} {\mathcal {U}}_{\mathcal {Q}}:=\mathrm{cl }\,\left\{ \begin{pmatrix}{\mathbb {E}}(X_1Q)\\ \vdots \\ {\mathbb {E}}(X_nQ) \end{pmatrix}:~Q\in {\mathcal {Q}}\right\} . \end{aligned}$$
(5.3)

In particular, we denote

$$\begin{aligned} {\mathcal {U}}_{\mathcal {P}}:=\mathrm{cl }\,\left\{ \begin{pmatrix}{\mathbb {E}}(X_1Q)\\ \vdots \\ {\mathbb {E}}(X_nQ) \end{pmatrix}:~Q\in {\mathcal {P}}\right\} . \end{aligned}$$

Then \({\mathcal {U}}_{\mathcal {Q}}\) is a nonempty and convex subset of \({\mathcal {U}}_{\mathcal {P}}\). Given a nonempty, convex and closed uncertainty set \({\mathcal {U}}\subseteq {\mathcal {U}}_{\mathcal {P}}\), let

$$\begin{aligned} {\mathcal {Q}}_{\mathcal {U}}:=\mathrm{cl }\,\left\{ Q\in {\mathcal {P}}:~\begin{pmatrix}{\mathbb {E}}(X_1Q)\\ \vdots \\ {\mathbb {E}}(X_nQ) \end{pmatrix}\in {\mathcal {U}}\right\} . \end{aligned}$$
(5.4)

Then \({\mathcal {Q}}_{\mathcal {U}}\) is a nonempty, closed and convex subset of \({\mathcal {P}}\). The following lemma is basic.

Lemma 5.1

The following relations hold:

  1. (1)

    \({\mathcal {Q}}_{{\mathcal {U}}_{\mathcal {P}}}={\mathcal {P}}\);

  2. (2)

    \({\mathcal {U}}_{{\mathcal {Q}}_{\mathcal {U}}}={\mathcal {U}}\);

  3. (3)

    \({\mathcal {Q}}\subseteq {\mathcal {Q}}_{{\mathcal {U}}_{\mathcal {Q}}}\);

  4. (4)

    If \({\mathcal {Q}}_1\subseteq {\mathcal {Q}}_2\), then \({\mathcal {U}}_{{\mathcal {Q}}_1}\subseteq {\mathcal {U}}_{{\mathcal {Q}}_2}\);

  5. (5)

    \({\mathcal {U}}_1\subseteq {\mathcal {U}}_2\) if and only if \({\mathcal {Q}}_{{\mathcal {U}}_1}\subseteq {\mathcal {Q}}_{{\mathcal {U}}_2}\).

Proof

  1. (1)

    Trivial.

  2. (2)

    On one hand, we have

    $$\begin{aligned} {\mathcal {U}}_{{\mathcal {Q}}_{\mathcal {U}}}=\mathrm{cl }\,\left\{ [{\mathbb {E}}(X_1Q),\ldots , {\mathbb {E}}(X_nQ)]':~Q\in {\mathcal {Q}}_{\mathcal {U}}\right\} \subseteq {\mathcal {U}}, \end{aligned}$$

    where the apostrophe stands for the transpose. On the other hand, for any \((z_1,\ldots , z_n)'\in {\mathcal {U}}\subseteq {\mathcal {U}}_{\mathcal {P}}\), there exists \(Q\in {\mathcal {P}}\) such that \(z_i={\mathbb {E}}(X_iQ)\) for any \(1\le i\le n\). Since \([{\mathbb {E}}(X_1Q),\ldots , {\mathbb {E}}(X_nQ)]'\in {\mathcal {U}}\), by definition we have \(Q\in {\mathcal {Q}}_{\mathcal {U}}\). Therefore,

    $$\begin{aligned} (z_1,\ldots , z_n)'=[{\mathbb {E}}(X_1Q),\ldots , {\mathbb {E}}(X_nQ)]'\in {\mathcal {U}}_{{\mathcal {Q}}_{\mathcal {U}}}. \end{aligned}$$

    Hence \({\mathcal {U}}\subseteq {\mathcal {U}}_{{\mathcal {Q}}_{\mathcal {U}}}\), and then \({\mathcal {U}}_{{\mathcal {Q}}_{\mathcal {U}}}={\mathcal {U}}\).

  3. (3)

    For any \(Q\in {\mathcal {Q}}\), we have \([{\mathbb {E}}(X_1Q),\ldots , {\mathbb {E}}(X_nQ)]' \in {\mathcal {U}}_{\mathcal {Q}}\). Then by definition, \(Q\in {\mathcal {Q}}_{{\mathcal {U}}_{\mathcal {Q}}}\). Therefore, \({\mathcal {Q}}\subseteq {\mathcal {Q}}_{{\mathcal {U}}_{\mathcal {Q}}}\).

  4. (4)

    Trivial.

  5. (5)

    The “only if” part is trivial. For the “if” part, by (4) and (2), \({\mathcal {Q}}_{{\mathcal {U}}_1}\subseteq {\mathcal {Q}}_{{\mathcal {U}}_2}\) implies \({\mathcal {U}}_{{\mathcal {Q}}_{{\mathcal {U}}_1}}\subseteq {\mathcal {U}}_{{\mathcal {Q}}_{{\mathcal {U}}_2}}\), that is, \({\mathcal {U}}_1\subseteq {\mathcal {U}}_2\).

\(\square \)

Remark

The converse of (3) may not be true. For example, if \({\mathcal {Q}}\) is a singleton \(\{1\}\), then \({\mathcal {U}}_{\mathcal {Q}}=[{\mathbb {E}}(X_1),\ldots , {\mathbb {E}}(X_n)]' \). Here \({\mathcal {Q}}_{{\mathcal {U}}_{\mathcal {Q}}}\) contains all \(Q\in {\mathcal {P}}\) such that \([{\mathbb {E}}(X_1Q),\ldots , {\mathbb {E}}(X_nQ) ]'=[{\mathbb {E}}(X_1),\ldots , {\mathbb {E}}(X_n) ]'\), which may not necessarily be constant 1.

We can use the uncertainty sets to define coherent risk measures. For uncertainty set \({\mathcal {U}}\), the mapping

$$\begin{aligned} \sum \limits _{i=1}^na_iX_i\longmapsto \sup \limits _{(z_1,\ldots ,z_n)^\prime \in {\mathcal {U}}}\left( \sum \limits _{i=1}^na_iz_i\right) \end{aligned}$$

defines a risk measure on the subspace \({\mathcal {V}}\), which is called the risk measure on \({\mathcal {V}}\) with uncertainty set \({\mathcal {U}}\).

The next two propositions describe some relationships between risk envelopes and uncertainty sets. A common criticism to robust optimization is the arbitrariness of the uncertainty set and its lack of theoretical foundation. Our result here may shed some light on the rationale of uncertainty set and build up a proper theoretical foundation of it. Theorem 5.1 below serves for the same purpose.

Proposition 5.1

\({\bar{\mathcal{R}}}\) is a coherent risk measure on \({\mathcal {V}}\) with risk envelope \({\bar{\mathcal{Q}}}\) if and only if it is a coherent risk measure on \({\mathcal {V}}\) with uncertainty set \({\mathcal {U}}_{{\bar{\mathcal{Q}}}}\).

Proof

By direct calculation, we can get

$$\begin{aligned} \sup \limits _{Q\in {\bar{\mathcal{Q}}}}{\mathbb {E}}\left[ \left( \sum \limits _{i=1}^na_iX_i\right) Q\right]&=\sup \limits _{Q\in {\bar{\mathcal{Q}}}}\left( \sum \limits _{i=1}^na_i{\mathbb {E}}(X_iQ)\right) \\&=\sup \limits _{(z_1,\ldots ,z_n)^\prime \, \in {\mathcal {U}}_{{\bar{\mathcal{Q}}}}}\left( \sum \limits _{i=1}^na_iz_i\right) \end{aligned}$$

for any \( \sum \nolimits _{i=1}^na_iX_i\in {\mathcal {V}}\).\(\square \)

Proposition 5.2

For any uncertainty set \({\mathcal {U}}\subseteq {\mathcal {U}}_{\mathcal {P}}, {\bar{\mathcal{R}}}\) is a coherent risk measure on \({\mathcal {V}}\) with uncertainty set \({\mathcal {U}}\) if and only if it is a coherent risk measure on \({\mathcal {V}}\) with risk envelope \({\mathcal {Q}}_{\mathcal {U}}\).

Proof

By Proposition 5.1, \({\bar{\mathcal{R}}}\) is a coherent risk measure on \({\mathcal {V}}\) with risk envelope \({\mathcal {Q}}_{\mathcal {U}}\) if and only if it is a coherent risk measure on \({\mathcal {V}}\) with uncertainty set \({\mathcal {U}}_{{\mathcal {Q}}_{\mathcal {U}}}\). Then by Lemma 5.1 (2), \({\mathcal {U}}_{{\mathcal {Q}}_{\mathcal {U}}}={\mathcal {U}}\), so the proposition is proved.\(\square \)

The following is a main theorem in Natarajan et al. (2009), where the authors discussed how to construct coherent risk measures in general. However, since uncertainty sets are constructed independent of probability distributions, it is not completely clear how the uncertainty sets are related to the random variables appeared in the problem. We now present a new proof of the theorem, which discloses the connection between the uncertainty set and the risk measure on \({\mathcal {V}}.\)

Theorem 5.1

\({\bar{\mathcal{R}}}\) is a coherent risk measure on \({\mathcal {V}}\) if and only if there exists a nonempty and convex subset \({\mathcal {U}}\subseteq {\mathcal {U}}_{\mathcal {P}}\) such that

$$\begin{aligned} {\bar{\mathcal{R}}}\left( \sum \limits _{i=1}^na_iX_i\right) =\sup \limits _{{z}=(z_1,\ldots ,z_n)'\in {\mathcal {U}}}\left( \sum \limits _{i=1}^na_iz_i\right) \end{aligned}$$
(5.5)

for any \(a_1,\ldots ,a_n\in {\mathbb {R}}\). We call \({\mathcal {U}}\) the “uncertainty set” of the risk measure \({\bar{\mathcal{R}}}\) on \({\mathcal {V}}.\) It can be written explicitly as

$$\begin{aligned} {\mathcal {U}}= & {} \left\{ {z}\in {\mathcal {U}}_{\mathcal {P}}:~\max \limits _{a_1,\ldots ,a_n\in {\mathbb {R}}} \left\{ \sum \limits _{i=1}^na_iz_i:~{\mathcal {R}}\left( \sum \limits _{i=1}^na_iX_i\right) \le 1\right\} \le 1\right\} , \end{aligned}$$

where \({\mathcal {R}}\) is the original risk measure that induces \({\bar{\mathcal{R}}}.\)

Proof

Formula (5.5) follows from Propositions 5.1 and 5.2. Next, by Proposition 5.1, \({\bar{\mathcal{R}}}\) is a coherent risk measure on \({\mathcal {V}}\) with risk envelope

$$\begin{aligned} {\bar{\mathcal{Q}}}=\left\{ Q\in {\mathcal {P}}:~{\mathbb {E}}\left[ \left( \sum \limits _{i=1}^na_iX_i\right) Q\right] \le {\mathcal {R}}\left( \sum \limits _{i=1}^na_iX_i\right) ~\text{ for } \text{ all }~a_1,\ldots ,a_n\in {\mathbb {R}}\right\} \end{aligned}$$

if and only if it is a coherent risk measure on \({\mathcal {V}}\) with uncertainty set

$$\begin{aligned} {\mathcal {U}}_{{\bar{\mathcal{Q}}}}= & {} \left\{ \begin{pmatrix}{\mathbb {E}}(X_1Q)\\ \vdots \\ {\mathbb {E}}(X_nQ) \end{pmatrix}:~Q\in {\mathcal {P}},~{\mathbb {E}}\left[ \left( \sum \limits _{i=1}^na_iX_i\right) Q\right] \right. \\\le & {} \left. {\mathcal {R}}\left( \sum \limits _{i=1}^na_iX_i\right) ~\text{ for } \text{ all }~a_1,\ldots ,a_n\in {\mathbb {R}}\right\} . \end{aligned}$$

Therefore, to complete the proof of Theorem 5.1, we only need to prove

$$\begin{aligned}&\left\{ \begin{pmatrix}{\mathbb {E}}(X_1Q)\\ \vdots \\ {\mathbb {E}}(X_nQ) \end{pmatrix}:~Q\in {\mathcal {P}},~{\mathbb {E}}\left[ \left( \sum \limits _{i=1}^na_iX_i\right) Q\right] \le {\mathcal {R}}\left( \sum \limits _{i=1}^na_iX_i\right) ~\text{ for } \text{ all }~ a_1,\ldots ,a_n\in {\mathbb {R}}\right\} \nonumber \\&\quad =\left\{ \begin{pmatrix}z_1\\ \vdots \\ z_n \end{pmatrix}\in {\mathcal {U}}_{\mathcal {P}}:~\max \limits _{a_1,\ldots ,a_n\in {\mathbb {R}}}\left\{ \sum \limits _{i=1}^na_iz_i:~{\mathcal {R}}\left( \sum \limits _{i=1}^na_iX_i\right) \le 1\right\} \le 1\right\} . \end{aligned}$$
(5.6)

In fact, since \(Q\in {\mathcal {P}}\Longleftrightarrow [{\mathbb {E}}(X_1Q),\ldots ,{\mathbb {E}}(X_nQ)]'\in {\mathcal {U}}_{\mathcal {P}}\), and for any \(Q\in {\mathcal {P}}\),

$$\begin{aligned}&{\mathbb {E}}\left[ \left( \sum \limits _{i=1}^na_iX_i\right) Q\right] \le {\mathcal {R}}\left( \sum \limits _{i=1}^na_iX_i\right) ~\text{ for } \text{ all }~ a_1,\ldots ,a_n\in {\mathbb {R}}\\&\quad \Longleftrightarrow \sum \limits _{i=1}^na_i{\mathbb {E}}(X_iQ)\le {\mathcal {R}}\left( \sum \limits _{i=1}^na_iX_i\right) ~\text{ for } \text{ all }~a_1,\ldots ,a_n\in {\mathbb {R}}\\&\quad \Longleftrightarrow \max \left\{ \sum \limits _{i=1}^na_i{\mathbb {E}}(X_iQ):~a_1,\ldots ,a_n\in {\mathbb {R}},~{\mathcal {R}}\left( \sum \limits _{i=1}^na_iX_i\right) \le 1\right\} \le 1, \end{aligned}$$

then (5.6) holds. The proof of Theorem 5.1 is completed.\(\square \)

6 Concluding remarks

Artzner et al. (1997, 1999) introduced the fundamental notion of coherent risk measures. Rockafellar et al. (2006) considered a dual representation theorem in \({{\mathscr {L}}}^2\) space. In this paper, we considered risk measures in \({{\mathscr {L}}}^2\) under set operations and discussed the dual representations and aversity for various popular risk measures. We also studied the relationship between the risk measure defined by risk envelopes and that defined by uncertainty sets in the case for the risk measures on subspaces. These results may provide certain tools for stochastic optimization with risk measures as well as improve our understanding on robust optimization.