Keywords

AMS subject classification:

1 Introduction

In his book [114, p. 378], J. Jahn states that the set relation approach ‘opens a new and wide field of research’ and the so-called set relations ‘turn out to be very promising in set optimization.’ We share this opinion, and this note aims at a (partial) fulfillment of this promise.

What is “set optimization?” The answer given in this note concerns minimization problems with set-valued objective functions and is based on a twofold solution concept: Look for a set of arguments each of which has a function value which is minimal in some sense, and all those values generate the infimum of the function. Thus, infimum attainment and minimality are the two, no longer equivalent requirements for a solution of a set optimization problem. It turns out that the set relation infimum is a useful concept in contrast to the vector order infimum which may not exist, and even if it does, it is of no practical use.

What is a motivation to consider set optimization problems? The heart of the problem is the question of how to deal with a non-total order relation, i.e. when there are non-comparable alternatives. The “complete lattice approach” based on set relations re-gains meaningful and applicable notions of infimum and supremum even if the departing pre-ordered vector space does not have the least upper bound property, is not even a lattice, its positivity cone is not pointed, not normal or has an empty interior. The theory presented in this survey suggests that even vector-valued optimization problems should be treated as set-valued ones. This point of view has already been emphasized in [151] for problems with a solid ordering cone.

According to an old theorem by Szpilrajn [215], which is well-known in mathematical economics, but less so in vector and set optimization, a partial order (preorder) is the intersection of all linear orders (total preorders) including it. In the same spirit, dual descriptions of objects related to a preorder such as convex functions can be given in terms of half spaces generating total orders, hence dual objects are naturally halfspace- or hyperplane-valued.Footnote 1 Since the simplest dual object is a linear functional, set-valued replacements for them should be halfspace- or hyperplane-valued as well and “as linear as possible.” This basic idea leads to a new type of duality which is not only strong enough to provide set-valued analogs of the Fenchel-Moreau and the Lagrange duality theorem, but also implies known duality results in vector optimization which are usually stated under much stronger assumptions.

It turns out that convex analysis, in particular duality, does not rely on linearity of functionals or image spaces, but rather on “conlinearity.” The structure of a conlinear space as introduced in [77] is precisely the part of the structure of a linear space which remains invariant under passing to the power set (with Minkowski addition) or order completion (add a least and greatest element to an ordered vector space). Thus, \(\mathrm {I\negthinspace R}\cup \left\{ -\infty , +\infty \right\} \) is the prototype of a conlinear space. A particular feature is the resulting bifurcation: The extended reals can be provided with inf-addition or sup-addition (see [197, p. 15], but already introduced by Moreau [175]), which produces two different conlinear spaces. A preoder on a linear space can be extended in two different ways to the power set of this space, thus producing two different ordered conlinear spaces. Below, it should become clear why this happens and how to deal with this ambiguity.

Finally, “set optimization” is motivated by problems in Economics and Mathematical Finance. The classic books [209] (first edition from 1953) and [125] contain many examples of set-valued functions which naturally occur in economic models, among them ‘production technologies’ [209, p. 13] which are basically monotone lattice-valued functions in the sense of this survey. In finance, market models with transaction costs provide plenty of examples for the theory discussed in this survey; for example the superhedging theorems in [121, 205] can be identified as particular cases of the set-valued Fenchel-Moreau theorem stated below, and the theory of set-valued risk measures, initiated in [120], was particularly motivating for the development for the set-valued duality in Sect. 5 below.

This survey aims at developing ideas and structures and providing a framework for principal results. Full proofs are only given for new or unpublished results, or if they illustrate an important idea particularly nicely. Sections with bibliographical remarks conclude each part with the goal to put the presented material into perspective with variational analysis and vector optimization theory in view.

Several results are new, mostly complementing those obtained by the authors in several recent publications. For example, Proposition 2.17 discusses the totalness of set relations, Sect. 4.2 relies on an improved general scheme for scalarizations and includes several new observations such as the supermodularity of the scalarizations given in Corollary 4.15, Theorem 5.2 characterizes set-valued dual variables parallel to results for continuous linear functions and convex processes, Sect. 5.5 contains a new general framework for directionally translative functions and Proposition 6.7 provides a new sufficient optimality condition including a complementary slackness condition for set optimization.

2 Set Relations and Lattices of Sets

2.1 Extending Preorders from Linear Spaces to their Power Sets

Let Z be a non-trivial real linear space and \(C \subseteq Z\) a convex cone with \(0 \in C \ne Z\). In particular, \(C = \left\{ 0 \right\} \) is allowed. Here, C is said to be a cone if \(z \in C\), \(t>0\) imply \(tz \in C\). By

$$ z_1 \le _C z_2 \quad \Leftrightarrow \quad z_2 - z_1 \in C $$

a reflexive and transitive relation \(\le _C\) is defined on Z; such a relation is usually called a preorder. It is compatible with the linear structure of Z in the usual sense, i.e. \(z_1, z_2, z \in Z\), \(t \ge 0\) and \(z_1 \le _C z_2\) imply \(z_1 + z \le _C z_2 + z\) as well as \(tz_1 \le tz_2\). Obviously,

$$ z_1 \le _C z_2 \; \Leftrightarrow \; z_2 - z_1 \in C \; \Leftrightarrow \; z_2 \in z_1 + C \; \Leftrightarrow \; z_1 \in z_2 - C. $$

The last two relationships can be used to extend \(\le _C\) from Z to \(\mathcal P\left( Z \right) \), the set of all subsets of Z including the empty set \(\emptyset \). Take \(A, B \in \mathcal P\left( Z \right) \) and define

$$\begin{aligned} A \preccurlyeq _C B \quad \Leftrightarrow \quad B \subseteq A + C, \\ A \curlyeqprec _C B \quad \Leftrightarrow \quad A \subseteq B - C. \end{aligned}$$

Here and in the following, we use \(+\) to denote the usual Minkowski (element-wise) addition for sets with the conventions \(A + \emptyset = \emptyset + A = \emptyset \) for all \(A \in \mathcal P\left( Z \right) \) and \(A - C = A + \left( -C \right) \), \(-C = \left\{ -c \mid c \in C \right\} \). The following facts are immediate.

Proposition 2.1

  1. (a)

    Both \(\preccurlyeq _C\) and \(\curlyeqprec _C\) are reflexive and transitive relations on \(\mathcal P\left( Z \right) \). Moreover, they are not antisymmetric in general, and they do not coincide.

  2. (b)

    \(A \preccurlyeq _C B\) \(\Leftrightarrow \) \(-B \curlyeqprec _C -A\) \(\Leftrightarrow \) \(B \preccurlyeq _{-C} A\).

  3. (c)

    \(A \preccurlyeq _C B\) \(\Leftrightarrow \) \(A + C \supseteq B + C\); \(A \curlyeqprec _C B\) \(\Leftrightarrow \) \(A - C \subseteq B -C\).

Proof

Left as exercise.   \(\square \)

The property (c) above gives rise to define the set

$$ \mathcal P\left( Z, C \right) = \left\{ A \in \mathcal P\left( Z \right) \mid A = A + C \right\} $$

and to observe that it can be identified with the set of equivalence classes with respect to the equivalence relation on \(\mathcal P\left( Z \right) \) defined by

$$\begin{aligned} A \sim _C B \quad \Leftrightarrow \quad \left( A \preccurlyeq _C B \; \wedge \; B \preccurlyeq _C A \right) \quad \Leftrightarrow \quad A + C = B + C, \end{aligned}$$
(2.1)

i.e. \(\sim _C\) is the symmetric part of \(\preccurlyeq _C\). Likewise,

$$ \mathcal P\left( Z, -C \right) = \left\{ A \in \mathcal P\left( Z \right) \mid A = A - C \right\} $$

can be identified with the set of equivalence classes with respect to the equivalence relation

$$ A \sim _{\left( -C \right) } B \quad \Leftrightarrow \quad \left( A \curlyeqprec _C B \; \wedge \; B \curlyeqprec _C A \right) \quad \Leftrightarrow \quad A - C = B - C. $$

Below, we will mainly discuss the relation \(\preccurlyeq _C\) which is the appropriate one when it comes to minimization; however, the theory becomes completely symmetric since every statement for the \(\preccurlyeq _C\) relation (and minimization) has a counterpart for \(\curlyeqprec _C\) (and maximization).

The following proposition relies on (c) of Proposition 2.1. We recall that the infimum of a subset \(V \subseteq W\) of a partially ordered set \(\left( W, \preceq \right) \) is an element \(\bar{w} \in W\) (unique if it exists) satisfying \(\bar{w} \preceq v\) for all \(v \in V\) and \(w \preceq \bar{w}\) whenever \(w \preceq v\) for all \(v \in V\). This means that the infimum is the greatest lower bound of V in W. The infimum of V is denoted by \(\inf V\). Likewise, the supremum \(\sup V\) is defined as the least upper bound of V. A partially ordered set \(\left( W, \preceq \right) \) is called a lattice if \(\inf \left\{ w_1, w_2 \right\} \) and \(\sup \left\{ w_1, w_2 \right\} \) exist in W for any two elements \(w_1, w_2 \in W\). A lattice \(\left( W, \preceq \right) \) is called (order) complete if each subset has an infimum and a supremum in W.

Proposition 2.2

The pair \(\left( \mathcal P\left( Z, C \right) , \supseteq \right) \) is a complete lattice. Moreover, for a subset \(\mathcal {A}\subseteq \mathcal P\left( Z, C \right) \), the infimum and the supremum of \(\mathcal {A}\) are given by

$$\begin{aligned} \inf \mathcal {A}= \bigcup \limits _{A \in \mathcal {A}} A,\qquad \sup \mathcal {A}= \bigcap \limits _{A \in \mathcal {A}} A \end{aligned}$$
(2.2)

where it is understood that \(\inf \mathcal {A}= \emptyset \) and \(\sup \mathcal {A}= Z\) whenever \(\mathcal {A}= \emptyset \). The greatest (top) element of \(\mathcal P\left( Z, C \right) \) with respect to \(\supseteq \) is \(\emptyset \), the least (bottom) element is Z.

In particular, \(\supseteq \) is a partial order on \(\mathcal P\left( Z, C \right) \). This property does not depend on the cone C: It can be a trivial cone, i.e. \(C = \left\{ 0 \right\} \), or a half space, i.e. \(C = \left\{ z \in Z \mid \xi \left( z \right) \ge 0 \right\} \) where \(\xi \) is a (non-zero) linear function on Z (an element of the algebraic dual of Z), i.e. \(\le _C\) is not antisymmetric in the latter case in general. Of course, a parallel result holds for \(\left( \mathcal P\left( Z, -C \right) , \subseteq \right) \).

Note that the convention \(\inf \emptyset = \emptyset \) and \(\sup \emptyset = Z\) is in accordance with the following monotonicity property: If \(\mathcal A_1 \subseteq \mathcal A_2\) then \(\inf \mathcal A_1 \subseteq \inf \mathcal A_2\) and \(\sup \mathcal A_1 \supseteq \sup \mathcal A_2\) in \(\mathcal {P}\left( Z, C \right) \).

Proof

To show the first formula in (2.2) one has to prove two facts: First,

$$ \forall A' \in \mathcal A :\bigcup \limits _{A \in \mathcal {A}} A \supseteq A', $$

and secondly, for \(B \in \mathcal P\left( Z, C \right) \)

$$ \left( \forall A \in \mathcal A :B \supseteq A \right) \; \Rightarrow \; B \supseteq \bigcup \limits _{A \in \mathcal {A}} A. $$

Both claims are obvious. The second formula of (2.2) also follows from the definition of the supremum with respect to \(\supseteq \). The lattice property is a consequence of (2.2).    \(\square \)

Remark 2.3

One could also use other representatives of the equivalence classes defined by (2.1)

$$ \left\{ A' \in \mathcal P\left( Z \right) \mid A \preccurlyeq _C A' \; \wedge \; A' \preccurlyeq _C A \right\} . $$

As as rule, one has to impose additional assumptions, for example a non-empty interior of the cone C. An example is the infimal set approach of Nieuwenhuis [179] which has been extended in [149, 155] (compare also [151, 216]). This approach is summarized in Example 2.12 below.

2.2 Comments on Set Relations

In the (vector and set) optimization community, D. Kuroiwa is credited for the introduction of the two “set relations” \(\preccurlyeq _C\) and \(\curlyeqprec _C\) above and, indeed, he was the first who used them for defining optimality notions for optimization problems with a set-valued objective function, compare [128, 138] and several reports [129134] published by RIMS Kokyuroku 1996–1999. However, it should be noted that these “set relations” were in use much earlier in different contexts.

In the 1993 paper [22], Brink describes an algebraically motivated approach to power structures where the two relations \(\curlyeqprec \) and \(\preccurlyeq \) (analog extensions of a preorder on a general set, not necessarily a vector space) are denoted by \(R_0^+\) and \(R_1^+\), respectively. These and similar structures mostly defined on finite or countable sets are widely used in theoretical computer science as becomes clear from the reference list of [22]. For example, in [188, Definition 1] the following definition is used: A set A ‘can be reduced to’ another set B if for all \(a \in A\) there is \(b \in B\) such that \(a \le b\) for some partial order \(\le \), thus \(A \curlyeqprec B\), which is parallel to the definition of \(\curlyeqprec _C\) above.

Nishianidze [180] also used the relations \(\preccurlyeq \) and \(\curlyeqprec \) in the context of fixed point theory. This reference was brought to our attention by J. Jahn. Constructions mainly motivated by applications in economics and social choice theory can be found e.g. in [18, 177]. Compare also the references therein, especially [122]. In [56], set relations (on finite sets) and corresponding best choice problems are motivated by committee selection, governing coalition formation, product line formation and similar problems.

As pointed out in [77, 83], the earliest reference known to us is the paper [228] by Young. It already contains the definitions of \(\preccurlyeq _C\) and \(\curlyeqprec _C\) implicitly and presents applications to the analysis of upper and lower limits of sequences of real numbers.

Another field of application for set relations is interval mathematics. In the survey [178, Sect. 2.2] from 1975, an order relation is investigated which is defined on the set of order intervals of a partially ordered set M. This relation coincides with \(\preccurlyeq _C \cap \curlyeqprec _C\) if \(M = Z\) and \(\le _C\) is a partial order on Z. It has also been discussed, for example, in [115, 117]. Jahn [118] applies it in fixed point theory for interval functions, and Schmidt [206] relates it to general ordered convex cones. Later, the “set-less-or-equal relation” became a standard part of the FORTRAN 95 specification for interval arithmetic, see [30]. We point out that the “set less” relation in [115] actually is the “set-less-or-equal” relation in [30, Sect. 12.8] and also coincides with \(\preccurlyeq _C \cap \curlyeqprec _C\).

In [138], one can find a systematic investigation of six extensions of a preorder \(\le _C\) on a topological linear space generated by a convex ordering cone C with nonempty interior to its power set; the relations \(\preccurlyeq _C\) and \(\curlyeqprec _C\) are proven to be the only such relations which are reflexive and transitive; definitions for the convexity of set-valued functions with respect to several order relations are given. Subsequent papers of the three authors of [138] contain applications to optimization problems with a set-valued objective, see for example [73, 135, 136]. For this topic, compare also the book [114], especially Chap. 5. The recent paper [117] contains even more set relations.

After 2005, many authors adopted the concepts related to \(\preccurlyeq _C\) and \(\curlyeqprec _C\), see, among an increasing number of others, [1, 85, 9496, 140, 163, 164]. Quite recently, robustness for vector optimization problems has been linked to the two (and other) set relations, see [107109].

In [77, 149, 155], it has been realized that the set relations unfold their full potential in the framework of complete lattices; Propositions 2.2 above and infimal set versions of it such as [151, Proposition 1.52] may serve as a blueprint for this idea. Because of Proposition 2.2 (which can be found, even in a more general set-up, in [77, Theorem 6] and, for a different image space, [149, Proposition 1.2.3]) we call this approach the “complete lattice approach” to set optimization.Footnote 2

2.3 Inf-Residuated Conlinear Spaces of Sets

We start with a definition which provides the algebraic framework for the image space analysis. It is taken from [77] where references and more material about structural properties of conlinear spaces can be found.

Definition 2.4

A nonempty set W together with two algebraic operations \(+ :W \times W \rightarrow W\) and \(\cdot :\mathrm {I\negthinspace R}_+ \times W \rightarrow W\) is called a conlinear space provided that

  1. (C1)

    \(\left( W, + \right) \) is a commutative semigroup with neutral element \(\theta \),

  2. (C2)

    (i) \(\forall w_1, w_2 \in W\), \(\forall r \in \mathrm {I\negthinspace R}_+\): \(r \cdot \left( w_1 + w_2 \right) = r \cdot w_1 + r \cdot w_2\), (ii) \(\forall w \in W\), \(\forall r, s \in \mathrm {I\negthinspace R}_+\): \(s \cdot \left( r \cdot w \right) = \left( sr \right) \cdot w\), (iii) \(\forall w \in W\): \(1 \cdot w = w\), (iv) \(0 \cdot \theta = \theta \).

An element \(w \in W\) is called a convex element of the conlinear space W if

$$ \forall s, t \ge 0 :\left( s+t \right) \cdot w = s \cdot w + t \cdot w. $$

A conlinear space \(\left( W, +, \cdot \right) \) together with a partial order \(\preceq \) on W (a reflexive, antisymmetric, transitive relation) is called ordered conlinear space provided that (v) \(w, w_1, w_2 \in W\), \(w_1 \preceq w_2\) imply \(w_1 + w \preceq w_2 + w\), (vi) \(w_1, w_2 \in W\), \(w_1 \preceq w_2\), \(r \in \mathrm {I\negthinspace R}_+\) imply \(r \cdot w_1 \preceq r\cdot w_2\).

A non-empty subset \(V \subseteq W\) of the conlinear space \(\left( W, +, \cdot \right) \) is called a conlinear subspace of W if (vii) \(v_1, v_2 \in V\) implies \(v_1 + v_2 \in V\) and (viii) \(v \in V\) and \(t \ge 0\) imply \(t \cdot v \in V\).

It can easily be checked that a conlinear subspace of a conlinear space again is a conlinear space. Note that an important feature of the above definition is the missing second distributivity law which is used to define convex elements.

Example 2.5

(a) The Minkowski addition \(+\) has already been extended to \(\mathcal P\left( Z, C \right) \) and \(\mathcal P\left( Z, -C \right) \) (see the paragraph before Proposition 2.1). The multiplication with non-negative numbers is extended to \(\mathcal P\left( Z, C \right) \) by defining \(t \cdot A = \left\{ ta \mid a \in A \right\} \) for \(A \in \mathcal P\left( Z, C \right) \backslash \left\{ \emptyset \right\} \), \(t > 0\) and

$$ 0 \cdot A = C, \quad t \cdot \emptyset = \emptyset $$

for all \(A \in \mathcal P\left( Z, C \right) \) and \(t > 0\). In particular, \(0 \cdot \emptyset = C\) by definition, and we will drop the \(\cdot \) in most cases. Since the same can be done for \(\mathcal P\left( Z, -C \right) \), the triples \(\left( \mathcal P\left( Z, C \right) , +, \cdot \right) \) and \(\left( \mathcal P\left( Z, -C \right) , +, \cdot \right) \) are conlinear spaces in the sense of Definition 2.4.

Note that it does not hold:

$$ \forall s, t \ge 0, \; \forall A \in \mathcal P\left( Z, C \right) :\left( s + t \right) \cdot A = s\cdot A + t \cdot A. $$

Counterexamples are provided by non-convex sets \(A \subseteq Z\). Therefore, \(\left( \mathcal P\left( Z, C \right) , +, \cdot , \supseteq \right) \) is neither an ordered semilinear space [202] nor a semi-module over the semi-ring \(\mathrm {I\negthinspace R}_+\) [230].

(b) The extended real numbers \(\overline{\mathrm {I\negthinspace R}}= \mathrm {I\negthinspace R}\cup \left\{ -\infty , +\infty \right\} \) provide two more examples. Supplied with the inf-addition \({+^{\negmedspace \centerdot \,}}\) and the sup-addition \({+_{\negmedspace \centerdot \,}}\), respectively, one obtains two (different!) conlinear spaces. For terminology and references, see [89, 197].

The next result connects the conlinear structure on \(\left( \mathcal P\left( Z, C \right) , +, \cdot \right) \) with the order structure of \(\left( \mathcal P\left( Z, C \right) , \supseteq \right) \).

Proposition 2.6

  1. (a)

    \(A, B, D, E \in \mathcal P\left( Z, C \right) \), \(A \supseteq B\), \(D \supseteq E\) \(\Rightarrow \) \(A + D \supseteq B + E\),

  2. (b)

    \(A, B \in \mathcal P\left( Z, C \right) \), \(A \supseteq B\), \(s \ge 0\) \(\Rightarrow \) \(sA \supseteq sB\),

  3. (c)

    \(\mathcal {A}\subseteq \mathcal P\left( Z, C \right) \), \(B \in \mathcal P\left( Z, C \right) \) \(\Rightarrow \)

$$\begin{aligned} \inf \left( \mathcal A + B \right)&= \left( \inf \mathcal {A} \right) + B \end{aligned}$$
(2.3)
$$\begin{aligned} \sup \left( \mathcal A + B \right)&\supseteq \left( \sup \mathcal {A} \right) + B. \end{aligned}$$
(2.4)

where \(\mathcal A + B = \left\{ A + B \mid A \in \mathcal {A} \right\} \).

Proof

Exercise.   \(\square \)

The following example shows that (2.4) does not hold with equality in general.

Example 2.7

Let \(Z = \mathrm {I\negthinspace R}\), \(C = \mathrm {I\negthinspace R}_+\), \(\mathcal {A}= \left\{ [t, \infty ) \mid t \ge 0 \right\} \), \(B = \mathrm {I\negthinspace R}\). Then,

$$ \forall t \ge 0 :[t, \infty ) + \mathrm {I\negthinspace R}= \mathrm {I\negthinspace R}\quad \text {and} \quad \sup \mathcal {A}= \bigcap _{t \ge 0}[t, \infty ) = \emptyset , $$

so \(\sup \left( \mathcal A + B \right) = \mathrm {I\negthinspace R}\ne \emptyset = \left( \sup \mathcal {A} \right) + B\).

Items (a) and (b) of the previous proposition show that \(\left( \mathcal P\left( Z, C \right) , +, \cdot , \supseteq \right) \) (as well as \(\left( \mathcal P\left( Z, -C \right) , +, \cdot , \subseteq \right) \)) carries the structure of an ordered conlinear space. Moreover, Proposition 2.2 shows that they are also complete lattices. The innocent looking Eq. (2.3) has far reaching consequences. In lattice theoretical terms, it means that \(\left( \mathcal P\left( Z, C \right) ,+, \cdot , \supseteq \right) \) is inf-residuated (but not sup-residuated in general). The opposite is true for \(\left( \mathcal P\left( Z, -C \right) , +, \cdot , \subseteq \right) \): it is sup-, but not inf-residuated. The following proposition is an explanation for the word “inf-residuated.”

Proposition 2.8

The relationship (2.3) given in (c) of Proposition 2.6 is equivalent to: For each \(A, B \in \mathcal P\left( Z, C \right) \), the set

$$ \left\{ D \in \mathcal P\left( Z, C \right) \mid A \supseteq B + D \right\} $$

has a least element (with respect to \(\supseteq \)).

Proof

Assume (2.3) and fix \(A, B \in \mathcal P\left( Z, C \right) \). Define

$$ \hat{D} = \inf \left\{ D \in \mathcal P\left( Z, C \right) \mid A \supseteq B + D \right\} . $$

By (2.3) and (2.2),

$$\begin{aligned} B + \hat{D}&= B + \inf \left\{ D \in \mathcal P\left( Z, C \right) \mid A \supseteq B + D \right\} \\&= \inf \left\{ B + D \in \mathcal P\left( Z, C \right) \mid A \supseteq B + D \right\} \\&= \bigcup \left\{ B + D \in \mathcal P\left( Z, C \right) \mid A \supseteq B + D \right\} \subseteq A \end{aligned}$$

which means \(\hat{D} \in \left\{ D \in \mathcal P\left( Z, C \right) \mid A \supseteq B + D \right\} \), so \(\hat{D}\) is the desired least element. The converse direction is left as an exercise.   \(\square \)

The inf-residuation of \(A, B \in \mathcal P\left( Z, C \right) \) is denoted

$$\begin{aligned} A {-^{\negmedspace \centerdot \,}}B = \inf \left\{ D \in \mathcal P\left( Z, C \right) \mid A \supseteq B + D \right\} . \end{aligned}$$
(2.5)

This operation will serve as a substitute for the difference in linear spaces. Indeed, for \(Z = \mathrm {I\negthinspace R}\), \(C = \mathrm {I\negthinspace R}_+\), \(A = a + \mathrm {I\negthinspace R}_+\), \(B = b + \mathrm {I\negthinspace R}_+\), \(a, b \in \mathrm {I\negthinspace R}\) one obtains

$$ A {-^{\negmedspace \centerdot \,}}B = \left\{ r \in \mathrm {I\negthinspace R}\mid b + r + \mathrm {I\negthinspace R}_+ \subseteq a + \mathrm {I\negthinspace R}_+ \right\} = \left\{ r \in \mathrm {I\negthinspace R}\mid b - a + r + \mathrm {I\negthinspace R}_+ \subseteq \mathrm {I\negthinspace R}_+ \right\} = a-b + \mathrm {I\negthinspace R}_+. $$

Compare Example 2.15 below for more about the extended reals. The following proposition states two elementary properties of \({-^{\negmedspace \centerdot \,}}\). A full calculus exists for \({-^{\negmedspace \centerdot \,}}\) which to a large extend can be derived from known results in lattice/residuation theory. One example is Proposition 4.17 below which can be understood as a residuation version of “the negative of the infimum is the supremum of the negative.”

Proposition 2.9

Let \(A, B \in \mathcal P\left( Z, C \right) \). Then

$$\begin{aligned} A {-^{\negmedspace \centerdot \,}}B = \left\{ z \in Z \mid B + z \subseteq A \right\} , \end{aligned}$$
(2.6)

and if A is closed (convex) then \(A {-^{\negmedspace \centerdot \,}}B\) is closed (convex) where Z is required to be a topological linear space if closedness is involved.

Proof

The proof of the equation is immediate from (2.2) and (2.5), and the second claim follows from the first and

$$ \left\{ z \in Z \mid B + z \subseteq A \right\} = \bigcap _{b \in B} \left\{ z \in Z \mid z \in A + \left\{ -b \right\} \right\} . $$

Of course, \(A + \left\{ -b \right\} \) is closed (convex) if A is closed (convex), and these properties are stable under intersection.   \(\square \)

Remark 2.10

We would like to draw the reader’s attention to the fact that the structure of an ordered conlinear space which also is an inf-residuated complete lattice is “rich enough” to serve as an image space in convex analysis. In fact, this structure is shared by \(\overline{\mathrm {I\negthinspace R}}\) with inf-addition and \((\mathcal P\left( Z, C \right) , +, \cdot , \supseteq )\) (as well as others, see below). Completely symmetric counterparts are provided by \(\overline{\mathrm {I\negthinspace R}}\) with sup-addition and \((\mathcal P\left( Z, -C \right) , +, \cdot , \subseteq )\) which are sup-residuated complete lattices. The transition from one to the other, provided by multiplication with \(-1\), is a ‘duality’ in the sense of [210]. Although elements of this structure were well-known and have been used before (see the comments section below), the development of this framework for a “set-valued” convex/variational analysis and optimization theory is one contribution of the authors of this survey. A nice feature is that this structure admits to establish many results for vector/set-valued functions in the same way as for extended real-valued functions—not surprising after one realizes the similarities between the extended reals and conlinear spaces of sets.

We conclude this section by providing more examples of inf-residuated complete lattices of sets which will be used later on.

Example 2.11

Let Z be a topological linear space and \(C \subseteq Z\) a convex cone with \(0 \in C\). The set

$$ \mathcal {F}\left( Z, C \right) = \left\{ A \subseteq Z \mid A = \mathrm{{cl \,}}\left( A + C \right) \right\} $$

clearly is a subset of \(\mathcal P\left( Z, C \right) \), but not closed under (Minkowski) addition. Therefore, we define an associative and commutative binary operation \(\oplus :\mathcal {F}\left( Z, C \right) \times \mathcal {F}\left( Z, C \right) \rightarrow \mathcal {F}\left( Z, C \right) \) by

$$\begin{aligned} A \oplus B = \mathrm{{cl \,}}\left( A+B \right) \end{aligned}$$
(2.7)

for \(A, B \in \mathcal {F}\left( Z, C \right) \). The element-wise multiplication with non-negative real numbers is extended by

$$ 0 \odot A = \mathrm{{cl \,}}C, \quad t \odot \emptyset = \emptyset $$

for all \(A \in \mathcal {F}\left( Z, C \right) \) and \(t > 0\). In particular, \(0 \odot \emptyset = \mathrm{{cl \,}}C\) by definition, and we will drop \(\odot \) in most cases. The triple \(\left( \mathcal {F}\left( C \right) , \oplus , \odot \right) \) is a conlinear space with neutral element \(\mathrm{{cl \,}}C\).

On \(\mathcal {F}\left( Z, C \right) \), \(\supseteq \) is a partial order which is compatible with the algebraic operations just introduced. Thus, \(\left( \mathcal {F}\left( Z, C \right) , \oplus , \odot , \supseteq \right) \) is a partially ordered, conlinear space.

Moreover, the pair \(\left( \mathcal {F}\left( Z, C \right) , \supseteq \right) \) is a complete lattice, and if \(\mathcal A \subseteq \mathcal {F}\left( Z, C \right) \) then

$$ \inf _{\left( \mathcal {F}\left( Z, C \right) , \supseteq \right) } \mathcal A = \mathrm{{cl \,}}\bigcup \limits _{A \in \mathcal A} A, \quad \sup _{\left( \mathcal {F}\left( Z, C \right) , \supseteq \right) } \mathcal A = \bigcap \limits _{A \in \mathcal A} A $$

where again \(\inf _{\left( \mathcal {F}\left( Z, C \right) , \supseteq \right) } \mathcal A = \emptyset \) and \(\sup _{\left( \mathcal {F}\left( Z, C \right) , \supseteq \right) } \mathcal A = Z\) whenever \(\mathcal A = \emptyset \).

The inf-residuation in \(\mathcal {F}\left( Z, C \right) \) is defined as follows: For \(A, B \in \mathcal {F}\left( Z, C \right) \), set

$$\begin{aligned} A {-^{\negmedspace \centerdot \,}}B = \inf _{\left( \mathcal {F}\left( Z, C \right) , \supseteq \right) } \left\{ D \in \mathcal {F}\left( Z, C \right) \mid B + D \subseteq A \right\} = \left\{ z\in Z \mid B + z\subseteq A \right\} . \end{aligned}$$
(2.8)

Note that, for \(A \in \mathcal {F}\left( Z, C \right) \), the set on the right hand side of (2.8) is indeed closed by Proposition 2.9.

Example 2.12

Let Z be a topological linear space and \(C \subsetneq Z\) be a closed convex cone with \(\emptyset \ne \mathrm{{int\,}}C \ne Z\). The set of weakly minimal points of a subset \(A \subseteq Z\) (with respect to C) is defined by

$$ \mathrm{{wMin}}A = \left\{ y \in A \mid ( \left\{ y \right\} -\mathrm{{int\,}}C) \cap A = \emptyset \right\} . $$

For \(A \in \mathcal {F}(Z, C)\), it can be shown ([151, Proposition 1.40 and Corollary 1.44]) that \(\mathrm{{wMin}}A \ne \emptyset \) if and only if \(A \not \in \left\{ Z, \emptyset \right\} \). This justifies the following construction. For \(A \in \mathcal {F}(Z, C)\), set

$$ \mathrm{{Inf}}A = \left\{ \begin{array}{c@{\quad }c@{\quad }c} \mathrm{{wMin}}A &{} :&{} A \not \in \left\{ Z, \emptyset \right\} \\ \left\{ -\infty \right\} &{} :&{} A = Z \\ \left\{ +\infty \right\} &{} :&{} A = \emptyset . \end{array} \right. $$

Then \(\mathrm{{Inf}}A \subseteq Z \cup \left\{ \pm \infty \right\} \), and \(\mathrm{{Inf}}A\) is never empty. The set A can be reconstructed from \(\mathrm{{Inf}}~A\) by

$$ A = \left\{ \begin{array}{c@{\quad }c@{\quad }c} \mathrm{{Inf}}~A \oplus C &{} :&{} \mathrm{{Inf}}~A \not \in \left\{ \left\{ -\infty \right\} , \left\{ +\infty \right\} \right\} \\ Z &{} :&{} \mathrm{{Inf}}~A = \left\{ -\infty \right\} \\ \emptyset &{} :&{} \mathrm{{Inf}}~A = \left\{ +\infty \right\} . \end{array} \right. $$

Defining the set \(\mathcal I(Z, C) = \left\{ \mathrm{{Inf}}~A~\mid ~A~\in ~\mathcal {F}(Z,~C) \right\} \) (the collection of ‘self-infimal’ subsets of \(Z \cup \left\{ \pm \infty \right\} \), [151, Definition 1.50]) and appropriate algebraic operations as well as an order one obtains \(\mathcal {F}(Z, C) = \left\{ B \oplus C \mid B \in \mathcal I(Z, C) \right\} \). Moreover, \(\mathcal I(Z, C)\) and \(\mathcal {F}(Z, C)\) are algebraically and order isomorphic ordered conlinear spaces (compare Proposition 1.52 of [151]). The reader is referred to [151, 155, 179, 216] for more details concerning infimal (and supremal) sets.

Example 2.13

Let Z, C be as in Example 2.11. The set

$$ \mathcal {G}\left( Z, C \right) = \left\{ A \subseteq Z \mid A = \mathrm{{cl \,}}\mathrm{{co \,}}\left( A + C \right) \right\} \subseteq \mathcal {F}\left( Z, C \right) $$

together with the operations \(\oplus \) and \(\odot \) introduced in Example 2.11 is a conlinear subspace of \(\left( \mathcal {F}\left( C \right) , \oplus , \odot \right) \). In fact, \(\mathcal {G}\left( Z, C \right) \) precisely contains the convex elements of \(\mathcal {F}\left( Z, C \right) \). Moreover, the pair \(\left( \mathcal {G}\left( Z, C \right) , \supseteq \right) \) is a complete lattice, and for \(\emptyset \ne \mathcal A \subseteq \mathcal {G}\left( Z, C \right) \)

$$ \inf _{\left( \mathcal {G}\left( Z, C \right) , \supseteq \right) } \mathcal A = \mathrm{{cl \,}}\mathrm{{co \,}}\bigcup \limits _{A \in \mathcal A} A $$

whereas the formula for the supremum is the same as in \(\mathcal {F}\left( Z,C \right) \). The inf-residuation in \(\left( \mathcal {G}\left( Z, C \right) , \oplus , \odot , \supseteq \right) \) is the same as in \(\left( \mathcal {F}\left( Z, C \right) , \oplus , \odot , \supseteq \right) \) which is a consequence of (2.8) and Proposition 2.9.

Example 2.14

If in Example 2.12 and under the assumptions therein, \(\mathcal {F}(Z,C)\) is replaced by \(\mathcal {G}(Z,C)\), we obtain a conlinear space \(\mathcal {I}_{co}(Z, C)\), which is a subspace of \(\mathcal {I}(Z, C)\) that is algebraically and order isomorphic to \(\mathcal {G}(Z,C)\). For further details, the reader is referred to [151, Sect. 1.6].

Note that parallel results are obtained for \(\mathcal {F}\left( Z,-C \right) \), \(\mathcal {G}\left( Z,-C \right) \) with the same algebraic operations as in \(\mathcal {F}\left( Z, C \right) \), \(\mathcal {G}\left( Z, C \right) \) and the order relation \(\subseteq \).

Example 2.15

Let us consider \(Z = \mathrm {I\negthinspace R}\), \(C = \mathrm {I\negthinspace R}_+\). Then

$$ \mathcal {F}\left( \mathrm {I\negthinspace R}, \mathrm {I\negthinspace R}_+ \right) = \mathcal {G}\left( \mathrm {I\negthinspace R}, \mathrm {I\negthinspace R}_+ \right) = \left\{ [r, +\infty ) \mid r \in \mathrm {I\negthinspace R} \right\} \cup \left\{ \mathrm {I\negthinspace R} \right\} \cup \left\{ \emptyset \right\} . $$

Moreover, by

$$ a = \inf _{\left( \mathrm {I\negthinspace R}, \le \right) }A \quad \text {for} \quad A \in \mathcal {G}\left( \mathrm {I\negthinspace R}, \mathrm {I\negthinspace R}_+ \right) $$

and

$$ A = \left\{ \begin{array}{c@{\quad }c@{\quad }c} \mathrm {I\negthinspace R}&{} : &{} a = -\infty \\ \left[ a, +\infty \right) &{} : &{} a \in \mathrm {I\negthinspace R}\\ \emptyset &{} : &{} a = +\infty \end{array} \right. $$

we obtain an algebraic and order isomorphism between \(\left( \mathcal {G}\left( \mathrm {I\negthinspace R}, \mathrm {I\negthinspace R}_+ \right) , \oplus , \odot , \supseteq \right) \) and \(\left( \overline{\mathrm {I\negthinspace R}}, {+^{\negmedspace \centerdot \,}}, \cdot , \le \right) \) where \({+^{\negmedspace \centerdot \,}}\) is the inf-addition (see [197]) on \(\overline{\mathrm {I\negthinspace R}}= \mathrm {I\negthinspace R}\cup \left\{ \pm \infty \right\} \) with \(\left( +\infty \right) {+^{\negmedspace \centerdot \,}}r = r {+^{\negmedspace \centerdot \,}}\left( +\infty \right) = +\infty \) for all \(r \in \overline{\mathrm {I\negthinspace R}}\), and \(\cdot \) is an extension of the multiplication of non-negative real numbers by elements of \(\overline{\mathrm {I\negthinspace R}}\). Note that \(0 \cdot (-\infty ) = 0 \cdot (+\infty ) = 0\) since otherwise \(\left( \overline{\mathrm {I\negthinspace R}}, {+^{\negmedspace \centerdot \,}}, \cdot \right) \) is not a conlinear space. Of course, \(A \supseteq B\) if, and only if, \(\inf _{\left( \mathrm {I\negthinspace R}, \le \right) }A \le \inf _{\left( \mathrm {I\negthinspace R}, \le \right) }B\). Thus, \(\left( \overline{\mathrm {I\negthinspace R}}, {+^{\negmedspace \centerdot \,}}, \cdot , \le \right) \) is an ordered conlinear space which is a complete lattice with respect to \(\le \). Moreover,

$$ \forall M \subseteq \overline{\mathrm {I\negthinspace R}}, \; \forall r\in \overline{\mathrm {I\negthinspace R}}:\inf _{(\overline{\mathrm {I\negthinspace R}}, \le )}\left( M {+^{\negmedspace \centerdot \,}} \left\{ r \right\} \right) = r {+^{\negmedspace \centerdot \,}}\inf _{(\overline{\mathrm {I\negthinspace R}}, \le )}M, $$

which admits the introduction of the inf-residuation in \(\overline{\mathrm {I\negthinspace R}}\) [89]. Here, \(M {+^{\negmedspace \centerdot \,}} \left\{ r \right\} = \left\{ m {+^{\negmedspace \centerdot \,}}r \mid m \in M \right\} \). We have

$$ r {-^{\negmedspace \centerdot \,}}s = \inf \left\{ t \in \mathrm {I\negthinspace R}\mid r \le s {+^{\negmedspace \centerdot \,}}t \right\} $$

for all \(r, s\in \overline{\mathrm {I\negthinspace R}}\) with some strange properties. For examples, expressions like \(\left( +\infty \right) {-^{\negmedspace \centerdot \,}}\left( -\infty \right) \) are well-defined and even useful as shown in [89, 90].

Remark 2.16

As a simple, but instructive exercise the reader should try to establish the isomorphism between \(\left( \mathcal {G}\left( \mathrm {I\negthinspace R}, -\mathrm {I\negthinspace R}_+ \right) , \oplus , \cdot , \subseteq \right) \) and \(\left( \overline{\mathrm {I\negthinspace R}}, {+_{\negmedspace \centerdot \,}}, \cdot , \le \right) \) where \({+_{\negmedspace \centerdot \,}}\) denotes the “sup-addition” [197]. This shows that the reason why ‘there’s no single symmetric way of handling \(\left( +\infty \right) +\left( -\infty \right) \)’ is basically the same as the one for having two “canonical” extensions of a vector pre-order to the power set of the vector space.Footnote 3

The image space \(\mathcal {G}\left( Z, C \right) \) will feature prominently in duality theories for set-valued functions/optimization problems. The last example shows that it shares almost all properties with the extended reals provided with the inf-addition. The notable exception is that the order \(\supseteq \) is not total in general. The following result clarifies this question.

Proposition 2.17

Let Z be a locally convex space and \(C \subseteq Z\) a convex cone with \(0 \in C\) and \(Z \ne \mathrm{{cl \,}}C\). Then \(\supseteq \) is total on \(\mathcal {F}(Z, C)\) if, and only if, \(\mathrm{{cl \,}}C\) coincides with a half-space \(H^+(z^*) := \left\{ z \in Z \mid z^*(z) \ge 0 \right\} \) for some \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \).

Proof

The “if” part is immediate. For the “only if” part, assume \(\supseteq \) is total on \(\mathcal {F}(Z, C)\) and \(\mathrm{{cl \,}}C\) is not a half space. Then, there are \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) and \(\bar{z} \in Z\) such that

$$ \mathrm{{cl \,}}C \subseteq H^+(z^*) \quad \text {and} \quad \bar{z} \in H^+(z^*)\backslash \mathrm{{cl \,}}C. $$

Indeed, the existence of \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) and the first inclusion follow from a separation argument, the second from the assumption. We claim that

$$ \forall s \in \mathrm {I\negthinspace R}:H(z^*, s) := \left\{ z \in Z \mid z^*(z) \ge s \right\} \not \subseteq \mathrm{{cl \,}}C. $$

In order to verify the claim, assume \(H(z^*, s) \subseteq \mathrm{{cl \,}}C\) for some \(s \in \mathrm {I\negthinspace R}\). Then, there is \(z_s \in Z\) such that \(H(z^*, s) = z_s + H^+(z^*)\) and \(z^*(z_s) = s\). This implies

$$ \forall t > 0 :z_s + t\bar{z} \in H(z^*, s) $$

since \(z^*(z_s + t\bar{z}) = s + tz^*(\bar{z}) \ge s\). By assumption, \( z_s + t\bar{z} \in \mathrm{{cl \,}}C\) for all \(t>0\), hence

$$ \forall t > 0 :\frac{1}{t}z_s + \bar{z} \in \mathrm{{cl \,}}C $$

which in turn gives \(\bar{z} \in \mathrm{{cl \,}}C\), a contradiction. This proves the claim, i.e. \(\mathrm{{cl \,}}C \not \supseteq H(z^*, s)\) for all \(s \in \mathrm {I\negthinspace R}\). Since \(\supseteq \) is total,

$$ \mathrm{{cl \,}}C \subseteq \bigcap _{s \in \mathrm {I\negthinspace R}}H(z^*, s) = \emptyset , $$

a contradiction.   \(\square \)

2.4 Comments on Conlinear Spaces and Residuation

The term ’conlinear space’ was coined in [77] because of the analogies to ’convex cones’ on the one hand and to linear spaces on the other hand.

A related concept is the one of quasilinear spaces or almost linear spaces as defined in, for example, [66, 168], respectively. A quasilinear (or almost linear) space satisfies all the axioms of a linear space, but the second distributivity law which is required only for non-negative reals. Hence \(\left( \mathcal P(Z), +, \cdot \right) \), \(\left( \mathcal P(Z, C), +, \cdot \right) \) and \(\left( \mathcal {F}(Z, C), \oplus , \cdot \right) \) are not quasilinear spaces in general. With respect to interval mathematics, Schmidt [206, Sect. 4] observed ‘\([\ldots ]\) it seems to be convenient to generalize one step further and to restrict the multiplication by scalars to positive scalars alone.’ Keeping all the other requirements for a quasilinear space we obtain an abstract convex cone in the sense of B. Fuchssteiner, W. Lusky [60]. In [124], the same concept is the basic one, sometimes a convex cone even without a zero element. Abstract convex cones also coincide with semilinear spaces as probably introduced by A. M. Rubinov [202] (he refers to a 1975 joint paper with S.S. Kutateladze) and recalled, for example, in [65, Definition 2.6].

We remark that convex cones in the sense of [60] and semilinear spaces (with a “zero”) in the sense of [65, Definition 2.6] are also semi-modules over \(\mathrm {I\negthinspace R}_+\) (and even semivector spaces) as defined by U. Zimmermann in [230, Sect. 5]. Finally, another close relative of a conlinear space is a semivector space in the sense of [192]. Prakash and Sertel (see also [193]) defined this structure in the early Seventies and observed that the collections of non-empty and non-empty convex sets of a vector space form a semivector space. In a semivector space, the existence of a neutral element with respect to the addition is not required. Therefore, it might be considered as the “weakest” algebraic concept discussed here.

Dedekind [43] already introduced the residuation concept and used it in order to construct the real numbers as ‘Dedekind sections’ of rational numbers. Among others, R. P. Dilworth and M. Ward turned it into a standard tool in abstract lattice theory, see [46, 223225] and many followers.

Sometimes, the right hand side of (2.8) is called the geometric difference [189], star-difference (for example in [220]), or Minkowski difference [76] of the two sets A and B, and H. Hadwiger should probably be credited for its introduction. The relationship to residuation theory (see, for instance, [13, 61]) has been established in [89]. At least, we do not know an earlier reference. In the context of abstract duality, residuation has been used, for example, in [64, 167] and also in idempotent analysis (see [62, Sect. 3.3], for example). Note that in [64], the set \(\overline{\mathrm {I\negthinspace R}}\) is supplied both with \({+^{\negmedspace \centerdot \,}}\) and \({+_{\negmedspace \centerdot \,}}\) at the same time, and this idea is extended to ‘the canonical enlargement’ of a ‘boundedly complete lattice ordered group’ (see [64, Sect. 3]) which is different from the point of view of this survey. On the other hand, \(\mathcal {F}(Z, C)\) and \(\mathcal {G}(Z, C)\) are special cases of \((A, \preceq )\) in [64, Sect. 2], but therein the conlinear structure is not used.

3 Minimality Notions

3.1 Basic Concepts

This section is concerned with the question of how to define “infimum attainment” and “minimality.” We shall focus on the relation \(\supseteq \) on \(\mathcal {F}\left( Z, C \right) \) and \(\mathcal G\left( Z, C \right) \) noting that there are parallel concepts and results for \(\subseteq \) on \(\mathcal F\left( Z, -C \right) \), \(\mathcal G\left( Z, -C \right) \). In the remainder of the paper, the infimum or supremum is always taken in the corresponding space of elements, for example

$$ \inf ~\mathcal A = \mathrm{{cl \,}}\bigcup _{A \in \mathcal A} A $$

whenever \(\mathcal A \subseteq \mathcal {F}(Z, C)\) whereas for \(\mathcal A \subseteq \mathcal {G}(Z, C)\)

$$ \inf ~\mathcal A = \mathrm{{cl \,}}\mathrm{{co \,}}\bigcup _{A \in \mathcal A} A. $$

With the constructions from the previous section in view, we have at least two possibilities for a minimality notion. Given a set \(\mathcal A \subseteq \mathcal F\left( Z, C \right) \) or \(\mathcal A \subseteq \mathcal G\left( Z, C \right) \), look for

(I):

\(\inf ~\mathcal A = \mathrm{{cl \,}}\bigcup _{A \in \mathcal A}A\) or \(\inf ~\mathcal A = \mathrm{{cl \,}}\mathrm{{co \,}}\bigcup _{A \in \mathcal A}A\), respectively, or

(II):

minimal elements with respect to \(\supseteq \), i.e. \(B \in \mathcal A\) satisfying

$$ A \in \mathcal A, \; A \supseteq B \quad \Rightarrow \quad A = B. $$

Note that the second possibility corresponds to the so-called set criterion which became popular due to the work of D. Kuroiwa and collaborators: One looks for minimal elements of \(\mathcal A \subseteq \mathcal P\left( Z \right) \) with respect to \(\preccurlyeq _C\). Since \(\preccurlyeq _C\) is not antisymmetric in general one has to look for \(B \in \mathcal A\) satisfying

(IIa):
$$ A \in \mathcal A, \; A \preccurlyeq _C B \quad \Rightarrow \quad B \preccurlyeq _C A. $$

Neither of the two possibilities above has been considered first. Rather, the following problem has been studied since the 1980s by Corley [32], Dinh The Luc [156] and others, and it is still popular.

(III):

Find minimal elements of \(\bigcup _{A \in \mathcal A}A\) with respect to \(\le _C\), i.e. find \(b \in \bigcup _{A \in \mathcal A}A\) satisfying

$$ a \in \bigcup _{A \in \mathcal A}A, \; a \le _C b \quad \Rightarrow \quad b \le _C a. $$

In this way, a set optimization problem is reduced to a vector optimization problem, and sometimes this problem is referred to as the vector criterion in set optimization. Note that, in some way, it involves the infimum of \(\mathcal A\) in \(\mathcal P(Z, C)\).

Example 3.1

Let \(Z = \mathrm {I\negthinspace R}^2\), \(C = \left\{ 0 \right\} \times \mathrm {I\negthinspace R}_+\) and \(\mathcal A = \left\{ A_t \mid t \in \left[ 0,1 \right] \right\} \) where

$$ A_t = \left[ -1+t, t \right] \times \mathrm {I\negthinspace R}_+. $$

Then each \(A_t\) is minimal with respect to \(\supseteq \) and

$$ \inf ~\mathcal A = A_0 \cup A_1 = \left[ -1, 1 \right] \times \mathrm {I\negthinspace R}_+. $$

This shows that not all minimal elements are required to generate the infimum which prepares the following definition.

Definition 3.2

Let \(\mathcal A \subseteq \mathcal {F}(Z, C)\) or \(\mathcal A \subseteq \mathcal {G}(Z, C)\).

  1. (a)

    A set \(\mathcal B \subseteq \mathcal A\) is said to generate the infimum of \(\mathcal A\) if

    $$ \inf ~\mathcal B = \inf ~\mathcal A. $$
  2. (b)

    An element \(\bar{A} \in \mathcal A\) is called minimal for \(\mathcal A\) if it satisfies

    $$ A \in \mathcal A, \; A \supseteq \bar{A} \quad \Rightarrow \quad A = \bar{A}. $$

    The set of all minimal elements of \(\mathcal A\) is denoted by \(\mathrm{{Min\,}}\mathcal A\).

Parallel definitions apply to generators of the supremum and maximal elements. Of course, \(\mathcal A\) always generates the infimum of \(\mathcal A\). On the other hand, a set of minimal elements of \(\mathcal A\) does not necessarily generate the infimum of \(\mathcal A\). In Example 3.1, every subset of \(\mathcal A\) including \(A_0\) and \(A_1\) consists of minimal elements and generates the infimum, i.e., in general, sets of minimal elements generating the infimum are not singletons. If \(\mathcal A \subseteq \mathcal P\left( \mathrm {I\negthinspace R}, \mathrm {I\negthinspace R}_+ \right) \), then a single element \(A \in \mathcal A\) generates the infimum of \(\mathcal A\) if, and only if, it is a minimal one. Definition 3.2 leads to the following “complete lattice approach.” Given a set \(\mathcal A \subseteq \mathcal {F}(Z, C)\) or \(\mathcal A \subseteq \mathcal {G}(Z, C)\) look for

(IV):

a set \(\mathcal B \subseteq \mathcal A\) such that

$$ \inf ~\mathcal B = \inf ~\mathcal A \quad \text {and} \quad \mathcal B \subseteq \mathrm{{Min\,}}\mathcal A. $$

Hence, the minimality notion of the “complete lattice approach” consists of looking for sets of minimal elements which generate the infimum. We turn these notions into a solution concept for set optimization problems. The following definition is a special case of the general one given in [102].

Definition 3.3

Let X be a non-empty set, \(f :X \rightarrow \mathcal {F}\left( Z, C \right) \) (or \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \)) a function and \(f[X] = \left\{ f(x) \mid x \in X \right\} \).

  1. (a)

    A set \(M \subseteq X\) is called an infimizer for f if

    $$ \inf ~f[M] = \inf ~f[X]. $$
  2. (b)

    An element \(\bar{x} \in X\) is called a minimizer of f if \(f(\bar{x})\) is minimal for f[X].

  3. (c)

    A set \(M \subseteq X\) is called a solution of the problem

    $$\begin{aligned} {\text {minimize} \quad f(x) \quad \text {subject to} \quad x \in X} \end{aligned}$$
    (P)

if M is an infimizer for f, and each \(\bar{x} \in M\) is a minimizer of f. It is called a full solution if the set f[M] includes all minimal elements of f[X].

Thus, solutions of set minimization problems in the “complete lattice” sense are infimizers consisting only of minimizers. Again, parallel definitions apply to solutions of maximization problems which will later appear in duality results. One more concept is needed for a Weierstraß type theorem.

Definition 3.4

A set \(\mathcal A \subseteq \mathcal {F}(Z, C)\) (or \(\mathcal A \subseteq \mathcal {G}(Z, C)\)) is said to satisfy the domination property if

$$ \forall A \in \mathcal A, \; \exists \bar{A} \in \mathrm{{Min\,}}\mathcal A :\bar{A} \supseteq A. $$

Proposition 3.5

Let \(f :X \rightarrow \mathcal {F}\left( Z, C \right) \) (or \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \)) be a function and f[X] satisfy the domination property. Then

$$ M = \left\{ x \in X \mid f(x) \in \mathrm{{Min\,}}f[X] \right\} $$

is a full solution of \(\left( P \right) \).

Proof

The domination property yields the first of the following inclusions while the second one follows from \(M \subseteq X\):

$$ \inf _{x \in M}f(x) \supseteq \inf _{x \in X}f(x) \supseteq \inf _{x \in M}f(x). $$

This already completes the proof since M comprises all minimizers of f[X].   \(\square \)

3.2 Comments on Solution Concepts in Set Optimization

The appearance of set-valued functions in optimization theory was mainly motivated by unifying different forms of constraints, see [14] and also [184, 185]. Problem (P) in [16, p. 196] seems to be the first explicit set-valued optimization problem. J. M. Borwein defines its optimal value as the infimum with respect to the underlying vector order and assumes that the image space is conditional order complete, i.e. every subset which is bounded from below (above) has an infimum (supremum) in the space. Clearly, a necessary condition for this is that the image space is a vector lattice. This restricts the applicability of such results considerably and besides, the vector infimum/supremum does not produce solution concepts which are useful in applications.

In [190, 191], Postolica formulates an optimization problem with a set-valued objective and uses the minimality concept (III) above.

Corley [32, 33] defined ‘the maximization of a set-valued function with respect to a cone in possibly infinite dimensions’ mainly motivated by the fact that, in duality theories for multiobjective problems as established by Tanino and Sawaragi [218], ‘dual problems took this form’ (quotes from [32, p. 489]). The same motivation can be found in Dinh The Luc’s book [156] in which vector optimization problems with a set-valued objective are investigated using the approach (III).

Both authors considered optimality in the sense of (III) above: Take the union of all objective values and then look for (weakly, properly etc.) minimal points in this union with respect to the vector order. This approach has been the leading idea ever since, among the many followers are [29, 54, 55, 57, 141, 145147] (just to mention a few), the book [28], and even the more recent [34, 35, 74, 105, 173, 174, 203], [19, Sects. 7.1.3., 7.4.2.], [58, 199] and many more. We call this approach the vector approach to set optimization.

The picture changed when the set relations were popularized by Kuroiwa and his co-authors [130, 132, 133, 136, 138]. Still, it took several years until the idea to use (II) above as a solution concept for set-valued optimization problems became more popular, see [1, 75, 83, 85, 9396, 229] and also Chap. 5 of Jahn’s book [114]. The basic idea is, of course, to “lift” the concept of minimal (\(=\)non-dominated) image points from elements of a vector space to elements of the power set of the vector space. Therefore, we call this approach the set relation approach to set optimization. A comparison of the vector and the set relation approach can be found in [97].

Roughly another ten years later, it has been realized that the so-called set relations can be utilized in a more subtle manner which is described in the previous section: Via equivalence classes with respect to the two pre-orders and hull operations one defines (conlinear) spaces of sets which enjoy rich algebraic and order structures. The set relations somehow disappear from the final picture since they serve as a tool to construct the image spaces in which the subset or superset inclusion appears as a partial order. This approach, which we call the “complete lattice approach” to set optimization has been developed in the two theses [77, 149] and led to the solution concept in [102, 151] which is the basis for the Definitions 3.2 and 3.3 above (see already [91] for a precursor). One may realize that the complete lattice approach (IV) absorbs both of (I) and (II) as well as (IIa).

It might be interesting to note that Ekeland’s variational principle became one of the first major results in (nonlinear, nonconvex) functional analysis that was generalized to set-valued functions via the set relation approach. While [26], [27, Theorem 4.1], [84, Theorem 5.1], [106, Theorems 2.3 and 2.4] still follow the vector approach, first set relation versions were independently established in [75, 85] (with precursor [83] already from 2002). Note that the results in [85] are more general (weaker assumptions like boundedness from below, more general pre-image spaces) and more complete (both set relations are involved, minimal “set“theorems, not only Ekeland’s principle). In particular, the main result [75, Theorem 4.1] is a special case of [85, Theorem 6.1].

4 Set-Valued Functions

4.1 Basic Concepts

Let X be another linear space and \(f :X \rightarrow \mathcal P\left( Z, C \right) \) a function. The goal is to develop a convex analysis for such functions f. We start by recalling a popular definition. A function \(\hat{f} :X \rightarrow \mathcal P\left( Z \right) \) is called C-convex (see e.g. [14, Definition 1.1]) if

$$\begin{aligned} t \in \left( 0,1 \right) , \; x_1, x_2 \in X \; \Rightarrow \; \hat{f}\left( tx_1 + \left( 1-t \right) x_2 \right) + C \supseteq t\hat{f}\left( x_1 \right) + \left( 1-t \right) \hat{f}\left( x_2 \right) , \end{aligned}$$
(4.1)

and it is called C-concave ([156, p. 117]) if

$$\begin{aligned} t \in \left( 0,1 \right) , \; x_1, x_2 \in X \; \Rightarrow \; t\hat{f}\left( x_1 \right) + \left( 1-t \right) \hat{f}\left( x_2 \right) \subseteq \hat{f}\left( tx_1 + \left( 1-t \right) x_2 \right) - C. \end{aligned}$$
(4.2)

Of course, the C-convexity inequality is just

$$ \hat{f}\left( t x_1 + \left( 1-t \right) x_2 \right) \preccurlyeq _C t\hat{f}\left( x_1 \right) + \left( 1-t \right) \hat{f}\left( x_2 \right) , $$

and the C-concavity inequality

$$ t\hat{f}\left( x_1 \right) + \left( 1-t \right) f\left( x_2 \right) \curlyeqprec _C \hat{f}\left( tx_1 + \left( 1-t \right) x_2 \right) . $$

Here is another interesting feature of the set-valued framework. If f maps into \(\mathcal P\left( Z, C \right) \), then the cone C can be dropped from (4.1) whereas (4.2) becomes meaningless for many interesting cones C (for example, for generating cones, i.e. \(C - C = Z\)). The opposite is true for \(\mathcal P\left( Z, -C \right) \)-valued functions. This gives a hint why convexity (and minimization) is related to \(\mathcal P\left( Z, C \right) \)-valued functions and concavity (and maximization) to \(\mathcal P\left( Z, -C \right) \)-valued ones.

The graph of a function \(\hat{f} :X \rightarrow \mathcal P\left( Z \right) \) is the set

$$ \mathrm{{graph \,}}~\hat{f} = \left\{ \left( x ,z \right) \in X \times Z \mid z \in \hat{f}\left( x \right) \right\} , $$

and the domain is the set

$$ \mathrm{{dom \,}}~\hat{f} = \left\{ x \in X \mid f\left( x \right) \ne \emptyset \right\} . $$

Definition 4.1

A function \(f :X \rightarrow \mathcal P\left( Z, C \right) \) is called

  1. (a)

    convex if \(\mathrm{{graph \,}}f\) is a convex subset of \(X \times Z\),

  2. (b)

    positively homogeneous if \(\mathrm{{graph \,}}f\) is a cone in \(X \times Z\),

  3. (c)

    sublinear if \(\mathrm{{graph \,}}f\) is a convex cone \(X \times Z\),

  4. (d)

    proper if \(\mathrm{{dom \,}}f \ne \emptyset \) and \(f\left( x \right) \ne Z\) for all \(x \in X\).

Proposition 4.2

A function \(f :X \rightarrow \mathcal P\left( Z, C \right) \) is convex if, and only if,

$$\begin{aligned} t \in \left( 0,1 \right) , \; x_1, x_2 \in X \; \Rightarrow \; f\left( tx_1 + \left( 1-t \right) x_2 \right) \supseteq t f\left( x_1 \right) + \left( 1-t \right) f\left( x_2 \right) . \end{aligned}$$
(4.3)

It is positively homogeneous if, and only if,

$$\begin{aligned} t>0, \; x \in X \; \Rightarrow \; f\left( tx \right) \supseteq tf\left( x \right) , \end{aligned}$$
(4.4)

and it is sublinear if, and only if,

$$\begin{aligned} s, t > 0, \; x_1, x_2 \in X \; \Rightarrow \; f\left( sx_1 + tx_2 \right) \supseteq s f\left( x_1 \right) + t f\left( x_2 \right) . \end{aligned}$$
(4.5)

Proof

Exercise.   \(\square \)

A parallel result for concave \(\mathcal P\left( Z, -C \right) \)-valued functions can be established. As a straightforward consequence of Proposition 4.2 we obtain the following facts.

Proposition 4.3

Let \(f :X \rightarrow \mathcal P\left( Z, C \right) \) be a convex function. Then

  1. (a)

    \(f\left( x \right) \) is convex for all \(x \in X\), i.e. f is convex-valued,

  2. (b)

    \( \left\{ x \in X \mid z \in f\left( x \right) \right\} \) is convex for all \(z \in Z\),

  3. (c)

    \(\mathrm{{dom \,}}f\) is convex.

Proof

Another exercise.   \(\square \)

In the remainder of this subsection, let X and Z be topological linear spaces. We shall denote by \(\mathcal N_X\) and \(\mathcal N_Z\) a neighborhood base of \(0 \in X\) and \(0 \in Z\), respectively.

Definition 4.4

A function \(f :X \rightarrow \mathcal P\left( Z, C \right) \) is called

  1. (a)

    closed-valued if f(x) is a closed set for all \(x \in X\),

  2. (b)

    level-closed if \( \left\{ x \in X \mid z \in f(x) \right\} \) is closed for all \(z\in Z\),

  3. (c)

    closed if \(\mathrm{{graph \,}}f\) is a closed subset of \(X \times Z\) with respect to the product topology.

Remark 4.5

A function \(f :X \rightarrow \mathcal {F}(Z, C)\) is level-closed if, and only if, \( \left\{ x \in X \mid f(x) \supseteq A \right\} \) is closed for all \(A \in \mathcal F\left( Z, C \right) \) which may justify the term “level-closed.” Indeed, this follows from \( \left\{ z \right\} \oplus C \in \mathcal {F}(Z, C)\) and

$$ \forall A \in \mathcal {F}(Z, C) : \left\{ x \in X \mid f(x) \supseteq A \right\} = \bigcap _{a \in A} \left\{ x \in X \mid a \in f(x) \right\} . $$

Level-closedness is even equivalent to closedness if \(\mathrm{{int\,}}C \ne \emptyset \), see [151, Proposition 2.38], even for functions mapping into a completely distributive lattice as in [148], but not in general.

Example 4.6

This example is taken from [176, Example 3.1]. Let \(X = \mathrm {I\negthinspace R}\), \(Z = \mathrm {I\negthinspace R}^2\), \(C = \left\{ \left( 0, t \right) ^T \mid t \ge 0 \right\} \) and consider the function

$$ f(x) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} \left( \begin{array}{c} x \\ x+1 \end{array} \right) + C &{} : &{} 0 \le x < 1 \\ \left( \begin{array}{c} 1 \\ 4 \end{array} \right) + C &{} : &{} x = 1 \\ \emptyset &{} : &{} \text {otherwise.} \end{array} \right. $$

Defining sequences by

$$ x^k = 1 - \frac{1}{k} \quad \text {and} \quad z^k = \left( \begin{array}{c} 1 - \frac{1}{k} \\ 2 - \frac{1}{k} \end{array} \right) $$

we obtain \(z^k \in f(x^k)\) for all \(k = 1, 2, \ldots \), \(x^k \rightarrow 1\), \(z^k \rightarrow \left( 1,2 \right) ^T\) and \(\left( 1,2 \right) ^T \not \in f(1)\), thus \(\mathrm{{graph \,}}f\) is not closed. On the other hand,

$$ \left\{ x \in X \mid z \in f(x) \right\} = \left\{ \begin{array}{c@{\quad }c@{\quad }c} \left\{ z_1 \right\} &{} : &{} 0 \le z_1 < 1 \; \text {and} \; z_2 \ge z_1 + 1 \\ \left\{ 1 \right\} &{} : &{} z_1 = 1 \; \text {and} \; z_2 \ge 4 \\ \emptyset &{} : &{} \text {otherwise} \end{array} \right. $$

thus f is level-closed.

The following result is immediate.

Proposition 4.7

Let \(f :X \rightarrow \mathcal {P}(Z,C)\) be a closed function. Then f is closed-valued and level-closed.

Proof

Yet another exercise.    \(\square \)

Proposition 4.7 shows that a closed \(\mathcal {P}(Z,C)\)-valued function actually maps into \(\mathcal {F}(Z,C)\). Therefore, we can restrict the discussion of lower semicontinuity and closedness to \(\mathcal F(Z,C)\)-valued functions. The following definition introduces two more related notions.

Definition 4.8

A function \(f :X \rightarrow \mathcal {F}(Z, C)\) is called lattice-lower semicontinuous (lattice-l.s.c.) at \(\bar{x} \in X\) iff

$$\begin{aligned} f(x) \supseteq \liminf \limits _{x \rightarrow \bar{x}} f(x) = \sup \limits _{U \in \mathcal N_X}\inf \limits _{x \in \bar{x} + U}f(x) = \bigcap \limits _{U \in \mathcal N_X}\mathrm{{cl \,}}\bigcup \limits _{x \in \bar{x} + U}f(x). \end{aligned}$$
(4.6)

It is called lattice-lower semicontinuous iff it is lattice-l.s.c. at every \(\bar{x} \in X\).

Parallel definitions apply for \(\mathcal {G}(Z, C)\)-valued functions. The next result shows the equivalence of lattice-lower semicontinuity and closedness for \(\mathcal {F}(Z, C)\)-valued functions.

Proposition 4.9

A function \(f :X \rightarrow \mathcal {F}(Z, C)\) is lattice-l.s.c. if, and only if, it is closed.

Proof

The proof of Proposition 2.34 in [151] also applies to this case as already discussed in [151, p. 59].    \(\square \)

The following result contains the heart of the argument for the Weierstraß type theorem.

Proposition 4.10

Let \(f :X \rightarrow \mathcal {F}\left( Z, C \right) \) be a level-closed function such that \(\mathrm{{dom \,}}f\) is compact. Then f[X] satisfies the domination property.

Proof

This is a special case of Proposition 2.38 in [151].   \(\square \)

Theorem 4.11

Let \(f :X \rightarrow \mathcal {F}\left( Z, C \right) \) be a level-closed function such that \(\mathrm{{dom \,}}f\) is compact. Then \(\left( P \right) \) has a full solution.

Proof

This directly follows from Propositions 3.5 and 4.10.   \(\square \)

Because of Propositions 4.7 and 4.9, lattice-lower semicontinuity or closedness are sufficient conditions for level-closedness.

We turn to upper semi-continuity type properties which will mainly be used to establish sufficient conditions for convex duality results.

Definition 4.12

A function \(f :X \rightarrow \mathcal {F}(Z, C)\) is called lattice-upper semicontinuous (lattice-u.s.c.) at \(\bar{x} \in X\) if

$$ \limsup \limits _{x \rightarrow \bar{x}} f(x) = \inf \limits _{U \in \mathcal N_X}\sup \limits _{x \in \bar{x} + U}f(x) = \mathrm{{cl \,}}\bigcup \limits _{U \in \mathcal N_X}\bigcap \limits _{x \in \bar{x} + U}f(x) \supseteq f(\bar{x}). $$

It is called lattice-upper semicontinuous (lattice-u.s.c.) if it is lattice-u.s.c. at every \(x \in X\).

Because of Proposition 4.3, we only need to consider \(\mathcal {G}(Z, C)\)-valued functions in the following result.

Proposition 4.13

Let X be a locally convex topological linear space and \(\mathcal N_X\) a neighborhood base of \(0 \in X\) consisting of convex sets. Let \(f :X \rightarrow \left( \mathcal {F}(Z,C),\supseteq \right) \) be convex. Then, f is lattice-l.s.c. (lattics-u.s.c.) at \(\bar{x} \in X\) if, and only if, it is lattice-l.s.c. (lattice-u.s.c.) as a function into \(\left( \mathcal {G}(Z,C),\supseteq \right) \) at \(\bar{x}\).

Proof

It is easy to prove that if f is convex, then for all \(x\in X\) and all \(U\in \mathcal N_X\) the set \(\bigcup \limits _{x \in \bar{x} + U} f(x)\) is convex, hence

$$ \bigcap \limits _{U\in \mathcal N_X}\mathrm{{cl \,}}\bigcup \limits _{x \in \bar{x} + U}f(x) = \bigcap \limits _{U \in \mathcal N_X}\mathrm{{cl \,}}\mathrm{{co \,}}\bigcup \limits _{x \in \bar{x} + U}f(x). $$

With the definition of \(\liminf \) in view, the case of lattice-lower semi-continuity follows.

Concerning lattice upper semi-continuity, take

$$ z_1, \; z_2 \in \bigcup \limits _{U\in \mathcal N_X}\bigcap \limits _{x \in \bar{x} + U}f(x). $$

Then, there are \(U_1, U_2 \in \mathcal N_X\) such that \(z_i \in \bigcap \limits _{x \in \bar{x} + U_i}f(x)\) for \(i = 1,2\). Since \(\mathcal N_X\) is a neighborhood base of \(0 \in X\) there is \(V \in \mathcal N_X\) such that \(V \subseteq U_1 \cap U_2\). Hence

$$ \forall x \in \bar{x} + V :z_1, z_2 \in f\left( x \right) . $$

Since \(f\left( x \right) \) is a convex set, this implies

$$ \forall t \in \left( 0,1 \right) , \forall x\in \bar{x} + V :tz_1 + \left( 1-t \right) z_2 \in f\left( x \right) , $$

hence \(tz_1 + \left( 1-t \right) z_2 \in \bigcup \limits _{U\in \mathcal U}\bigcap \limits _{x \in \bar{x} + U}f(x)\). This shows that the latter is a convex set. Consequently,

$$ \mathrm{{cl \,}}\mathrm{{co \,}}\bigcup \limits _{U\in \mathcal U}\bigcap \limits _{x \in \bar{x} + U}f(x) = \mathrm{{cl \,}}\bigcup \limits _{U\in \mathcal U}\bigcap \limits _{x \in \bar{x} + U}f(x). $$

The claim for lattice-upper semi-continuity follows from the definition of \(\limsup \).   \(\square \)

4.2 Scalarization of \(\mathcal G\left( Z, C \right) \)-Valued Functions

In the following, we assume that Z is a non-trivial locally convex linear space with topological dual \(Z^*\). For \(A \subseteq Z\), define the extended real-valued functions \(\sigma ^{\vartriangle }_A :Z^* \rightarrow \overline{\mathrm {I\negthinspace R}}\) and \(\sigma ^{\triangledown }_A :Z^* \rightarrow \overline{\mathrm {I\negthinspace R}}\) by

$$ \sigma ^{\vartriangle }_A\left( z^* \right) = \inf _{a \in A}z^*\left( a \right) \quad \text {and} \quad \sigma ^{\triangledown }_A\left( z^* \right) = \sup _{a \in A} z^*\left( a \right) , $$

respectively. Of course, \(\sigma ^{\triangledown }_A\) is the classical support function of A and \(\sigma ^{\vartriangle }_A\left( z^* \right) = -\sigma ^{\triangledown }_A\left( -z^* \right) \) a version of it. It is well-known (and a consequence of a separation argument) that \(A \in \mathcal G\left( Z, C \right) \) if, and only if,

$$\begin{aligned} A = \bigcap \limits _{z^*\in C^+\backslash \{0\}} \left\{ z \in Z \mid \sigma ^{\vartriangle }_A\left( z^* \right) \le z^*\left( z \right) \right\} . \end{aligned}$$
(4.7)

Moreover, one easily checks for \(A, B \in \mathcal G\left( Z, C \right) \),

$$\begin{aligned} \forall z^*\in C^+\backslash \{0\} :\sigma ^{\vartriangle }_{A \oplus B}\left( z^* \right) = \sigma ^{\vartriangle }_A\left( z^* \right) {+^{\negmedspace \centerdot \,}}\sigma ^{\vartriangle }_B\left( z^* \right) . \end{aligned}$$
(4.8)

Lemma 4.14

If \(\mathcal A \subseteq \mathcal G\left( Z, C \right) \) then

$$\begin{aligned} \forall z^*\in C^+\backslash \{0\} :&\sigma ^{\vartriangle }_{\inf ~\mathcal A}\left( z^* \right) = \inf \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} ,\end{aligned}$$
(4.9)
$$\begin{aligned} \forall z^*\in C^+\backslash \{0\} :&\sigma ^{\vartriangle }_{\sup \mathcal A}\left( z^* \right) \ge \sup \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} . \end{aligned}$$
(4.10)

Moreover,

$$\begin{aligned} \inf ~\mathcal A = \bigcap _{z^*\in C^+\backslash \{0\}} \left\{ z \in Z \mid \inf \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} \le z^*\left( z \right) \right\} , \end{aligned}$$
(4.11)
$$\begin{aligned} \sup \mathcal A = \bigcap _{z^*\in C^+\backslash \{0\}} \left\{ z \in Z \mid \sup \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} \le z^*\left( z \right) \right\} \end{aligned}$$
(4.12)

Proof

If \(\mathcal A \subseteq \left\{ \emptyset \right\} \) then there is nothing to prove. Otherwise,

$$ \forall A \in \mathcal A :\sigma ^{\vartriangle }_{\inf ~\mathcal A}\left( z^* \right) = \inf _{z \in \inf \mathcal A} z^*\left( z \right) \le \sigma ^{\vartriangle }_A\left( z^* \right) , $$

hence \(\sigma ^{\vartriangle }_{\inf ~\mathcal A}\left( z^* \right) \le \inf \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} \). Conversely,

$$ \forall z \in \bigcup _{A \in \mathcal A}A :z^*\left( z \right) \ge \inf \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} , $$

hence \(\sigma ^{\vartriangle }_{\inf ~\mathcal A}\left( z^* \right) = \inf \left\{ z^*\left( z \right) \mid z \in \bigcup _{A \in \mathcal A}A \right\} \ge \inf \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} \) since the support function of a set coincides with the support function of its closed convex hull. This proves (4.9) which in turn immediately implies (4.11).

Moreover, if \(z \in \sup \mathcal A = \bigcap _{A \in \mathcal A}A\) then

$$ \forall A \in \mathcal A :z^*\left( z \right) \ge \inf _{a \in A}z^*\left( a \right) = \sigma ^{\vartriangle }_A\left( z^* \right) $$

which already proves (4.10). Finally, for all \(z^*\in C^+\backslash \{0\}\)

$$ \left\{ z \in Z \mid z^*\left( z \right) \ge \sup \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} \right\} = \bigcap _{A \in \mathcal A} \left\{ z \in Z \mid z^*\left( z \right) \ge \sigma ^{\vartriangle }_A\left( z^* \right) \right\} , $$

hence

$$\begin{aligned}&\bigcap _{z^*\in C^+\backslash \{0\}} \left\{ z \in Z \mid z^*\left( z \right) \ge \sup \left\{ \sigma ^{\vartriangle }_A\left( z^* \right) \mid A \in \mathcal A \right\} \right\} = \bigcap _{z^*\in C^+\backslash \{0\}}\bigcap _{A \in \mathcal A} \left\{ z \in Z \mid z^*\left( z \right) \ge \sigma ^{\vartriangle }_A\left( z^* \right) \right\} \\&= \bigcap _{A \in \mathcal A}\bigcap _{z^*\in C^+\backslash \{0\}} \left\{ z \in Z \mid z^*\left( z \right) \ge \sigma ^{\vartriangle }_A\left( z^* \right) \right\} = \bigcap _{A \in \mathcal A} A = \sup \mathcal A \end{aligned}$$

according to (4.7), and this is just (4.12).   \(\square \)

The following example shows that the inequality in (4.10) can be strict. Consider \(\mathcal A = \left\{ \left\{ a \right\} + \mathrm {I\negthinspace R}^2_+ \mid a =(a_1, a_2)^T \in \mathrm {I\negthinspace R}^2, \; a_1 \ge 0, \; a_2 \ge 0, \; a_1+a_2 = 1 \right\} \subseteq \mathcal G\, (\mathrm {I\negthinspace R}^2, \mathrm {I\negthinspace R}^2_+)\) and \(z^* = \left( 1, 1 \right) ^T\). Then \(\sigma ^{\vartriangle }_A\left( z^* \right) = 1\) for all \(A \in \mathcal A\) and \(\sigma ^{\vartriangle }_{\sup \mathcal A}\left( z^* \right) = 2\).

As an immediate consequence, the sub/supermodularityFootnote 4 of the scalarization functions \(\sigma ^{\triangledown }_A, \sigma ^{\vartriangle }_A\) as functions of \(A \in \mathcal {G}(Z,C)\) can be established. This property is fundamental in the theory of Choquet integrals [44]. For \(z^* \in Z^*\) define

$$ \psi ^{\vartriangle }_{z^*}(A) = \inf _{a \in A} z^*(a) \quad \text {and} \quad \psi ^{\triangledown }_{z^*}(A) = \sup _{a \in A} z^*(a) $$

which are functions \(\psi ^{\vartriangle }_{z^*}, \psi ^{\triangledown }_{z^*} :\mathcal {G}(Z, C) \rightarrow \mathrm {I\negthinspace R}\cup \left\{ \pm \infty \right\} \).

Corollary 4.15

If \(z^* \in C^+\backslash \{0\}\), then \(\psi ^{\vartriangle }_{z^*}\) is a supermodular function on the complete lattice \((\mathcal {G}(Z, C), \supseteq )\), i.e.

$$ \psi ^{\vartriangle }_{z^*}(A) + \psi ^{\vartriangle }_{z^*}(B) \le \psi ^{\vartriangle }_{z^*}(A \cap B) + \psi ^{\vartriangle }_{z^*}(A \cup B). $$

Likewise, \(\psi ^{\triangledown }_{z^*}\) is a submodular function on \((\mathcal {G}(Z, C), \supseteq )\), i.e.

$$ \psi ^{\triangledown }_{z^*}(A) + \psi ^{\triangledown }_{z^*}(B) \ge \psi ^{\triangledown }_{z^*}(A \cap B) + \psi ^{\triangledown }_{z^*}(A \cup B). $$

Proof

This follows from the definition of the functions \(\psi ^{\vartriangle }_{z^*}, \psi ^{\triangledown }_{z^*}\) and Lemma 4.14.   \(\square \)

The inf-residuation in \(\mathcal G\left( Z, C \right) \) can also be represented via scalarization.

Proposition 4.16

For all \(A, B\in \mathcal G\left( Z, C \right) \),

$$ A {-^{\negmedspace \centerdot \,}}B = \bigcap \limits _{z^*\in C^+\backslash \{0\}} \left\{ z\in Z \mid \sigma ^{\vartriangle }_A\left( z^* \right) {-^{\negmedspace \centerdot \,}}\sigma ^{\vartriangle }_B\left( z^* \right) \le z^*\left( z \right) \right\} . $$

In particular, if \(A = \left\{ z \in Z \mid \sigma ^{\vartriangle }_A\left( z^* \right) \le z^*(z) \right\} \left( = A \oplus H^+(z^*) \right) \) for \(z^*\in C^+\backslash \{0\}\), then

$$\begin{aligned} A {-^{\negmedspace \centerdot \,}}B = \left\{ z \in Z \mid \sigma ^{\vartriangle }_A\left( z^* \right) {-^{\negmedspace \centerdot \,}}\sigma ^{\vartriangle }_B\left( z^* \right) \le z^*\left( z \right) \right\} . \end{aligned}$$

Moreover,

$$ \forall z^* \in C^+\backslash \{0\} :\sigma ^{\vartriangle }_{A {-^{\negmedspace \centerdot \,}}B}\left( z^* \right) \ge \sigma ^{\vartriangle }_A\left( z^* \right) {-^{\negmedspace \centerdot \,}}\sigma ^{\vartriangle }_B\left( z^* \right) $$

with equality if \(A = \left\{ z \in Z \mid \sigma ^{\vartriangle }_A\left( z^* \right) \le z^*(z) \right\} \left( = A \oplus H^+(z^*) \right) \).

Proof

See [89, Proposition 5.20] while recalling \(H^+(z^*) = \left\{ z \in Z \mid z^*\left( z \right) \ge 0 \right\} \) for \(z^* \in Z^*\).    \(\square \)

The following result can be seen as a “\(-inf = sup -\)” rule for the inf-residuation in \(\mathcal {G}(Z, C)\). It turns out to be useful later on.

Proposition 4.17

Let \(\mathcal A \subseteq \mathcal {G}\left( Z, C \right) \), \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) and \(H^+(z^*) = \left\{ z \in Z \mid z^*(z) \ge 0 \right\} \). Then

$$\begin{aligned} H^+(z^*) {-^{\negmedspace \centerdot \,}}\inf ~\mathcal A&= \sup _{A \in \mathcal A} \left[ H^+(z^*) {-^{\negmedspace \centerdot \,}}A \right] , \end{aligned}$$
(4.13)
$$\begin{aligned} H^+(z^*) {-^{\negmedspace \centerdot \,}}\sup \mathcal A&\supseteq \inf _{A \in \mathcal A} \left[ H^+(z^*) {-^{\negmedspace \centerdot \,}}A \right] . \end{aligned}$$
(4.14)

If \(A \oplus H^+(z^*) = A\) for all \(A \in \mathcal A\) then (4.14) is satisfied as an equation.

Proof

Formula (4.13) directly follows from

$$\begin{aligned}&H^+(z^*) {-^{\negmedspace \centerdot \,}}\inf ~\mathcal A = \left\{ z \in Z \mid \mathrm{{cl \,}}\mathrm{{co \,}}\bigcup _{A \in \mathcal A}A + z \subseteq H^+(z^*) \right\} \\&\qquad \qquad \qquad = \left\{ z \in Z \mid \forall A \in \mathcal A :A + z \subseteq H^+(z^*) \right\} = \bigcap _{A \in \mathcal A} \left\{ z \in Z \mid A + z \subseteq H^+(z^*) \right\} . \end{aligned}$$

The proof of (4.14) makes use of the fact \(B_1 \subseteq B_2\) \(\Leftrightarrow \) \(H^+(z^*) {-^{\negmedspace \centerdot \,}}B_2 \subseteq H^+(z^*) {-^{\negmedspace \centerdot \,}}B_1\). Applying it to \(B_1 = \bigcap _{A \in \mathcal A}A\) and \(B_2 = A\) we obtain (4.14). The equality case can be proven with the help of Lemma 4.14 and Proposition 4.16.   \(\square \)

A simple counterexample for equality in (4.14) is as follows: \(Z = \mathrm {I\negthinspace R}^2\), \(C = \mathrm {I\negthinspace R}^2_+\), \(\mathcal A = \left\{ A_1, A_2 \right\} \) with \(A_1 = \left( 1, 0 \right) ^T + \mathrm {I\negthinspace R}^2_+\), \(A_2 = \left( 0, 1 \right) ^T + \mathrm {I\negthinspace R}^2_+\) and \(z^* = \left( 1, 1 \right) ^T\). Both (4.13) and (4.14) are valid for more general sets than \(H^+(z^*)\), but this is not needed in the following.

The previous results establish a one-to-one relationship between \(\mathcal {G}\left( Z, C \right) \) and the set

$$ \Gamma \left( Z^*, C^+ \right) = \left\{ \sigma :C^+ \rightarrow \overline{\mathrm {I\negthinspace R}}\mid \; \sigma \; \text {is superlinear and has a closed hypograph} \right\} . $$

On \(\Gamma \left( Z^*, C^+ \right) \), we consider the pointwise addition \({+^{\negmedspace \centerdot \,}}\) and the pointwise multiplication with non-negative numbers. Finally, two elements of \(\Gamma \left( Z^*, C^+ \right) \) are compared pointwise, and we write \(\sigma \le \gamma \) whenever

$$ \forall z^* \in C^+ :\sigma \left( z^* \right) \le \gamma \left( z^* \right) . $$

The one-to-one relationship includes the algebraic structure as well as the order structure.

Proposition 4.18

The quadrupel \(\left( \Gamma \left( Z^*, C^+ \right) , \le , {+^{\negmedspace \centerdot \,}}, \cdot \right) \) is an inf-residuated conlinear space which is algebraically and order isomorphic to \(\left( \mathcal {G}\left( Z, C \right) , \supseteq , \oplus , \odot \right) \).

Proof

The formulas

$$ \sigma ^{\vartriangle }_A\left( z^* \right) = \inf _{a \in A}z^*\left( a \right) , \quad A^{\vartriangle }_\sigma = \bigcap _{z^* \in C^+\backslash \{0\}} \left\{ z \in Z \mid \sigma \left( z^* \right) \le z^*\left( z \right) \right\} $$

and

$$\begin{aligned} \sigma ^{\vartriangle }_{A^{\vartriangle }_\sigma } = \sigma , \quad A^{\vartriangle }_{\sigma ^{\vartriangle }_A} = A \end{aligned}$$
(4.15)

provide the relationship; the algebraic isomorphism is provided by

$$ \sigma ^{\vartriangle }_{A \oplus B} = \sigma ^{\vartriangle }_A {+^{\negmedspace \centerdot \,}}\sigma ^{\vartriangle }_B, \quad A^{\vartriangle }_\sigma \oplus A^{\vartriangle }_\gamma = A^{\vartriangle }_{\sigma {+^{\negmedspace \centerdot \,}}\gamma } $$

and for \(t \ge 0\)

$$ \sigma ^{\vartriangle }_{tA} = t \sigma ^{\vartriangle }_{A}, \quad tA^{\vartriangle }_\sigma = A^{\vartriangle }_{t\sigma }; $$

the order isomorphism is provided by

$$ A \supseteq B \quad \Leftrightarrow \quad \sigma ^{\vartriangle }_A \le \sigma ^{\vartriangle }_B $$

and (4.15).   \(\square \)

Corollary 4.19

Let \(\mathcal A \subseteq \mathcal {G}(Z, C)\). Then:

  1. (a)

    A set \(\mathcal B \subseteq \mathcal A\) generates the infimum of \(\mathcal A\) if, and only if,

    $$ \sigma ^{\vartriangle }_{\inf ~\mathcal B} = \sigma ^{\vartriangle }_{\inf ~\mathcal A}. $$
  2. (b)

    \(\bar{A} \in \mathcal A\) is minimal for \(\mathcal A\) if, and only if, \(\sigma ^{\vartriangle }_{\bar{A}}\) is a minimal element of

    $$ \left\{ \sigma ^{\vartriangle }_{A} \mid A \in \mathcal A \right\} $$

    with respect to the point-wise order in \(\Gamma \left( Z^*, C^+ \right) \),

Proof

This is an obvious consequence of the previous results.   \(\square \)

One may think that this straightforward result reduces \(\mathcal {G}(Z, C)\)-valued (= set-valued) problems to vector optimization problem since the functions \(\sigma ^{\vartriangle }_{A}\) could be considered as elements of some function space with point-wise order. Such an approach can be found in [115]. The problem with this point of view is that the functions \(\sigma ^{\vartriangle }_{A}\) may attain (and frequently do) the values \(-\infty \) and/or \(+\infty \). Therefore, the difficulty is conserved by passing from \(\mathcal {G}(Z, C)\) to \(\Gamma \left( Z^*, C^+ \right) \) since the latter is an ordered conlinear space which, in general, cannot be embedded into a linear space of functions.

We turn the above ideas into a scalarization concept for set-valued functions. Let X be a topological linear space and \(f :X \rightarrow \mathcal P\left( Z \right) \), \(z^* \in C^+\) be given. Define an extended real-valued function \(\varphi _{f, z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}= \mathrm {I\negthinspace R}\cup \left\{ \pm \infty \right\} \) by

$$\begin{aligned} \varphi _{f, z^*}\left( x \right) = \sigma ^{\vartriangle }_{f\left( x \right) }\left( z^* \right) = \inf _{z \in f\left( x \right) }z^*\left( z \right) . \end{aligned}$$
(4.16)

The new symbol \(\varphi _{f, z^*}\) is justified by the fact that we want to emphasize the dependence on x rather than on \(z^*\). From (4.7) we obtain the following “setification” formula: If \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) then

$$\begin{aligned} \forall x \in X :f\left( x \right) = \bigcap _{z^* \in C^+\backslash \{0\}} \left\{ z \in Z \mid \varphi _{f, z^*}\left( x \right) \le z^*\left( z \right) \right\} . \end{aligned}$$
(4.17)

Several important properties of \(\mathcal {G}\left( Z, C \right) \)-valued functions can equivalently be expressed using the family of its scalarizations \( \left\{ \varphi _{f, z^*} \right\} _{z^* \in C^+\backslash \{0\}}\). One may say that, according to formula (4.16), a \(\mathcal {G}\left( Z, C \right) \)-valued function is, as a mathematical object, equivalent to this family of extended real-valued functions.

Topological properties like closedness pose difficulties in this context since scalarizations of a closed \(\mathcal {G}\left( Z, C \right) \)-valued function are not necessarily closed. A simple example is as follows: The function \(f :\mathrm {I\negthinspace R}\rightarrow \mathcal {G}(\mathrm {I\negthinspace R}^2, \mathrm {I\negthinspace R}^2_+)\) defined by \(f(x) = \left\{ \left( \frac{1}{x}, 0 \right) ^T \right\} + \mathrm {I\negthinspace R}^2_+\) for \(x>0\) and \(f\left( x \right) = \emptyset \) for \(x \le 0\) is closed and convex, but \(\varphi _{f, z^*}\) for \(z^*=\left( 0,1 \right) ^T\) is convex, but not closed. Below, we will deal with this issue.

Lemma 4.20

Let \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) be a function. Then:

  1. (a)

    f is convex if, and only if, \(\varphi _{f, z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}\) is convex for all \(z^* \in C^+\backslash \{0\}\).

  2. (b)

    f is positively homogeneous if, and only if, \(\varphi _{f, z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}\) is positively homogeneous for all \(z^* \in C^+\backslash \{0\}\).

  3. (c)

    f is (sub)additive if, and only if, \(\varphi _{f, z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}\) is (sub)additive for all \(z^* \in C^+\backslash \{0\}\).

  4. (d)

    f is proper if, and only if, there is \(z^* \in C^+\backslash \{0\}\) such that \(\varphi _{f, z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}\) is proper.

  5. (e)

    \(\mathrm{{dom \,}}f = \mathrm{{dom \,}}\varphi _{f, z^*}\) for all \(z^* \in C^+\backslash \{0\}\).

Proof

(a) “\(\Rightarrow \)” Take \(t \in (0,1)\), \(x, y \in X\) and \(z^* \in C^+\backslash \{0\}\). Then

$$\begin{aligned} \varphi _{f, z^*}(tx + \left( 1-t \right) y)&= \inf _{z \in f(tx + \left( 1-t \right) y)}z^*\left( z \right) \le \inf _{z \in tf(tx) + \left( 1-t \right) f(y)}z^*\left( z \right) \\&= \inf _{u \in tf(x)}z^*\left( u \right) + \inf _{v \in \left( 1-t \right) f(y)}z^*\left( v \right) \\&= t\inf _{\frac{u}{t} \in f(x)}\frac{z^*\left( u \right) }{t} + \left( 1-t \right) \inf _{\frac{v}{\left( 1-t \right) } \in f(y)}\frac{z^*\left( v \right) }{\left( 1-t \right) }\\&= t\varphi _{f, z^*}(x) + \left( 1-t \right) \varphi _{f, z^*}(y) \end{aligned}$$

where the inequality is a consequence of the convexity of f.

\(\Leftarrow \)” By the way of contradiction, assume that f is not convex. Then there are \(t \in (0,1)\), \(x, y \in X\), \(z \in Z\) satisfying

$$ z \in tf(x) + \left( 1-t \right) f(y), \quad z \not \in f(tx + \left( 1-t \right) y). $$

Since the values of f are closed convex sets we can apply a separation theorem and obtain \(z^* \in C^+\backslash \{0\}\) such that

$$ z^*\left( z \right) < \varphi _{f, z^*}(tx + \left( 1-t \right) y) \le t\varphi _{f, z^*}(x) + \left( 1-t \right) \varphi _{f, z^*}(y) $$

where the second inequality is a consequence of the convexity of the scalarizations. Since f maps into \(\mathcal {G}\left( Z, C \right) \), \(z^* \in C^+ \left\{ 0 \right\} \). Since \(z \in tf(x) + \left( 1-t \right) f(y)\) there are \(u \in f\left( x \right) \) and \(v \in f(y)\) such that \(z = tu + (1-t)v\). Hence

$$ z^*\left( z \right) = tz^*\left( u \right) + (1-t)z^*\left( v \right) \ge t\varphi _{f, z^*}(x) + (1-t)\varphi _{f, z^*}(y) $$

by definition of the scalarization. This contradicts the strict inequality above.

(c) If f is (sub)additive, then (sub)additivity of the scalarizations \(\varphi _{f, z^*}\) follows from (4.8). The converse can be proven using the same separation idea as in the proof of (a).

The remaining claims are straightforward.    \(\square \)

Finally, we link closedness and semicontinuity of \(\mathcal G(Z,C)\)-valued functions to corresponding properties of their scalarizations. The main result is Theorem 4.22 below which shows that a proper closed and convex set-valued function and the family of its proper closed and convex scalarizations are equivalent as mathematical objects. We start with a characterization of the lattice-limit inferior in terms of scalarizations.

Corollary 4.21

Let \(f :X \rightarrow \mathcal {G}(Z,C)\) and \(\bar{x} \in \mathrm{{dom \,}}f\) such that f is lattice-l.s.c. at \(\bar{x}\). Then

$$ \liminf _{x \rightarrow \bar{x}}f(x) = \left\{ z \in Z \mid \forall z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} :\liminf _{x \rightarrow \bar{x}}\varphi _{f, z^*}(x) \le z^*(z) \right\} . $$

Proof

Observing \(\varphi _{f,z^*}(x) = \sigma ^{\vartriangle }_{f(x)}(z^*)\) for each \(x \in X\) and applying Lemma 4.14 we obtain

$$\begin{aligned} \sup \limits _{U \in \mathcal N_X}\inf \limits _{x \in \bar{x} + U}f(x)&= \bigcap \limits _{z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} } \left\{ z \in Z \mid \sup _{U \in \mathcal U} \sigma ^{\vartriangle }_{\inf \limits _{x \in \bar{x} + U}f(x)}(z^*) \le z^*(z) \right\} \\&= \bigcap \limits _{z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} } \left\{ z \in Z \mid \sup _{U \in \mathcal U} \inf _{x \in \bar{x} + U}\sigma ^{\vartriangle }_{f(x)}(z^*) \le z^*(z) \right\} \\&= \bigcap \limits _{z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} } \left\{ z \in Z \mid \liminf \limits _{x \rightarrow \bar{x}}\varphi _{f,z^*}(x) \le z^*(z) \right\} . \end{aligned}$$

Indeed, the first equality follows from the last equation in Lemma 4.14 applied to \(\mathcal A = \left\{ \inf _{x \in \bar{x} + U}f(x) \mid U \in \mathcal U \right\} \) whereas the second follows from the first equation in Lemma 4.14 applied to \(\mathcal A = \left\{ f(x) \mid x \in \bar{x} + U \right\} \) for \(U \in \mathcal U\).    \(\square \)

Theorem 4.22

Let \(f :X \rightarrow \mathcal {F}(Z,C)\) be a function and \(\mathrm{{dom \,}}f \ne \emptyset \). Then f is closed, convex and either constant Z or proper, if and only if,

$$\begin{aligned} \forall x \in X :f(x) =\bigcap \limits _{\begin{array}{c} z^*\in C^+\backslash \{0\}\\ \mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}\; \text {is proper} \end{array}} \left\{ z\in Z \mid \mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*}(x)\le z^*(z) \right\} , \end{aligned}$$
(4.18)

where \(\mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*}\) denotes the lower semi-continuous convex hull of \(\varphi _{f,z^*}\) defined by

$$ \mathrm{{epi \,}}\left( \mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*} \right) = \mathrm{{cl \,}}\mathrm{{co \,}}\left( \mathrm{{epi \,}}\varphi _{f,z^*} \right) . $$

Proof

If the set \( \left\{ z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} \mid \mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}\; \text {is proper} \right\} \) is empty, then (4.18) produces \(f(x)=Z\) for all \(x \in X\) since \(\mathrm{{dom \,}}f \ne \emptyset \). On the other hand, \(f(x)=Z\) for all \(x \in X\) implies the emptyness of the same set, hence (4.18) is satisfied in this case.

The graphs of \(x \mapsto \left\{ z\in Z \mid \mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*}(x)\le z^*(z) \right\} \) are closed convex sets in \(X \times Z\), and \( \left\{ z \in Z \mid \mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*}(x)\le z^*(z) \right\} \ne Z\) for all \(x \in X\) is true whenever \(\mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*}\) is proper. Thus, (4.18) implies f is closed, convex and either proper or constantly equal to Z.

On the other hand, assume f is closed, convex and proper. Then

$$ \forall x \in X :f(x) = \liminf \limits _{y \rightarrow x}f(y) \ne Z, $$

and all scalarizations are convex. Corollary 4.21 yields

$$ f(x) = \sup \limits _{U \in \mathcal U}\inf \limits _{x \in \bar{x} + U}f(x) = \bigcap \limits _{z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} } \left\{ z \in Z \mid \mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*}(x) \le z^*(z) \right\} . $$

If \(\mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*}\) is improper, then \( \left\{ z \in Z \mid \mathrm{{cl \,}}\mathrm{{co \,}}\varphi _{f,z^*}(x) \le z^*(z) \right\} = Z\) for all \(x \in \mathrm{{dom \,}}f = \mathrm{{dom \,}}\varphi _{f,z^*}\) (see [235, Proposition 2.2.5]), hence these scalarizations can be omitted from the intersection. This completes the proof.    \(\square \)

We state a few more facts about relationships between semicontinuity properties of set-valued functions and their scalarizations.

Proposition 4.23

  1. (a)

    If \(f :X \rightarrow \mathcal {F}(Z,C)\) is lattice-u.s.c. at \(\bar{x} \in X\), then \(\varphi _{f,z^*} : X \rightarrow \overline{\mathrm {I\negthinspace R}}\) is u.s.c. at \(\bar{x}\) for all \(z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} \).

  2. (b)

    If \(f :X \rightarrow \mathcal {G}(Z,C)\) is such that \(\varphi _{f,z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}\) is l.s.c. at \(\bar{x} \in X\) for all \(z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} \), then f is lattice-l.s.c. at \(\bar{x}\).

Proof

  1. (a)

    Define \(A(\bar{x}) = \limsup \limits _{x \rightarrow \bar{x}}f(x) = \mathrm{{cl \,}}\bigcup \limits _{U\in \mathcal N_X}\bigcap \limits _{x \in \bar{x} + U}f(x)\) and take \(z^* \in C^+\backslash \{0\}\). By assumption,

    $$ \varphi _{f,z^*}(\bar{x}) \ge \sigma ^{\vartriangle }_{A(\bar{x})}\left( z^* \right) . $$

    By a successive application of the first and the second relation of Lemma 4.14,

    $$ \sigma ^{\vartriangle }_{A(\bar{x})}\left( z^* \right) \ge \inf \limits _{U\in \mathcal N_X}\sup \limits _{x \in \bar{x} + U} \varphi _{f, z^*}(x). $$

    This verifies the upper semicontinuity of the scalarizations.

  2. (b)

    From Corollary 4.21, the lower semicontinuity of the \(\varphi _{f,z^*}\)’s and (4.17) we obtain

    $$\begin{aligned} \liminf \limits _{x \rightarrow \bar{x}}f(x)&= \bigcap \limits _{z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} } \left\{ z \in Z \mid \liminf \limits _{x \rightarrow \bar{x}}\varphi _{f,z^*}(x) \le z^*(z) \right\} \\&\subseteq \bigcap \limits _{z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} } \left\{ z \in Z \mid \varphi _{f,z^*}(\bar{x}) \le z^*(z) \right\} = f(\bar{x}) \end{aligned}$$

    which means that f is lattice-l.s.c. at \(\bar{x}\).

   \(\square \)

Corollary 4.24

If \(f :X \rightarrow \mathcal {F}(Z,C)\) is convex and lattice-u.s.c. at \(\bar{x} \in \mathrm{{dom \,}}f\), then each scalarization \(\varphi _{f,z^*}\) is continuous at \(\bar{x}\) and f is lattice-l.s.c. at \(\bar{x}\). Moreover, in this case f also is lattice-u.s.c. and -l.s.c. at \(\bar{x}\) as a function into \(\mathcal {G}(Z,C)\).

Proof

By Proposition 4.23 (a), \(\varphi _{f,z^*}\) is u.s.c. at \(\bar{x}\) for each \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) (and also convex by Lemma 4.20 (a)). Well-known results about extended real-valued convex functions [235, Theorem 2.2.9] imply that \(\varphi _{f,z^*}\) for each \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) is continuous which in turn yields that f is lattice-l.s.c. at \(\bar{x}\) by 4.23 (b). The last claim follows from Proposition 4.13.    \(\square \)

Corollary 4.25

Let \(f :X \rightarrow \mathcal {G}(Z,C)\) be a convex function and \(\bar{x} \in X\) such that there exists a \(\bar{z} \in Z\) with \((\bar{x}, \bar{z}) \in \mathrm{{int\,}}(\mathrm{{graph \,}}f)\). Then \(\varphi _{f,z^*} :X \rightarrow \overline{\mathrm {I\negthinspace R}}\) is continuous on \(\emptyset \ne \mathrm{{int\,}}(\mathrm{{dom \,}}f)\) for all \(z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} \).

Proof

If \((\bar{x}, \bar{z}) \in \mathrm{{int\,}}(\mathrm{{graph \,}}f)\) then \(\varphi _{f,z^*}\) is bounded from above by \(z^*(\bar{z})\) on a neighborhood of \(\bar{x}\), thus continuous on \(\emptyset \ne \mathrm{{int\,}}(\mathrm{{dom \,}}f)\) for all \(z^*\in C^+\backslash \negthinspace \left\{ 0 \right\} \) again by [235, Theorem 2.2.9].    \(\square \)

4.3 Comments on Convexity, Semicontinuity and Scalarization

The properties which are called lattice-lower and lattice-upper semicontinuity can already be found in the 1978 paper [142]. Note that in this survey, for obvious reasons, ‘upper’ and ‘lower’ are swapped compared to [142]. Therein, the result of Proposition 4.9 is even referenced to a paper by Choquet from 1947.

Level-closedness features in [54, 55] as ‘D-lower semi-continuity’ and ‘C-lower semi-continuity’, respectively: Proposition 2.3 in [54] states the equivalence of (epi)closedness and level-closedness whenever the cone has a non-empty interior. The assumption “pointedness of the cone” and a compactness assumption used in [54] are not necessary, the latter already removed in [55, Proposition 3.1]. Compare also [176].

Of course, the lattice semicontinuity concepts of this survey differ from the definitions of lower and upper semicontinuity as used, for example, in [4, Definitions 1.4.1 and 1.4.2]. This is one reason why lower and upper continuity replace lower and upper semicontinuity, respectively, in [67]. We refer to Sect. 2.5 of [67] for a survey about continuity concepts of set-valued functions and also a few bibliographical remarks at the end of the section.

For a more detailed discussion of (semi)continuity concepts for set-valued functions, compare [102, 104, 151]: Whereas Corollary 4.21 seems to be new in this form, Proposition 4.23 appears in [104] with a (slightly) different proof.

The scalarization approach via (4.16) (and Lemma 4.20) has many contributors. Motivated by economic applications, Shephard used it in [209], compare, for example, the definition of the ‘factor minimal cost function’ [209, p. 226, (97)] and Proposition 72 on the following page where essentially Lemma 4.20 (a) is stated. Moreover, the first part of Proposition 4.2 corresponds to [209, Appendix 2, Proposition 3]. Rockafellar baptized these functions Kuhn-Tucker functions in his 1967 monograph [195, Definition 2 on p. 17] where they were used as an auxiliary tool for representing the then new “convex processes.” Compare also the relationship to the orders \(\supseteq \) for sets of ‘convex type’ and \(\subseteq \) for sets of ‘concave type,’ [195, p. 16].

Pshenichnyi [194, Lemma 1] also used the functions \(\varphi _{f, z^*}\) and proved Lemma 4.20 (a), see also [12]. Another reference is [45, Proposition 1.6]. In [169] as well as in [170] continuity concepts for set-valued functions are discussed using the \(\varphi _{f,z^*}\)-functions as essential tool. See also [9, Proposition 2.1] and the more recent [68, p. 188] (see also the references therein).

Theorem 4.22 has been established in [207, 208] and is the basis for the scalarization approach to convex duality results for set-valued functions. Together with the “setification” formula (4.17) it basically tells us that one can either deal with the \(\mathcal {G}(Z,C)\)-valued function or a whole family of scalar functions, and both approaches are equivalent in the sense that major (convex duality) results can be expressed and proven either way: Using the “set calculus” or “scalarizations.” The reader may compare the two different proofs for the Lagrange duality theorem in [86].

Finally, we mention that an alternative scalarization approach to (convex as well as non-convex) problems is based on directionally translative extended real-valued functions which are used in many areas of mathematics and prominently in vector optimization, see [67, Sect. 2.3]. To the best of our knowledge, [83] (eventually published as [85]) was the first generalization to set-valued problems. Subsequent applications of this construction include [2, 72, 96, 164, 181183, 229].

5 Set-Valued Convex Analysis

What is convex analysis? A core content of this theory could be described as follows: Define affine minorants, directional derivatives, (Fenchel) conjugates and subdifferentials for convex functions and relate them by means of a Fenchel-Moreau type theorem, a max-formula and Young-Fenchel’s inequality as an equation. How can one establish such a theory for set-valued convex functions? In this section, we will define appropriate “dual variables” for the set-valued framework, define “affine minorants” of set-valued functions and introduce corresponding Fenchel conjugates, directional derivatives and subdifferentials. The difference in expressions involved in these constructions for scalar functions will be replaced by a residuation.

In the following, we assume that X and Z are non-trivial, locally convex, topological linear spaces with topological duals \(X^*\) and \(Z^*\), respectively. As before, \(C \subseteq Z\) is a convex cone with \(0 \in C\), and \(C^+ = \left\{ z^* \in Z^* \mid \forall z \in C :z^*\left( z \right) \ge 0 \right\} \) is its positive (topological) dual.

5.1 Conlinear Functions

What is an appropriate replacement for the dual variables \(x^* :X \rightarrow \mathrm {I\negthinspace R}\) in scalar convex analysis? A good guess might be to use linear operators \(T :X \rightarrow Z\) instead of linear functionals in expressions like

$$ f^*\left( x^* \right) = \sup _{x \in X} \left\{ x^*\left( x \right) - f\left( x \right) \right\} . $$

This has been done in most references about duality for vector/set optimization problems. A notable exception is the definition of the coderivative of set-valued functions due to B. S. Mordukhovich which goes back to [171] and can be found in [172, Sect. 2]. Coderivatives at points of the graph are defined as sets of \(x^*\)’s depending on an element \(z^* \in Z^*\). Another exception is the use of “rank one” operators of the form \(\hat{z}x^*\) whose existence can be proven using classical separation results, compare [32, Proof of Theorem 4.1] and [98, Theorem 4.1] for an older and a more recent example. The constructions in [231] are also based on this idea.

Another attempt to find set-valued analogues of linear functions is the theory of convex processes. See [4, p. 55] in which the authors state that ‘it is quite natural to regard set-valued maps, with closed convex cones as their graphs, as these set-valued analogues.’

In our approach, a class of set-valued functions will be utilized the members of which almost behave like linear functions. In some sense (see Proposition 8 in [78]), they are more general than linear operators and also than linear processes as defined in [4, p. 55], and on the other hand, they form a particular class of convex processes. In fact, these functions are characterized by the fact that their graphs are homogeneous closed half spaces in \(X \times Z\).

Let \(x^* \in X^*\) and \(z^* \in Z^*\) be given. Define a function \(S_{\left( x^*, z^* \right) } :X \rightarrow \mathcal P\left( Z \right) \) by

$$ S_{\left( x^*, z^* \right) }\left( x \right) = \left\{ z \in Z \mid x^*\left( x \right) \le z^*\left( z \right) \right\} . $$

The next result shows that these functions are indeed as “linear” as one can hope for.

Proposition 5.1

Let \(\left( x^*, z^* \right) \in X^* \times Z^*\backslash \{0\}\). Then

  1. (a)

    for all \(x \in X\) and for all \(t > 0\)

    $$ S_{\left( x^*, z^* \right) }\left( tx \right) = tS_{\left( x^*, z^* \right) }\left( x \right) ; $$
  2. (b)

    for all \(x_1, x_2 \in X\)

    $$ S_{\left( x^*, z^* \right) }\left( x_1 + x_2 \right) = S_{\left( x^*, z^* \right) }\left( x_1 \right) + S_{\left( x^*, z^* \right) }\left( x_2 \right) , $$

    in particular

    $$ S_{\left( x^*, z^* \right) }\left( x \right) + S_{\left( x^*, z^* \right) }\left( -x \right) = S_{\left( x^*, z^* \right) }\left( 0 \right) = H^+(z^*); $$
  3. (c)

    \(S_{\left( x^*, z^* \right) }\) maps into \(\mathcal {G}\left( Z, C \right) \), hence in particular into \(\mathcal {P}\left( Z,C \right) \), if, and only if, \(z^* \in C^+\);

  4. (d)

    \(S_{\left( x^*, z^* \right) }\left( x \right) \) is a closed half space with normal \(z^*\) if, and only if, \(z^* \ne 0\); and \(S_{\left( x^*, 0 \right) }\left( x \right) \in \left\{ Z, \emptyset \right\} \);

  5. (e)

    if \(\widehat{z} \in Z\) such that \(z^*\left( \widehat{z} \right) = -1\) then

    $$\begin{aligned} \forall x \in X :S_{\left( x^*, z^* \right) }\left( x \right) = x^*\left( x \right) \widehat{z} + S_{\left( x^*, z^* \right) }\left( 0 \right) = x^*\left( x \right) \widehat{z} + H^+(z^*). \end{aligned}$$
    (5.1)

Proof

Elementary, see, for instance, [78].   \(\square \)

A function of the type \(S_{\left( x^*, z^* \right) }\) is called conlinear. It will turn out that convex analysis is a “conlinear” theory–not because convex functions are not linear, but because the image space of a convex function is a conlinear space and all properties of linear functions necessary for the theory are only the “conlinear” ones from the previous proposition. The following result gives a characterization of the class of conlinear functions in the class of all positively homogeneous and additive set-valued functions.

Theorem 5.2

Let \(f :X \rightarrow \mathcal {G}(Z, C)\) be a function. Then, the following are equivalent:

  1. (a)

    \(\exists \left( x^*, z^* \right) \in X^* \times C^+\backslash \{0\}\), \(\forall x \in X\): \(f(x) = S_{(x^*, z^*)}(x)\).

  2. (b)

    \(\mathrm{{graph \,}}f\) is a closed homogeneous half-space of \(X \times Z\) and \(f(0) \ne Z\).

  3. (c)

    f is positively homogeneous, additive, lattice-l.s.c. at \(0 \in X\) and \(f(0) \subseteq Z\) is a non-trivial, closed homogeneous half-space.

Proof

(a) \(\Rightarrow \) (b), (c): Straightforward.

(b) \(\Rightarrow \) (a): \(\mathrm{{graph \,}}f\) is a closed homogenous half-space if, and only if,

$$ \exists \left( x^*, z^* \right) \in X^* \times Z^*\backslash \{(0,0)\} :\mathrm{{graph \,}}F = \left\{ \left( x,z \right) \in X \times Z \mid x^*(x) - z^*(z) \le 0 \right\} . $$

This implies

$$ \forall x \in X :f(x) = \left\{ z \in Z \mid x^*(x) \le z^*(z) \right\} = S_{(x^*, z^*)}(x). $$

Since \(f(0) \ne Z\) and f maps into \(\mathcal {G}(Z, C)\), \(z^* \in C^+\backslash \{0\}\). By Proposition 5.1 (b), f is additive.

(c) \(\Rightarrow \) (a): By assumption, \(f\left( 0 \right) = H^+(z^*_0) = \left\{ z \in Z \mid z^*_0\left( z \right) \ge 0 \right\} \) for some \(z^*_0 \in C^+\backslash \{0\}\). By additivity, \(f\left( 0 \right) = H^+(z^*_0) = f\left( x \right) \oplus f\left( -x \right) \) for all \(x \in X\), hence \(f\left( x \right) \) is never \(\emptyset \) nor Z. Moreover, additivity implies \(f\left( x \right) = f\left( x + 0 \right) = f\left( x \right) \oplus f\left( 0 \right) = f\left( x \right) \oplus H^+(z^*_0)\) for each \(x \in X\). This means that every value \(f\left( x \right) \) is a closed half space with normal \(z^*_0\).

Next, we use (4.17) which reads

$$ \forall x \in X :f\left( x \right) = \bigcap _{z^* \in C^+\backslash \{0\}} \left\{ z \in Z \mid \varphi _{f, z^*}\left( x \right) \le z^*(z) \right\} . $$

Since every value \(f\left( x \right) \) is a half space with normal \(z^*_0\) the intersection in the above formula can be replaced just by \( \left\{ z \in Z \mid \varphi _{f, z^*_0}\left( x \right) \le z^*_0(z) \right\} \).

We shall show that \(\varphi _{f, z^*_0}\) is linear. By Lemma 4.20 (b) and (c) it is additive because f is additive, and \(\varphi _{f, z^*_0}(tx)=t\varphi _{f, z^*}\) for \(t\ge 0\), so it remains to show this for \(t<0\) in order to prove homogeneity. Indeed,

$$ 0 = \varphi _{f, z^*_0}\left( 0 \right) = \inf _{z \in f\left( x \right) \oplus f\left( -x \right) }z^*_0(z) = \inf _{z_1 \in f\left( x \right) } z^*_0(z_1) + \inf _{z_2 \in f\left( -x \right) }z^*_0(z_2) = \varphi _{f, z^*_0}\left( x \right) + \varphi _{f, z^*_0}\left( -x \right) , $$

which gives us

$$ \forall t<0 :\varphi _{f, z^*_0}(tx)=\varphi _{f, z^*_0}(-|t|x)=\left| t \right| \varphi _{f, z^*_0}(-x)=-\left| t \right| \varphi _{f, z^*_0}(x)=t\varphi _{f, z^*_0}(x). $$

Therefore, \(\varphi _{f, z^*_0}\) is a linear function and can be identified with some \(x' \in X'\), the algebraic dual of X. Since f is lower semicontinuous at \(0 \in X\), Corollary 4.21 with \(\bar{x} = 0\) yields

$$ \liminf _{x \rightarrow 0}f(x) = \left\{ z \in Z \mid \forall z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} :\liminf _{x \rightarrow 0}x'(x) \le z^*(z) \right\} . $$

If \(x'\) is not continuous then it is not bounded (from below) on every neighborhood \(U \in \mathcal N_X\). Thus,

$$ \forall U \in \mathcal N_X :\inf _{x \in U} x'(x) = -\infty , $$

hence

$$ \liminf _{x \rightarrow 0}x'(x) = \sup _{U \in \mathcal U} \inf _{x \in U}x'(x) = -\infty $$

and consequently \(Z = \liminf _{x \rightarrow 0}f(x)\) which contradicts \(f(0) = H^+(z^*_0) \supseteq \liminf _{x \rightarrow 0}f(x)\). Hence, there is \(x^* \in X^*\) such that \(x^*(x) = \varphi _{f, z^*_0}\left( x \right) \) for all \(x \in X\).    \(\square \)

The basic idea for the development of a set-valued convex analysis simply is as follows: Replace the extended reals by \(\mathcal {G}(Z, C)\), \(\le \) by \(\supseteq \), use the inf/sup-formulas from Proposition 2.2, replace continuous linear functionals by conlinear functions and the difference by inf-residuation. We start the program with conjugates.

5.2 Fenchel Conjugates of Set-Valued Functions

A crucial observation concerning Fenchel conjugates for extended real-valued functions \(\varphi :X \rightarrow \mathrm {I\negthinspace R}\cup \left\{ \pm \infty \right\} \) is as follows:

$$ r \ge \varphi ^*\left( x^* \right) \quad \Leftrightarrow \quad \forall x \in X :x^*\left( x \right) - r \le \varphi \left( x \right) . $$

This means, \(x^*\) belongs to the domain of \(\varphi ^*\) precisely if there is an affine minorant of \(\varphi \) with “slope” \(x^*\). Replacing \(x^*\) by \(S_{\left( x^*, z^* \right) }\), \(\le \) by \(\supseteq \) and recalling (2.5) we obtain

$$\begin{aligned} \forall x \in X :S_{\left( x^*, z^* \right) }\left( x \right) - z \supseteq f\left( x \right) \quad&\Leftrightarrow \quad \forall x \in X :f\left( x \right) + z \subseteq S_{\left( x^*, z^* \right) }\left( x \right) \\&\Leftrightarrow \quad \forall x \in X :z \in S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) \\&\Leftrightarrow \quad z \in \bigcap _{x \in X} \left\{ S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) \right\} . \end{aligned}$$

The function \(x \mapsto S_{\left( x^*, z^* \right) }\left( x \right) - z\) is called an affine minorant of f precisely if the above (equivalent) conditions are satisfied. This discussion may justify the following definition.

Definition 5.3

The Fenchel conjugate of the function \(f :X \rightarrow \mathcal P\left( Z, C \right) \) is \(f^* : X^* \times C^+\backslash \{0\} \rightarrow \mathcal P\left( Z, C \right) \) defined by

$$ f^*\left( x^*, z^* \right) = \sup _{x \in X} \left\{ S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) \right\} = \bigcap _{x \in X} \left\{ S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) \right\} . $$

The biconjugate of f is \(f^{**} :X \rightarrow \mathcal P\left( Z, C \right) \) defined by

$$\begin{aligned} f^{**}\left( x \right)&= \sup _{x^* \in X^*, \, z^* \in C^+\backslash \{0\}} \left\{ S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f^*\left( x^*, z^* \right) \right\} \\&= \bigcap _{x^* \in X^*, \, z^* \in C^+\backslash \{0\}}\left( S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f^*\left( x^*, z^* \right) \right) . \end{aligned}$$

The Fenchel conjugate defined above shares most properties with her scalar little sister.

Proposition 5.4

Let \(f, g :X \rightarrow \mathcal P\left( Z, C \right) \) be two functions. Then

  1. (a)

    \(f \supseteq g\) \(\Rightarrow \) \(g^* \supseteq f^*\).

  2. (b)

    \(f^*\) maps into \(\mathcal {G}\left( Z, C \right) \), and each value of \(f^*\) is a closed half space with normal \(z^*\), or \(\emptyset \), or Z.

  3. (c)

    \(f^{**} \supseteq f\) and \(f^{**}\) is a proper closed convex function into \(\mathcal {G}\left( Z, C \right) \), or \(\equiv Z\), or \(\equiv \emptyset \).

  4. (d)

    \(\left( f^{**} \right) ^* = f^*\).

  5. (e)

    For all \(x \in X\), \(x^* \in X^*\), \(z^* \in C^+\backslash \{0\}\),

    $$ f^*\left( x^*, z^* \right) \subseteq S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) \; \Leftrightarrow \; f^*\left( x^*, z^* \right) + f\left( x \right) \subseteq S_{\left( x^*, z^* \right) }\left( x \right) . $$

Proof

The equivalence in (e) follows from the definition of \({-^{\negmedspace \centerdot \,}}\). The other relationships can be found in [78, 207, 208].   \(\square \)

Remark 5.5

In [78], the “negative conjugate”

$$ (-f^*)\left( x^*, z^* \right) = \inf _{x \in X} \left\{ f\left( x \right) \oplus S_{\left( x^*, z^* \right) }\left( -x \right) \right\} = \mathrm{{cl \,}}\bigcup _{x \in X} \left\{ f\left( x \right) \oplus S_{\left( x^*, z^* \right) }\left( -x \right) \right\} $$

has been introduced which avoids the residuation. The transition from \(f^*\) to \(-f^*\) and vice versa can be done via

$$ (-f^*)\left( x^*, z^* \right) = H^+\left( z^* \right) {-^{\negmedspace \centerdot \,}}f^*\left( x^*, z^* \right) , \quad f^*\left( x^*, z^* \right) = H^+\left( z^* \right) {-^{\negmedspace \centerdot \,}}(-f^*)\left( x^*, z^* \right) $$

using Proposition 4.17. Sometimes, it even seems to be more natural to work with \(-f^*\), for example, when it comes to Fenchel-Rockafellar duality results as presented in [79].

Set-valued conjugates can be expressed using the (scalar) conjugates of the scalarizing functions.

Lemma 5.6

If \(f :X \rightarrow \mathcal P\left( Z, C \right) \), then

$$\begin{aligned} \forall x^* \in X^*, \; \forall z^* \in C^+\backslash \negthinspace \left\{ 0 \right\}&:f^*\left( x^*, z^* \right) = \left\{ z \in Z \mid \left( \varphi _{f, z^*} \right) ^*\left( x^* \right) \le z^*(z) \right\} , \end{aligned}$$
(5.2)
$$\begin{aligned} \forall x \in X&:f^{**}\left( x \right) = \bigcap _{z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} }\negthinspace \left\{ z \in Z \mid \left( \varphi _{f, z^*} \right) ^{**}\left( x \right) \le z^*(z) \right\} . \end{aligned}$$
(5.3)

Proof

The first formula is a consequence of the definitions, the second follows from \(\left( \varphi _{f, z^*} \right) ^{**} = \left( \varphi _{f^{**}, z^*} \right) ^{**}\) and Theorem 4.22.   \(\square \)

Remark 5.7

Conversely, \(\varphi _{f^*\left( \cdot , z^* \right) , z^*} = \left( \varphi _{f, z^*} \right) ^*\) is true (see [208, Proposition 4.2] and [86, Lemma 5.1]. On the other hand, \(\varphi _{f^{**}, z^*}\) does not always coincide with \(\left( \varphi _{f, z^*} \right) ^{**}\) since the latter is a closed function which is not true for the former even if f is proper closed convex (see the example before Lemma 4.20).

The following result is a set-valued version of the famous Fenchel-Moreau theorem. Note that the additional dual variable \(z^*\) disappears via the definition of the biconjugate.

Theorem 5.8

Let \(f :X \rightarrow \mathcal P\left( Z, C \right) \) be a function. Then \(f = f^{**}\) if, and only if, f is proper closed and convex, or identically Z, or identically \(\emptyset \).

Proof

This follows from Theorem 4.22, Lemma 5.6 and the classical Fenchel-Moreau theorem for scalar functions, see, for example, [235, Theorem 2.3.3].   \(\square \)

Remark 5.9

Another, more direct way to prove Theorem 5.8 consists in applying the basic convex duality relationship ‘every closed convex set is the intersection of closed half spaces containing it’ to the graph of f (such half spaces are generated by pairs \((x^*, z^*) \in X^* \times C^+\)), making sure that one can do without \(z^* = 0\) and converting the result into formulas involving the \(S_{\left( x^*, z^* \right) }\)-functions. In this way, the scalar Fenchel-Moreau theorem is obtained as a special case. See [78] for details.

To conclude this section, we point out that the Fenchel conjugate does not distinct between a function \(f :X \rightarrow \mathcal P\left( Z, C \right) \) and the function

$$ \tilde{f}\left( x \right) = \mathrm{{cl \,}}\mathrm{{co \,}}f\left( x \right) ; $$

we have \(\tilde{f}^* = f^*\) since (compare Proposition 2.9)

$$\begin{aligned}&\forall x \in X :S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) = \left\{ z \in Z \mid f\left( x \right) + z \subseteq S_{\left( x^*, z^* \right) }\left( x \right) \right\} \\&\qquad \qquad \qquad \qquad \, = \left\{ z \in Z \mid \mathrm{{cl \,}}\mathrm{{co \,}}f\left( x \right) + z \subseteq S_{\left( x^*, z^* \right) }\left( x \right) \right\} = S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}\tilde{f}\left( x \right) . \end{aligned}$$

The function \(\tilde{f}\) maps into \(\mathcal {G}\left( Z, C \right) \). The above relationship means that when it comes to Fenchel conjugates it does not make a difference to start with a \(\mathcal {G}\left( Z, C \right) \)-valued function.

Under additional assumptions, the formulas for (bi)conjugates can be simplified. One such assumption is as follows: There is an element \(\hat{z} \in C\backslash \left\{ 0 \right\} \) such that

$$ \forall z^* \in C^+\backslash \{0\} :z^*\left( \hat{z} \right) > 0. $$

In this case, the set \(B^+(\hat{z}) = \left\{ z^* \in C^+ \mid z^*\left( \hat{z} \right) = 1 \right\} \) is a base of \(C^+\) with \(0 \not \in \mathrm{{cl \,}}B^+(\hat{z})\). That is, for each \(z^* \in C^+\backslash \{0\}\) there is a unique representation \(z^* = tz^*_0\) with \(t>0\) and \(z^*_0 \in B^+(\hat{z}) \). Compare [67], Definition 2.1.14, Theorems 2.1.15 and 2.2.12 applied to \(C^+\) instead of C. Clearly, a pointed closed convex cone with non-empty interior has a base, but, for example, the cone \(L^2_+\) has an empty interior, but a base is generated by the constant 1 function.

The very definition of the functions \(S_{\left( x^*, z^* \right) }\) gives

$$ \left\{ S_{\left( x^*, z^* \right) } \mid x^* \in X^*, \; z^* \in C^+\backslash \{0\} \right\} = \left\{ S_{\left( x^*, z^* \right) } \mid x^* \in X^*, \; z^* \in B^+(\hat{z}) \right\} . $$

Therefore, it is sufficient to run an intersection like in the definition of \(f^{**}\) over \(x^* \in X^*\) and \(z^* \in B^+(\hat{z})\). Moreover, one easily checks (see also Proposition 5.1 (e)) for \(z^* \in B^+(\hat{z})\)

$$ \forall x \in X :S_{\left( x^*, z^* \right) }\left( x \right) = \left\{ x^*\left( x \right) \hat{z} \right\} + H^+(z^*). $$

Thus, the negative conjugate of a function \(f :X \rightarrow \mathcal {P}\left( Z, C \right) \) can be written as

$$ (-f^*)\left( x^*, z^* \right) = \mathrm{{cl \,}}\bigcup _{x \in X} \left[ f\left( x \right) - x^*\left( x \right) \hat{z} + H^+(z^*) \right] = \mathrm{{cl \,}}\bigcup _{x \in X} \left[ f\left( x \right) - x^*\left( x \right) \hat{z} \right] \oplus H^+(z^*). $$

The part which does not depend on \(z^*\) (remember \(\hat{z}\) defines a base of \(C^+\) and is the same for all \(z^* \in C^+\backslash \{0\}\)) has been used in [150, 155] for the definition of another set-valued conjugate, namely

$$ (-f^*_{\hat{z}})\left( x^* \right) = \mathrm{{cl \,}}\bigcup _{x \in X} \left[ f\left( x \right) - x^*\left( x \right) \hat{z} \right] . $$

In particular, if \(Z = \mathrm {I\negthinspace R}\), \(C = \mathrm {I\negthinspace R}_+\), then \(C^+ = \mathrm {I\negthinspace R}_+\), and \( \left\{ 1 \right\} \) is a base of \(C^+\), thus the intersection over the \(z^*\)’s disappears from the definition of \(f^{**}\) and formulas like (5.3).

5.3 Directional Derivatives

Usually, derivatives for set-valued functions are defined at points of their graphs as for example in [4, Chap. 5] and [114, Chap. 5]. Here, we use the inf-residuation in order to define a “difference quotient” (which could be called “residuation quotient”) and take “lattice limits.” This leads to the concept of a lower Dini directional derivative for \(\mathcal {G}(Z, C)\)-valued functions as introduced in [36].

Definition 5.10

The lower Dini directional derivative of a function \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) with respect to \(z^*\in C^+\backslash \{0\}\) at \(\bar{x} \in X\) in direction \(x \in X\) is defined to be

$$\begin{aligned} f_{z^*}'\left( \bar{x}, x \right)&= \liminf _{t \downarrow 0} \frac{1}{t} \left[ \left( f\left( \bar{x} + tx \right) \oplus H^+(z^*) \right) {-^{\negmedspace \centerdot \,}}f\left( \bar{x} \right) \right] \\&= \bigcap _{s>0} \mathrm{{cl \,}}\bigcup _{0 < t < s} \frac{1}{t} \left[ \left( f\left( \bar{x} + tx \right) \oplus H^+(z^*) \right) {-^{\negmedspace \centerdot \,}}f\left( \bar{x} \right) \right] . \end{aligned}$$

Obviously, \(f_{z^*}' = f_{tz^*}'\) for \(t>0\). Hence, if \(C^+\) has a base one only gets “as many” directional derivatives as there are elements in the basis.

One may ask why the set \(H^+(z^*)\) appears in the definition of the difference quotient. The reason is that frequently the sets \(f(\bar{x} +tx) {-^{\negmedspace \centerdot \,}}f(\bar{x})\) and also corresponding “lattice limits” are empty.

Example 5.11

Let \(X = \mathrm {I\negthinspace R}\), \(Z = \mathrm {I\negthinspace R}^2\), \(C = \left\{ \left( 0, 1 \right) ^Ts \mid s \ge 0 \right\} \) and the function \(f :X \rightarrow \mathcal {G}(Z, C)\) be defined by

$$ f(x) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} [-x, x] \times \mathrm {I\negthinspace R}_+ &{} : &{} x \in [0, 1] \\ \emptyset &{} : &{} \text {otherwise} \end{array} \right. . $$

Then, f is convex and \(f(1) = \inf _{x \in X} f(x) \ne Z\). However, \(f(1 + tx) {-^{\negmedspace \centerdot \,}}f(1) = \emptyset \) whenever \(x<0\) and \(t < -\frac{1}{x}\), or \(x>0\) and \(t>0\). This means that the directional derivative of f at \(\bar{x} = 1\) (defined without \(H^+(z^*)\)) would be identically \(\emptyset \). On the other hand, \(f_{z^*}'\left( 1, x \right) \) is never empty for \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) and provides much better information about the local behavior of f at \(\bar{x} = 1\).

For scalar functions, the standard definition of the lower Dini directional derivative can be adapted.

Definition 5.12

The lower Dini directional derivative of a function \(\varphi :X \rightarrow \overline{\mathrm {I\negthinspace R}}\) at \(\bar{x}\) in direction x is

$$\begin{aligned} \varphi ^\downarrow (\bar{x}, x)&=\liminf \limits _{t\downarrow 0}\frac{1}{t} \left[ \varphi (\bar{x} + t x) {-^{\negmedspace \centerdot \,}}\varphi (\bar{x}) \right] . \end{aligned}$$

In Definition 5.12, it is neither assumed \(\bar{x} \in \mathrm{{dom \,}}\varphi \), nor \(\varphi \) be a proper function. This is possible since the difference operator is replaced by the residual operator. For \(\mathcal {G}\left( Z, C \right) \)-valued functions, the lower Dini directional derivative can be expressed by corresponding derivatives of scalarizations.

Proposition 5.13

(a) For all \(\bar{x} \in X\), for all \(x \in X\),

$$\begin{aligned} f^\downarrow _{z^*}(\bar{x}, x)&= \left\{ z \in Z \mid \varphi _{f, z^*}^\downarrow (\bar{x}, x) \le -z^*\left( z \right) \right\} \end{aligned}$$
(5.4)
$$\begin{aligned} \varphi _{f,z^*}^\downarrow (\bar{x}, x)&= \varphi _{f^\downarrow _{z^*}\left( \bar{x}, \cdot \right) , z^*}\left( x \right) . \end{aligned}$$
(5.5)

Proof

See [36, Proposition 3.4].   \(\square \)

The next result is familiar in the scalar case for proper functions, see [235, Theorem 2.1.14].

Lemma 5.14

Let \(f :X \rightarrow \mathcal {G}\left( Z,C \right) \) be convex, \(\bar{x} \in X\) and \(z^* \in C^+\backslash \{0\}\). Then

$$\begin{aligned} \forall x \in X :f'_{z^*}\left( \bar{x}, x \right) = \inf _{t>0}\frac{1}{t} \left[ \left( f\left( \bar{x} + tx \right) \oplus H^+(z^*) \right) {-^{\negmedspace \centerdot \,}}f\left( \bar{x} \right) \right] , \end{aligned}$$
(5.6)

and the function

$$ x \mapsto f'_{z^*}\left( x_0, x \right) $$

is sublinear as a function from X into \(\mathcal {G}\left( Z,C \right) \). If \(\bar{x} \in \mathrm{{dom \,}}f\), then \(\mathrm{{dom \,}}f'_{z^*}\left( \bar{x}, \cdot \right) = \mathrm{{cone\,}}\left( \mathrm{{dom \,}}f - \bar{x} \right) \). Moreover,

$$ f'_{z^*}\left( \bar{x}, 0 \right) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} H^+(z^*) &{} : &{} f\left( \bar{x} \right) \oplus H^+(z^*) \not \in \left\{ Z, \emptyset \right\} \\ Z &{} : &{} f\left( \bar{x} \right) \oplus H^+(z^*) \in \left\{ Z, \emptyset \right\} \end{array} \right. . $$

Proof

It relies on the monotonicity of the “residuation quotient”

$$ \frac{1}{t} \left[ \left( f\left( \bar{x} + tx \right) \oplus H^+(z^*) \right) {-^{\negmedspace \centerdot \,}}f\left( \bar{x} \right) \right] $$

which in turn is proven using a calculus for the inf-residuation and the convexity of f. For details, compare [90].   \(\square \)

The following result tells us when the directional derivative has only “finite” values. As usual, we denote by \(\mathrm{{core \,}}M\) the algebraic interior of a set \(M \subseteq X\).

Theorem 5.15

Let \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) be convex and \(\bar{x} \in \mathrm{{core \,}}\left( \mathrm{{dom \,}}f \right) \). If f is proper, then there exists \(z^* \in C^+\backslash \{0\}\) such that \(f'_{z^*}\left( \bar{x}, x \right) \not \in \left\{ Z, \emptyset \right\} \) for all \(x \in X\).

Proof

See [90].   \(\square \)

5.4 The Subdifferential

For convex functions, we define elements of the subdifferential using conlinear minorants of the sublinear directional derivative.

Definition 5.16

Let \(f :X \rightarrow \mathcal {G}\left( Z,C \right) \) be convex, \(\bar{x} \in X\) and \(z^* \in C^+\backslash \{0\}\). The set

$$ \partial f_{z^*}\left( \bar{x} \right) = \left\{ x^* \in X^* \mid \forall x \in X :S_{\left( x^*, z^* \right) }\left( x \right) \supseteq f_{z^*}'\left( \bar{x}, x \right) \right\} $$

is called the \(z^*\)-subdifferential of f at \(\bar{x}\).

Again, the basic idea is to replace a continuous linear functional \(x^*\) by \(S_{\left( x^*, z^* \right) }\). An alternative characterization of the subdifferential is provided in the following result.

Proposition 5.17

Let \(f :X \rightarrow \mathcal {G}\left( Z,C \right) \) be convex and \(\bar{x} \in X\). The following statements are equivalent for \(x^* \in X^*\), \(z^* \in C^+\backslash \{0\}\):

  1. (a)

    \(\forall x \in X\): \(S_{\left( x^*, z^* \right) }\left( x \right) \supseteq f'_{z^*}\left( \bar{x}, x \right) \),

  2. (b)

    \(\forall x \in X\): \(S_{\left( x^*, z^* \right) }\left( x - \bar{x} \right) \supseteq \left( f\left( x \right) \oplus H^+(z^*) \right) {-^{\negmedspace \centerdot \,}}f\left( \bar{x} \right) \).

  3. (c)

    \(x^* \in \partial \varphi _{f, z^*}(\bar{x})\).

Proof

See [90].   \(\square \)

Some extra care is necessary for defining the subdifferential \(\partial \varphi _{f, z^*}\) of the extended real-valued function \(\varphi _{f, z^*}\) in the previous proposition since its is not necessarily proper. The reader may compare [89, 90]. Condition (c) opens the path to a subdifferential calculus: With some effort, one can transform the subdifferential rules from the scalar to the set-valued case obtaining corresponding “\(z^*\)-wise” rules, see [207].

Under some “regularity”, the directional derivative can be reconstructed from the subdifferential. This result is known as the max-formula. Here is a set-valued version.

Theorem 5.18

Let \(f :X \rightarrow \mathcal {G}\left( Z,C \right) \) be a convex function, \(\bar{x} \in \mathrm{{dom \,}}f\) and \(z^* \in C^+\backslash \{0\}\) such that the function \(x \mapsto f\left( x \right) \oplus H^+(z^*)\) is proper and the function \(\varphi _{f, z^*} :X \rightarrow \mathrm {I\negthinspace R}\cup \left\{ +\infty \right\} \) is upper semi-continuous at \(\bar{x}\). Then \(\partial f_{z^*}\left( \bar{x} \right) \ne \emptyset \) and it holds

$$\begin{aligned} \forall x \in X :f'_{z^*}\left( \bar{x}, x \right) = \bigcap _{x^* \in \partial f_{z^*}\left( \bar{x} \right) } S_{\left( x^*, z^* \right) }\left( x \right) . \end{aligned}$$
(5.7)

Moreover, for each \(x \in X\) there exists \(\bar{x}^* \in \partial f_{z^*}\left( \bar{x} \right) \) such that

$$\begin{aligned} f'_{z^*}\left( \bar{x}, x \right) = S_{\left( \bar{x}^*, z^* \right) }\left( x \right) . \end{aligned}$$
(5.8)

Proof

See [90].   \(\square \)

Next, we link the subdifferential and the Fenchel conjugate.

Proposition 5.19

Let \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) be convex, \(\bar{x} \in X\), \(\mathrm{{dom \,}}f \ne \emptyset \) and \(f\left( \bar{x} \right) \oplus H^+(z^*) \ne Z\). Then, the following statements are equivalent for \(x^* \in X^*\), \(z^* \in C^+\backslash \{0\}\):

  1. (a)

    \(x^* \in \partial f_{z^*}\left( \bar{x} \right) \),

  2. (b)

    \(\forall x \in X\): \(S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) \supseteq S_{\left( x^*, z^* \right) }\left( \bar{x} \right) {-^{\negmedspace \centerdot \,}}f\left( \bar{x} \right) \).

Proof

See [90].   \(\square \)

This results basically says that \(x^* \in \partial f_{z^*}\left( \bar{x} \right) \) if the supremum in the definition of the conjugate is attained at \(\bar{x}\) since from the Young-Fenchel inequality we have

$$ S_{\left( x^*, z^* \right) }\left( \bar{x} \right) {-^{\negmedspace \centerdot \,}}f\left( \bar{x} \right) \supseteq f^*\left( x^*, z^* \right) $$

whereas (b) above produces

$$ f^*\left( x^*, z^* \right) = \bigcap _{x \in X} \left\{ S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) \right\} \supseteq S_{\left( x^*, z^* \right) }\left( \bar{x} \right) {-^{\negmedspace \centerdot \,}}f\left( \bar{x} \right) . $$

This means: In the sense of Definition 3.3 adapted to maximization, the set \( \left\{ \bar{x} \right\} \) is a solution of the problem

$$ \text {maximize} \quad S_{\left( x^*, z^* \right) }\left( x \right) {-^{\negmedspace \centerdot \,}}f\left( x \right) \quad \text {over} \quad x \in X. $$

Finally, we want to describe the set of points satisfying the condition \(0 \in \partial _{z^*}f\left( \bar{x} \right) \).

Proposition 5.20

Let \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) be convex, \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) and \(\bar{x} \in \mathrm{{dom \,}}f\) such that \(f(\bar{x}) \oplus H^+(z^*) \ne Z\). Then, the following statements are equivalent:

  1. (a)

    \(H^+(z^*) \supseteq f'_{z^*}\left( \bar{x}, x \right) \) for all \(x \in X\),

  2. (b)

    \(0 \in \partial f_{z^*}\left( \bar{x} \right) \),

  3. (c)

    \(f(\bar{x}) \oplus H^+(z^*) = \left[ \inf _{x \in X}f(x) \right] \oplus H^+(z^*)\),

  4. (d)

    \(\varphi _{f,z^*}(\bar{x}) \le \varphi _{f,z^*}(x)\) for all \(x \in X\).

Proof

This is immediate from the previous results.   \(\square \)

We will call an \(\bar{x} \in X\) for which there is \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) satisfying (c) in Proposition 5.20 a \(C^+\)-minimizer of problem (P) in Definition 3.3. The question arises if there is a (full) solution of (P) consisting of \(C^+\)-minimizers and how such a solution can be characterized.

We conclude this section by noting that a calculus for the \(z^*\)-subdifferential can be derived from corresponding calculus rules for extended real-valued convex functions. The additional feature in the set-valued case is the dependence of \(\partial f_{z^*}\left( \bar{x} \right) \) on \(z^*\), i.e. properties of the mapping \(z^* \mapsto \partial f_{z^*}\left( \bar{x} \right) \). It turns out that this is an adjoint process type relationship as pointed out in [90].

5.5 A Case Study: Set-Valued Translative Functions

Let X, Z be two topological linear spaces and \(T :Z \rightarrow X\) an injective continuous linear operator. A function \(f :X \rightarrow \mathcal P\left( Z, C \right) \) is called translative with respect to T (or just T -translative) if

$$ \forall x \in X, \; \forall z \in Z :f\left( x + Tz \right) = f\left( x \right) + \left\{ z \right\} . $$

A special case of interest will be \(Z = \mathrm {I\negthinspace R}^m\), \( \left\{ h^1, \ldots , h^m \right\} \subseteq X\) a set of m linearly independent elements and \(T :\mathrm {I\negthinspace R}^m \rightarrow X\) defined by \(Tz = \sum _{k =1}^m z_kh^k\). This construction is very close to (and motivated by) set-valued risk measures as shown below.

It is an easy exercise to show that a T-translative function f can be represented as follows:

$$\begin{aligned} \forall x \in X :f\left( x \right) = \left\{ z \in \mathrm {I\negthinspace R}^m \mid x - Tz \in A_f \right\} \end{aligned}$$
(5.9)

where \(A_f = \left\{ x \in X \mid 0 \in f\left( x \right) \right\} \) is the zero sublevel set of f. This set satisfies

$$ \forall z \in C :A_f - Tz \subseteq A_f $$

since f maps into \(\mathcal P\left( Z, C \right) \). The latter property is called \(\left( T, C \right) \)-translativity of \(A_f\).

The representation (5.9) can be written as

$$ \forall x \in X :f\left( x \right) = \left( I_{A_f} \Box \alpha _T \right) \left( x \right) = \inf \left\{ I_{A_f}\left( x_1 \right) + \alpha _T\left( x_2 \right) \mid x_1 + x_2 = x \right\} $$

where \(\alpha _T :X \rightarrow \mathcal P\left( Z, C \right) \) is given by

$$ \alpha _T\left( x \right) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} \left\{ z \right\} + C &{} : &{} x = Tz \\ \emptyset &{} : &{} \text {otherwise} \end{array} \right. $$

and \(I_A\) is the set-valued indicator function of A: \(I_A\left( x \right) = C\) if \(x \in A\) and \(I_A\left( x \right) = \emptyset \) if \(x \not \in A\). Note that the function \(\alpha _T\) is well-defined since T is assumed to be injective.

We start the investigation of set-valued translative functions with their conjugates and make use of the fact that the conjugate of the infimal convolution of two functions is the sum of the two conjugates. For set-valued functions, this has been established in [78, Lemma 2]. The conjugate of the indicator function is indeed the set-valued support function as shown in [78]:

$$\begin{aligned} I_{A_f}^*(x^*,z^*) = \bigcap _{x\in A_f}S_{(x^*,z^*)}(x). \end{aligned}$$

Moreover,

$$\begin{aligned} \alpha _T^*\left( x^*,z^* \right)&=\bigcap _{x\in X}\left( S_{(x^*,z^*)}(x){-^{\negmedspace \centerdot \,}}\alpha _T(x) \right) =\bigcap _{u\in Z}\left( S_{(x^*,z^*)}(Tu){-^{\negmedspace \centerdot \,}}\left( \left\{ u \right\} +C \right) \right) \\&= \bigcap _{u \in Z} \left\{ z \in Z \mid z+u+C \subseteq S_{(x^*,z^*)}(Tu) \right\} \\&= \left\{ z \in Z \mid \forall u \in Z :z^*(z+u) \ge x^*(Tu) \right\} \\&= \left\{ z \in Z \mid z^*(z) \ge \sup _{u\in Z}(T^*x^*-z^*)(u) \right\} = \left\{ \begin{array}{c@{\quad }c@{\quad }c} H^+(z^*) &{} : &{} z^* = T^*x^* \\ \emptyset &{} : &{} z^* \ne T^*x^* \end{array} \right. \end{aligned}$$

Hence, for a T-translative function f we get

$$\begin{aligned} f^*(x^*,z^*)=I_{A_f}^*(x^*,z^*)+\alpha _T^*\left( x^*,z^* \right) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} \bigcap \limits _{x\in A_f}S_{(x^*,z^*)}(x) &{} : &{} z^* = T^*x^* \\ \emptyset &{} : &{} z^* \ne T^*x^* \end{array} \right. \end{aligned}$$
(5.10)

and (see Remark 5.5)

$$\begin{aligned} (-f^*)(x^*,z^*)=H^+(z^*){-^{\negmedspace \centerdot \,}}f^*(x^*,z^*)= \left\{ \begin{array}{c@{\quad }c@{\quad }c} \mathrm{{cl \,}}\bigcup \limits _{x\in A_f}S_{(x^*,z^*)}(-x) &{} : &{} z^* = T^*x^* \\ Z &{} : &{} z^* \ne T^*x^* \end{array} \right. \end{aligned}$$

since \(H^+(z^*){-^{\negmedspace \centerdot \,}}\emptyset = Z\) and \(H^+(z^*){-^{\negmedspace \centerdot \,}}\bigcap _{x \in A_f}S_{(x^*,z^*)}(x) = \mathrm{{cl \,}}\bigcup _{x \in A_f} \left[ H^+(z^*){-^{\negmedspace \centerdot \,}}S_{(x^*,z^*)}(x) \right] = S_{(x^*,z^*)}(-x)\) according to Proposition 4.17.

If the function f additionally maps into \(\mathcal {G}(Z, C)\) and is proper, closed and convex, then the biconjugation theorem applies, and the following dual representation is obtained:

$$\begin{aligned} \forall x \in X :f\left( x \right) = \bigcap _{\begin{array}{c} x^* \in X^* \\ T^*x^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \end{array}} \left[ S_{(x^*,T^*x^*)}(x) {-^{\negmedspace \centerdot \,}}I_{A_f}^*(x^*, T^*x^*) \right] . \end{aligned}$$
(5.11)

If f is additionally sublinear, then \(A_f\) is a closed convex cone and (5.11) simplifies to

$$\begin{aligned} \forall x \in X :f\left( x \right) = \bigcap _{\begin{array}{c} x^* \in A^-_f \\ T^*x^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \end{array}}S_{(x^*, T^*x^*)}(x) \end{aligned}$$
(5.12)

since in this case

$$ I_{A_f}^*(x^*,z^*) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} H^+(z^*) &{} : &{} x^* \in A^-_f \\ \emptyset &{} : &{} \text {otherwise} \end{array} \right. . $$

Of course, \(A^-_f = -\left( A_f \right) ^+\).

The value of these formulas depends on how the dual data \(x^*\), \(T^*\) and \(I_{A_f}^*\) can be interpreted in terms of the application at hand. We will show in Sect. 7.4 below that this can be done very nicely.

Example 5.21

\(Z = \mathrm {I\negthinspace R}^m\), \(T :\mathrm {I\negthinspace R}^m \rightarrow X\) defined by \(Tz = \sum _{k =1}^m z_kh^k\). Then

$$ \forall z \in \mathrm {I\negthinspace R}^m :\left( T^*x^* \right) \left( z \right) = \sum _{k =1}^m x^*(h^k)z_k, $$

thus \(T^*x^*\) can be identified with \(\left( x^*\left( h^1 \right) , \ldots , x^*\left( h^m \right) \right) ^T \in \mathrm {I\negthinspace R}^m\).

We turn to the subdifferential of T-translative functions. The result reads as follows.

Corollary 5.22

Let \(f :X \rightarrow \mathcal G\left( Z, C \right) \) be convex, T-translative and \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \). If \(\partial f_{z^*}\left( \bar{x} \right) \ne \emptyset \) then

$$\begin{aligned} \partial f_{z^*}\left( \bar{x} \right) = \left\{ x^* \in X^* \mid z^* = T^*x^* \; \text {and} \; \forall x \in A_f :S_{(x^*, T^*x^*)}(x) \supseteq S_{(x^*, T^*x^*)}(\bar{x}) {-^{\negmedspace \centerdot \,}}f(\bar{x}) \right\} . \end{aligned}$$
(5.13)

Proof

First, we show “\(\subseteq \)”. The assumption \(\partial f_{z^*}\left( \bar{x} \right) \ne \emptyset \) in conjunction with Proposition 5.17 implies \(f(\bar{x}) \oplus H^+(z^*) \not \in \left\{ Z, \emptyset \right\} \). Hence \(S_{(x^*,z^*)}(\bar{x}) {-^{\negmedspace \centerdot \,}}f(\bar{x}) \not \in \left\{ Z, \emptyset \right\} \), and Proposition 5.19 produces \(f^*(x^*, z^*) \not \in \left\{ Z, \emptyset \right\} \). Take \(x^* \in \partial f_{z^*}\left( \bar{x} \right) \). From (5.10) we now obtain

$$ z^* = T^*x^* \quad \text {and} \quad f^*(x^*, z^*) = I_{A_f}^*(x^*,z^*). $$

The definition of the set-valued support function yields that \(x^*\) belongs to the right hand side of (5.14).

Conversely, assume that \(x^* \in X^*\) satisfies \(z^* = T^*x^*\) as well as

$$ \forall x \in A_f :S_{(x^*,z^*)}(x) \supseteq S_{(x^*,z^*)}(\bar{x}) {-^{\negmedspace \centerdot \,}}f(\bar{x}). $$

Take \(x \in \mathrm{{dom \,}}f\). Then

$$ \forall z \in f(x) :x - Tz \in A_f $$

by T-translativity and hence

$$ \forall z \in f(x) :S_{(x^*,z^*)}(x - Tz) \supseteq S_{(x^*,z^*)}(\bar{x}) {-^{\negmedspace \centerdot \,}}f(\bar{x}) . $$

Since \(z^* = T^*x^*\) we have

$$ S_{(x^*,z^*)}(x - Tz) = S_{(x^*,z^*)}(x) + \left\{ -z \right\} $$

and therefore

$$ \forall z \in f(x) :S_{(x^*,z^*)}(x) + \left\{ -z \right\} \supseteq S_{(x^*,z^*)}(\bar{x}) {-^{\negmedspace \centerdot \,}}f(\bar{x}). $$

This means that any \(\eta \in S_{(x^*,z^*)}(\bar{x}) {-^{\negmedspace \centerdot \,}}f(\bar{x})\) satisfies

$$ \forall z \in f(x) :z + \eta \in S_{(x^*,z^*)}(x), $$

thus \(\eta \in S_{(x^*,z^*)}(x) {-^{\negmedspace \centerdot \,}}f(x)\). Hence

$$ \forall x \in \mathrm{{dom \,}}f :S_{(x^*,z^*)}(x) {-^{\negmedspace \centerdot \,}}f(x) \supseteq S_{(x^*,z^*)}(\bar{x}) {-^{\negmedspace \centerdot \,}}f(\bar{x}) $$

which is, according to Proposition 5.19, equivalent to \(x^* \in \partial f_{z^*}\left( \bar{x} \right) \).    \(\square \)

The above corollary tells us that the knowledge of \(\partial f_{z^*}\) can be obtained by knowledge about \(A_f\) and \(T^*\). This becomes even more clear in the sublinear case.

Corollary 5.23

Let \(f :X \rightarrow \mathcal G\left( Z, C \right) \) be sublinear, T-translative and \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \). If \(\partial f_{z^*}\left( \bar{x} \right) \ne \emptyset \) then

$$\begin{aligned} \partial f_{z^*}\left( \bar{x} \right) = \left\{ x^* \in X^* \mid z^* = T^*x^*, \; x^* \in A^-_f, \; S_{(x^*,z^*)}(\bar{x}) = f(\bar{x}) \oplus H^+(z^*) \right\} . \end{aligned}$$
(5.14)

Proof

As observed above, in this case \(A_f\) is a convex cone and \(I^*\) can only attain the two values \(H^+(z^*)\) for \(x^* \in A^-_f\) and \(\emptyset \) otherwise. Finally,

$$ S_{(x^*,z^*)}(\bar{x}) {-^{\negmedspace \centerdot \,}}f(\bar{x}) = H^+(z^*) \quad \Leftrightarrow \quad S_{(x^*,z^*)}(\bar{x}) = f(\bar{x}) \oplus H^+(z^*). $$

The result now follows from Corollary 5.22.    \(\square \)

5.6 Comments on Vector- and Set-Valued Convex Analysis

The history of convex analysis for scalar functions is a continuing success story, and this area of mathematics is the theoretical basis for linear and nonlinear, in particular non-smooth, optimization and optimal control theory: compare [196, p. 3]Footnote 5 or the preface of [8, p. xii].Footnote 6

Surprisingly, the gap between theory and applications (in optimization and multi-criteria decision making) is much wider for vector- or even set-valued functions. For example, there is no canonical (Fenchel) conjugate of a vector-valued function, but rather a whole bunch of different definitions which work under different assumptions (see below for references).

If one ignores for a moment scalarization approaches, then there are basically two different paths to a “vector-valued” convex analysis.

The first one simply consists in an extended interpretation of the infimum and the supremum in formulas like the definition of the Fenchel conjugate: Under the assumption that the function maps into a conditional complete vector lattice (this means that every set which is bounded from below with respect to the vector order has an infimum in the space) one considers infima/suprema with respect to the vector order. This approach has been followed by Zowe [232, 233], Elster and Nehse [23, 50], Borwein [16], Zalinescu [234], Kutateladze [139] and others. One may compare [17] for the state of the art in the mid 1980s and more references. This approach has the advantage that a corresponding version of the Hahn-Banach theorem is available which is due to L.V. Kantorovich, see for example Day’s book [42]. Disadvantages are, of course, the strong assumptions to the image space and, even worth for applications, the fact that a vector infimum/supremum is hardly an appropriate concept when it comes to vector optimization and multi-criteria decision making.

In the second approach, infima and suprema are therefore replaced by sets of minimal and maximal points, respectively, with respect to the vector order. This worked for (and was motivated by) applications of vector optimization, but made the task of developing a corresponding vector-valued convex analysis incredibly harder: It turns out that “dual constructions” like conjugates or dual optimization problems become set-valued: ‘for a vector problem, its dual constructed by several means, is a problem whose objective function is set-valued, whatever the objective of the primal problem be’ ([156, p. 57]). Set-valued Legendre–Fenchel conjugates with maximal points replacing the supremum appear in [156, 204, 219], with weakly maximal points in [166, 204], with (weakly) supremal points in [123, 190, 191, 212, 214, 217] and an even more general construction involving “non-submitted” points is used in [47], for example.

A major difficulty for this approach is the lack of an appropriate Hahn-Banach theorem which is at the heart of convex analysis: One has to turn to scalarizations in order to apply the “usual” Hahn-Banach argument. Zowe’s paper [231] shows how difficult it is to get back to vector-valued concepts after a scalarization.

In both approaches, continuous linear operators were used as dual variables. One way to avoid this again is a scalarization approach: An early attempt is Jahn’s work [113] (compare also [114, Chap. 8]). This approach leads to peculiar difficulties even if the problem at hand is linear: Compare [113, Conclusions] and the discussion at the ends of [114, Sects. 8.2 and 8.3]. A modern account is given in [19] which leads to dual problems with a, in general, non-convex feasibility set even if the original problem is convex (or linear).

Let us mention that there are at least two quite different attempts to answer the duality question for vector problems: In [5, 6] as well as in [21] Fenchel conjugates of vector- or set-valued functions are defined in terms of scalar functions depending on an additional dual variable. Although in both attempts quite strong assumptions are imposed, they seem to be only a few steps short of the constructions in this section.

The approach summarized in [151] is also based on scalarization via support functions, but it involves a set infimum/supremum which admits to obtain stronger results.

The concepts presented in this survey go without the usual assumptions to the ordering cone C (non-empty interior, pointedness, generating a lattice order etc.), and they basically produce set-valued versions of all the known (duality) formulas for scalar convex functions, and this includes the case of vector-valued functions. A crucial observation is the theoretical equivalence of a convex \(\mathcal {G}(Z, C)\)-valued function f and the family \( \left\{ \varphi _{f, z^*} \right\} _{z^* \in C^+\backslash \{0\}}\). Formula (5.11) is an example for how the set-valued theory tells us what kind of scalarizations should be taken into consideration. New insights can be obtained by investigating relationships between the two components of the dual variable \((x^*, z^*)\) which is essentially of adjoint process duality type (see [90, Sect. 4]). Set-valued functions satisfying (a) and (b) of Proposition 5.1 are sometimes called linear, e.g. in [186]. On the other hand, Proposition 5.1 and Theorem 5.2 show that the “conlinear” functions \(S_{(x^*, z^*)}\) are in some sense one-sided versions of linear multivalued operators (linear relations) as surveyed e.g. in [37]. The reader may check the relationships in the case of a linear subspace C with \(C^+ = C^\perp \) its orthogonal complement.

Directional derivatives for set-valued functions are usually defined at points of its graph, thus fixing (only) one element in the image set along with a point in the pre-image set. The standard reference is [4], and Mordukhovich’s coderivative [172] is of the same type. Compare also [227]. Quite a different path is the attempt to embed certain subsets of \(\mathcal {P}(Z)\) into a linear space and then use the usual “linear” constructions, see [137] for an example. This, of course, only works under strong assumptions since, in general, \(\mathcal {G}(Z, C)\) cannot be embedded into a linear space even if one drops \(\emptyset \) and Z.

Concerning subgradients for set-valued functions, the paper [98] presents an overview over the existing concepts each of which is afflicted with a peculiar difficulty: for example, the ‘weak subgradient’ of [29] (see also [28, Definition 2.53]) leaves the realm of convexity, the ‘strong subgradient’ introduced in [98] needs an artificial exclusion condition in its definition. Both require rather strong assumptions for their existence: compare [236, Corollary 9] with respect to weak subgradients while observing that the space Z therein has the least upper bound property, and with respect to strong subgradients compare [98, Definition 3.2 and Theorem 4.1].

One should note that the application of Yang’s Hahn-Banach type theorem [226, 236] also suffers the “non-convexity issue:” since it relies on the “not strictly greater than” relation: inequalities cannot be added whenever the relation is non-complete. This means that the weak subdifferentials of convex set-valued function obtained for example via [236, Corollary 9] are not convex in general.

Another way of defining subgradients is to do it at points of the graph of a set-valued mapping rather than at points of its domain, see [16, 204], [7, Definition 2.1] and also the ‘positive subgradients’ defined in [143, Definition 2.5], [92, Definition 3.1] and the ‘k-subgradients’ of [19, Definition 7.1.9] among many others.

Most of those concepts use linear operators as dual variables, but when it comes to existence very often operators of rank 1 show up, see, for example, [28, Theorem 2.55], [98, Theorem 4.1]. The (straightforward) relationships are discussed in [19, p. 331] and [92, Sect. 4].

We interpret this as evidence that, unless the image space is a (conditional) complete vector lattice and the Hahn-Banach-Kantorovitch theorem is available, the dual variables should involve linear functionals rather than linear operators. Using “conlinear functions” generated by pairs of linear functionals, the constructions in Sects. 5.3 and 5.4 offer a way to obtain results which are very close in shape to the scalar case and avoid strong assumptions to the ordering cone in Z. Moreover, in contrast to most of the “vectorial” constructions in the literature (for example, see the discussion in [19, p. 313]), our set-valued results reproduce the ones for scalar extended real-valued functions as special cases; this includes e.g. existence of subgradients and strong duality with attainment of the supremum for the dual problem.

The subdifferential as given in Definition 5.16 is exactly the same set which is called the ‘conjugate to’ f in [194, Definition 2 and the remark thereafter] provided one assumes that every expression in B. N. Pshenichnyi’s definition is finite. Section 5.4 should make it clear why we call it a subdifferential; the relationship to convex process duality can be found in [90]. It should be pointed out that the complete lattice approach of this survey also adds new insights to scalar convex analysis: the improper case, in particular the function value \(-\infty \), can be dealt with using the residuation. We refer to [89].

Scalar translative functions appear in many areas of applied mathematics, for example probability (quantile functions and lower previsions [221]), insurance and finance (constant additive insurance premiums [222] and cash additive risk measures, introduced in [3] and reviewed in [59]), mathematical economics (benefit and shortage functions [161, 162]), vector optimization (nonlinear scalarization functions, compare [63] also for earlier references and [67] for an overview) and idempotent analysis (compare the survey [126]) as well as in max-plus algebra (see e.g. [31]). A relationship between vector optimization and risk measures in finance is pointed out in [99].

Following an idea of [120], in [80, 82] cash additive risk measures have been generalized to set-valued risk measures for multivariate positions which turned out to be T-translative for some special T. Thus, such functions are important in applications, and they provide examples for the set optimization theory of this survey.

6 Set-valued Optimization

6.1 Unconstrained Problems

Within the set-up of the previous section, the basic problem again is

$$\begin{aligned} {\text {minimize} \quad f(x) \quad \text {subject to} \quad x \in X.} \end{aligned}$$
(P)

The difficulty with the solution concept given in Definition 3.3 is that solutions are, in general, sets rather than single points. Thus, optimality conditions such as “zero belongs to the subdifferential of some function” should actually be taken “at sets” rather than “at points.” Of course, this does not sound very attractive. The following construction provides a remedy.

Definition 6.1

Let \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) be a function and \(M \subseteq X\) a non-empty set. The function \(\hat{f}\left( \cdot ; M \right) :X \rightarrow \mathcal {G}\left( Z, C \right) \) defined by

$$\begin{aligned} \hat{f}\left( x; M \right) = \inf _{u \in M}f\left( x+u \right) = \mathrm{{cl \,}}\mathrm{{co \,}}\bigcup _{u \in M}f\left( x+u \right) \end{aligned}$$
(6.1)

is called the inf-translation of f by M.

The function \(\hat{f}\left( \cdot ; M \right) \) coincides with the canonical extension of f at \(M + \left\{ x \right\} \) as defined in [102]. A few elementary properties of the inf-translation are collected in the following lemma.

Lemma 6.2

Let \(M \subseteq X\) be non-empty and \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) a function.

  1. (a)

    If \(M \subseteq N \subseteq X\) then \(\hat{f}\left( x; M \right) \subseteq \hat{f}\left( x; N \right) \) for all \(x \in X\).

  2. (b)

    \(\inf _{x \in X} f\left( x \right) = \inf _{x \in X} \hat{f}\left( x; M \right) \).

  3. (c)

    If f and M are convex, so is \(\hat{f}\left( \cdot ; M \right) :X \rightarrow \mathcal {G}\left( Z, C \right) \), and in this case \(\hat{f}\left( x; M \right) = \mathrm{{cl \,}}\bigcup _{u \in M}f\left( u+x \right) \).

Proof

The proof can be found in [90].   \(\square \)

Proposition 6.3

Let \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) be a convex function and \(\emptyset \ne M \subseteq \mathrm{{dom \,}}f\). The following statements are equivalent:

  1. (a)

    M is an infimizer for f;

  2. (b)

    \( \left\{ 0 \right\} \subseteq X\) is an infimizer for \(\hat{f}\left( \cdot ; M \right) \);

  3. (c)

    \( \left\{ 0 \right\} \) is an infimizer for \(\hat{f}\left( \cdot ; \mathrm{{co \,}}M \right) \) and \(\hat{f}\left( 0; M \right) = \hat{f}\left( 0; \mathrm{{co \,}}M \right) \).

Proof

The equivalence of (a) and (b) is immediate from \(\hat{f}\left( 0; M \right) = \inf _{u \in M}f\left( u \right) \) and Lemma 6.2, (b). The equivalence of (a) and (c) follows from \(\hat{f}\left( 0; \mathrm{{co \,}}M \right) = \inf _{u \in \mathrm{{co \,}}M}f\left( u \right) \) and Lemma 6.2, (b).   \(\square \)

The previous proposition makes clear that by an inf-translation an infimizer (set) can be reduced to a single point, namely just \(0 \in X\). Moreover, it should be apparent that we need to consider \(\hat{f}\left( \cdot ; \mathrm{{co \,}}M \right) \): Since we want to characterize infimizers via directional derivatives and subdifferentials, a convex function is needed, and \(\hat{f}\left( \cdot ; M \right) \) is not convex in general even if f is convex (find a counterexample!). Obviously, an infimizer is not necessarily a convex set; on the contrary, sometimes one prefers a nonconvex one, for example a collection of vertices of a polyhedral set instead of higher dimensional faces.

Theorem 6.4

Let \(f :X \rightarrow \mathcal {G}(Z, C)\) be a convex function satisfying

$$ I\left( f \right) = \inf _{x \in X}f\left( x \right) \not \in \left\{ Z, \emptyset \right\} . $$

Then f is proper, and the set \(\Gamma ^+\left( f \right) = \left\{ z^* \in C^+\backslash \{0\} \mid I\left( f \right) \oplus H^+(z^*) \ne Z \right\} \) is non-empty. Moreover, a set \(M \subseteq X\) is an infimizer for f if, and only if, \(\hat{f}\left( 0; M \right) = \hat{f}\left( 0; \mathrm{{co \,}}M \right) \) and

$$ 0 \in \bigcap _{z^* \in \Gamma ^+\left( f \right) }\partial \hat{f}_{z^*}\left( \cdot ; \mathrm{{co \,}}M \right) \left( 0 \right) . $$

Proof

Since \( \left\{ 0 \right\} \) is a singleton infimizer of the function \(x \mapsto \hat{f}\left( x; M \right) \), \(\bar{x} = 0 \in X\) satisfies (c) of Proposition 5.20 with f replaced by \(\hat{f}\left( \cdot ; M \right) \) for each \(z^* \in \Gamma ^+\left( f \right) \). Now, the result follows from Proposition 5.20 and Proposition 6.3.    \(\square \)

Theorem 6.4 highlights the use of the “\(z^*\)-wise” defined directional derivatives and subdifferentials. One needs to take into consideration all reasonable (\(=\) proper) scalarizations at the same time in order to characterize infimizers.

6.2 Constrained Problems and Lagrange Duality

Let Y be another locally convex spaces with topological dual \(Y^*\), and \(D \subseteq Y\) a convex cone. The set \(\mathcal {G}\left( Y, D \right) \) is defined in the same way as \(\mathcal {G}\left( Z, C \right) \). Finally, let \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) and \(g :X \rightarrow \mathcal {G}(Y ,D)\) be two functions. We are interested in the problem

$$\begin{aligned} {\text{ minimize } \quad f(x) \quad \text {subject to} \quad 0 \in g\left( x \right) .} \end{aligned}$$
(PC)

The set

$$ \mathcal X = \left\{ x \in X \mid 0 \in g\left( x \right) \right\} $$

is called the feasible set for (PC) and \( I(f, g) = \inf \left\{ f\left( x \right) \mid x \in \mathcal X \right\} \) is the optimal value of the problem. With Definition 3.2 in view we define a solution of (PC) as follows.

Definition 6.5

A set \(M \subseteq \mathcal X\) is called a solution of (PC) if

  1. (a)

    \(\inf \left\{ f\left( x \right) \mid x \in M \right\} = I(f, g)\),

  2. (b)

    \(\bar{x} \in M\), \(x \in \mathcal X\), \(f\left( x \right) \supseteq f\left( \bar{x} \right) \) imply \(f\left( x \right) = f\left( \bar{x} \right) \).

Clearly, \(M \subseteq X\) is a solution of (PC) if, and only if \(f \left[ M \right] \) generates the infimum of \(f \left[ \mathcal X \right] = \left\{ f(x) \mid x \in \mathcal X \right\} \) and each \(f\left( \bar{x} \right) \) for \(\bar{x} \in M\) is minimal in \(f \left[ \mathcal X \right] \) with respect to \(\supseteq \).

We define the Lagrangian \(L :X \times Y^* \times C^+\backslash \{0\} \rightarrow \mathcal {G}\left( Z, C \right) \) of problem (PC) by

$$\begin{aligned} L\left( x, y^*, z^* \right) = f\left( x \right) \oplus \bigcup _{y \in g\left( x \right) }S_{\left( y^*, z^* \right) }\left( y \right) = f\left( x \right) \oplus \inf \left\{ S_{\left( y^*, z^* \right) }\left( y \right) \mid y \in g\left( x \right) \right\} . \end{aligned}$$
(6.2)

Under a mild condition, the primal problem can be reconstructed from the Lagrangian.

Proposition 6.6

If \(f\left( x \right) \ne Z\) for each \(x \in \mathcal X\), then

$$ \sup _{\left( y^*, z^* \right) \in Y^* \times C^+\backslash \{0\}} \negthickspace \negthickspace L\left( x, y^*, z^* \right) = \negthickspace \negthickspace \bigcap _{\left( y^*, z^* \right) \in D^+ \times C^+\backslash \{0\}} \negthickspace \negthickspace L\left( x, y^*, z^* \right) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} f\left( x \right) &{} : &{} 0 \in g\left( x \right) \\ \emptyset &{} : &{} 0 \not \in g\left( x \right) . \end{array} \right. $$

Proof

The proof is based on the assumption that the values of f and g are closed convex sets. See [86] for details.   \(\square \)

The next proposition provides a Lagrange sufficient condition which is a simple, but important result with an algorithmic character since it admits to test if a given set is an infimizer of (PC).

Proposition 6.7

Let \(M \subseteq \mathcal X\) be a non-empty set of feasible points for (PC). Assume that for each \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) there is \(y^* \in D^+\) satisfying

$$\begin{aligned} \hat{f}\left( 0; M \right) \oplus \inf _{y \in \hat{g}\left( 0; M \right) }S_{\left( y^*, z^* \right) }\left( y \right) = \inf _{x \in X} L\left( x, y^*, z^* \right) \end{aligned}$$
(6.3)

and

$$\begin{aligned} \inf _{y \in \hat{g}\left( 0; M \right) } S_{\left( y^*, z^* \right) }\left( y \right) = H^+(z^*). \end{aligned}$$
(6.4)

Then, M is an infimizer for (PC).

Proof

Using (6.4) and (6.3) we obtain

$$\begin{aligned} \hat{f}\left( 0; M \right) \oplus H^+(z^*)&= \hat{f}\left( 0; M \right) \oplus \inf _{y \in \hat{g}\left( 0; M \right) } S_{\left( y^*, z^* \right) }\left( y \right) = \inf _{x' \in X} L\left( x', y^*, z^* \right) \\&\supseteq f(x) \oplus \inf _{y \in g\left( x \right) }S_{\left( y^*, z^* \right) }\left( y \right) \supseteq f(x) \oplus H^+(z^*) \end{aligned}$$

for all \(x \in \mathcal X\) since \(S_{\left( y^*, z^* \right) }\left( 0 \right) = H^+(z^*)\). Taking the infimum over the feasible x on the right hand side and then the intersection over \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) on both sides while observing \(\hat{f}\left( 0; M \right) = \inf _{u \in M}f\left( u \right) \) we obtain that M indeed is an infimizer for (PC).    \(\square \)

Condition (6.4) serves as set-valued complementary slackness condition. If one considers the Lagrange function \((x, y^*, z^*) \mapsto \hat{L}\left( x, y^*, z^*; M \right) \) for the “inf-translated” problem

$$ \text {minimize} \quad \hat{f}(x; M) \quad \text {subject to} \quad 0 \in \hat{g}(x; M) $$

then condition (6.3) means that the infimum of the Lagrange function for the original problem coincides with \(\hat{L}\left( 0, y^*, z^*; M \right) \). Finally, if \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) and \(y^* \in D^+\) satisfy (6.4) and (6.3) then \(y^*\) is nothing else than a Lagrange multiplier for the by \(z^*\) scalarized problem. One may therefore expect that strong duality is something like “strong duality for all reasonable scalarized problems.” This idea works as shown in the following.

Define the function \(h :Y^* \times C^+\backslash \{0\} \rightarrow \mathcal {G}\left( Z, C \right) \) by

$$ h\left( y^*, z^* \right) = \inf _{x \in X}L\left( x, y^*, z^* \right) = \mathrm{{cl \,}}\bigcup _{x \in X} L\left( x, y^*, z^* \right) . $$

Since the values of \(L(\cdot , y^*, z^*)\) are closed half spaces with the same normal \(z^*\), the convex hull can be dropped in the infimum. The dual problem,

$$\begin{aligned} {\text{ maximize } \quad h\left( y^*, z^* \right) \quad \text{ subject } \text{ to } \quad y^* \in Y^*, \;z^* \in C^+\backslash \{0\},} \end{aligned}$$
(DC)

thus consists in finding

$$ d = \sup _{y^* \in Y^*, \, z^* \in C^+\backslash \{0\}} h\left( y^*, z^* \right) = \bigcap _{y^* \in Y^*, \, z^* \in C^+\backslash \{0\}} h\left( y^*, z^* \right) $$

and corresponding (full) solutions. The following weak duality result is immediate.

Proposition 6.8

Let \(f:X \rightarrow \mathcal {F}\left( Z, C \right) \) and \(g:X \rightarrow \mathcal {F}\left( Y, D \right) \). Then

$$ \sup \left\{ h\left( y^*, z^* \right) \mid y^* \in Y^*, \; z^* \in C^+\backslash \{0\} \right\} \supseteq \inf \left\{ f\left( x \right) \mid x \in X, \; 0 \in g\left( x \right) \right\} . $$

Proof

This is true since for \(\left( y^*, z^* \right) \in Y^* \times C^+\backslash \{0\}\) and \(x \in X\) satisfying \(0 \in g\left( x \right) \) we have

$$ h\left( y^*, z^* \right) \supseteq f\left( x \right) \oplus \mathrm{{cl \,}}\bigcup _{y \in g\left( x \right) }S_{\left( y^*, z^* \right) }\left( y \right) \supseteq f\left( x \right) \oplus S_{\left( y^*, z^* \right) }\left( 0 \right) = f\left( x \right) \oplus H^+\left( z^* \right) . $$

   \(\square \)

As usual, a constraint qualification condition is needed as part of sufficient conditions for strong duality. The following condition is called the Slater condition for problem (PC):

$$\begin{aligned} \exists \bar{x} \in \mathrm{{dom \,}}f :g\left( \bar{x} \right) \cap \mathrm{{int\,}}\left( -D \right) \ne \emptyset . \end{aligned}$$
(6.5)

The implicit assumption is \(\mathrm{{int\,}}D \ne \emptyset \).

Theorem 6.9

Assume \(p=\inf \left\{ f(x) \mid x \in \mathcal X \right\} \ne Z\). If \(f :X \rightarrow \mathcal {G}\left( Z, C \right) \) and \(g :X \rightarrow \mathcal {G}\left( Y, D \right) \) are convex and the Slater condition for problem (PC) is satisfied then strong duality holds for (PC), that is

$$\begin{aligned}&\inf \left\{ f\left( x \right) \mid 0 \in g\left( x \right) \right\} = \sup \left\{ h\left( y^*, z^* \right) \mid y^* \in Y^*, \; z^* \in C^+\backslash \{0\} \right\} ,\end{aligned}$$
(6.6)
$$\begin{aligned}&z^* \in C^+\backslash \{0\},\; p \oplus H^+\left( z^* \right) \ne Z \quad \Rightarrow \quad \exists y^* \in Y^* :p \oplus H^+\left( z^* \right) = h\left( y^*, z^* \right) . \end{aligned}$$
(6.7)

Proof

Hamel and Löhne [86].   \(\square \)

Note that the assumption \(p \ne Z\) implies the existence of \(z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \) with \(p \oplus H^+\left( z^* \right) \ne Z\). Thus, (6.7) is attainment of the supremum for the dual problem “\(z^*\)-wise.”

Corollary 6.10

Under the assumptions of the strong duality theorem, the set

$$ \Delta = \left\{ \left( y^*, z^* \right) \in Y^* \times C^+\backslash \{0\} \mid Z \ne p \oplus H^+(z^*) = h\left( y^*, z^* \right) \right\} $$

is non-empty and a full solution of the dual problem (DC).

Proof

See [86].   \(\square \)

6.3 Comments on Set Optimization Duality

Among the first papers in which optimization problems with a set-valued constraint have been systematically studied are [15, 16, 185]. It is, for example, instructive to realize that the Lagrange function in (6.2) is nothing else, but a set-valued version of the one in [185, p. 197]. Compare also [151, Theorem 3.28].

Whereas in [16, Problem (P) in (3.1)] the vector infimum serves as the building block for optimality, in [15, Theorem 3] a Lagrange duality result is established for properly efficient points of vector optimization problems. The dual variables are rank one linear operators. Similarly, in [211, Theorem 3.3] and also [213, Theorem 3.3], rank one linear operators and a set-valued Lagrange function (see equation (6.8) below) are used under strong assumptions (cones with weakly compact base). A similar idea can be found in the proof of the Lagrangian duality theorem, [156, Theorem 1.6 on p. 113] under the assumption that the ordering cone in Z has non-empty interior. These examples may suffice with respect to vector optimization problems in view although the literature is huge.

In [131, 134] the same type of set-valued Lagrangian has been used (without giving proofs) in connection with set relations, i.e., basically the solution concept IIa of Sect. 3.1. The more recent [93, 95] proceed similarly: Theorem 3.3 in [95] (basically the same as Theorem 4.2 in [93]) is a Lagrange duality result for weakly \(\preccurlyeq _C\)-minimal solutions with Lagrange function

$$\begin{aligned} f(x) + (T \circ g)(x) = f(x) + \left\{ Ty \mid y \in g(x) \right\} \end{aligned}$$
(6.8)

where \(T \in \mathcal L(Y, Z)\), the set of continuous linear operators from Y to Z. It is again based on rank one operators, an idea which at least dates back to [32, Theorem 4.1]. The same set of dual variables is used in [81] for a Lagrangian approach to linear vector optimization. However, the Lagrange function, even for a vector-valued problem, already is a set-valued one.

A thorough discussion of optimality conditions of Fermat and Lagrange type for (non-convex) set-valued optimization problems based on the minimality concept III can be found in [48] (compare also the references therein). These conditions are formulated in terms of the Mordukhovich subdifferential. It might be worth noting that the use of \(\mathcal {F}(Z, C)\)-valued functions ‘gives better conclusions’ [48, Remark 3.10].

A complete lattice approach based on infimal and supremal sets was developed in [92, 151]. The Lagrange function for a vector-valued function f and a set-valued G has the form

$$ f(x) + \mathrm{{Inf}} \left\{ y^*(y)c \mid y \in G(x) \right\} $$

where \(\mathrm{{Inf}}\) stands for the infimal set and \(c \in \mathrm{{int\,}}C\) is a (fixed) element. Assumptions, of course, include \(\mathrm{{int\,}}C \ne \emptyset \). The same assumption also is crucial in [144]; Theorems 3.2 and 3.3 therein are probably as far as one get in terms of conjugate duality based on “suprema” of a set, i.e. the elements which belong to the closure of the set, but are not dominated with respect to the relation which is generated by the interior of the ordering cone.

Other approaches rely on other set-valued derivatives, for example on contingent epiderivatives [69] or coderivatives [74, 173].

In virtually all approaches for set/vector optimization problems known to the authors, the strong duality assertion is based on the assumption of the existence of a (weakly, properly etc.) minimal element of the primal problem either with respect to the vector order (see [32], [156, Theorem 1.6 on p. 113, Theorem 2.7 on p. 119], [214, Theorems 3.4 and 3.5], [19, Theorems 5.2.4 and 5.2.6]) or with respect to a set relation (see [95, Theorem 3.3], [93, Theorem 4.2]). The two exceptions are the approaches in [86, 151] where the primal problems only have finite values in some sense and still existence for the dual problems is obtained–which is standard in the scalar case. In [151, p. 98] (see also Open Problem 3.6 therein with respect to Fenchel duality) and [92] it is discussed that the approach based on infimal/supremal sets indeed yields strong duality, but it is not clear whether the existence of the dual solution can be guaranteed without the \(z^*\)-component of the dual variable.

By means of the “complete lattice approach” surveyed here, the type of results which is known from the scalar case can be transferred to a “set level.” Strong duality then indeed means “inf equals sup” and includes the existence of dual solutions: compare [86, 151] for Lagrange duality and [79] for Fenchel-Rockafellar duality. The Lagrange function as defined in (6.2) basically is the composition of the two set-valued functions \(S_{(x^*, z^*)}\) and g, compare, for example, [125, Definition 6.3.2] and for scalar problems with a set-valued constraint already [185, p. 197].

The reduction of a “set solution” in the sense of Definition 6.5 to a “point solution” via an inf-translation (see Definition 6.1) is due to [90]. The exploitation of this construction seems to be very promising for obtaining optimality conditions and algorithms.

The complementary slackness condition given in Proposition 6.7 seems to be new although it clearly is in the spirit of [14, formulae (10), (12)].

7 Applications

7.1 Vector Optimization

In this section, let X and Z be as in Sect. 5 and \(C \subseteq Z\) a closed, convex, pointed (i.e. \(C \cap -C = \{0\}\)) and non-trivial cone. Then, \(\le _C\) is a partial order (i.e. also antisymmetric). Moreover, let a function \(F :X \rightarrow Z\cup \left\{ -\infty ,+\infty \right\} \) be given. Defining a function \(f :X \rightarrow \mathcal {G}(Z, C)\) by

$$ f(x) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} F(x) + C &{} : &{} F(x) \in Z \\ Z &{} : &{} F(x) = -\infty \\ \emptyset &{} : &{} F(x) = +\infty \end{array} \right. $$

we observe

$$ f(x_1) \supseteq f(x_2) \quad \Leftrightarrow \quad F(x_1) \le _C F(x_2), $$

where it is understood that \(-\infty \le _C z \le _C +\infty \) for all \(z \in Z\cup \left\{ -\infty ,+\infty \right\} \). Hence the two problems

$$\begin{aligned} {\text {find minimizers w.r.t.} \; \le _C \; \text {of} \; F(x) \; \text {subject to} \; 0 \in g(x),} \end{aligned}$$
(VOP)
$$\begin{aligned} {\text {find minimizers w.r.t.} \; \supseteq \; \text {of} \; f(x) \; \text {subject to} \; 0 \in g(x)} \end{aligned}$$
(SOP)

have the same feasible elements and the same minimizers. The minimizers of (VOP) are called’minimal solutions’ [114, Definition 7.1] or ‘efficient solutions’ [19, Definition 2.5.1]. In most cases, it does not make sense to look for the infimum in (VOP) with respect to \(\le _C\): It may not exist (not even for simple polyhedral cones C, see e.g. [151, Example 1.9]), and even if it does, it is not useful in practice at it refers to so-called utopia points which are typically not realizable by feasible points (i.e. “decisions”).

The (PC) version of (SOP) considered as an \(\mathcal {F}(Z, C)\)- or \(\mathcal {G}(Z, C)\)-valued problem is called the lattice extension of (VOP), and a solution of (VOP) is defined to be a solution of its lattice extension (see [102], compare Definition 6.5). In this way, the notion of an “infimum” makes a strong comeback, and the infimum attainment becomes a new feature in vector optimization, which is useful for theory and applications: It ensures that the decision maker possesses a sufficient amount of information about the problem if (s)he knows a solution. For a detailed discussion see [151, Chap. 2]. Note that one possibly obtains different solutions depending on the choice of \(\mathcal {F}(Z, C)\) or \(\mathcal {G}(Z, C)\) as image space. Since the infimum in \(\mathcal {G}(Z, C)\) involves the convex hull, solutions of \(\mathcal {G}(Z, C)\)-valued problems may include “fewer” elements, and this is in particular preferable for convex problems.

If f is the “lattice extension” of a vector-valued function F as given above, the Lagrange function for (PC) takes the form

$$\begin{aligned} L\left( x, y^*, z^* \right)&= f(x) \oplus \inf _{y \in g(x)}S_{\left( y^*, z^* \right) }\left( y \right) = F(x) + \inf _{y \in g(x)}S_{\left( y^*, z^* \right) }\left( y \right) \\&= \inf _{y \in g(x)} \left\{ z + F(x) \in Z \mid y^*(y) \le z^*(z) \right\} \\&= \left\{ z \in Z \mid \inf _{y \in g(x)}y^*(y) + z^*\left( F(x) \right) \le z^*(z) \right\} \end{aligned}$$

whenever \(F(x) \in Z\), \(L\left( x, y^*, z^* \right) = \emptyset \) whenever \(F(x) = +\infty \) or \(g(x) = \emptyset \), and \(L\left( x, y^*, z^* \right) = Z\) whenever \(F(x) = -\infty \) and \(g(x) \ne \emptyset \). The function \(\Lambda _{z^*}\left( x, y^* \right) := z^*\left( F(x) \right) + \inf _{y \in g(x)}y^*(y)\) (with the convention \(z^*(\pm \infty ) = \pm \infty \)) is the (classical) Lagrange function of the (scalar) problem

$$ \inf \left\{ z^*\left( F(x) \right) \mid 0 \in g(x) \right\} $$

(see, for example, already [185, p. 197]). Moreover, if g is generated by a vector-valued function \(G :X \rightarrow Y\cup \left\{ -\infty , +\infty \right\} \) in the same way as f by F, then

$$ \inf _{y \in g(x)}y^*(y) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} y^*\left( G(x) \right) &{} : &{} G(y) \in Z, \; y^* \in D^+ \\ -\infty &{} : &{} G(y) = -\infty , \; \text {or} \; G(y) \in Z \; \text {and} \; y^* \not \in D^+ \\ +\infty &{} : &{} G(y) = +\infty . \end{array} \right. $$

Thus, \(\Lambda _{z^*}\left( x, y^* \right) = z^*\left( F(x) \right) + y^*\left( G(y) \right) \) whenever \(F(x) \in Z\), \(G(x) \in Y\) and \(y^* \in D^+\). The dual objective becomes

$$ h(y^*, z^*) = \inf _{x \in X} L\left( x, y^*, z^* \right) = \left\{ z \in Z \mid \inf _{x \in X}\Lambda _{z^*}\left( x, y^* \right) \le z^*(z) \right\} . $$

Corollary 7.1

Let F be C-convex, f its lattice extension and \(g :X \rightarrow \mathcal {G}(Y, D)\) convex such that the Slater condition (6.5) is satisfied. If \(I(f, g) = \inf \left\{ f(x) \mid 0 \in g(x) \right\} \not \in \left\{ Z, \emptyset \right\} \), then \(\Gamma ^+(f,g) = \left\{ z^* \in C^+\backslash \negthinspace \left\{ 0 \right\} \mid I(f, g) \oplus H^+(z^*) \ne Z \right\} \) is non-empty and

$$\begin{aligned} I(f, g)&= \mathrm{{cl \,}}\bigcup \left\{ F\left( x \right) \mid 0 \in g\left( x \right) \right\} = \bigcap _{y^* \in D^*, \, z^* \in \Gamma ^+(f,g)} \left\{ z \in Z \mid \Lambda _{z^*}\left( x, y^* \right) \le z^*(z) \right\} ,\end{aligned}$$
(7.1)
$$\begin{aligned}&\forall z^* \in \Gamma ^+(f,g) \; \exists y^* \in Y^* :I(f, g) \oplus H^+\left( z^* \right) = h\left( y^*, z^* \right) . \end{aligned}$$
(7.2)

Proof

Of course, f is convex if, and only if, F is C-convex (see [156, Definition 1.6 on p. 29] for a definition). Theorem 6.9 and the above discussion produce the result.   \(\square \)

It might be worth to compare Corollary 7.1 with standard duality results in vector optimization. First, there is no assumption about the existence of (weakly, properly) minimal solutions: This is in contrast to most results in vector optimization such as [67, Theorems 3.7.4 and 3.7.7], [114, Theorem 8.7], [19, Theorems 4.1.2 and 4.1.4]. Secondly, there are no interior point assumptions to the cone C. Thirdly, with Corollary 6.10 in view, the existence of a dual solution in a set-valued sense is provided in the sense of the “maximization” version of Definition 3.3. Finally, classical duality results in vector optimization can be obtained from Corollary 7.1 as it is described in [151, Sect. 3.5].

7.2 A Case Study: Linear Vector Optimization

We proceed with an exemplary application of the set-valued theory to linear vector optimization problems and show that we obtain what we expect in view of scalar linear programming duality: a dual program of the same type. In this section, we will write \(\le \) and \(\ge \) for \(\le _{\mathrm {I\negthinspace R}^m_+}\) and \(\ge _{\mathrm {I\negthinspace R}^m_+}\), respectively, for any \(m \in \left\{ 1,2, \ldots \right\} \).

Consider the linear vector optimization problem

$$\begin{aligned} {\textstyle \min _C} \quad P x \quad \text {subject to} \quad Ax \ge b, \quad {(P_L)} \end{aligned}$$

where \(P \in \mathrm {I\negthinspace R}^{q \times n}\), \(A\in \mathrm {I\negthinspace R}^{m \times n}\), \(b \in \mathrm {I\negthinspace R}^m\), and the cone C is polyhedral convex with nonempty interior. A representation \(C = \left\{ z \in \mathrm {I\negthinspace R}^q \mid W^T z \ge 0 \right\} \) by a matrix \(W \in \mathrm {I\negthinspace R}^{q \times k}\) is given. The feasible set is denoted by \(S := \left\{ x \in \mathrm {I\negthinspace R}^n \mid Ax \ge b \right\} \).

With (PC) in view we define \(f(x) = Px + C\) and \(g(x) = b - Ax + \mathrm {I\negthinspace R}^m_+\). Then, the set \( \left\{ (x,z) \in \mathrm {I\negthinspace R}^n \times \mathrm {I\negthinspace R}^q \mid z \in f(x),\; 0 \in g(x) \right\} \) is polyhedral convex. We modify the solution concept in Definition 6.5 by adding the requirement that a solution is a finite set of vectors and directions, see [151] and also [87], the latter also including “\(\varepsilon \)-variants.” The reason is that every polyhedral set can be expressed as the generalized convex hull of finitely many vectors and directions. Such a solution is called finitely generated solution, but we call it just solution if the context of polyhedral convex set-valued problems or the subclass of linear vector optimization problems is clear. To keep the notation simple, we only consider bounded problems here, that is, we assume

$$\begin{aligned} \exists \bar{z} \in \mathrm {I\negthinspace R}^q :\forall x \in S :\bar{z} \le _C Px. \end{aligned}$$
(7.3)

Under this assumption, a solution consists of finitely many vectors only. For the general case, see [151, Chap. 4]. A solution to (\(P_L\)) is a nonempty finite set \(\bar{S} \subseteq S\) of minimizers (’efficient solutions’ in the most textbooks) such that \(P[S] \subseteq P[\bar{S}]+C\), where the latter condition refers to infimum attainment in \(\bar{S}\) with respect to the lattice extension (compare Definition 3.3).

Considering the lattice extension of (\(P_L\)) we show that the Lagrange technique from Sect. 6.2 leads to a dual problem, which enjoys nice properties and is useful for applications and algorithms. Re-labeling the dual variables by \(u = y^*\), \(w = z^*\) we obtain the Lagrangian

$$\begin{aligned} L(x,u,w)&= Px + C + \mathrm{{cl \,}}\bigcup _{z \ge b - Ax} \left\{ z \in \mathrm {I\negthinspace R}^q \mid u^Ty \le w^T z \right\} \\&= Px + C + \mathrm{{cl \,}}\bigcup _{r \in \mathrm {I\negthinspace R}^m_+} \left\{ z \in \mathrm {I\negthinspace R}^q \mid u^T (r-Ax+b) \le w^T z \right\} . \end{aligned}$$

The dual objective is

$$\begin{aligned} h(u,w)&= \mathrm{{cl \,}}\bigcup _{x \in \mathrm {I\negthinspace R}^n} L(x,u,w) \\&= \mathrm{{cl \,}}\bigcup _{r \in \mathrm {I\negthinspace R}^m_+,\, x \in \mathrm {I\negthinspace R}^n,\, v \in C} \left\{ z \in \mathrm {I\negthinspace R}^q \mid u^T (r-Ax+b) \le w^T (z-Px-v) \right\} \\&= \mathrm{{cl \,}}\bigcup _{r \in \mathrm {I\negthinspace R}^m_+,\, x \in \mathrm {I\negthinspace R}^n,\, v \in C} \left\{ z \in \mathrm {I\negthinspace R}^q \mid (w^T P-u^T A) x \le w^T (z - v) -u^T (b+r) \right\} \\&= \left\{ \begin{array}{c@{\quad }c@{\quad }c} \left\{ z \in \mathrm {I\negthinspace R}^q \mid 0 \le w^T z - u^T b \right\} &{} : &{} A^T u = P^T w,\, u \ge 0,\, w \in C^+\backslash \left\{ 0 \right\} \\ \mathrm {I\negthinspace R}^q &{} : &{} \text {otherwise.} \end{array} \right. \end{aligned}$$

Let \(C^+= \left\{ w \in \mathrm {I\negthinspace R}^q \mid V^T w \ge 0 \right\} \) be a representation of \(C^+\) by a matrix \(V \in \mathrm {I\negthinspace R}^{q \times l}\). Note that a basis of \(C^+\) is already sufficient to cover all values of the dual objective h (see the end of Sect. 5.2). If we fix some \(c \in \mathrm{{int\,}}C\), we obtain the (set-valued) dual problem

$$\begin{aligned} {\text { maximize } D(u,w) \quad \text {subject to} \quad (u,w) \in T} \quad {(D_L)} \end{aligned}$$

with objective function

$$\begin{aligned} D :\mathrm {I\negthinspace R}^m \times C^+ \rightarrow \mathcal {G}(\mathrm {I\negthinspace R}^q,C),\quad D(u,w):= \left\{ z \in \mathrm {I\negthinspace R}^q \mid u^T b \le w^T z \right\} \end{aligned}$$

and feasible set

$$\begin{aligned} T:= \left\{ (u,w) \in \mathrm {I\negthinspace R}^m \times \mathrm {I\negthinspace R}^q \mid A^T u = P^T w,\; u \ge 0,\; V^T w \ge 0,\; c^T w = 1 \right\} . \end{aligned}$$

This dual problem has a very simple structure: linear constraints, a halfspace-valued objective function and maximization means to take the intersection over these halfspaces. The objective function is conlinear in b and in u, i.e., \(D(u,w)=S_{(u,w)}(b)=S_{(b,w)}(u)\), and therefore a natural replacement of the dual objective “\(b^T u\)” in (scalar) linear programming. A (finitely generated) solution of (\(D_L\)) is a nonempty set \(\bar{T} \subseteq T\) of maximizers with respect to the ordering \(\supseteq \) satisfying \(\bigcap _{(u,w)\in \bar{T}} D(u,w) = \bigcap _{(u,w) \in T} D(u,w)\), where the latter conditions means supremum attainment in \(\bar{T}\).

Remark 7.2

Using the construction of Example 2.12, we obtain an equivalent problem with a hyperplane-valued objective. This shows that we indeed have a very natural generalization of scalar linear programs to the vectorial case because in \(\mathrm {I\negthinspace R}\), a real number and a hyperplane are the same object. In more general linear spaces, vectors and half-spaces are dual in some sense. Compare the footnote on p. 2.

Weak duality (see Proposition 6.8) means that \(x \in S\) and \((u,w) \in T\) imply \(D(u,w) \supseteq Px + C\). As a consequence, for every subset \(\tilde{T} \subseteq T\) of feasible points, the set \(\bigcap _{(u,w) \in \tilde{T}} D(u,w)\) is a superset (“outer approximation”) of the set \(\mathcal {P}:= \left\{ Px \mid Ax \ge b \right\} +C\), which is just the optimal value (the infimum) of the lattice extension. Likewise, for every subset \(\tilde{S} \subseteq S\) of feasible points of (\(P_{L}\)), the set \(\mathrm{{cl \,}}\mathrm{{co \,}}\bigcup _{x \in \tilde{S}} Px+C\) is a subset (“inner approximation”) of \(\mathcal {P}\).

Strong duality means that \(\bigcap _{(u,w) \in T} D(u,w) = \mathcal {P}\). A constraint qualification is not needed as in the case of linear constraints in (scalar) convex programming. Note further that, if \(\emptyset \ne \bar{S} \subseteq S\) such that \(P[\bar{S}]\) is the set of vertices of \(\mathcal {P}\), then \(\bar{S}\) is a solution to (\(P_{L}\)). Likewise, a set \(\emptyset \ne \bar{T} \subseteq T\) such that \( \left\{ D(u,w) \mid (u,w) \in \bar{T} \right\} \) is the family of half-spaces supporting \(\mathcal {P}\) in facets, then \(\bar{T}\) is a solution of (\(D_L\)).

Remark 7.3

In the vector optimization literature one can observe the longstanding paradigm that the dual of a vector optimization problem should be a vector optimization problem with the same ordering cone. To fulfill this requirement, problems of the type

$$\begin{aligned} {\textstyle \max _C} \quad z \quad \text {subject to} \quad z \in D(u,w),\; (u,w)\in T \end{aligned}$$
(7.4)

have been considered, see e.g. [19, Sect. 4.5.1] and in the linear case [20]. The price is high. In general, important properties like linearity of the constraints and convexity of the feasible set get lost by such a transformation.

To emphasize the “linear” character of problem (\(D_L\)), we transform it into an equivalent linear vector optimization problem:

$$\begin{aligned} {\textstyle {\max _K} \quad D^*(u,w) \quad \text {subject to} \quad (u,w) \in T,} \quad {(D^{*}_{L})} \end{aligned}$$

where the objective function \(D^* :\mathrm {I\negthinspace R}^q\times \mathrm {I\negthinspace R}^m \rightarrow \mathrm {I\negthinspace R}^q\), given by

$$\begin{aligned} D^*(u,w) := (w_1,\dots ,w_{q-1},b^T u)^T, \end{aligned}$$

is linear and vector-valued, and the ordering cone is \(K := \left\{ \right. z \in \mathrm {I\negthinspace R}^q \mid z_1 = \dots = z_{q-1} = 0, z_q \ge 0\left. \right\} \). A (finitely generated) solution of (\(D^{*}_{L}\)) is a nonempty set \(\bar{T} \subseteq T\) of maximizers with respect to \(\le _K\) in \(\mathrm {I\negthinspace R}^q\) satisfying \(D^*[T] \subseteq \mathrm{{co \,}}D^*[\bar{T}] - K\), where the latter condition refers to supremum attainment in \(\bar{T}\) (with respect to the lattice extension with image space \(\mathcal {G}(\mathrm {I\negthinspace R}^q, K)\)).

Proposition 7.4

The problems (\(D_L\)) and (\(D^{*}_{L}\)) have the same solutions.

Proof

See [151, Theorems 4.38 and 4.57].   \(\square \)

In the sense of the previous proposition, (\(D_L\)) and (\(D^{*}_{L}\)) are equivalent. This means that the set-valued dual problem (\(D_L\)) can be expressed as a linear vector optimization problem, however, with a different ordering cone K and an interpretation of the duality relation which differs from the one in standard references.

Of course, we can derive a set-valued dual problem to (\(D^{*}_{L}\)) by an analogous procedure. This leads to outer and inner approximations and different representations of \(\mathcal {D}:= \left\{ D^*(u,w) -K \mid (u,w)\in T \right\} \), i.e., the optimal value of the lattice extension of (\(D^{*}_{L}\)).

Problem (\(D^{*}_{L}\)) is called the geometric dual problem, and there is a further duality relation called geometric duality [101] between (\(P_{L}\)) and (\(D^{*}_{L}\)): There is an inclusion-reversing one-to-one map between the proper faces of \(\mathcal {P}\) and the proper K-maximal faces of \(\mathcal {D}\). This means, for instance, that a vertex of one set can be used to describe a facet of the other set and vice versa. For a detailed explanation of geometric duality see [101, 151]. Geometric duality has been extended to convex vector optimization problems, see [100]. The paper [157] is in the same spirit.

7.3 Approximate Solutions and Algorithms

In this section, we assume that C is a closed convex cone. Let \(f :X \rightarrow \mathcal {G}(Z, C)\) be a function. The starting point for constructing algorithms for solving the problem (P) (see Sect. 3.1), i.e.

$$\begin{aligned} {\text {minimize} \quad f(x) \quad \text {subject to} \quad x \in X} \end{aligned}$$
(P)

should be Definition 3.3: It involves minimal values of f as well as the infimum taken in \(\mathcal {G}(Z,C)\). In order to make algorithms reasonable, both notions should be replaced by appropriate approximate versions.

Recall \(I(f) = \inf _{x \in X} f(x)\). Two sets \(A, B \in \mathcal {G}(Z, C)\) are called an outer approximation and an inner approximation of I(f), respectively, if \(A \supseteq I(f) \supseteq B\). Outer and inner approximations of I(f) could be generated by sets \(M \subseteq \mathrm{{dom \,}}f\) or by dual admissible elements.

Definition 7.5

Let \(D :\mathrm {I\negthinspace R}_+ \rightarrow \mathcal {G}(Z, C)\) be a function satisfying

  1. (i)

    \(D(\varepsilon _2) \supseteq D(\varepsilon _1)\) for all \(\varepsilon _1, \varepsilon _2 \in \mathrm {I\negthinspace R}_+\) with \(0 < \varepsilon _1 \le \varepsilon _2\), and

  2. (ii)

    \(C = D(0) = \bigcap _{\varepsilon > 0} D(\varepsilon )\).

A set \(M \subseteq \mathrm{{dom \,}}f\) is called a \((D, \varepsilon )\)-solution of (P) if

$$ \inf ~f[M] \oplus D(\varepsilon ) \supseteq I(f), $$

and each \(x \in M\) is a minimizer of f.

A similar concept applies to supremum problems which can be useful in connection with duality. If M is a \((D, \varepsilon )\)-solution of (P), then

$$ \inf ~f[M] \oplus D(\varepsilon ) \supseteq I(f) \supseteq \inf ~f[M], $$

i.e., \(\inf ~f[M]\) trivially is an inner approximation of I(f).

The condition that elements of M be minimizers for f might be relaxed to any type of approximate minimizers, thus producing sets of \((D, \varepsilon )\)-solutions consisting of approximate minimizers. Similarly, the intersection in (ii) might be replaced by any type of set convergence which is sometimes useful if \(C \subseteq D(\varepsilon )\) is not satisfied for some (or all) \(\varepsilon > 0\).

It turned out that effective algorithms for vector and set optimization problems generate \((D, \varepsilon )\)-solutions, for example with

$$ D(\varepsilon ) = C - \varepsilon c $$

with some \(c \in C\backslash (-C)\), even \(c \in \mathrm{{int\,}}C\) under the assumption that the latter set is non-empty. This idea has been exploited with Benson’s outer approximation algorithm as the building block, see [87, Remark 4.10] and [153, Proposition 4.8]. The obtained algorithms indeed produce approximations of the set-valued infimum for (linear, convex) vector optimization problems. In [154], it is shown that the same idea can be used for minimizing a polyhedral set-valued function (i.e., a \(\mathcal {G}({\mathrm {I\negthinspace R}}^q, C)\)-valued function whose graph is a polyhedral set): The corresponding algorithm produces solutions in the sense of Definition 3.3 and might be considered as the first “true set-valued” algorithm. Its extension to non-polyhedral problems is highly desirable and another challenge for the future.

We note that a different algorithmic approach for producing minimizers with respect to a set relation can be found in [116]. In particular, it provides a numerical test if two (compact) sets \(A, B \subseteq Z\) are in relation with respect to \(\preccurlyeq _C \cap \curlyeqprec _C\) (compare the closely related Sect. 4.2 of this survey and [115]). In the polyhedral case, this test can be implemented on a computer. An algorithm is given which produces minimizers of a set-valued function if the set of feasible points is finite, and a descent method [116, Algorithm 4.1] for problem (P) generates feasible points which are minimal with respect to some finite subset of the set of feasible points.

7.4 Set-valued Risk Measures

Set-valued risk measures shall serve as a prominent example of set-valued translative functions as discussed in Sect. 5.5. The framework will be the following. By a slight abuse of notation, X in this section does not denote a linear space, but rather a random variable etc.

Let \(\left( \Omega , \mathcal {F}_T, P \right) \) be a probability space. A multivariate random variable is a P-measurable function \(X :\Omega \rightarrow \mathrm {I\negthinspace R}^d\) for some positive integer \(d \ge 2\). If \(d=1\), the random variable is called univariate. Let us denote by \(L^0_d =L^0_d\left( \Omega , \mathcal {F}_T, P \right) \) the linear space of the equivalence classes of all \(\mathrm {I\negthinspace R}^d\)-valued random variables which coincide up to sets of P-measure zero (P-almost surely). As usual, we write

$$ \left( L^0_d \right) _+ = \left\{ X \in L^0_d \mid P\left( \left\{ \omega \in \Omega \mid X\left( \omega \right) \in \mathrm {I\negthinspace R}^d_+ \right\} \right) = 1 \right\} $$

for the closed convex cone of \(\mathrm {I\negthinspace R}^d\)-valued random vectors with P-almost surely nonnegative components. An element \(X \in L^0_d\) has components \(X_1, \ldots , X_d\) in \(L^0 = L^0_1\). In a similar way, we use \(L^p_d\) for the spaces of equivalence classes of d-dimensional random variables whose components are to the pth power integrable (if \(0 < p < \infty \)) and essentially bounded (if \(p = \infty \)). The symbol \(\mathrm {1\negthickspace I}\) denotes the random variable in \(L^0_1\) which has P-almost surely the value 1.

Let \(M \subseteq \mathrm {I\negthinspace R}^d\) be a linear subspace. We set \(M_+ = M \cap \mathrm {I\negthinspace R}^d_+\) and assume \(M_+ \ne \left\{ 0 \right\} \) in the following.

Definition 7.6

([82]) A function \(R :L^p_d \rightarrow \mathcal {P}\left( M, M_+ \right) \) is called a risk measure if it is

  1. (R0)

    finite at \(0 \in L^p_d\): \(R\left( 0 \right) \ne \emptyset \), \(R\left( 0 \right) \ne M\);

  2. (R1)

    M-translative:

    $$\begin{aligned} \forall X \in L^p_d, \; \forall u \in M :R\left( X + u\mathrm {1\negthickspace I} \right) = R\left( X \right) - u; \end{aligned}$$
    (7.5)
  3. (R2)

    \(\left( L^p_d \right) _+\)-monotone: \(X^2 - X^1 \in \left( L^p_d \right) _+\) \(\Rightarrow \) \(R\left( X^2 \right) \supseteq R\left( X^1 \right) \).

Set-valued risk measures are indeed recognized as T-translative if, within the notation of Sect. 5.5, \(X=L^p_d\), \(Z=M\), \(C = M_+ \subseteq M\) and the linear operator \(T :M \rightarrow L^p_d\) is defined by \(Tu = -u\mathrm {1\negthickspace I}\). This means that T assigns to each \(u\in M\) the random vector being constantly equal to \(-u\).

A financial interpretation is as follows. A multivariate random variable is understood as a model for an unknown future portfolio or payoff of d assets where each component indicates the number of units of the corresponding asset in the portfolio. The elements of R(X) are understood as deposits, to be given at initial time, which compensate for the risk of X. The collection of all such risk compensating initial portfolios is understood as a measure of the risk associated to X. Such deposits usually involve fewer assets than the original portfolio, for example cash in a few currencies. This motivates the introduction of the space M which is called the space of eligible portfolios. A typical example is \(M = \mathrm {I\negthinspace R}^m \times \left\{ 0 \right\} ^{d-m}\) for \(1 \le m \le d\) with \(m \ll d\).

The axiom (R1) roughly means that the risk of \(X + u\mathrm {1\negthickspace I}\) is the risk of X reduced by u whenever \(u \in M\). Axiom (R2) also has a clear interpretation: if a random vector \(Y\in L^p_d\) dominates another random vector \(X\in L^p_d\), then there should be more possibilities to compensate for the risk of Y (in particular cheaper ones) than for X. Finiteness at zero means that there is an eligible portfolio which covers the risk of the zero payoff, but not all eligible portfolios do. Convexity is an important property as it allows to invoke diversification effects.

From M-translativity and \(\left( L^p_d \right) _+\)-monotonicity it follows that R maps into \(\mathcal {P}\left( M,M_+ \right) \). Clearly, the image space of a closed convex risk measure is \(\mathcal {G}\left( M,M_+ \right) \).

If trading is allowed a market model has to be incorporated. Here, a one-period market with proportional transaction costs as in [121, 205] is considered. It is given by closed convex cones \(K_0\) and \(K_T = K_T\left( \omega \right) \) with \(\mathrm {I\negthinspace R}_+^d \subseteq K_t(\omega ) \subsetneq \mathrm {I\negthinspace R}^d\) for all \(\omega \in \Omega \) and \(t \in \left\{ 0, T \right\} \) such that \(\omega \mapsto K_T\left( \omega \right) \) is \(\mathcal F_T\)-measurable. These cones, called solvency cones, include precisely the set of positions which can be exchanged into a nonnegative portfolio at time 0 and T, respectively, by trading according to the prevailing exchange rates. We set \(K_0^M := M \cap K_0 \subseteq M\) which is the cone containing the “solvent” eligible portfolios. The set

$$\begin{aligned} L^p_d\left( K_T \right) = \left\{ X \in L^p_d \mid P\left( \left\{ \omega \in \Omega \mid X\left( \omega \right) \in K_T\left( \omega \right) \right\} \right) = 1 \right\} \end{aligned}$$

is a closed convex cone in \(L^p_d\).

Definition 7.7

([82]) A risk measure \(R :L^p_d \rightarrow \mathcal {P}\left( M,M_+ \right) \) is called market-compatible if it maps into \(\mathcal {P}\left( M,K_0^M \right) \) and is \(L^p_d\left( K_T \right) \)-monotone, that is \(X^2 - X^1 \in L^p_d\left( K_T \right) \) implies \(R\left( X^2 \right) \supseteq R\left( X^1 \right) \).

Let \(1 \le p \le \infty \). We consider the dual pairs \((L^p_d,L^q_d)\) with \(\frac{1}{p}+\frac{1}{q} = 1\) and endow them with the norm topology if \( p<\infty \) and the \(\sigma \left( L^\infty _d, L^1_d \right) \)-topology on \(L^\infty _d\) in the case \(p = +\infty \), respectively. The duality pairing is given by \((X, Y) \mapsto E[Y^TX]\) for \(X \in L^p_d\), \(Y \in L^q_d\). The adjoint operator \(T^* :L^q_d \rightarrow M\) is given by \(T^*Y=\Pr _M E \left[ -Y \right] \) where \(\Pr _M\) denotes the projection operator onto the linear subspace M.

The biconjugation theorem, Theorem 5.8, can be used to obtain a dual description of a closed convex market-compatible set-valued risk measure of the form

$$\begin{aligned} R(X)= R^{**}(X) = \bigcap _{Y\in L^q_d, \, v \in \left( K_0^M \right) ^+\backslash \left\{ 0 \right\} } \left( S_{(Y,v)}(X) + (-R^*)(Y,v) \right) \end{aligned}$$
(7.6)

with

$$ S_{(Y,v)}(X)= \left\{ u\in M \mid v^Tu\ge E \left[ Y^T X \right] \right\} $$

and \(\left( K_0^M \right) ^+= \left\{ v\in M \mid \forall u\in K_0^M :v^Tu\ge 0 \right\} \).

Using the considerations of Sect. 5.5 and taking into account that \(L^p_d(K_T)\)-monotonicity implies \((-R^*)\left( Y, v \right) =M\) if \(-Y\not \in L^q_d\left( K^+_T \right) \) we get

$$\begin{aligned} (-R^*)\left( Y, v \right) = \left\{ \begin{array}{c@{\quad }c@{\quad }c} \displaystyle \mathrm{{cl \,}}\bigcup _{X \in A_R} S_{\left( -Y, v \right) }\left( X \right) &{} : &{} -Y \in L^q_d\left( K^+_T \right) , \; v=\Pr _M E \left[ -Y \right] \\ M &{} : &{} \text{ else. } \end{array}\right. \end{aligned}$$
(7.7)

Recall \(A_R = \left\{ X \in L^p_d \mid 0 \in R(X) \right\} \) from Sect. 5.5.

The next lemma admits a change of variables from vector densities Y to vector probability measures Q. This allows a formulation of the dual representation result in terms of probability measures as it is common in the scalar case.

In the following, \(\mathrm{{diag}}\left( w \right) \) with \(w \in \mathrm {I\negthinspace R}^d\) denotes the diagonal matrix with the components of w as entries in its main diagonal and zero elsewhere. Moreover, \(\mathcal {M}^P_d = \mathcal {M}^P_d\left( \Omega , \mathcal {F}_T \right) \) denotes the set of all vector probability measures with components being absolutely continuous with respect to P, i.e. \(Q_i :\mathcal {F}_T \rightarrow \left[ 0,1 \right] \) is a probability measure on \(\left( \Omega ,\mathcal {F}_T \right) \) such that \(\frac{dQ_i}{dP} \in L^1\) for \(i = 1, \ldots , d\).

Lemma 7.8

  1. (a)

    Let \(Y \in L^q_d\left( K^+_T \right) \), \(v =\Pr _M E \left[ Y \right] \in \left( K_0^M \right) ^+\backslash \{0\}\). Then there are \(Q \in \mathcal {M}^P_d\), \(w \in K_0^+\backslash M^\perp + M^\perp \) such that \(\mathrm{{diag}}\left( w \right) \frac{dQ}{dP} \in L^q_d\left( K^+_T \right) \) and \(S_{\left( Y, v \right) } = {F}^M_{\left( Q, w \right) }\) with

    $$\begin{aligned} {F}^M_{\left( Q, w \right) } \left[ X \right] = \left\{ {{z}} \in M \mid w^TE^Q \left[ X \right] \le w^T{z} \right\} = \left( E^Q \left[ X \right] + H^+(w) \right) \cap M. \end{aligned}$$
    (7.8)
  2. (b)

    Vice versa, if \(Q \in \mathcal {M}^P_d\), \(w \in K_0^+\backslash M^\perp + M^\perp \) such that \(\mathrm{{diag}}\left( w \right) \frac{dQ}{dP} \in L^q_d\left( K^+_T \right) \) then there is \(Y \in L^q_d\left( K^+_T \right) \) such that \(v :=\Pr _M E \left[ Y \right] \in \left( K_0^M \right) ^+\backslash \{0\}\) and \({F}^M_{\left( Q, w \right) } = S_{\left( Y, v \right) } \).

Proof

See [82].   \(\square \)

Let us denote the set of dual variables by

$$ \mathcal {W}^q = \left\{ \left( Q, w \right) \in \mathcal {M}^P_d \times \mathrm {I\negthinspace R}^d \mid w \in K_0^+\backslash M^\perp + M^\perp , \; \mathrm{{diag}}\left( w \right) \frac{dQ}{dP} \in L^q_d\left( K^+_T \right) \right\} . $$

The preceding considerations lead to the following dual representation result.

Theorem 7.9

A function \(R :L^p_d \rightarrow \mathcal {G}\left( M,K_0^M \right) \) is a market-compatible closed (\(\sigma \left( L^\infty _d, L^1_d \right) \)-closed if \(p = \infty \)) convex risk measure if, and only if, there is a set \(\mathcal W^q_R \subseteq \mathcal W^q\) such that

$$\begin{aligned} \forall X \in L^p_d :R\left( X \right) = \bigcap _{\left( Q, w \right) \in \mathcal {W}^q} \left[ (-\alpha _R)\left( Q, w \right) + \left( E^Q \left[ -X \right] + H^+(w) \right) \cap M \right] , \end{aligned}$$
(7.9)

where the function \(-\alpha _R :\mathcal W^q \rightarrow \mathcal G(M, M_+)\) is defined by

$$ \forall \left( Q, w \right) \in \mathcal W^q_R :(-\alpha _R)\left( Q, w \right) = \mathrm{{cl \,}}\bigcup _{X' \in A_R} \left( E^Q \left[ X' \right] + H^+(w) \right) \cap M $$

and \((-\alpha _R)\left( Q, w \right) = M\) whenever \(\left( Q, w \right) \in \mathcal W^q \backslash \mathcal W^q_R\).

Proof

See [82].   \(\square \)

Lemma 7.8 shows that the set \(\mathcal {W}^\infty \) for \(M = \mathrm {I\negthinspace R}^d\) coincides with the set of so-called consistent price systems (or processes). Strictly consistent price systems are crucial for market models with proportional transaction cost: In finite discrete time, the existence of such a price system is equivalent to the fundamental robust-no-arbitrage condition (see [205] for conical and [187] for convex market models). Therefore, results like Theorem 7.9, derived with set-valued duality tools, fit nicely into the mathematical finance background: They produce the correct dual variables, and they yield formulas which look like the corresponding scalar ones.

7.5 Comments on Applications

Duality for vector optimization problems is already discussed in Sect. 6.3. We add a few remarks about the linear case. It is an astounding fact that there still is no consensus on what to consider as the “canonical” dual of a linear vector optimization problem. After early contributions by Kornbluth [127], Isermann [110, 111] and Rödder [198], Ivanov and Nehse [112] discuss five different duals for a given linear vector optimization problem which illustrates the ambiguity even in the “simplest”, i.e. linear, case. The difficulty is further illustrated by means of the examples in [24] and [114, Discussion after Theorem 8.13]. A set-valued approach has been presented in [81] and later compared to several “vector-valued” duals in [20]. Compare also [103] and Dinh The Luc [157]. We believe that this ambiguity and the mathematical difficulties that come with it are rooted in the non-totalness of the order: A two-player matrix game with vector payoffs is hardly in equilibrium since the decisions of the players also depend on their “vertical preferences” (as well as on their guesses about the vertical preference of the opponent), i.e. the weight they put on the components of the payoff vectors. This topic, essentially the link between set-valued convex duality and games with vector payoffs (more general, payoffs which are not totally ordered), seems to be one of the most interesting open questions that can be derived from the material presented in this survey.

One advantage of the complete lattice approach is that the set-valued calculus deals with all “vertical preferences”, i.e. all reasonable scalarizations at the same time. This admits to re-discover scalar duality results on a “set level.”

In 1998, Benson [10, 11] proposed an “outer approximation algorithm” to solve linear vector optimization problems “in the outcome space.” Benson motivated this by three practical reasons: First, the set of minimal elements in the outcome space \(\mathrm {I\negthinspace R}^q\) has a simpler structure than the set of minimizers in the decision space \(\mathrm {I\negthinspace R}^n\), because one usually has \(q \ll n\). The same argument motivated [40, 41], which already contain similar algorithms based on the analysis given in [39]. The second reason is that a decision maker prefers to base her decision on objectives rather than directly on a set of efficient decisions. The third argument is that many feasible points are mapped on a single image point which may lead to redundant information.

Later it turned out that Benson’s algorithm just computes solutions to (\(P_{L}\)) and (\(D_L\)) as defined above, see [87, 151]. Therefore Benson’s arguments motivate the solution concepts introduced in Sect. 3.1 from an application oriented viewpoint, compare also [154]. The geometric duality theory [100, 101] briefly discussed in Sect. 7.2 is a fundamental tool to develop dual algorithms for solving linear and convex vector optimization problems, see [49, preprint already from 2007] [87, 153]. Compare also [158, 159] for the convex case and [70, 71], even for nonconvex variants.

Set-valued risk measures have been introduced in [120]. It contains a dual representation result for the sublinear case, basically a combination of the formulae (5.12) and (7.6). A more systematic development including the extension to the general convex case has been presented in [80] while market compatibility is due to [82]. A link to depth-trimmed regions, yet another set-valued object from statistics, can be found in [25]. Currently, the set-valued approach for evaluating multivariate risks is gaining more and more attention, see for example [53, 119, 165] and also [51, 52]. Applications of Benson’s algorithm and its variants to financial problems can be found in [87, 88, 152] and related approaches in [200, 201] as well as in [38].