Keywords

1 Introduction

Both, multi-objective optimization problems as well as bi-level optimization problems have been considered thoroughly during the last decades. The relatively new class of optimization problems considered in this article can be understood as a combination of the two above mentioned problems in the sense that both the higher and the lower level problem of such a bi-level optimization problem are given by multi-objective optimization problems. In other words, we are concerned with a multi-objective optimization problem (the higher level of the bi-level optimization problem), where the feasible set itself is restricted by the solution set of another (parametrized) multi-objective optimization problem (the lower level of such a bi-level optimization problem). Therefore, we call these problems bi-level multi-objective optimization problems (BLMOP). To demonstrate the relevance of such problems from a practical point of view, consider the following example. For the design of a perfect passenger car, two important goals are the fuel consumption (to be minimized) and the power of the engine (to be maximized) leading to a bi-objective optimization problem in the higher level. However, due to safety reasons there is the restriction that in the first place optimality concerning the mechanical guidance of the undercarriage in both horizontal and vertical direction have to be optimized leading to another bi-objective problem in the lower level.

In this chapter, we will concentrate on problems with equality constraintsFootnote 1 for both the higher and lower level problem. Moreover, we assume that the lower level problem is convex, that is, the lower level objectives are assumed to be convex and the lower level constraints are assumed to be affine-linear.

The outline of this article is as follows. In Sect. 2 we review the basic definitions and concepts of multi-objective optimization and bi-level optimization needed to understand the contents of the subsequent sections. The proposed algorithm for the solution of a BLMOP is presented in Sect. 3. In Sect. 4, we prove convergence of the algorithm. Then, in Sect. 5, we indicate the efficiency of the algorithm on two academic examples. Finally, we draw our conclusions in Sect. 6.

2 Background and Related Work

In the following we briefly review the relevant definitions and concepts of multi-objective optimization and bi-level optimization. Next, we describe in detail the bi-level multi-objective optimization problem (BLMOP) that we will consider in this article. We also state a Kuhn-Tucker based reformulation of the given BLMOP, which is used for the construction of the subproblem to be solved repeatedly in order to compute the individual points of the solution set as our new BL-Recovering-IS algorithm presented in Sect. 3 proceeds.

In a multi-objective optimization problem (MOP) one is faced with the problem that several objectives have to be optimized at the same time. Mathematically, a continuous MOP can be expressed as

$$\begin{aligned} \min _{x\in S}F(x). \end{aligned}$$
(MOP)

Hereby, the map F is defined by the individual objective functions \(F_i\), i.e.,

$$\begin{aligned} F:S\rightarrow \mathbb {R}^k,\qquad F(x) = (F_1(x),\ldots ,F_k(x))^T, \end{aligned}$$
(1)

where we assume all functions \(F_i:S\rightarrow \mathbb {R}\), \(i=1,\ldots , k\), to be continuous. Problems with \(k=2\) objectives are termed bi-objective optimization problems (BOPs).

The domain or feasible set \(S\subset \mathbb {R}^n\) of F can in general be expressed by equality and inequality constraints,

$$\begin{aligned} S = \{x\in \mathbb {R}^n\; | \; G_i(x)\le 0,\; i=1,\ldots ,l,\;\text {and}\; H_j(x) = 0,\; j=1,\ldots ,p\}. \end{aligned}$$
(2)

If \(S=\mathbb {R}^n\), we call the MOP unconstrained.

Optimality of a MOP is based on the concept of dominance.

Definition 1

  1. (a)

    Let \(v,w\in \mathbb {R}^k\). Then the vector v is less than w (in short: \(v<_p w\)), if \(v_i< w_i\) for all \(i\in \{1,\ldots ,k\}\). The relation \(\le _p\) is defined analogously.

  2. (b)

    A vector \(y\in S\) is called strictly dominated (or simply dominated) by a vector \(x\in S\) (\(x\prec y\)) with respect to (MOP) if

    $$\begin{aligned} F(x)\le _p F(y)\qquad and \qquad F(x) \ne F(y), \end{aligned}$$

    else y is called non-dominated by x.

If a feasible point x dominates another feasible point y, then we can consider x to be ’better’ than y with respect to the given MOP. The definition of optimality (i.e., the definition of the ‘best’ solutions) in multi-objective optimization is now straightforward.

Definition 2

  1. (a)

    A point \(x\in S\) is called (Pareto) optimal or a Pareto point of (MOP) if there exists no \(y\in S\) that dominates x.

  2. (b)

    The set of all Pareto optimal solutions is called the Pareto set, i.e.,

    $$\begin{aligned} P_S := \{x\in S\; : \; x\; is\,a\,Pareto\,point\,of \; (MOP)\}. \end{aligned}$$
    (3)
  3. (c)

    The image \(F(P_S)\) of \(P_S\) is called the Pareto front.

If all the objectives and constraint functions of the MOP are differentiable one can state a necessary condition for optimality which is analog to ‘classical’ scalar objective optimization problems (SOPs, i.e., MOPs with \(k=1\)).

Theorem 1

( [25]). Let \(x^*\) be a Pareto point of (MOP), where S is as in (2), and all objectives and constraint functions are differentiable in \(x^*\). Further, let the vectors \(\nabla H_i(x^*)\), \(i=1,\ldots , p\), be linearly independent. Then there exist vectors \({\alpha }^*\in \mathbb {R}^k\), \({\lambda }^*\in \mathbb {R}^l\), and \({\mu }^*\in \mathbb {R}^p\) such that the tuple \(({x}^*,{\alpha }^*,{\lambda }^*,{\mu }^*)\) satisfies

$$\begin{aligned} \begin{aligned} \sum _{i=1}^k \alpha _i^* \nabla F_i(\mathbf {x}^*) + \sum _{i=1}^l \lambda _i^*\nabla G_i(\mathbf {x}) + \sum _{i=1}^p \mu _i^*\nabla H_i(\mathbf {x}^*) = 0\\ \alpha _i^* \ge 0,\quad i=1,\ldots ,k\\ \sum _{i=1}^k \alpha _i^* = 1\\ \lambda _i^*\ge 0,\quad i=1,\ldots ,l\\ \lambda _i^* G_i(x^*) = 0,\quad i=1,\ldots ,l. \end{aligned} \end{aligned}$$
(4)

Moreover, it is known that these conditions are already sufficient under the assumptions used in this article.

Theorem 2

( [25]). Assume that the objectives \(F_i, i=1,\ldots , k\), are convex. Further, let the problem contain no inequality constraints and let all equality constraints \(H_i, i=1,\ldots , p\), be affine-linear. Then the conditions stated in Theorem 1 are sufficient for a solution of (MOP).

A reference point \(t\in \mathbb {R}^k\) can be regarded as a vector that consists of desirable objective values called aspiration levels or targets, \(t_i, i=1,\ldots , k\).

In the following we will focus on distance function based approaches, which are relevant for our new algorithm presented in Sect. 3. As indicated by its notation, distance function based approaches use a distance function, which is typically based on a norm, to measure the distance between a reference point and a given point in image space. To state the auxiliary problem corresponding to a target vector \(t\in \mathbb {R}^k\), let \(\delta :\mathbb {R}^k\times \mathbb {R}^k\rightarrow \mathbb {R}_+\) be a distance function derived from a norm, i.e., \(\delta (a,b)=||a-b||\) for some norm \(||\cdot ||:\mathbb {R}^k\rightarrow \mathbb {R}_+\). Then the auxiliary problem to be solved is

$$\begin{aligned}&\displaystyle \min _{x\in S} \delta (F(x),t). \end{aligned}$$
(RPP)

If we have \(\delta (F(x^\star ),t)>0\), where \(x^\star \) is a solution to RPP, then we know that \(F(x^\star )\) is on the boundary of the image \(F(S)=\{F(x):x\in S\subset \mathbb {R}^n\}\). Moreover, if in addition \(t <_p F(x^\star )\) we can expect that \(x^\star \) is (at least a local) Pareto point. Thus, local Pareto points can be found by first choosing suitable targets and then solving RPP. Indeed, Theorem 3, which was taken from [13], guarantees that, under certain assumptions, \(x^\star \) is a Pareto point. For this, recall that a norm \(||\cdot ||:\mathbb {R}^k\rightarrow \mathbb {R}_+\) is called strictly monotonically increasing, if \(||y^1|| < ||y^2||\) for all \(y^1,y^2\in \mathbb {R}^k\) with \(|y^1_j| \le |y^2_j|, j=1,2,\ldots ,k\), and \(|y^1_j|\ne |y^2_j|\) for some j.

Theorem 3

( [13]). Let \(||\cdot ||\) be a strictly monotonically increasing norm and assume \(\displaystyle t_i = \min \{F_i(x):x\in S\}\) for \(i=1,2,\ldots ,k\). If \(x^\star \) is an optimal solution of RPP, then \(x^\star \) is a solution of MOP.

We stress that throughout this article it will be \(||\cdot ||=||\cdot ||_2\) unless stated differently.

The analytic expression of the entire (exact) Pareto set/front is except for some academic test problems in general not possible. In literature, a huge variety of different methods can be found for the effective numerical treatment of MOPs. There are, for instance, mathematical programming (MP) techniques, point-wise iterative search techniques that generate a sequence of solutions that can converge toward one optimal solution (e.g., [16, 30] and references therein.). The most widely used sub-class of the MP techniques is given by scalarization methods that replace the given MOP into a suitable auxiliary SOP (e.g., [19, 20, 32, 38, 39]). Via identifying a clever sequence of such SOPs a suitable approximation of the entire Pareto set/front can be obtained in certain cases (e.g., [5, 14, 17, 18, 24, 30, 33]). Reference point methods use feasible or infeasible reference points for the construction of scalar valued auxiliary functions. For an overview on different types of reference point methods the reader is referred to [16].

Another class of methods are given by continuation-like methods that take advantage of the fact that the solution set forms at least locally a manifold. Such methods start from a given solution and perform a search along the solution manifold ( [22, 27,28,29, 34, 36, 37, 43]). However, one potential drawback of all the above mentioned methods is that they are of local nature, i.e, that they may get stuck in local Pareto optimal solutions of the given MOP depending on the chosen starting point and the chosen method to solve the auxiliary SOP.

Next to these point-wise iterative methods there exist specialized set oriented methods such as multi-objective evolutionary strategies (MOEAs, e.g., [2, 3, 6]), subdivision techniques [10, 23, 41, 42] or cell mapping techniques [21, 31, 35, 45, 46, 48, 49]. These methods have in common that they use entire sets in an iterative manner and are thus able to deliver an approximation of the solution set in one run of the algorithm. Further, the set based approach allows a more global view on the problem leading to a reduced probability to get stuck in local optimal solutions. Cell mapping techniques are particularly advantageous over other methods if a thorough investigation of the entire (low or moderate dimensional) system is of interest as they deliver next to Pareto set/front approximations also approximations of the set of nearly optimal solutions as well as the set of local solutions, as we will discuss in the following.

A bi-level optimization problem can be understood as an optimization problem (the higher level problem), where the feasible set is restricted by the solution set of another (parametrized) optimization problem (the lower level problem).

Many different approaches for solving (classical) bi-level optimization problems have been proposed in the past, as there are for example descent algorithms, bundle algorithms, penalty methods, trust region methods, smoothing methods, and branch-and-bound methods. Many of these approaches are based on the conversion of the bi-level problem to an ordinary (or classical) optimization problem (a one-level problem). One possibility is to replace the lower level objective f by an additional non-differentiable equation \(f(x,y)=\varphi (y)\), where \(\varphi (y)=\min _x\{f(x,y):g(x,y)\le 0, h(x,y)=0\}\). Other approaches use the implicit function theorem to derive a local description of the function \(x(y):\mathbb {R}^m\rightarrow \mathbb {R}^n\), which is then inserted into the higher level problem. Another concept is to replace the lower level problem by its Kuhn-Tucker conditions. In general, the resulting one-level problem, which is a mathematical program with equilibrium constraints or MPEC, see [26], is not equivalent to the original problem, but the desired equivalence is ensured in the particular case where the lower level problem is a convex one. For an overview on bi-level optimization the reader is referred to [1, 4, 7, 11, 12, 15, 44, 47].

In this article, we are concerned with the case where both the higher and lower level problem are given by multi-objective optimization problems. Such problems are called bi-level multi-objective optimization problem (BLMOP), see [9, 14, 15].

The higher level problem of a BLMOP can be written as

$$\begin{aligned} \min _y \min _x \{F(x,y): x\in \psi (y)\}\\ \text {s.t. } H(x,y) = 0, \end{aligned}$$
(BLMOP-H)

where \(\psi (y)\) denotes for every fixed \(y\in \mathbb {R}^m\) the solution, that is, the Pareto set of the lower level problem given by

$$\begin{aligned} \min _x f(x,y)&\\ \text {s.t. }&h(x,y) = 0, \end{aligned}$$
(BLMOP-L)

It should be mentioned that in the notions of [11], BLMOP-H and BLMOP-L correspond to an optimistic formulation of the general Bi-Level Optimization Problem. Since in this article we concentrate on the case with a convex lower level problem, the lower level problem can be replaced by the corresponding Kuhn-Tucker conditions stated in Theorem 1 to obtain an expression which is equivalent to BLMOP-H. For this, we assume that the higher level problem includes k objective functions \(F_i:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}\), which are collected in the vector valued function \(F:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}^k\), \(F(x, y) = (F_1(x, y), \ldots , F_k(x, y))^t\), and r equality constraints \(H_i:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}\), which are collected in the vector valued function \(H:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}^r\), \(H(x, y) = (H_1(x, y), \ldots , H_r(x, y))^t\). Analogously, we assume that the lower level problem includes l objectives \(f_i:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}\), which are collected in the vector valued function \(f:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}^l\), \( f(x, y) = (f_1(x, y), \ldots , f_l(x, y))^t\) and p equality constraints \(h_i:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}\), which are collected in the vector valued function \(h:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}^p\), \(h(x, y) = (h_1(x, y), \ldots , h_p(x, y))^t\). We denote by

$$\mathcal {L}(x,y,\alpha ,\lambda ) :=\sum _{i=1}^l \alpha _i f_i(x,y)+\sum _{i=1}^p \lambda _i h_i(x,y)$$

the lower level Lagrangian and by \(\nabla _x \mathcal {L}\) the gradient of \(\mathcal {L}\) with respect to x. According to Theorem 2, \(x \in \psi (y)\) if and only if

$$\begin{aligned} \begin{array}{c} h(x,y)=0,\\ \nabla _x\mathcal {L}(x,y,\alpha ,\lambda )=0,\\ \displaystyle \sum _{i=1}^l \alpha _i =1,\\ \alpha _i \ge 0,\quad i=1,\ldots , l \end{array} \end{aligned}$$

for some \(\alpha \in \mathbb {R}^l\) and \(\lambda \in \mathbb {R}^p\). Let

$$\begin{aligned} \hat{F}(x, y, \alpha , \lambda , s) := \left( \begin{array}{c} h(x,y)\\ H(x,y)\\ \nabla _x\mathcal {L}(x,y,\alpha ,\lambda )\\ \displaystyle \sum _{i=1}^l \alpha _i -1\\ \alpha - (s \circ s) \end{array}\right) , \end{aligned}$$

where \(s\in \mathbb {R}^l\) is a vector of l slack variables and \(a\circ b\) denotes the component-wise product of two vectors ab. Moreover, let \(z := (x,y,\alpha ,\lambda , s)\), \(\hat{S} := \{z:\hat{F}(z)=0\}\), and denote by \(\pi (z)\) the projection of z to the (xy)-space \(\mathbb {R}^{n+m}\). Observe that \(\{\pi (z):z\in \hat{S}\}\) is the feasible set of BLMOP-H and therefore the desired reformulation for the given Problem can be written as follows:

$$\begin{aligned} \min _{z\in \hat{S}}F(\pi (z)), \end{aligned}$$
(BLMOP-R)

where again minimization has to be understood in the sense of Definition 1. In order to handle BLMOP-R by the use of reference point methods, we define the following variant of RPP:

$$\begin{aligned} \min _{z\in \hat{S}} \delta (F(\pi (z)),t) \end{aligned}$$
(RPP-R)

Note that RPP-R will be the method used for the computation of the individual Pareto points of the given BLMOP while our BL-Recovering-IS algorithm presented in Sect. 3 proceeds.

3 Algorithm and Realization

We present the BL-Recovering-IS algorithm for the solution of equality constrained BLMOPs with a convex lower level problem. This algorithm can be understood as an extension of our algorithm for the solution of unconstrained MOPs described in [8]. In addition, we state some theoretical results which apply both to the algorithm presented here and to the algorithm presented in [8].

The aim of the BL-Recovering-IS algorithm is to generate both a box covering and a discrete representation of the entire Pareto front of the given BLMOP (see also Fig. 1).

We assume that this representation is required to be well-distributed in higher level image space in the following sense: Denote by \(Q\subset \mathbb {R}^k\) the region of interest in image space. For formal reasons denote by \({\mathcal P}_d\) a complete partitionFootnote 2 of the set Q into boxes of subdivision size – or depthd, which are generated by successive bisection of Q. These boxes are understood to be half-open, that is, they can be written as cartesian products \([a_1, b_1)\times , \ldots , \times [a_k, b_k)\) of half-open intervals \([a_i, b_i), i=1,\ldots , k\). Then there exists for every point \(\bar{F}\in Q\) exactly one box \(B(\bar{F},d)\in {\mathcal P}_d\) such that \(\bar{F}\in B(\bar{F},d)\). The algorithm computes both a covering

$$\begin{aligned} \mathcal {B}=\bigcup _{B \in {\mathcal P}_d\cap F(P)} \end{aligned}$$

and a discrete representation of the Pareto set P. The discrete representation is well-distributed in the sense that for every \(B\in {\mathcal P}_d\cap F(P)\) there is at least one computed point \((x,y)\in \mathbb {R}^{n+m}\) such that \(F(x,y)\in B\).

Fig. 1.
figure 1

Idea fo the BL-Recovering-IS algorithm: use boxes to obtain a uniform spread of solutions around the Pareto front.

In order to compute these points, our new image set-oriented algorithm presented in the following repeatedly solves a variant of RPP-R while the targets are varying. To state a corollary which guarantees that the corresponding solutions are at least locally Pareto optimal we denote \(T=(T_1,\ldots ,T_k)\), where

$$\begin{aligned} \displaystyle T_i = \min \{F_i(\pi (z)):z\in \hat{S}\}\qquad \text {for}\; i=1,2,\ldots ,k, \end{aligned}$$

and define for a given target vector \(t\in \mathbb {R}^k\) the modified feasible set

$$\begin{aligned} \hat{S}_t=\{z\in \hat{S}:F_i(\pi (z))\ge t_i,\,i=1,\ldots , k\}. \end{aligned}$$

Furthermore, we define variants of BLMOP-R and RPP-R, respectively, by replacing \(\hat{S}\) by \(\hat{S}_t\):

Now, with these notations we can state the following result.

Corollary 1

Let F be continuous on the compact domain \(\hat{S}\). Moreover, let \(||\cdot ||\) be a strictly monotonically increasing norm and assume that \(T<_p t <_p F(\pi (z^\star ))\), where \(z^\star \) is an optimal solution of RPP-R’. Then \(\pi (z^\star )\) is a local solution of the given BLMOP.

Proof

Since \(F_i\) is continuous and since there are \(\bar{z}^i, z^\star \in \hat{S}\) with \(F_i(\pi (\bar{z}^i)) = T_i< t_i < F_i(\pi (z^\star ))\), there exist \(z^i\in \hat{S}\) such that \(F_i(\pi (z^i)) = t_i\) for all \(i=1,2,\ldots ,k\). >From construction of \(\hat{S}_t\) it is obvious, that \(z^\star \in \hat{S}_t\) and \(\displaystyle t_i=\min \{F_i(\pi (z)):z\in \hat{S}_t\}\). Thus, Theorem 3 guarantees that \(z^\star \) solves BLMOP-R’. Since \(\hat{S}_t\) is constructed from \(\hat{S}\) just by constraining the image of F, such that \(F(\pi (\hat{S}_t))\) contains a part of a local Pareto optimal set in image space, \(z^\star \) is a local solution of BLMOP-R, that is, \(\pi (z^\star )\) is a local solution of the given BLMOP.    \(\square \)

In practice, a randomly chosen point \(t\in \mathbb {R}^k\) does not necessarily belong to the image \(F(\pi (\hat{S}))=\{F(\pi (z)):z\in \hat{S}\}\), that is, we do not know a priori whether there is any \(z\in \hat{S}\) such that \(F(\pi (z))=t\). Moreover, if \(F(\pi (z))=t\) for some \(z\in \hat{S}\), we do not know whether \(\pi (z)\) is Pareto optimal. To get an answer to these questions, we solve the auxiliary problem RPP-R’. If \(t<_p F(\pi (z^\star ))\) for a solution \(z^\star \) of RPP-R’, then we know that – under suitable assumptions – \(\pi (z^\star )\) is at least locally Pareto optimal. Otherwise, if \(t=F(\pi (z^\star ))\), then we repeatedly have to vary t and solve RPP-R’ until \(t<_p F(\pi (z^\star ))\). A strategy for the choice and variation of the targets t can be found later on in this section. In the algorithm described below and for the remainder of this article the distance function \(\delta \) is based on the norm \(||\cdot ||_2\) that is, \(\delta (a,b)=||a-b||_2\) for all \(a,b\in \mathbb {R}^k\). Our algorithm belongs to the family of continuation methods ( [8, 22, 40]), that is, the aim of every step is to compute Pareto points in the neighborhood of Pareto points already found in a previous step. Accordingly, we assume that at least one box B along with a point \(z^\star \) with \(F(\pi (z^\star ))\in F(P)\) has been computed previously, e.g., by the solution of RPP-R’ for the target \(t=(t_1,\ldots ,t_k)\), \(\displaystyle t_i = \min \{F_i(\pi (z)):z\in \hat{S}\}\) for \(i=1,2,\ldots ,k\).

Then, for a given box collection \(\mathcal {B}_j\subset \mathbb {R}^k\) (in image space) of subdivision depth d and denoting by \(z_B\) and \(F_B\) the previously generated solution (in parameter and image space, respectively) associated with a box \(B\in \mathcal {B}_j\), a step of the BL-Recovering-IS algorithm can be written as shown in Algorithm 1.

figure a

It remains to answer the question of how to choose the target vectors \(t_i\), \(i=1,2,\ldots , n_t\), near a current box B in order to compute Pareto points which are well-distributed in the sense mentioned above. Efficient strategies for the choice of target vectors can be defined, particularly by using local information on the Pareto set, e.g. orientation or curvature, which can be calculated via objective derivatives (or numerical approximations of the derivatives). In the following we will focus on a particular strategy for the choice of the targets which was originally designed for problems with smooth objectives, but is also applicable and works satisfactorily in the case of more general objectives. Let us assume that the higher level image \(F(P)\subset \mathbb {R}^k\) of the Pareto set P is smooth and forms a \((k-1)\)-dimensional manifold in a neighborhood \(N_\varepsilon (F^\star )\) of a given Pareto optimal point \(F^\star \in F(P)\) in higher level image space. Since an approximation of F(P) at \(F^\star \) is given by the tangent space \(T_{F^\star }F(P)\), there are certainly further Pareto points near \(T_{F^\star }F(P)\cap N_\varepsilon (F^\star )\). Consequently, we can expect that there are \(\lambda \in \mathbb {R}\) and \(p\in T_{F^\star }F(P)\cap N_\varepsilon (F^\star )\), such that suitable targets needed for the computation of further Pareto points can be expressed by \(p+\lambda d\), where \(d \le _p 0\) denotes a basis vector of the 1-dimensional space \((T_{F^\star }F(P))^\perp \). Thus, to apply this idea in practice, we first have to construct d and a basis \(V:=\{b_1,b_2, \ldots , b_{k-1}\}\) of \(T_{F^\star }F(P)\) and then to specify targets

$$t_i=F_i^\star + \sum _{j=1}^{k-1}\alpha _{i,j}\, b_j + \lambda _i\, d,\quad i=1,2,\ldots ,n_t$$

by determining the coefficients \(\alpha _{i,j}\) and \(\lambda _i\). Fortunately, as stated in the following lemma, if F(P) forms a smooth manifold in a neighborhood of \(F^\star \) and if \(F^\star \) was found in a previous step by solving RPP-R’ for a given target \(t^\star , t^\star <_p F^\star \), then d can be obtained without any additional effort by \(d := t^\star - F^\star \).

Lemma 1

Let \(F_i\in C^1(\mathbb {R}^n,\mathbb {R})\) for \(i = 1,\ldots , k,\) and consider the multi-objective optimization problem

$$\begin{aligned} \min _{z\in \hat{S}} F(\pi (z)). \end{aligned}$$

Denote by P the corresponding Pareto set and let \(F^\star := F(\pi (z^\star ))\), where \(z^\star \) is the unique solution of RPP-R’ associated with the target \(t^\star <_p F^\star .\) Moreover, assume that F(P) makes up a \((k-1)\)-dimensional smooth manifold in a neighborhood of \(F^\star \). Then

$$\begin{aligned} F^\star -t^\star \in T_{F^\star }F(P)^\bot . \end{aligned}$$

Proof

Let \( \partial F\) denote the boundary of \(\{F(\pi (z)):z\in \hat{S}\}\). Since F(P) forms a differentiable manifold in a neighborhood of \(F^\star \), there exists a differentiable curve \(\alpha :[-1,1]\rightarrow \partial F\) with \(\alpha (0)=F^\star \), \(\alpha '(0)\in T_{F^\star }F(P)\) and \(\alpha (\lambda )\in F(P)\) for all \(\lambda \in [0,1]\). Then, since \(z^\star \) is a solution of RPP’, \(\lambda =0\) is a solution of

$$\begin{aligned} \min _{\lambda \in [-1,1]}||\alpha (\lambda )-t^\star ||^2 \end{aligned}$$

and therefore

$$\begin{aligned} \frac{d}{d\lambda }||\alpha (\lambda )-t^\star ||^2 = \frac{d}{d\lambda }\left\langle \alpha (\lambda )-t^\star ,\alpha (\lambda )-t^\star \right\rangle =2\left\langle \alpha (\lambda )-t^\star , \alpha '(\lambda )\right\rangle =0 \end{aligned}$$

for \(\lambda =0\). With \(\alpha (0)=F^\star \) we obtain

$$\begin{aligned} \left\langle F^\star -t^\star , \alpha '(0)\right\rangle =0, \end{aligned}$$

that is, \(F^\star -t^\star \in T_{F^\star }F(P)^\bot \).    \(\square \)

Once d is available, any standard method for the construction of an orthogonal basis, e.g. the Grahm-Schmidt method can be used to obtain the required basis V. For all \(i=1,2,\ldots ,n_t\), the coefficients \(\alpha _{i,j}\) are chosen such that \(p_i:=\sum _{j=1}^{k-1}\alpha _{i,j}\, b_j\) is located inside a neighbor box of the current box. Moreover, the \(p_i\) should be well-distributed around \(F^\star \). With this heuristic, it is very likely to find new boxes containing the image of Pareto points. For the choice of \(\lambda _i\) an adaptive concept has to be applied, because a computed solution z of RPP-R’ can only be accepted, if \(t_i <_p F(\pi (z))\) is satisfied. Such an adaptive concept should be guided by the fact that \(t_i <_p F(\pi (z))\) certainly holds if \(\lambda _i\) is sufficiently large, but it should also be considered that RPP-R’ is ill-conditioned if \(\lambda _i\) is too large.

4 Convergence

Since the described BL-Recovering-IS algorithm is realized by minimizing a reformulation of the BLMOP, which can be understood as an equality constrained MOP, in the following we prove convergence for the more general class of image set-oriented recovering algorithms for the solution of MOP as defined in Sect. 2. This includes in particular the Recovering-IS algorithm presented in [8].

The proof is carried out in two steps: first, Theorem 4 states that for every subset \(B\subset \mathbb {R}^k\) containing a part of the Pareto optimal solution in image space, there is a minimal set of targets, such that for at least one of these targets the corresponding distance minimization subproblem leads to a Pareto point \(x^\star \) with \(F(x^\star )\in B\). Then, this result is used in Corollary 2 to complete the proof from the global point of view. In the following, let

$$\text {dist}(y, \mathcal {X})=\min _{t\in \mathcal {X}}||y,t|| $$

be the distance between a point \(y\in \mathbb {R}^k\) and a subset \(\mathcal {X}\subset \mathbb {R}^k\).

Theorem 4

Let \(F:\mathbb {R}^n\rightarrow \mathbb {R}^k, S\subset \mathbb {R}^n\) and denote by P the Pareto set of the constrained MOP:

$$\begin{aligned} \displaystyle \min _{x\in S} F(x). \end{aligned}$$

Assume that the norm \(||\cdot ||\) is strictly monotonically increasing. Let \(B\subset \mathbb {R}^k\) be an open subset such that \( B\cap F(P) \ne \emptyset \). Then there is \(d>0\), such that for any set \(\mathcal {X}\subset B\) of targets with \(\mathrm {dist}(y, \mathcal {X})<d \) for all \(y\in B\cap F(P)\) there exists a target \(t\in \mathcal {X}\) with \(\displaystyle F(x^\star ) \in F(P)\cap B\), where \(\displaystyle x^\star := \arg \min _{x\in S_t}||F(x)-t||\).

Proof

There are \(\bar{y}\in F(P)\) and \(\varepsilon > 0\), such that \(U_\varepsilon (\bar{y})\subset B\). Let \(d:=\frac{\varepsilon }{8\sqrt{k}}\) and \(\displaystyle c := \bar{y}-2\,d\sum _{i=1}^k\,e_i\), where \(e_i\) denotes the i-th standard basis vector in \(\mathbb {R}^k\). Then, for every \(y\in U_d(c)\), we have

$$ ||y-\bar{y}||\le ||y-c|| + ||c-\bar{y}|| \le d + 2 d \sqrt{k} = \frac{\varepsilon }{8\sqrt{k}} + \frac{\varepsilon }{4}<\frac{\varepsilon }{2}, $$

that is, \(U_d(c)\subset U_\varepsilon (\bar{y})\). Consequently, there is a target \(t=c+v\in \mathcal {X}, ||v||\le d\), such that

$$\begin{aligned} \displaystyle \min _{x\in S_t}||F(x)-t||\le ||t-\bar{y}|| < \frac{\varepsilon }{2} \end{aligned}$$

and

$$\begin{aligned} t_i =c_i+v_i=\bar{y}_i-2d +v_i<\bar{y}_i \quad \text { for }\quad i=1,\ldots , k. \end{aligned}$$

With \(\displaystyle x^\star = \displaystyle \arg \min _{x\in S_t}||F(x)-t||\), it follows that

$$\displaystyle ||F(x^\star )-\bar{y}||\le ||F(x^\star )-t||+ ||t-\bar{y}|| < \frac{\varepsilon }{2}+\frac{\varepsilon }{2}=\varepsilon $$

and therefore

$$\begin{aligned} F(x^\star )\in U_\varepsilon (\bar{y})\subset B . \end{aligned}$$

Now we have to show that \(F(x^\star )\) is not dominated by any \(\hat{y}\in F(P)\cap S_t\). For \(F(x^\star )=\hat{y}\) this nondominance is obvious. For the case \(F(x^\star )\ne \hat{y}\) we have to show that \(F_i(x^\star )<\hat{y}_i\) for at least one \(i=1,\ldots , k\). To see this, assume that the opposite is true. Then \(F_i(x^\star )\ge \hat{y}_i > t_i\) for all \(i=1,\ldots , k\), where, since \(F(x^\star )\ne \hat{y}\), strict inequality holds for at least one \(i\in \{1,\ldots , k\}\). Consequently, since \(||\cdot ||\) is strictly monotonically increasing,

$$\begin{aligned} ||F(x^\star )-t||>||\hat{y} -t||, \end{aligned}$$

which is a contradiction to \(||F(x^\star )-t|| = \displaystyle \min _{x\in S_t} ||F(x)-t||\). Finally, since \(F(x^\star )\) is not dominated by any \(\hat{y} \in F(P)\cap S_t\), we have \(F(x^\star )\in F(P)\), which completes the proof.    \(\square \)

To guarantee that an image set-oriented recovering method converges towards the union of those connected components of F(P) which correspond to the initial box collection \(\mathcal{B}_0\), in every step of the algorithm the set of targets \(t_i\) has to be chosen properly, such that all desired boxes are found, that is, boxes which are both neighbors of the boxes generated in the respective previous step and contain a part of the respective connected component of F(P). To this end, we denote by \(\bar{B}\) the closure of a box B and we state the following

Corollary 2

Using the notations of Theorem 4 and denoting by \(\mathcal{B}_0\) a box collection of (fixed) subdivision depth d covering a part of F(P), assume that every step of the Recovering-IS or BL-Recovering-IS algorithm, respectively, is realized in a way such that for every \(B\in \mathcal{B}_j\backslash \mathcal{B}_{j-1}\) targets are chosen according to Theorem 4 within all boxes \(C\in \{C: \bar{C} \cap \bar{B} \ne \emptyset , C \notin \mathcal{B}_j\}\). Moreover, assume that F(P) is bounded. Then, the algorithm terminates after a finite number of steps such that the final box collection covers those connected components of F(P), which correspond to at least one \(B\in \mathcal{B}_0\).

5 Numerical Results

In the following we demonstrate the working principle and strength of the proposed algorithm on two academic benchmark problems.

5.1 Example 1

In our first example we consider a classical (i.e., one-level) bi-objective optimization problem in order to demonstrate the working principle of the IS recovering techniques. For this, let the BOP be given by

$$\begin{aligned} \begin{aligned} F=(F_1,F_2)^t&:\mathbb {R}^3\rightarrow \mathbb {R}^2\\ F(x_1,x_2,x_3)&=\left( \begin{array}{c} (x_1-1)^2+(x_2-1)^2+(x_3-1)^4\\ (x_1+1)^4+(x_2+1)^2+(x_3+1)^2 \end{array}\right) \end{aligned} \end{aligned}$$

We assume that the decision maker is only interested in solutions for which both objective values are located within the interval \(I:=[0,20]\), and therefore define

$$\begin{aligned} S:=\{x\in \mathbb {R}^3:F_i(x)\in I , i=1,2\}. \end{aligned}$$

Figure 2 shows the solutions generated by the Recovering-IS algorithm using different box sizes (depths). Here, the reader can get an impression of how the density of the computed representation can be controlled by choosing the box size.

5.2 Example 2

Next, we consider the following equality constrained bi-level multi-objective optimization problem with a convex lower level problem:

$$\begin{aligned} \displaystyle \min _{x\;\in \mathbb {R}^3,\;y\;\in \mathbb {R}} F(x,y)=\left( \begin{array}{c} 4((x_1+1)^2 + (x_2-1-y)^4 + x_3^2)\\ (x_1-1)^2 + (x_2+1-y)^2 + (x_3-0.5)^4 \end{array}\right) , \end{aligned}$$
$$\begin{aligned}&\displaystyle \begin{array}{ll} \text { such that } &{} H(x,y) = x_1^2 + x_3 - y^2 = 0,\\ \text { and } x \text { solves:}&{}\\ &{}\displaystyle \min _{x\;\in \mathbb {R}^3} f(x,y)=\left( \begin{array}{c} (x_1-1)^2 + 0.5(x_2+y)^2 + (x_3-0.5)^4\\ (x_1+1)^2 + 0.5(x_2+y)^2 + (x_3+1)^2\\ x_1^2 + x_2^2 + (x_3+1)^2 \end{array}\right) \\ &{}\text { such that } h_1 = x_1-x_2y = 0. \end{array} \end{aligned}$$
Fig. 2.
figure 2

Numerical results on Example 1 computed by the image set-oriented recovering algorithm using different box sizes in image space.

Fig. 3.
figure 3

The Pareto set of the example problem computed by our algorithm in higher level image space (top) and in parameter space (projection to the x-space)(bottom).

The solution of this problem was computed by the presented BL-Recovering-IS algorithm. For this, we have chosen \(Q=[0,10]^2\) for the domain of interest in higher level image space. The partition \({\mathcal P}_d\) was chosen corresponding to 5 virtual subdivisions in each coordinate, such that all boxes \(B\in {\mathcal P}_d\) are of the size \(0.3125^2\). The computed solution in higher level image space along with the generated boxes is shown on top of Fig. 3. As expected, the solution is well-distributed in the sense that there is at least one computed point in every box of the box collection covering the Pareto set. The projection of the corresponding Pareto set to the x-space is shown on the bottom of Fig. 3.

6 Conclusions

In this chapter, we have considered the class of bi-level multi-objective optimization problems (BLMOP) with equality constraints for both the higher and lower level problem. The lower level problem was assumed to be convex, that is, the lower level objectives are convex and the lower level equality constraints are affine-linear. Due to the concentration to this particular subclass, we have been able to write down an equivalent formulation based on the well-known Kuhn-Tucker optimality conditions for multi-objective optimization problems. The resulting reformulation has the form of a general equality constrained multi-objective optimization problem. We have presented an image set-oriented algorithm for the approximation of the Pareto set P of the given BLMOP. The representation of P computed by this algorithm turns out to be well-distributed in the sense that in every box \(B\subset \mathbb {R}^k\) with \(B\cap F(P)\ne \emptyset \) of a given partition \({\mathcal P}_d\) of the higher level image space \(\mathbb {R}^k\), there is the image of at least one of the computed Pareto points. Convergence has been proved in the sense that after a finite number of iterations, the box collection formed by those boxes containing the images of the computed Pareto points, covers the image of the entire connected components of P, which correspond to the given initial points. The efficiency of the algorithm was demonstrated on an academic example, where comparison to the state-of-the-art is still missing, which we leave for future work. Finally, also the development of algorithms for the solution of the more general BLMOP, which includes non-convex lower level problems, shall be investigated in the future.