Bilevel optimization based on iterative approximation of multiple mappings

Sinha, Ankur; Lu, Zhichao; Deb, Kalyanmoy; Malo, Pekka

doi:10.1007/s10732-019-09426-9

Bilevel optimization based on iterative approximation of multiple mappings

Published: 24 September 2019

Volume 26, pages 151–185, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Heuristics Aims and scope Submit manuscript

Bilevel optimization based on iterative approximation of multiple mappings

Download PDF

Ankur Sinha¹,
Zhichao Lu²,
Kalyanmoy Deb² &
…
Pekka Malo³

1508 Accesses
39 Citations
Explore all metrics

Abstract

A large number of application problems involve two levels of optimization, where one optimization task is nested inside the other. These problems are known as bilevel optimization problems and have been studied by both classical optimization community and evolutionary optimization community. Most of the solution procedures proposed until now are either computationally very expensive or applicable to only small classes of bilevel optimization problems adhering to mathematically simplifying assumptions. In this paper, we propose an evolutionary optimization method that tries to reduce the computational expense by iteratively approximating two important mappings in bilevel optimization; namely, the lower level rational reaction mapping and the lower level optimal value function mapping. The algorithm has been tested on a large number of test problems and comparisons have been performed with other algorithms. The results show the performance gain to be quite significant. To the best knowledge of the authors, a combined theory-based and population-based solution procedure utilizing mappings has not been suggested yet for bilevel problems.

Approximate Bilevel Optimization with Population-Based Evolutionary Algorithms

Evolutionary Bilevel Optimization: An Introduction and Recent Advances

Bilevel Multi-Objective Optimization and Decision Making

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Interest in bilevel optimization has been growing due to a number of new applications that are arising in different fields of science and engineering. Bilevel programming is quite common in the area of defense where these problems are studied as attacker-defender problems. The problem was introduced by Bracken and McGill (1973) in the area of mathematical programming, where an inner optimization problem acts as a constraint to an outer optimization problem. One of the follow-up papers by Bracken and McGill (1974) highlighted the applications of bilevel programming in defense. Since then a number of studies on homeland security (Brown et al. 2005; Wein 2009; An et al. 2013) have been performed, where it is common to have bilevel, trilevel and even multilevel optimization models. In the area of operations research, bilevel optimization is gaining importance in the context of interdiction and protection of hub-and-spoke networks (Lei 2013), as most of the critical infrastructures like transportation and communications are predominantly hub-and-spoke. In other game theoretic settings, bilevel optimization has been used in transportation (Migdalas 1995; Constantin and Florian 1995; Brotcorne et al. 2001), optimal tax policies (Labbé et al. 1998; Sinha et al. 2013, 2015), investigation of strategic behavior in deregulated markets (Hu and Ralph 2007), model production processes (Nicholls 1995) and optimization of retail channel structures (Williams et al. 2011). The applications extend to a variety of other domains, like, facility location (Jin and Feng 2007; Uno et al. 2008; Sun et al. 2008), chemical engineering (Smith and Missen 1982; Clark and Westerberg 1990), structural optimization (Bendsoe 1995; Christiansen et al. 2001), and optimal control (Mombaur et al. 2010; Albrecht et al. 2011) problems. While new applications that are inherently bilevel in nature are arising at a fast pace, the development of computationally efficient algorithms for such problems has not kept the pace with the applications.

A significant body of literature exists on bilevel optimization and its optimality conditions (Lignola and Morgan 2001; Dempe 2002; Dempe et al. 2007, 2014; Wiesemann et al. 2013) in the classical optimization literature. However, on the algorithm front most attention has been given to only simple instances of bilevel optimization where the objective functions and constraints are linear (Wen and Hsu 1991; Ben-Ayed 1993), quadratic (Bard and Moore 1990; Edmunds and Bard 1991; Al-Khayyal et al. 1992) or convex (Liu et al. 1998). This is not surprising given the fact that bilevel optimization is difficult to an extent that merely evaluating the bilevel optimality of a given solution is an NP-hard task (Vicente et al. 1994). Researchers have also attempted to solve these problems using computational techniques like evolutionary algorithms. Most of the bilevel algorithms relying on evolutionary framework have been nested in nature (Mathieu et al. 1994; Yin 2000; Li and Wang 2007; Zhu et al. 2006; Sinha et al. 2014; Islam et al. 2017b, a). One of the drawbacks of such an approach is that it might be able to solve small instances of bilevel problems, but as soon as the problem scales-up beyond a few variables, the computational requirements increase tremendously. However, the evolutionary algorithms still have a niche in solving these problems as it maintains a population at each iteration of the algorithm. A population of points may allow modeling various mappings in bilevel optimization to reduce the computational expense (Sinha et al. 2016a). Some studies in this direction are (Sinha et al. 2016b, 2017, 2013, 2014). We believe that exploiting some of the mathematical properties of bilevel problems through modeling of various mappings in bilevel is the way forward in solving such problems. For a detailed review on bilevel optimization the readers may refer to Sinha et al. (2018), Dempe (2002), and Bard (1998)

In this paper, we focus on two important mappings in bilevel optimization borrowed from the mathematical optimization literature. The first mapping is the lower level reaction set mapping (known as $\Psi $-mapping), which provides the lower level optimal solution(s) corresponding to any given upper level vector. Considering the upper level problem as the leader’s problem and the lower level problem as the follower’s problem, the reaction set mapping represents the rational decisions of the follower corresponding to any decision taken by the leader. The second mapping is the lower level value function mapping (known as $\varphi $-mapping) that provides the optimal objective function value to the follower’s problem for any given leader’s decision. While the first mapping can be a set-valued mapping, the second mapping is always single-valued. We work with meta-modeling techniques that try to approximate these two mappings and develop a computationally efficient evolutionary algorithm for solving bilevel problems. The algorithm has been tested on a number of test problems, and the computational gain when compared with other techniques is found to be significant. In this paper, we also extend an existing test-suite of bilevel test problems (Sinha et al. 2014) with a couple of additional problems to better evaluate our proposed solution procedure.

The paper is organized as follows. To begin with, we provide a brief literature survey of bilevel optimization using evolutionary algorithms. This is followed by various formulations of the bilevel optimization problem and discussion of the two mappings that we approximate in this paper. Thereafter, we provide the bilevel evolutionary optimization algorithm which is an extension of the algorithm proposed in the previous studies (Sinha et al. 2017, 2013, 2014). Following this, we provide the empirical results on a number of test problems. A comparative study with other approaches is also included. Finally, we end the paper with the conclusions section.

2 A survey on evolutionary bilevel optimization

Most of the evolutionary algorithms for bilevel optimization are nested in nature, where one optimization algorithm is used within the other. The outer algorithm handles the upper level problem and the inner algorithm handles the lower level problem. Such a structure necessitates that the inner algorithm is called for every upper level point generated by the outer algorithm. Therefore, nested approaches can be quite computationally demanding, and can only be applied to small scale problems. One can find studies with evolutionary algorithm being used for the upper level problem and classical approach being used for the lower level problem. If the lower level problem is complex, researchers have used evolutionary algorithms at both levels. Below we provide a review of evolutionary bilevel optimization algorithms from the past.

Mathieu et al. (1994) was one of the first to propose a bilevel algorithm using evolutionary algorithms. He used a genetic algorithm to handle the upper level problem and linear programming to solve the lower level problem for every upper level member generated using genetic operations. This study was followed by nesting the Frank-Wolfe algorithm (reduced gradient method) within a genetic algorithm in Yin (2000). Other authors utilized similar nested schemes in Li et al. (2006), Li and Wang (2007), and Zhu et al. (2006). Studies involving evolutionary algorithms at both levels include (Angelo et al. 2013; Angelo and Barbosa 2015), where authors have used differential evolution at both levels in the first study, and differential evolution within ant colony optimization in the second study.

Replacing the lower level problem in bilevel optimization with its KKT conditions is a common approach for solving the problem both in classical as well as evolutionary computation literature. However, a KKT based reduction can only be applied to problems where the lower level is convex and adheres to certain regularity conditions (Mirrlees 1999). Some of the past evolutionary studies that utilize this idea include (Hejazi et al. 2002; Wang et al. 2005). The approach has been popular and even recently researchers are relying on reducing the bilevel problem into single level problem using KKT and solving the reduced problem using evolutionary algorithm, for example, see Wang et al. (2011), Jiang et al. (2013), Li (2015), and Wan et al. (2013).

While KKT conditions can only be applied to problems where the lower level adheres to certain mathematically simplifying assumptions, the researchers are exploring techniques that can solve more general instances of bilevel optimization problems. Some of the approaches are based on meta-modeling the mappings within bilevel optimization, while others may be based on meta-modeling the entire bilevel problem itself. Studies in this direction include (Sinha et al. 2017, 2013, 2014). In this paper, we aim to develop an algorithm that tries to capture two important mappings in bilevel optimization; namely, the lower level reaction set mapping and the lower level value function mapping, in order to reduce the computational complexity of the problem.

3 Different bilevel formulations

We will start this section by providing a general formulation for bilevel optimization. This is followed by various proposals that researchers have made for reducing a bilevel problem into a single-level problem. The two levels in a bilevel problem are also known as the leader’s (upper) and follower’s (lower) problems in the domain of game theory. In general, the variables, objectives and constraints are different for the two levels. The upper level variables are treated as parameters while optimizing the lower level problem. A general bilevel formulation has been provided below (for brevity, we ignore equality constraints):

Definition 1

For the upper-level objective function $F:\mathbb {R}^n\times \mathbb {R}^m \rightarrow \mathbb {R}$, lower-level objective function $f:\mathbb {R}^n\times \mathbb {R}^m \rightarrow \mathbb {R}$, upper level variable $x_u\in \mathbb {R}^n$ and lower level variable $x_l\in \mathbb {R}^m$, the bilevel optimization problem is given by

where $G_k:\mathbb {R}^n\times \mathbb {R}^m \rightarrow \mathbb {R}$, $k=1,\ldots ,K$ denotes the upper level constraints, and $g_j:\mathbb {R}^n\times \mathbb {R}^m \rightarrow \mathbb {R}$ denotes the lower level constraints.

There are two common positions that a user assumes while solving a bilevel optimization problem; namely, optimistic and pessimistic positions. The bilevel formulation in Definition 1 is straightforward, whenever there is a single optimal solution for the lower level problem for any given upper level variable. However, for scenarios with more than one lower level optimal solutions for some upper level variables, one has to be clear that which of the many optimal solutions from the lower level be considered as the response of the follower. Optimizing bilevel problems from either optimistic or pessimistic position is useful to handle the ambiguity arising from multiple lower level optimal solutions. In an optimistic position, it is assumed that the lower level chooses that optimal solution which is favorable at the upper level. In a pessimistic position, the upper level optimizes its problem according to the worst case scenario. In other words, the lower level may choose a solution from the optimal set that is least favorable at the upper level. In this paper, we assume an optimistic position while solving bilevel optimization problems.

In case when certain mathematically simplifying assumptions like continuities and convexities are satisfied, often the lower level optimization task in Definition 1 is replaced with its KKT conditions. However, the reduced formulation is not simple to handle, as it induces non-convexities and discreteness into the problem through the complementary slackness conditions. We do not utilize any properties of the KKT based reduction in this paper, rather we focus on two different formulations in the development of the evolutionary algorithm in this paper.

3.1 Lower level reaction set mapping

The formulation provided in Definition 1 can also be stated as follows:

Definition 2

Let $\Psi :\mathbb {R}^n\rightrightarrows \mathbb {R}^m$ be the reaction set mapping,

$$\begin{aligned} \Psi (x_u)=&\mathop {\mathrm{argmin}}\limits _{x_l}\{f(x_u,x_l) : g_j(x_u,x_l)\le 0, j=1,\ldots ,J\}, \end{aligned}$$

which represents the constraint defined by the lower-level optimization problem, i.e. $\Psi (x_u)$ for every $x_u$. Then the following gives an alternative formulation for the bilevel optimization problem:

Using the above definition, a bilevel problem can be reduced to a single level constrained problem given that the $\Psi $-mapping can somehow be determined. Unfortunately this is rarely the case. Studies in the evolutionary computation literature that rely on iteratively approximation of this mapping to reduce the lower level optimization calls could be found in Sinha et al. (2017), Sinha et al. (2013), and Sinha et al. (2014). To illustrate the idea, let’s consider the Fig. 1. To acquire sufficient data for constructing the $\Psi $-mapping approximation, a few lower level problems need to be optimized completely for their corresponding upper level decision vectors in the beginning. For instance, the lower level decisions for the upper level decisions a, b, c, d, e and f are determined by optimizing the lower level problem, which are then used to locally approximate the $\Psi $-mapping. This has been shown in Fig. 1. Even though the actual $\Psi $-mapping is still unknown, the local approximation can then be substituted to identify the lower level optimal decision for every new upper level member to avoid the lower level optimization task. This procedure of approximating the mapping and utilizing it to predict the lower level optimum needs to be repeated iteratively until convergence to the bilevel optimum. The idea works well when the $\Psi $-mapping is single valued. If the lower level has multiple optimal solutions for some upper level members as shown in Fig. 2, then identifying as well as approximating the mapping is not a straightforward task.

3.2 Lower level optimal value function mapping

Another formulation for the bilevel optimization problem in Definition 1 can be written using the optimal lower level value function (Ye and Zhu 2010; Outrata 1988, 1990):

Definition 3

Let $\varphi : \mathbb {R}^n \rightarrow R$ be the lower level optimal value function mapping,

$$\begin{aligned} \varphi (x_u)=\mathop {\mathrm{min}}\limits _{x_l} \{f(x_u,x_l): g_j(x_u,x_l)\le 0, j=1,\ldots ,J \}, \end{aligned}$$

which represents the optimal function value at the lower level for any given upper level decision vector. Using this lower level optimal value function, the bilevel optimization problem can be expressed as:

Note that the constraint $f(x_u,x_l) \le \varphi (x_u)$ in the above definition says that the value of the lower level function $f(x_u,x_l)$ should always be less than or equal to the optimal lower level function value, given by $\varphi (x_u)$, corresponding to any $x_u$. This along with the lower level constraints ensure that the above definition incorporates the lower level optimality requirements.

As in the case of $\Psi $-mapping, if the $\varphi $-mapping can be somehow determined, a bilevel problem can be reduced to a single level problem as described in Definition 3. Along the process of an algorithm, the $\varphi $-mapping can be approximated and used to solve the reduced single level problem formulation in an iterative manner. Such an evolutionary algorithm has been recently discussed in Sinha et al. (2016b). The approximation of the optimal value function ($\varphi $) mapping is, in general, less complicated than the reaction set ($\Psi $) mapping, in the sense that, the $\varphi $-mapping is always scalar-valued regardless of the lower level variable dimension and whether or not there exist multiple lower level optimal solutions (Fig. 3). However, the $\varphi $-mapping based reduction is not necessarily always better than the $\Psi $-mapping based reduction. Definition 3 requires the problem to be solved with respect to upper as well as lower level variables, while in Definition 2 the lower level variables are readily available from the $\Psi $-mapping. The $\Psi $-mapping based reduction also contains fewer constraints. Therefore, clearly there is a trade-off.

It is noteworthy that the lower level optimization problem is a parametric optimization problem that is solved with respect to the lower level variables, while the upper level variables act as parameters. Therefore, for bilevel problems with mathematically well behaved objective functions and constraints, it is possible to utilize ideas from studies on sensitivity analysis and parametric optimization to identify the mappings in bilevel optimization. Whenever such a mapping can be directly obtained using the parametric optimization tools, the bilevel problem can be readily reduced to a single level problem and standard mathematical programming algorithms can be applied. For related work, the readers may refer to Jittorntrum (1984), Fiacco and McCormick (1990), and Ralph and Dempe (1995).

Table 1 Standard test problems TP1–TP5. (Note that $x = x_u$ and $y = x_l$)

Bilevel optimization based on iterative approximation of multiple mappings

Abstract

Similar content being viewed by others

Approximate Bilevel Optimization with Population-Based Evolutionary Algorithms

Evolutionary Bilevel Optimization: An Introduction and Recent Advances

Bilevel Multi-Objective Optimization and Decision Making

Explore related subjects

1 Introduction

2 A survey on evolutionary bilevel optimization

3 Different bilevel formulations

Definition 1

3.1 Lower level reaction set mapping

Definition 2

3.2 Lower level optimal value function mapping

Definition 3

4 Evaluating the performance of \(\Psi \) and \(\varphi \) mappings on test problems

4.1 Approximating the \(\Psi \)-mapping

4.2 Approximating the \(\varphi \)-mapping

5 Comparison results for \(\Psi \)- versus \(\varphi \)-approximations

6 Bilevel evolutionary algorithm based on \(\Psi \) and \(\varphi \)-mapping approximations

6.1 Initialization

6.2 Constraint handling and fitness assignment

6.3 Genetic operations

6.4 Approximation of mappings

6.5 Termination criteria

6.6 Lower level optimization

6.7 Offspring update

6.8 Local search

6.9 Parameters and platform

7 Results

7.1 Standard test problems

7.2 Scalable test problems

8 An application problem

9 Conclusions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Standard test problems

Appendix B: Additional SMD test problems

Appendix C: Stackelberg duopoly formulation

Appendix D: Additional results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation