1 Introduction

Bilevel programming problems are mathematical problems with a special constraint which is implicitly determined by another optimization problem. This latter problem, called the follower’s or lower level problem, is defined by

$$\begin{aligned} \underset{y}{\min } \left\{ f(x,y):\ g(x,y)\le 0\right\} , \end{aligned}$$

where \(f: \mathbb {R}^{n}\times \mathbb {R}^{m}\rightarrow \mathbb {R}\), \(g: \mathbb {R}^{n}\times \mathbb {R}^{m}\rightarrow \mathbb {R}^{s}\).

The bilevel programming problem can formally be described as follows

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0\\ &{} y \in {\varPsi }(x) \end{array} \end{aligned}$$
(1.1)

with

$$\begin{aligned} {\varPsi }(x):= \underset{y}{\text {Argmin }} \left\{ f(x,y):\ g(x,y)\le 0\right\} \end{aligned}$$

and functions \(F: \mathbb {R}^{n}\times \mathbb {R}^{m}\rightarrow \mathbb {R}\), \(G: \mathbb {R}^{n}\rightarrow \mathbb {R}^{k}\).

Problem (1.1) is called the upper level or leader’s problem. It is often referred to as the optimistic formulation of the bilevel programming problem [9]. The variables x and y are the upper and lower level variables or upper and lower level decisions, respectively.

It is well-known that the bilevel problem is a hard problem due to its inherent nonconvexity and nondifferentiability [2]. Even the simplest case, the linear bilevel problem, has been shown to be strongly NP-hard [3]. An overview of existing literature and applications can be found in the annotated bibliographies and books [3, 9, 10, 32].

Bilevel optimization problems may involve decisions in both discrete and continuous variables. Therefore, when we consider the discrete bilevel program, four types come to our attention [21, 33]:

Type I: The upper level variables are integer and the lower level variables are continuous: In this case, one usual approach to solve problem (1.1) is to replace the lower level problem with optimality conditions i.e., by its Karush–Kuhn–Tucker conditions. These conditions are necessary and sufficient optimality conditions as long as the functions f and g are continuously differentiable and convex with respect to y for fixed x and some constraint qualification (such as Slater’s condition or the Mangasarian–Fromovitz-constraint-qualification) is satisfied. It is worth to mention that this reformulation is equivalent to the original bilevel programming problem with respect to global optimal solutions but not necessarily with respect to local optimal solutions [11]. In the literature, one is in general more interested in finding global optimal solutions for problem (1.1) with respect to all variables than finding local solutions with respect to the continuous ones. In fact, a local solution for the follower’s program is not meaningful. Even if the leader assumes the follower is a local optimizer, he should hedge against all local solutions, which should be even harder than solving for the global follower’s solution. Hence, the choice of the KKT approach seems reasonable when combined with techniques of (mixed-integer) nonlinear optimization as Branch and Bound or Branch and Cut.

Type II: The sets of both upper and lower level variables admit integer and continuous values at the same time. This is the mixed-integer upper and lower level problem: an overview of different approaches for solving this type is given in [15, 17, 31, 34], and in [35].

Type III: The upper and lower level variables are all integer: here, Bard and Moore [4] were the first to suggest an approach solving the problem. They developed a specialized algorithm for binary bilevel programs, and later DeNegre and Ralphs [16] improved the previous algorithm by describing a branch and cut algorithm for solving integer bilevel linear programs in the general case. Domínguez and Pistikopoulos [17] employed a reformulation linearization technique to construct a parametric convex hull representation of the lower problem constraint set. In this paper, we describe a method solving this type by transforming our problem using the optimal value function reformulation.

Type IV: Here, the lower level variables are integer and the upper level variables are continuous. Vicente et al. [33] showed that more assumptions are needed in order to prove the existence of an optimal solution to problem (1.1) in this case than in the first and third case, respectively. Nevertheless, some research has been carried out: In [8], one solved problem (1.1) using a cutting plane approach for approximating the feasible set of the discrete lower level problem. A theoretical approach to the treatment of bilevel problems of this type is given in [20]. In [24], the authors presented a polynomial time algorithm when the follower’s dimension is fixed. Just recently, Dempe et al. [13] solved this type using an approximation of the optimal value function of the lower level to solve the bilevel programming problem globally. The main idea in [13] is to improve the quality of the approximation of the optimal value function using feasible solutions of the lower level problem in each iteration. Notice that although, the bilevel problem considered in [13] and the idea are quite similar to the one we applied here, the principal differences rely on the lower level’s objective function (in fact, there is no way to reduce one expression to the other one and conversely) and the method to find the approximation of the optimal value function.

This approximation is one of the main three approaches dedicated to the study of optimistic bilevel programming problems initiated by Outrata [30] (the two other approaches are the Karush–Kuhn–Tucker reformulation and the replacement of the lower level by a generalized equation [14]).

In this paper, we are more interested in the aforementioned approach where (1.1) is replaced by the following one:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0\\ &{} g(x,y)\le 0 \\ &{} f(x,y)\le \varphi (x) \end{array} \end{aligned}$$
(1.2)

with \(\varphi (x):=\underset{y}{\inf }\{f(x,y): g(x,y)\le 0\}\) for all \(x\in \mathbb {R}^{n}\).

It is easy to see that the nonlinear programming problem (1.2) is equivalent to (1.1). We will use this approach in order to develop the new algorithms solving discrete bilevel problems of Type III in which the upper level variables only affect the lower level objective function; and Type IV in which the upper-level variables only appear in the right-hand side of the lower level constraints.

It should be noted that, using this approach, we can allow for upper level constraints depending on the lower level variable. But this is difficult with respect to the ideas behind bilevel optimization where constraints \(G(x,y)\le 0\) mean that the leader can check feasibility of his selection only after being informed of the follower’s selection in case of non unique optimal solutions of the follower. From point-of-view of bilevel optimization this means that the leader assumes that the follower allows him to select out of the optimal solutions of the follower’s problem one point which satisfies the upper level constraints and is a best one for the resulting problem. These aims are again contradicting. For further details, we refer the interested reader to [9].

Our research on discrete bilevel problems with these peculiarities is motivated by a number of important applications in bilevel programming problems with parametric graph problems in the lower level such as the minimum spanning tree problem, the toll problem, the shortest path problem, the matching problem in a bipartite graph or the minimum knapsack problem. Indeed, the parameter which is in the objective function (resp. constraints) of the lower level is controlled by the leader in Type III (resp. Type IV).

Besides algorithms that have been described to solve one of the particular type given above, we have some algorithms which have been conceived to solve nonlinear mixed integer bilevel programs in a more general setting [19, 21, 23, 28]. While [28] uses the optimal value reformulation, [19, 21] use the KKT conditions after some reformulations in order to transform the bilevel problem into a one level problem. In [23], one presents a method based on the parametric analysis of [6] . Here, we use the same idea as in [28] but our peculiarity is that, in the first case we use a branch and cut method which has been proven to be a very successful technique for solving integer programming problems, and in the second case, we propose a method that deals with mixed bilevel problems for which the lower problem constraint set do not need to be bounded as in [21].

The remainder of the paper is organized as follows: In Sect. 2, we describe an algorithm to solve Type III of problem (1.1). Here, the upper level variable does only appear in the objective function of the lower level problem, the objective functions are linear and the variables are discrete at both levels. An example is worked out where some peculiarities of the solution are demonstrated. In Sect. 3, we present an algorithm to solve Type IV of problem (1.1). In distinction to the foregoing case, the upper level variable is only in the right-hand side of the constraints of the lower level problem, the lower variables are discrete, the upper ones are continuous and the objective function of the lower level is linear while the one of the upper level is not specified. We illustrate the algorithm via an example. Section 4 concludes the paper.

In the paper, we use \(\mathbb {R}^{n}_{+}\), \(\mathbb {Z}^{n}_{+}\), \(\mathbb {Q}^{n}\), \(\mathbb {B}^{n}\), \(\mathbb {R}^{n\times m}\), and \(\mathbb {Q}^{n\times m}\) to denote the cone of all n-dimensional real vectors with nonnegative components, the set of all n-dimensional integer vectors with nonnegative components, the set of all n-dimensional rational vectors, the set of all n-dimensional binary vectors, the set of all real matrices with n rows and m columns, and the set of all rational matrices with n rows and m columns, respectively. Furthermore, recall that \(\mathbb {R}^{n}_{+}\) induces the following partial order in \(\mathbb {R}^{n}\):

$$\begin{aligned} \forall x^{1}, x^{2}\in \mathbb {R}^{n}_{+}:\; x^{1}\le x^{2} \Leftrightarrow x^{2}-x^{1}\in \mathbb {R}^{n}_{+}. \end{aligned}$$

2 Approach for solving discrete bilevel optimization problems of Type III

In this section, we postulate that, the functions F and \(f(x,\cdot )\) for \(x\in \mathbb {R}^{n}\) are linear, the functions G and \(g(x,\cdot )\) for \(x\in \mathbb {R}^{n}\) are affine, and that the upper level variable x only appears in the objective function of the lower level problem. We propose an algorithm for solving the binary case, but the procedure can be easily modified to solve a discrete linear bilevel problem in the general case, i.e., when both x and y may take any bounded integer value.

2.1 Reformulation of the linear discrete bilevel optimization problem

The problem we consider is formulated as

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} d_{1}^{\top } x +d_{2}^{\top } y\\ \hbox { s.t. }&{} Dx \le d \\ &{} x \in \mathbb {B}^{n} \\ &{} y \in {\varPsi }(x):= \underset{y}{\text {Argmin }} \left\{ x^{\top }y:\ Ay\le b,\;y\in \mathbb {B}^{n} \right\} , \end{array} \end{aligned}$$
(2.1)

where \(m=n\), \(D\in \mathbb {Q}^{k\times n}\), \(A\in \mathbb {Q}^{s\times n}\) are matrices, while \(b\in \mathbb {Q}^{s}\), \(d\in \mathbb {Q}^{k}\), and \(d_{1}\text {,} \;d_{2}\in \mathbb {Q}^{n}\) are constant vectors.

We assume that

$$\begin{aligned} {\varOmega }:= \big \{x \in \mathbb {B}^{n}:Dx\le d\big \}\times \big \{y \in \mathbb {B}^{n}:Ay\le b\big \} \end{aligned}$$

is nonempty.

Note also that the assumption that the lower level objective function is \(f(x,y)=x^\top y\) is not very restrictive. What we need is that the objective function is linear in the lower level variable: if \(f(x,y)=H(x)^{\top } y\), we can substitute \(z=H(x)\) in the upper level and use \(z^\top y\) as objective function in the lower level. In that sense, our results can be applied to e.g. toll setting problems where the lower level problem is modeled as a multicommodity flow problem, O-D adjustment problems [26, 27].

Using the optimal value reformulation of (2.1), we get the following equivalent problem:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} d_{1}^{\top } x +d_{2}^{\top } y\\ \hbox { s.t. }&{} Dx \le d \\ &{} Ay\le b \\ &{} x,y \in \mathbb {B}^{n} \\ &{} x^{\top }y\le \min \big \{x^{\top }y^{i}:\;i=1,\ldots ,p \big \}. \end{array} \end{aligned}$$
(2.2)

Therein, \(\big \{y^{1},\ldots ,y^{p}\big \}=\big \{y\in \mathbb {B}^{n}: Ay\le b\big \}\).

Remark 2.1

It is important to see that the set \( \big \{y \in \mathbb {B}^{n}:Ay\le b\big \} \) is finite because \(|\mathbb {B}^{n}|=2^{n}\) and \(\big \{y\in \mathbb {B}^{n}:Ay\le b\big \}\subseteq \mathbb {B}^{n}\).

Remark 2.2

When generalizing to \(y\in \mathbb {Z}^{n}_{+}\), one needs the set \(\{y\in \mathbb {Z}^{n}_{+}:Ay\le b\}\) to be bounded in order to ensure its finiteness.

Let I be a subset of \(\{1,\ldots ,p\}\) such that for all \(i \in I,\; y^{i}\in \big \{y\in \mathbb {B}^{n}:Ay\le b\big \}\) is known. We consider the relaxation

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} d_{1}^{\top } x +d_{2}^{\top } y\\ \hbox { s.t. }&{} Dx \le d \\ &{} Ay\le b \\ &{} x^{\top }y\le \min \left\{ x^{\top }y^{i}:i\in I \right\} \\ &{} x,y \in \mathbb {B}^{n} \end{array} \end{aligned}$$

of (2.2) which is equivalent to:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} d_{1}^{\top }x +d_{2}^{\top }y\\ \hbox { s.t. }&{} Dx \le d \\ &{} Ay\le b \\ &{} x^{\top }y\le x^{\top }y^{i}\quad \forall i\in I \\ &{} x,y \in \mathbb {B}^{n}. \end{array} \end{aligned}$$
(2.3)

The last problem is a nonlinear integer programming problem. In its constraints, we have some quadratic inequalities, namely \(x^{\top }y\le x^{\top }y^{i}\) for all \(i\in I\). We will relax these nonlinear constraints by linear constraints without introducing additional variables.

Consider now the polynomial constraint \(x^{\top }y-x^{\top }y^{i}\le 0\) with \(i\in I\).

For fixed \(i\in I\) we set \(h_{i}(x,y):=x^{\top }y-x^{\top }y^{i}\) (only \(y^{i}\) is known). There are |I| of those constraints \(h_{i}(x,y)\le 0\). This amounts to

$$\begin{aligned} h_{i}(x,y)&=x_{1}y_{1}+\cdots +x_{n}y_{n}-x_{1}y^{i}_{1}-\cdots -x_{n}y^{i}_{n}\\&=\sum _{j=1}^{n}x_{j}y_{j}-\sum _{j=1}^{n}x_{j}y^{i}_{j}. \end{aligned}$$

Let

$$\begin{aligned} \forall j\in \{1,\ldots ,2n\}:\quad X_{j}:= {\left\{ \begin{array}{ll} x_{j}, &{} \quad \text {if} \; j\in \{1,\ldots ,n\} \\ y_{j-n}, &{} \quad \text {if} \; j\in \{n+1,\ldots ,2n\}, \end{array}\right. } \end{aligned}$$

and \(L_{i}:=\{l\in \{n+1,\ldots ,2n\}: y^{i}_{l-n}=1\}\). Then we rewrite \(h_{i}(x,y)\) by changing its variables.

$$\begin{aligned} h_{i}(x,y)=h_{i}(X)&=h_{i}(X_{1},\ldots ,X_{n},X_{n+1},\ldots ,X_{2n})\\&= \sum _{j=1}^{n}X_{j}X_{n+j}-\sum _{j\in L_{i}}X_{j-n} \\&=\sum _{j=1}^{n}\prod _{l\in S_{j}} X_{l}-\sum _{j\in L_{i}}\prod _{l\in S_{j}} X_{l} \end{aligned}$$

where

$$\begin{aligned} \forall j\in \{1,\ldots ,n\}\cup L_{i}:\quad S_{j}:= {\left\{ \begin{array}{ll} \{j,n+j\}, &{} \quad \text {if} \; j\in \{1,\ldots ,n\} \\ \{j-n\}, &{} \quad \text {if} \;j\in L_{i}. \end{array}\right. } \end{aligned}$$

We set \(N^{+}:=\{1,\ldots ,n\}\), \(N_{i}:=N^{+}\cup L_{i}\), and define the mapping \(\gamma _{i} :L_{i} \rightarrow \{j-n:\; j\in L_{i}\}\) by \(\gamma _{i}(j):=j-n\) for all \(j\in L_{i}.\)

For \(M\subseteq N_{i}\), we set \(S^{M}:=\bigcup _{k\in M\cap N^{+}}S_{k}=\bigcup _{k\in M\cap N^{+}}\{k,n+k\}\) and \(S^{M}_{i}:=\gamma _{i}(M\cap L_{i})=\{\gamma _{i}(j):j\in M\cap L_{i} \}\).

For \(X_{j}\in \mathbb {B}\), denote by \(\overline{X}_{j}\) the complement of \(X_{j}\) with respect to the set \(\mathbb {B}\), i.e., \(\overline{X}_{j}=1-X_{j}\).

Let us define a notion which will help us to relax the nonlinear inequality into a linear one.

Definition 2.3

A set \(M\subseteq N_{i} \) is said to be a cover for the inequality \(h_{i}(X)\le 0\) if \(|M|>|L_{i}|\).

A cover M is said to be minimal if no strict subset of it is a cover.

We can remark that any cover M of \(h_{i}\) contains at least one element of \(N^{+}\). This implies that the set \(S^{M}\) is not empty.

With this definition and the sets defined above, we obtain the following lemma which is a specialization of [18, Theorem 11.2] in the sense that all the coefficients of h here are unitary. We include the proof such that this paper is self-contained.

Lemma 2.4

If \(h_{i}(X)\le 0\) is satisfied, then

$$\begin{aligned} \sum _{j\in S^{M}}\overline{X}_{j}+\sum _{j\in S^{M}_{i}}X_{j}\ge 1 \end{aligned}$$
(2.4)

for any cover \(M\subseteq N_{i}\) of \(h_{i}\).

Proof

See the “Appendix”.

We call (2.4) a generalized covering inequality.

Of course a minimal cover M results in a generalized covering inequality with less variables than any other cover.

The reader can see how to derive more compact linear inequalities than the generalized covering inequality in [18].

Lemma 2.4 gives us a relaxation of (2.3) which is in turn a relaxation of problem (2.1).

Namely, we consider

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} d_{1}^{\top }x +d_{2}^{\top }y\\ \hbox { s.t. }&{} Dx \le d \\ &{} Ay\le b \\ &{} \sum _{j\in S^{M^{i}}}\overline{X}_{j}+\sum _{j\in S^{M^{i}}_{i}}X_{j}\ge 1\; \;\forall i\in I \\ &{} x,y \in \mathbb {B}^{n}, \end{array} \end{aligned}$$
(2.5)

where \(M^{i}\) is a cover of \(h_{i}\) for all \(i\in I\).

We can see that the problem (2.5) does not contain quadratic constraints anymore: it is a linear programming problem with integer variables.

The inequalities \(\sum _{j\in S^{M^{i}}}\overline{X}_{j}+\sum _{j\in S^{M^{i}}_{i}}X_{j}\ge 1\) for all \(i\in I\) can be expressed as \(A'x+B'y\le d'\) where \(A',B'\in \mathbb {R}^{|I|\times n}\), and \(d'\in \mathbb {R}^{|I|}\) are the corresponding matrices. Then the problem (2.5) becomes

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} d_{1}^{\top }x +d_{2}^{\top }y\\ \hbox { s.t. }&{} Dx \le d \\ &{} Ay\le b \\ &{} A'x+B'y\le d' \\ &{} x,y \in \mathbb {B}^{n}. \end{array} \end{aligned}$$
(2.6)

This is equivalent to

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} d_{1}^{\top }x +d_{2}^{\top }y\\ &{} Ex+Fy\le B \\ &{} x,y \in \mathbb {B}^{n} \end{array} \end{aligned}$$
(2.7)

with \(E:=\big [ (D)^{\top }| O| (A')^{\top }\big ]^{\top }\), \(F:=\big [ O| (A)^{\top }| (B')^{\top }\big ]^{\top }\), \(B:=\big [ d^{\top }| b^{\top }| (d')^{\top }\big ]^{\top }\).

Now we consider the continuous relaxation of problem (2.7) denoted by (GCR):

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} d_{1}^{\top }x +d_{2}^{\top }y\\ &{} Ex+Fy\le B \\ &{} x,y \in [0,1]^{n}. \end{array} \end{aligned}$$
(GCR)

We can see that every integer optimal solution of (GCR) that is feasible for (2.1) is also optimal for this problem. This is shown in the next proposition.

Proposition 2.5

Let \((\overline{x},\overline{y})\) be a global optimal solution of (GCR) which is integer. If \(\overline{y}\in {\varPsi }(\overline{x})\), then it is also a global optimal solution of problem (2.1).

Proof

Let \((\overline{x},\overline{y})\) be an integer optimal solution of (GCR). This implies that \((\overline{x},\overline{y})\) is feasible for (2.7), i.e., \(\overline{y}\in \mathbb {B}^{n} \). Moreover, from \(\overline{y}\in {\varPsi }(\overline{x})\) we have that \((\overline{x},\overline{y})\) is a feasible solution for problem (2.1). Now let (xy) be a feasible solution of (2.1). From Lemma 2.4, (xy) is also a feasible solution of (GCR), and the optimality of \((\overline{x},\overline{y})\) in problem (GCR) entails \(d_{1}^{\top } \overline{x} +d_{2}^{\top } \overline{y}\le d_{1}^{\top } x +d_{2}^{\top } y\). Consequently, \((\overline{x},\overline{y})\) is a global optimal solution of (2.1).\(\square \)

2.2 An algorithm for solving the discrete linear bilevel problem

For the procedure of solving problem (2.1), we will solve the relaxation (GCR) first for some cover M (we will see in the example below how to find such cover). If the solution obtained is not integer, we can just delete it from the feasible set of (GCR). To do this, we can use the following method where we suppose that all components of \(P:=[E|F]\) are integer (this can be assumed since all the matrices are supposed to have rational elements).

The problem (2.7) is restated as \(\min \{q^{\top }z: Pz\le B, z\in \mathbb {B}^{2n}\}\) where \(q=(d_{1}^{\top },d_{2}^{\top })\), \(z=(x,y)\), \(P=[E|F]=(p_{ij})_{(i,j)\in \{1,\ldots ,k+s+|I|\}\times \{1,\ldots ,2n\}}.\)

Let

$$\begin{aligned} R_{i}:=\Big \{z\in \mathbb {B}^{2n}: \sum _{j\in \{1,\ldots ,2n\}} p_{ij}z_{j}\le B_{i}\Big \}, \end{aligned}$$
(2.8)

i.e., \(R_{i}\) is the set of all z which satisfy the constraint i (\(R_{i}\) represents the solution set of the 0-1 knapsack constraint). Obviously, \(\bigcap _{i=1}^{k+s+|I|}R_{i}\) is the feasible set of (2.7).

We set \(N:=\{1,\ldots ,2n\}\), \(Z^{i}_{<}:=\{k\in N: p_{ik}<0 \}\), and \(Z^{i}_{\ge }:=\{k\in N: p_{ik}\ge 0 \}\).

Adding \(-\sum _{j\in Z^{i}_{<}}p_{ij}\) at each side of the inequality in (2.8) leads to

$$\begin{aligned} \sum _{j\in Z^{i}_{<}} p_{ij}z_{j}+\sum _{j\in Z^{i}_{\ge }} p_{ij}z_{j}-\sum _{j\in Z^{i}_{<}} p_{ij}\le B_{i}-\sum _{j\in Z^{i}_{<}}p_{ij}, \end{aligned}$$

i.e.,

$$\begin{aligned} \sum _{j\in N}| p_{ij}|\widehat{z}_{j}\le B_{i}-\sum _{j\in Z^{i}_{<}}p_{ij}, \end{aligned}$$
(2.9)

where

$$\begin{aligned} \widehat{z}_{j}:= {\left\{ \begin{array}{ll} z_{j} &{}\quad \text {if} \;j\in Z^{i}_{\ge } \\ 1-z_{j}&{} \quad \text {if} \;j\in Z^{i}_{<}. \end{array}\right. } \end{aligned}$$

Since both sides of (2.9) are positive, for every dependent set C with respect to (2.9),

$$\begin{aligned} \sum _{j\in C}\widehat{z}_{j}\le |C| -1 \end{aligned}$$
(2.10)

is a valid inequality for \(R_{i}\) [29, Proposition 2.1 SectionII.2.2]. Recall that a set C is dependent with respect to (2.9) if \(\sum _{j \in C}|p_{ij}|>B_{i}-\sum _{j\in Z^{i}_{<}}p_{ij}\).

Let \(z^{*}=(x^{*},y^{*})\in [0,1]^{2n}\) be a solution of (GCR) which is not integer.

In order to find an inequality which is satisfied for all feasible solutions of (2.7) and not for \(z^{*}\), it suffices to look for a dependent set \(C\subseteq N\) satisfying \(\sum _{j\in C}\widehat{z^{*}_{j}}> |C|-1\).

That is given by looking for \(t\in \mathbb {B}^{2n}\) representing C, i.e.,

$$\begin{aligned} \forall j\in \{1,\ldots ,2n\}:\quad t_{j}= {\left\{ \begin{array}{ll} 1, &{} \quad \text {if } j\in C \\ 0, &{} \quad \text {otherwise} \end{array}\right. } \end{aligned}$$

such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \sum _{j\in N}(1-\widehat{z^{*}_{j}})t_{j}<1,\\ \sum _{j\in N} \left| p_{ij}\right| t_{j}>B_{i}-\sum _{j\in Z^{i}_{<}}p_{ij} . \end{array}\right. } \end{aligned}$$
(2.11)

The two previous inequalities allow us to get our desired valid inequality.

Let then \((\overline{x},\overline{y})\) be a global integer optimal solution of (GCR): If \(\overline{y}\in {\varPsi }(\overline{x})\), then we conclude that \((\overline{x},\overline{y})\) is an optimal solution of (2.1). Otherwise, we must generate an inequality separating \((\overline{x},\overline{y})\) from the feasible set of (GCR). Actually, the type of inequality which has to be added to problem (GCR) in order to assure finite convergence is \(x^{\top }y\le x^{\top }\widetilde{y}\) where \(\widetilde{y}\) is a feasible solution of the lower level problem. Since this valid inequality for the feasible set of problem (2.1) will not necessary cut the point \((\overline{x},\overline{y})\), it is required to add an additional inequality which will cut off the point \((\overline{x},\overline{y})\) from the feasible set of (GCR) (just see the example below for better understanding). The following proposition from [16, Proposition 1] can be used to generate such an inequality.

Proposition 2.6

Let \((\overline{x},\overline{y})\in \mathbb {B}^{2n}\) be a basic feasible solution of the problem (GCR). Let J be the set of indices such that the corresponding constraints are binding at \((\overline{x},\overline{y})\), i.e., \(J=\{i:\;e^{'}_{i}\overline{x}+f^{'}_{i}\overline{y}=B_{i}\}\), where \(e^{'}_{i}\), \(f^{'}_{i} \) and \(B_{i}\) \(\big (i\in \{1,\ldots ,k+s+|I|\}\big )\) are rows of the matrices \(E,\; F,\; \) and B, respectively. Then

$$\begin{aligned} \pi _{1}x+\pi _{2}y\le \pi _{0}-1, \end{aligned}$$

where \(\pi _{1}:=\sum _{j\in J}e^{'}_{j},\;\pi _{2}:=\sum _{j\in J}f^{'}_{j},\text {and}\;\pi _{0}:=\sum _{j\in J}B_{j}\) is valid for every feasible solution of (2.1).

Remark 2.7

  1. 1.

    Every feasible solution of (2.7) is an extreme point of the feasible set of (GCR) [29, Proposition 2.1 Section II.6.2].

  2. 2.

    Every new extreme point generated by the cut from Proposition 2.6 cannot be feasible for (2.1).

The second assertion just follows from the first one.

In what follows, we present a depth-first search centered on the upper level variables. The basic idea of the algorithm is to solve the linear problem (GCR) which is actually a relaxation of (2.1). At each iteration, a check is made to see if the solution of (GCR) is feasible for (2.1). If so, the corresponding point is a potential solution of (2.1), if not a branch and cut scheme is used to examine all other combinations of the upper level variables. The algorithm terminates when either all subproblems have been solved, are known to have solution that are suboptimal or are infeasible.

Before presenting the algorithm, we provide a structure for enumerating the upper level variables.

Let \(W:=\{1,\ldots ,n\}\). At the k-th level of the branch and cut tree, we define a subset of indices \(W_{k}\subseteq W\). We set \(S^{+}_{k}:= \{i\in W_{k}: x_{i}^{k}=1\}\), \(S^{-}_{k}:=\{ i\in W_{k}: x_{i}^{k}=0\}\), \(S^{o}_{k}:= W\setminus W_{k}\), \((GCR)_{k}\) is the problem (GCR) in the k-th iteration.

Step 0::

Set \(k=0\), \(S^{+}_{k}=\emptyset \), \(S^{-}_{k}=\emptyset \), \(S^{o}_{k}=\{1,\ldots ,n\}\), \(W_{k}=\emptyset \), \(\underline{F}=+\infty \), and \((GCR)_{0}\)=(GCR). I is a fixed given set of lower level feasible points (\(I=\emptyset \) is also possible).

Step 1::

Set \(x_{j}=1\) for \(j\in S^{+}_{k}\), \(x_{j}=0\) for \(j\in S^{-}_{k}\). Solve the linear problem \((GCR)_{k}\): If it is infeasible, go to Step 5. If its solution is non-integer, cut this point by means of (2.11) or use some valid inequalities described in [7] and solve \((GCR)_{k}\) until getting an integer solution. Let \((x^{k},y^{k})\) be its optimal solution.

Step 2::

Solve the lower level problem with \(x=x^{k}\). Let \(\widehat{y}^{k}\) be its optimal solution. Compute \(F(x^{k},\widehat{y}^{k})\) and put \(\underline{F}=\min \{\underline{F}, F(x^{k},\widehat{y}^{k})\}\).

Step 3::

If \((y^{k}-\widehat{y}^{k})^{\top } x^{k}=0\), then go to Step 5, otherwise choose \(i_{o}\) for which \((y_{i_{o}}^{k}-\widehat{y}_{i_{o}}^{k})^{\top } x_{i_{o}}^{k}\ne 0\) and \(i_{o}\notin S^{+}_{k}\). Set \(S^{+}_{k+1}=S^{+}_{k}\cup \{i_{o}\}\), \(S^{-}_{k+1}=S^{-}_{k}\), \(S^{o}_{k+1}=S^{o}_{k}\backslash \{i_{o}\}\) and go to Step 4.

Step 4::

Generate respectively a generalized covering inequality of \(x^{\top }y\le x^{\top }\widetilde{y}\) where \(\widetilde{y}\) is a feasible solution of the lower level problem and an inequality violated by \((x^{k},y^{k})\) and valid for the other feasible points of problem (2.1) in problem \((GCR)_{k}\) using Proposition 2.6. Then add the two above generated inequalities to the constraints of \((GCR)_{k}\), set \(k=k+1\) and go to Step 1.

Step 5::

If no live node (i.e., node associated with a subproblem that has not been fully explored yet) exists, go to Step 6. Else go back to the current node (the most recently created live node) and call it j and call \(k'\) the value of the counter where we have got the last branching (for \(k'=0\), set \(S^{-}_{k'}=\emptyset \)). Branch on its complement by setting \(x^{k}_{j}=0\). Set \(S^{+}_{k+1}=S^{+}_{k}\backslash \{j\}\), \(S^{-}_{k+1}=S^{-}_{k'}\cup \{j\}\), \(S^{o}_{k+1}=S^{o}_{k}\cup \{j\}\) and go to Step 1.

Step 6::

If \(\underline{F}=+\infty \), no optimal solution for problem (2.1) exists, else the feasible point associated with \(\underline{F}\) is as an optimal solution.

Proposition 2.8

The algorithm will terminate at a global optimal solution of (2.1) in a finite number of steps.

Proof

It follows from Remark 2.1 that we will add in the worst case p generalized covering inequalities with respect to the inequalities \(x^{\top } y\le x^{\top } y^{i}\) for \(i=1,\ldots ,p\). We know that every feasible solution of (2.7) is an extreme point of the feasible set of (GCR) (Remark 2.7, Statement 1). Moreover, Proposition 2.6 implies that any integer optimal solution of (GCR) which is not feasible for (2.1) will be cut off without cutting any other feasible point of (2.1). Since we have a finite number of extreme points and from Remark 2.7 (Statement 2), the algorithm must terminate at an optimal solution of (2.1) in a finite number of iterations. It is worth to mention that the cuts added at Step 1 do not influence the finiteness of the algorithm because we consider only a finite number of cuts in order to have the desired integer point [7, 29].\(\square \)

2.3 Illustration

The following example illustrates the procedure of the algorithm.

Example 2.9

Consider the following discrete bilevel programming problem:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} 2x_{1}+x_{2}-2y_{1}+y_{2}\\ \hbox { s.t. }&{} x_{1}+x_{2}\ge 2 \\ &{} x \in \mathbb {B}^{2} \\ &{} y \in {\varPsi }(x):= \underset{y}{\text {Argmin }} \left\{ x^{\top }y:\ y_{1}+y_{2}\le 2.5,\; y\in \mathbb {B}^{2} \right\} . \end{array} \end{aligned}$$

We easily see that \(x_{1}=x_{2}=1\) is the only possible choice for the upper level variable. We have \({\varPsi }(1,1)= \{(0,0)\}\). Hence, the optimal solution of this bilevel programming problem is (1, 1, 0, 0).

Let us take \(I=\{(1,1)\}\) as starting point. After initializing the data at the Step 0, we solve the problem \((GCR)_{0}\) with the set I. Therefore, let us relax the nonlinear inequality \(x^{\top }y\le x^{\top }y^{1}\), \(y^{1} \in I \) into a linear inequality:

\(h_{1}(x,y):= x^{\top }y- x^{\top }y^{1} =x_{1}y_{1}+x_{2}y_{2}- x_{1}-x_{2}=h_{1}(X_{1},X_{2},X_{3},X_{4})=X_{1}X_{3}+X_{2}X_{4}-X_{1}-X_{2}\).

With respect to the setting in Sect. 2.1, we have \(N_{1}=\{1,2,3,4\},\;N^{+}=\{1,2\}, L_{1}=\{3,4\}\), \(S_{1}=\{1,3\},S_{2}=\{2,4\},S_{3}=\{1\},S_{4}=\{2\}\), and the mapping

$$\begin{aligned} \gamma _{1} :&\{3,4\} \rightarrow \{1,2\} \end{aligned}$$

is defined by \(\gamma _{1}(3)=1\) and \(\gamma _{1}(4)=2\).

\(M\subseteq N_{1}\) is a cover of \(h_{1}(X)\le 0\) if \(|M|>|L_{1}|\). This implies that \(M=\{1,2,3\}\) is a cover of \(h_{1}(X)\le 0\).

\(S^{M}=\bigcup _{k\in M\cap N^{+}}S_{k}=S_{1}\cup S_{2}=\{1,2,3,4\}\) and \(S^{M}_{1}=\gamma _{1}(M\cap L_{1})=\{1\}\).

This leads to the inequality \(\overline{X}_{1}+\overline{X}_{2}+\overline{X}_{3}+\overline{X}_{4}+X_{1}\ge 1,\) i.e., \(-X_{2}-X_{3}-X_{4}\ge -3\), or \(x_{2}+y_{1}+y_{2}\le 3\).

Now solve the following linear programming problem:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} 2x_{1}+x_{2}-2y_{1}+y_{2}\\ \hbox { s.t. }&{} x_{1}+x_{2}\ge 2\\ &{} x_{2}+y_{1}+y_{2}\le 3 \\ &{} y_{1}+y_{2}\le 2.5 \\ &{} x,y\in [0,1]^{2}. \\ \end{array} \end{aligned}$$

The optimal solution is (1, 1, 1, 0) but \((1,0)\notin {\varPsi }(1,1)=\{(0,0)\}\) and \(\underline{F}=3\).

At Step 3, \(S^{+}_{1}=\{1\}\) i.e., \(x_{1}=1\), \(S^{-}_{1}=\emptyset , S^{o}_{1}=\{2\}\) and at Step 4 we generate the generalized covering inequality of \(x^{\top }y\le x^{\top }y^{2}\) with \(y^{2}=(1,0)\) and the cut. Before, we have to relax the previous nonlinear inequality into a linear inequality:

\(h_{2}(x,y):= x^{\top }y- x^{\top }y^{2}=x_{1}y_{1}+x_{2}y_{2}- x_{1}=h_{2}(X_{1},X_{2},X_{3},X_{4})=X_{1}X_{3}+X_{2}X_{4}-X_{1}.\)

Once more, with respect to the setting in Sect. 2.1, we have \(N_{2}=\{1,2,3\},\)

\(N^{+}=\{1,2\}, L_{2}=\{3\},\) \(S_{1}=\{1,3\},S_{2}=\{2,4\},S_{3}=\{1\}\) and

$$\begin{aligned} \gamma _{2} :&\{3\} \rightarrow \{1\}. \end{aligned}$$

\(M=\{1,3\}\) is a cover of \(h_{2}(X)\le 0\).

\(S^{M}=\bigcup _{k\in M\cap N^{+}}S_{k}=\{1,3\}\) and \(S^{M}_{2}=\gamma _{2}(M\cap L_{2})=\{1\}\). That leads to the inequality

\(\overline{X}_{1}+\overline{X}_{3}+X_{1}\ge 1\) i.e., \(y_{1}\le 1\).

Then we have the following linear problem:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} 2+x_{2}-2y_{1}+y_{2}\\ \hbox { s.t. }&{} x_{2}\ge 1 \\ &{} x_{2}+y_{1}+y_{2}\le 3 \\ &{} y_{1}+y_{2}\le 2.5 \\ &{} y_{1}\le 1\\ &{} x,y\in [0,1]^{2}. \\ \end{array} \end{aligned}$$

With \(x_{1}=1\), the optimal solution is still (1, 1, 1, 0). That is why we have to generate an inequality which is violated by (1, 1, 1, 0) and valid for the other feasible points of the original problem.

\(y_{1}\le 1\), \(x_{2}\ge 1\), and \(-y_{2}\le 0\) are the inequalities which are binding at (1, 1, 1, 0). So we just apply Proposition 2.6 and add \(-y_{2}+y_{1}-x_{2}\le -1\) in the above problem and at Step 1 of the second iteration we solve the following problem:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} 2+x_{2}-2y_{1}+y_{2}\\ \hbox { s.t. }&{} x_{2}\ge 1 \\ &{} y_{1}+y_{2}\le 2.5 \\ &{} y_{1}\le 1\\ &{} -x_{2}-y_{2}+y_{1}\le -1\\ &{} x_{2}+y_{2}+y_{1}\le 3\\ &{} x,y\in [0,1]^{2}. \\ \end{array} \end{aligned}$$

We get (1, 1, 1, 1) as solution with \((1,1)\notin {\varPsi }(1,1)=\{(0,0)\}\).

At Step 3, we have \(S^{+}_{2}=\{1,2\}\) i.e., \(x_{1}=1,\; x_{2}=1\), \(S^{-}_{1}=\emptyset \), \(S^{o}_{1}=\emptyset \).

We can take \(y_{1}+y_{2}\le 2\) as the generalized covering inequality of \(x^{\top }y\le x^{\top }y^{3}\) with \(y^{3}=(1,1)\). Then after summing up all the inequalities which are binding at (1, 1, 1, 1), we get the following cut: \(3y_{1}+y_{2}\le 3\) with \(x_{2}=1\). We solve the problem at the first Step of the third iteration and get (1, 1, 0, 0) as the integer solution. Updating the sets at Step 5 gives \(S^{+}_{3}=\{1\}\), \(S^{-}_{3}=\{2\}\) i.e., \(x_{1}=1,\;x_{2}=0\), \(S^{o}_{3}=\{2\}\) but the corresponding problem is infeasible. The same holds true at the fifth iteration with \(S^{+}_{4}=\emptyset \), \(S^{-}_{4}=\{1\}\) i.e., \(x_{1}=0\), \(S^{o}_{4}=\{1,2\}\).

The optimal solution is then given by (1, 1, 0, 0). The branch and cut tree is shown in Fig. 1.

Fig. 1
figure 1

Branch and cut tree

3 Approach for solving discrete bilevel optimization problems of Type IV

Here, for any \(x \in \mathbb {R}^{n}_{+} \) the functions \(f(x,\cdot )\) and \(g(x,\cdot )\) are linear and affine respectively, the upper level variable x appears only in the right-hand side of the lower level constraints and the functions F and G are continuous. The problem we consider is formulated as

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0 \\ &{} x \in \mathbb {R}^{n}_{+}\\ &{} y \in {\varPsi }(x):= \underset{y}{\text {Argmin }} \left\{ c^{\top }y:\ Ay\ge x,\;y\in \mathbb {Z}^{m}_{+} \right\} , \end{array} \end{aligned}$$
(3.1)

where \(A\in \mathbb {Q}^{n\times m}_{+}\) is a matrix and \(c= (c_{1},\ldots ,c_{m})^{\top }\) with \(c_{i}>0\), \(i=1,\ldots ,m\), is a fixed vector. Again, the assumption that the constraints in the lower level problem are of the form \(Ay\ge x\) is not a strong restriction. We can add the equation \(z=b-Bx\) to the upper level problem when the lower level constraints are of the form \(Ay+Bx\ge b\).

We make the following assumptions:

  • The set \(X:=\{x\in \mathbb {R}^{n}_{+}:G(x)\le 0\} \) is not empty and bounded.

  • For any \(x \in X\), the set \(\{y\in \mathbb {Z}^{m}_{+}:Ay\ge x\}\) is not empty and the optimal solution of the lower level problem denoted by \(y_{x}\) is unique.

  • The set \(\overline{Y}:=\{y\in \mathbb {Z}^{m}_{+}:\;\exists x\in X\;\text {with}\;y \in {\varPsi }(x)\}=\bigcup _{x\in X}\{y_{x}\}\) is finite.

Using the optimal value reformulation of the lower level of (3.1), we get the following problem:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0 \\ &{} Ay\ge x \\ &{} x \in \mathbb {R}^{n}_{+}, \; y\in \mathbb {Z}^{m}_{+}\\ &{} c^{\top }y\le \varphi (x):=\underset{y}{\min }\left\{ c^{\top }y:\; Ay\ge x,\;y\in \mathbb {Z}^{m}_{+} \right\} . \end{array} \end{aligned}$$
(3.2)

Proposition 3.1

The optimal value function \(\varphi \) possesses the following properties:

  1. 1.

    \(\varphi \) is nondecreasing on \(\mathbb {R}^{n}_{+}\).

  2. 2.

    \(\varphi \) is subadditive, i.e., for all \(x_{1}, x_{2}\in \mathbb {R}^{n}_{+}, \; \varphi (x_{1}+x_{2})\le \varphi (x_{1})+ \varphi (x_{2}) \) holds.

  3. 3.

    \(\varphi \) is lower semicontinuous on \(\mathbb {R}^{n}_{+}\) and piecewise constant on X.

  4. 4.

    If \(\varphi \) is discontinuous at \(\widetilde{x}\in \mathbb {R}^{n}_{+}\), then there exist \(\widetilde{y}_{1},\ldots , \widetilde{y}_{m}\in \mathbb {Z}_{+}\) such that \(\widetilde{x}_{i}=\sum _{j=1}^{m}a_{ij}\widetilde{y}_{j}\) for all \(i=1,\ldots ,n\) with \(A=(a_{ij})_{(i,j)\in {\{1,\ldots ,n\}\times \{1,\ldots ,m\} }}.\)

Proof

  1. 1.

    Let \(x_{1},x_{2}\in \mathbb {R}^{n}_{+}\) and \(\mathcal {M}(x_{i}):=\{y\in \mathbb {Z}^{m}_{+}: Ay\ge x_{i}\}\), \(i=1,2\). Then \(x_{1}\le x_{2}\) implies \(\mathcal {M}(x_{1})\supseteq \mathcal {M}(x_{2})\) which in turn entails that \(\varphi \) is nondecreasing on \(\mathbb {R}^{n}_{+}\).

  2. 2.

    Let \(y_{1},y_{2}\) be the optimal solution of the lower level problems of (3.2) for \(x_{1}, x_{2}\in \mathbb {R}^{n}_{+}, \) respectively. We have \(\varphi (x_{1})+\varphi (x_{2})=c^{\top }y_{1}+c^{\top }y_{2}=c^{\top }(y_{1}+y_{2})\ge \varphi (x_{1}+x_{2}) \) because \(\varphi (x_{1}+x_{2})=\underset{y}{\min }\{c^{\top }y: Ay\ge x_{1}+x_{2},\; y\in \mathbb {Z}^{m}_{+}\}\) and \(A(y_{1}+y_{2})\ge x_{1}+x_{2} \).

  3. 3.

    The lower semicontinuity of \(\varphi \) comes from the fact that the matrix A has rational elements and \(\varphi \) does not take the value \(-\infty \) [1, Theorem 4.5.2]. Moreover, we know that

    $$\begin{aligned} X=\bigcup _{y\in \overline{Y}} R(y) \end{aligned}$$

    with \(R(y):=\big \{x\in X: y\in {\varPsi }(x)\big \}\). R(y) is called region of stability of y. We easily see that \(\varphi \) is constant on each region of stability. Since \(\overline{Y}\) is finite, the assertion of the proposition follows.

  4. 4.

    Let \(\widetilde{x}\) be a point of discontinuity of the function \(\varphi \) and \(\widetilde{y}\in \mathcal {M}(\widetilde{x})\) such that \(\varphi (\widetilde{x})=c^{\top }\widetilde{y}\). We want to show that \(\widetilde{x}=A\widetilde{y}\). Let \(\widetilde{I}=\left\{ j\in \{1,\ldots ,n\}:(A\widetilde{y})_{j}>\widetilde{x}_{j}\right\} \) (for \(v\in \mathbb {R}^{n}\), \(v_{j}\) is the j-th row of v). If \( \widetilde{I}\ne \emptyset \), we consider the vector \(x^{0}\) defined by

    $$\begin{aligned} x^{0}_{j}={\left\{ \begin{array}{ll} (A\widetilde{y})_{j} &{}\quad \text {if} \; j\in \widetilde{I} , \\ \widetilde{x}_{j} &{}\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

    We have \(\widetilde{x}\le x^{0}\; \text {and} \;\widetilde{x}\ne x^{0} \) which imply \( \varphi (\widetilde{x})<\varphi (x^{0})\) because \(\widetilde{x}\) is a point of discontinuity of \(\varphi \). Furthermore since \(\widetilde{y}\in \mathcal {M}(x^{0})\) we have

    $$\begin{aligned} c^{\top }\widetilde{y}=\varphi (\widetilde{x})<\varphi (x^{0})\le c^{\top }\widetilde{y}, \end{aligned}$$

    a contradiction. Consequently the set \(\widetilde{I}=\emptyset \) and \(\widetilde{x}\) is a linear combination of some nonnegative integer parameters with the respective columns of A.

This completes the proof.\(\square \)

In the following, we will use the previous properties of \(\varphi \) in order to find an upper bound of the optimal value function.

3.1 Upper bound for a single constraint

Firstly, we suppose that the lower level has only one constraint, i.e., \(A\in \mathbb {Q}^{1 \times m}_{+}\), \(x\in \mathbb {R_{+}}\). We set \(A=(a_{1},\ldots , a_{m})\).

We have \(\varphi (0)=\underset{y}{\min }\left\{ c^{\top }y:\; Ay\ge 0,\;y\in \mathbb {Z}^{m}_{+}\right\} =0\) because \(c>0\). Therefore, we know that \(\varphi (x)>-\infty \) for all \(x\in \mathbb {R}^{n}_{+}\) for arbitrary n [29, Proposition 2.2 Section II.3.2].

Without loss of generality, we can suppose \(a_{1}\le a_{2} \le \cdots \le a_{m} \). Then, the first points where the function could be discontinuous are: \(0, a_{1},a_{2}, \min \{a_{3},a_{1}+a_{2}\},\ldots \)

Let \(a\ge a_{1}\) be the first nonzero point where \(\varphi \) is discontinuous. We observe that the optimal value function on [0, a] satisfies

$$\begin{aligned} \varphi (x) = {\left\{ \begin{array}{ll} \varphi (0) &{}\quad \text {if} \; x=0,\\ \varphi (a) &{}\quad \text {if} \; x\in ]0,a]. \end{array}\right. } \end{aligned}$$
(3.3)

Definition 3.2

Let \(e>0\). For every subadditive function \(f:[0,e]\rightarrow \mathbb {R}\) we can define:

  • A subadditive extension of f on \([0,+\infty [\): This is a function which extends f to \([0,+\infty [\) and is subadditive. Some examples of subadditve extension functions can be found in [25].

  • Its maximal subadditive extension noted \(S(f): [0,+\infty [\rightarrow \mathbb {R}\cup \{-\infty \}\) given by \(S(f) (x):= \inf \left\{ \sum _{i=1}^l f(x_{i}): \{x_{1},\ldots , x_{l}\}\in \mathcal {C}(x)\right\} ,\) where \( \mathcal {C}(x) \) is the set of all finite collections \(\{x_{1},\ldots , x_{l}\}\) such that \(0\le x_{i}\le e\) for all \(i=1,\ldots ,l\) and \(\sum _{i=1}^l x_{i}=x\).

In the sequel, for ease of reference we denote S(f) by Sf.

Proposition 3.3

[25] If a function f is subadditive and nondecreasing, then Sf is subadditive and nondecreasing and for every other subadditive extension \(\mathcal {G}\) of f to \([0,+\infty [\), we have \(\mathcal {G}\le Sf\).

We can directly deduce from Proposition 3.3 that \(\varphi \le S\varphi \) since \(\varphi \) is already defined on \([0,+\infty [\) and subadditive, see Proposition 3.1.

It is worth to mention that supposing the subadditivity of \(\varphi \) is very important in order to have \(\varphi \le S\varphi \). If it is not satisfied, \(S\varphi \) may not be an upper bound of \(\varphi \) as we can see in the following example from [13]:

$$\begin{aligned} \varphi (x)=\underset{y}{\min }\left\{ y^{2}:\; 2y\ge x,\;y\in \{0,1,2,3\} \right\} . \end{aligned}$$

Here \(a=2\), \( \varphi (a)=1\) and

$$\begin{aligned} \varphi (x) ={\left\{ \begin{array}{ll} 0 &{}\quad \text {if} \; x=0,\\ 1 &{}\quad \text {if} \; x\in ]0,2],\\ 4 &{} \quad \text {if} \; x\in ]2,4],\\ 9 &{}\quad \text {if} \; x\in ]4,6],\\ +\infty &{}\quad \text {if} \; x>6. \end{array}\right. } \end{aligned}$$

\(\varphi \) is not subadditive on [0, 2] because \(\varphi (1+2)=4>\varphi (1)+\varphi (2)=2\) (we can also justify the non-subadditivity of \(\varphi \) with the fact that the function \(y\mapsto y^{2}\) is not linear), and we have \(S\varphi (3)\le 3\varphi (1)=3<\varphi (3)=4\).

On the other hand, we have the following theorem from [5]:

Theorem 3.4

Let f be subadditive on [0, e] for \(e>0\). Then \(Sf(ne+x)=nf(e)+f(x)\) for all \(x\in ]0,e]\) and all positive integers n if and only if

$$\begin{aligned} \forall y\in ]0,e]\; \forall u\in [y,e]:\; f(y)\le f(e+y-u)-f(e)+f(u). \end{aligned}$$
(3.4)

Since the function \( \varphi \) is subadditive and fulfills the condition (3.4) (this follows just from (3.3)), we can express \( S\varphi \) \(\text {for} \;x\in ]na, (n+1)a], \; n \in \mathbb {N},\) by

$$\begin{aligned} \begin{array}{rl} S\varphi (x)&{}=n\varphi (a)+\varphi (x-na)\\ &{} =(n+1)\varphi (a) \end{array} \end{aligned}$$

because \(0<x-na\le a\) implies \(\varphi (x-na)=\varphi (a)\) see (3.3).

In the following example, we visualize the functions \( S\varphi \) and \( \varphi \) in Fig. 2.

$$\begin{aligned} \varphi (x)=\min \{2y_{1}+3.5y_{2}+3y_{3}+4y_{4}: 2y_{1}+3.5y_{2}+5y_{3}+7.5y_{4}\ge x,\; y \in \mathbb {Z}^{4}_{+}\}. \end{aligned}$$

Here, \(a=2\) is the first nonzero point of discontinuity, and \(\varphi (2)=2\). The dotted line represents the function \(S\varphi \), the bold line represents the function \(\varphi \), and \(S\varphi (x)=\varphi (x)\) for \(x\in [0,2]\). Moreover, we can see from this figure that the functions \(\varphi \) and \(S\varphi \) do not have the same points of discontinuity.

Fig. 2
figure 2

Optimal value function and maximal subadditive extension

We will now replace problem (3.2) for \(n=1\) by

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0\\ &{} Ay\ge x \\ &{} c^{\top }y\le S\varphi (x)\\ &{} x \in \mathbb {R}_{+},\;y\in \mathbb {Z}^{m}_{+}, \end{array} \end{aligned}$$
(3.5)

where

$$\begin{aligned} S\varphi (x) = {\left\{ \begin{array}{ll} \varphi (0) &{}\quad \text {if} \; x=0,\\ (n+1)\varphi (a) &{}\quad \text {if} \; na<x\le (n+1)a,\; n\in \mathbb {N}. \end{array}\right. } \end{aligned}$$

Problem (3.5) is a relaxation of problem (3.2) (because the feasible set of the problem (3.5) is larger than the one of problem (3.2)), and it is a mixed integer programming problem for which the existence of an optimal solution is not guaranteed because \(\{y\in \mathbb {Z}^{m}_{+}: \;Ay\ge x\}\) is not bounded and the function \(S\varphi \) is discontinuous. Therefore, for each \(n\in \mathbb {N}\), we consider the problem

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0\\ &{} Ay\ge x \\ &{} c^{\top }y\le S\varphi (x)\\ &{} na\le x\le (n+1)a\\ &{} y\in \mathbb {Z}^{m}_{+}. \end{array}\quad \quad \quad \quad \quad \quad (P_{n}) \end{aligned}$$

The family of problems \((P_{n})_{ n\in \mathbb {N}}\) is deduced from (3.5) and is defined on the closure of the interval where the function \(S\varphi \) is constant.

Remark 3.5

The family of problems \((P_{n})_{ n\in \mathbb {N}}\) depends on the function \(S\varphi \). Hence, if \(S\varphi \) changes, then for all \(n\in \mathbb {N}\), \(P_{n}\) changes as well.

Since the set \(\{(x,y)\in \mathbb {R}^{n}_{+}\times \mathbb {R}^{m}_{+}:\; c^{\top }y\le (n+1)\varphi (a),\;na\le x\le (n+1)a\} \) for fixed \(n\in \mathbb {N}\) is bounded and closed, an optimal solution of \((P_{n})\) exists. Furthermore, we have a finite number of sub-problems \((P_{n})_{ n\in \mathbb {N}}\) due to the boundedness of the set X (the function \(S\varphi \) depends only on x). Consequently, the problem

$$\begin{aligned} \begin{array}{rl} \underset{(x^{n},y^{n})}{\min }&{} F(x^{n},y^{n})\\ \hbox { s.t. }&{} (x^{n},y^{n})\;\text {is an optimal solution of}\; (P_{n})\\ &{} n\in \mathbb {N}, \end{array} \end{aligned}$$
(3.6)

possesses a finite feasible set.

From the definition of \((P_{n})\), every feasible solution of (3.5) is in the feasible set of one of the sub-problems \((P_{n})\). Therefore, the following proposition holds:

Proposition 3.6

Let \((\overline{x},\overline{y})\) be a global optimal solution of (3.6) . If \(\overline{y}\in {\varPsi }(\overline{x})\), then it is also a global optimal solution of problem (3.1).

Proof

The point \((\overline{x},\overline{y})\) is a feasible solution of (3.1) because it is a feasible solution of \((P_{n})\) for a certain \(n\in \mathbb {N} \) and \(\overline{y}\in {\varPsi }(\overline{x})\). Furthermore, every feasible solution of (3.1) is a feasible solution of (3.5) and, hence, a feasible solution for one of the sub-problems \((P_{n})\). Accordingly, the global optimality of \((\overline{x},\overline{y})\) for (3.6) implies that \((\overline{x},\overline{y})\) is also global optimal solution of problem (3.1).\(\square \)

3.2 An algorithm for solving the discrete linear bilevel programming problem with a single constraint

From Proposition 3.6, problem (3.6) is a relaxation of problem (3.1). Consequently, the procedure to solve problem (3.1) is to reduce at each iteration the feasible set of (3.6) such that in the worst case it becomes equal to the one of (3.1), that is we solve problem (3.6) by updating the function \(S\varphi \) such that the set \(\{x: S\varphi (x)=\varphi (x)\}\) becomes larger by constructing another function \(S\varphi (x)\) which approximates better the function \(\varphi (x)\). This is our algorithm:

Step 0::

Solve problem (3.6): If it is infeasible, then problem (3.1) does not have a solution. Let \((x^{0},y^{0})\) be the optimal solution of (3.6). If \((x^{0},y^{0})\) is a feasible solution for (3.1), i.e., \({y}^{0}\in {\varPsi }(x^{0})\), then \((x^{0},y^{0})\) is the global optimal solution of (3.1); otherwise go to Step 1.

Step 1::

Compute \(\varphi (x^{0})\) and define \(S\varphi _{0}\) by

$$\begin{aligned} S\varphi _{0}(x) := {\left\{ \begin{array}{ll} S\varphi (x) \;\;\;\;&{}\text {if} \; x\in [0,a],\\ \varphi (x^{0}) \;\;\;\;&{}\text {if} \; x\in ]a,x^{0}],\\ S\varphi (x) \;\; &{}\text {if} \; x>x^{0}, \end{array}\right. } \end{aligned}$$

and then set \(k:=1\).

Step 2::

Solve problem (3.6) with respect to \(S\varphi _{k-1}\): If it is infeasible, then problem (3.1) does not have a solution. Let \((x^{k},y^{k})\) be the optimal solution of (3.6). If \((x^{k},y^{k})\) is feasible solution for (3.1), then \((x^{k},y^{k})\) is optimal solution of (3.1); otherwise go to Step 3.

Step 3::

Compute \(\varphi (x^{k})\) and define \(S\varphi _{k}\) by

$$\begin{aligned} S\varphi _{k}(x) := {\left\{ \begin{array}{ll} \min \{S\varphi _{k-1}(x), \varphi (x^{i})\} \;\;\;\;&{}\text {if} \; x\le x^{k},\\ S\varphi _{k-1}(x) \;\; &{}\text {if} \; x> x^{k}. \end{array}\right. } \end{aligned}$$
Step 4::

Set \(k=k+1\) and go to Step 2.

If it turns out at Step 3 of a certain iteration that the function \(S\varphi _{k}\) does no longer change and we do not yet have the optimal solution, then, in order to enlarge the set \(\big \{x: S\varphi (x)=\varphi (x)\big \}\), we can set \(\widetilde{x}= x^{k} +1\) and define \(S\varphi _{k+1}\) by:

$$\begin{aligned} S\varphi _{k+1}(x) := {\left\{ \begin{array}{ll} S\varphi _{k}(x) \;\;\;\;&{}\text {if} \; x\le x^{k},\\ \varphi (\widetilde{x}) \;\;\;\;&{}\text {if} \; x^{k} \le x\le \widetilde{x},\\ S\varphi _{k}(x) \;\; &{}\text {if} \; \widetilde{x}<x. \end{array}\right. } \end{aligned}$$

One advantage of this algorithm is that we do not need to compute the global upper approximation of the optimal value function at each iteration as we saw in [13].

Proposition 3.7

Let R(y) be the region of stability for \(y\in \mathbb {Z}^{m}_{+}\). If all the regions of stability are closed, then the algorithm will terminate at a global optimal solution of (3.1) in a finite number of steps.

Proof

We know from the assumptions that the set X is bounded and the union of a finite number of regions of stability. Since at each iteration the new problem remains a relaxation of problem (3.2) and the set \(\big \{x: S\varphi (x)=\varphi (x)\big \}\) is enlarged, we will obtain an optimal solution of (3.1) after a finite number of steps. The closedness of the regions of stability ensures that an optimal solution of (3.1) exists [20].\(\square \)

Remark 3.8

In Proposition 3.7, we assumed that all the regions of stability are closed. In fact our bilevel programming problem has continuous upper and discrete lower level variables. Therefore, the optimal set mapping \({\varPsi }\) is not in general upper semicontinuous [20], and this property is an essential assumption for the existence of an optimal solution of the optimistic formulation of the bilevel programming problem [12].

If there exists \(y\in \mathbb {Z}^{n}_{+}\) such that R(y) is not closed, we can nevertheless find a solution which is nearly optimal of problem (3.1) called weak solution. To get this solution we need to solve the problem

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0\\ &{} y\in \widehat{\psi }_{o}(x), \end{array} \end{aligned}$$
(3.7)

where

$$\begin{aligned}&\widehat{\psi }_{o}(x):=\big \{y\in \mathbb {Z}^{m}_{+}: R(y)\ne \emptyset , \;x\in \text {cl}\;O(y)\big \}\\&\quad O(y):= \Big \{x \in R(y): F(x,y)=\underset{z}{\min }\big \{F(x,z):\;z\in {\varPsi }(x)\big \}\Big \}. \end{aligned}$$

Problem (3.7) is a bilevel programming problem for which we are sure that an optimal solution exists if the set X is compact because the point-to-set mapping \(x\mapsto \widehat{\psi _{o}}(x) \) is upper semi-continuous [20]. An algorithm to solve (3.7) is given in [20] as well.

3.3 Illustration

The following example illustrates the procedure of our algorithm.

Example 3.9

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} (x-4.5)^{2}+y_{3}\\ \hbox { s.t. }&{}x\in [0,6]\\ &{} y \in {\varPsi }(x):= \underset{y}{\text {Argmin }} \left\{ 2y_{1}+3.5y_{2}+3y_{3}:\; 2y_{1}+y_{2}+5y_{3}\ge x,\;y\in \mathbb {Z}^{3}_{+} \right\} . \end{array} \end{aligned}$$

The first nonzero point of discontinuity is \(a=2\) and we have \(\varphi (a)=2\). We consider the mixed integer programming problem

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} (x-4.5)^{2}+y_{3}\\ \hbox { s.t. }&{}x\in [0,6]\\ &{}2y_{1}+y_{2}+5y_{3}\ge x\\ &{} 2y_{1}+3.5y_{2}+3y_{3}\le S\varphi (x)\\ &{} x \in \mathbb {R}_{+}, y\in \mathbb {Z}^{3}_{+}, \end{array} \end{aligned}$$

where

$$\begin{aligned} S\varphi (x) = {\left\{ \begin{array}{ll} 0 \;\;\;\;&{}\text {if} \; x=0,\\ 2(n+1)\;\; &{}\text {if} \; 2n<x\le 2(n+1),\; n\in \mathbb {N}. \end{array}\right. } \end{aligned}$$

The problems \((P_{n})_{n\in \{0,1,2\}}\) are given by:

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} (x-4.5)^{2}+y_{3}\\ \hbox { s.t. }&{} x\in [0,6]\\ &{} 2y_{1}+y_{2}+5y_{3}\ge x\\ &{} 2y_{1}+3.5y_{2}+3y_{3}\le 2(n+1) \\ &{} 2n\le x\le 2(n+1)\\ &{} y\in \mathbb {Z}^{3}_{+}. \end{array}\quad \quad \quad \quad \quad \quad (P_{n}) \end{aligned}$$
Step 0::

\((4.5,3,0,0)\in \text {Argmin}\big \{\big (x-4.5\big )^{2}+y_{3}:\;(x,y)\in \{(2,1,0,0), (4,2,0,0), (4.5,3,0,0)\}\big \}\) and \((3,0,0)\notin {\varPsi }(4.5)=\{(0,0,1)\}\)

Step 1::

Replace \(S\varphi \) by \(S\varphi _{0}\) defined by:

$$\begin{aligned} S\varphi _{0}(x) := {\left\{ \begin{array}{ll} 0 \;\;\;\;&{}\text {if} \; x=0,\\ 2 \;\;\;\;&{}\text {if} \; x\in ]0,2],\\ 3 \;\;\;\;&{}\text {if} \; x\in ]2,4.5],\\ 6 \;\;\;\;&{}\text {if} \; x\in ]4.5,6]. \end{array}\right. } \end{aligned}$$
Step 3::

The solution of each problem \((P_{n})\), \(n\in \{0,1,2\}\) is computed with respect to \(S\varphi _{0}\) and we get again (4.5, 3, 0, 0) as optimal solution of

$$\begin{aligned} \min \left\{ (x-4.5)^{2}+y_{3}:\;(x,y)\in \{(2,1,0,0), (4.5,0,0,1),(4.5,3,0,0)\}\right\} . \end{aligned}$$

This implies that \(S\varphi _{1}=S\varphi _{0}\). In order to continue, we must update \(S\varphi _{0}\). We consider the point 5.5, \(\varphi (5.5)=5\) and define \(S\varphi _{1}\) by

$$\begin{aligned} S\varphi _{1}(x) := {\left\{ \begin{array}{ll} 0 \;\;\;\;&{}\text {if} \; x=0,\\ 2 \;\;\;\;&{}\text {if} \; x\in [0,2],\\ 3 \;\;\;\;&{}\text {if} \; x\in ]2,4.5],\\ 5 \;\;\;\;&{}\text {if} \; x\in ]4.5,5.5],\\ 6 \;\;\;\;&{}\text {if} \; x\in ]5.5,6]. \end{array}\right. } \end{aligned}$$

At Step 3 of the next iteration, we solve for each \(n\in \{0,1,2,3\}\) the problem \((P_{n})\) with respect to \(S\varphi _{1}\) and compute the solution of (3.5) by solving the following problem

$$\begin{aligned}&\min \left\{ (x-4.5)^{2}+y_{3}:\;(x,y)\in \{(2,1,0,0), (4.5,1,0,1), (4.5,0,0,1),\right. \\&\quad \left. (5.5,3,0,0)\}\right\} . \end{aligned}$$

We get the point (4.5, 0, 0, 1) with \((0,0,1)\in {\varPsi }(4.5)\).

This implies that (4.5, 0, 0, 1) solves the bilevel programming problem.

3.4 Upper bound in the general case

Now, we get back to the general case. We extend the result to higher-dimensional spaces, i.e., to the case \(m>1\) . We recall that \(\varphi \) is nondecreasing, piecewise constant, and subadditive. To solve problem (3.1), we need to extend the upper approximation to the general case.

Definition 3.10

Let a real-valued function f be subadditive on the cuboid \([0,h_{1}]\times \ldots \times [0,h_{n}]\), \(h\in \mathbb {R}^{n}_{+}\). We denote the maximal subadditive extension of f as follows:

$$\begin{aligned} Sf(d) := {\left\{ \begin{array}{ll} f(d) &{}\quad \text {if} \; d_{i}\in [0,h_{i}],\;\forall i\in \{1,\ldots ,n\}\\ \inf \left\{ \sum _{i=1}^l f(\rho ^{i}): \{\rho ^{1},\ldots , \rho ^{l}\}\in \mathcal {C}(d)\right\} &{}\quad \text {if} \;\exists i\in \{1,\ldots ,n\}: d_{i}\notin [0,h_{i}], \end{array}\right. } \end{aligned}$$

where \( \mathcal {C}(d) \) is the set of all finite collections \(\{\rho ^{1},\ldots ,\rho ^{l}\}\) such that \(\rho ^{j}\in \mathbb {R}^{m}\), \(\rho ^{j}_{i}\in [0,h_{i}]\) for all \(j \in \{1,\ldots ,l\}\), and \(\sum _{i=1}^l\rho ^{j}=d\).

Proposition 3.11

[22] Let \(h\in \mathbb {R}^{n}_{+}\) be given and choose a real-valued function f which is subadditive on the cuboid \([0,h_{1}]\times \ldots \times [0,h_{n}]\) with \(f(0)=0\). The maximal subadditive extension of f is subadditive, and if \(\mathcal {G}\) is any other subadditive extension of f on \(\mathbb {R}^{n}_{+}\), then \(\mathcal {G}\le Sf\).

On the other hand, it turns out that the results of Theorem 3.4 obtained in one dimension have their homologue in the n-dimension setting, where the interval \(I=[0,e]\) is replaced by the region defined by the inequalities \(0\le x_{i} \le e_{i}\) for \(i= 1,\ldots , n\). Then, we can replace \(S\varphi (x)\) by \((n+1)\varphi (a)\) for \(na< x\le (n+1)a\), \(n\in \mathbb {N}\), where a is a linear combination of columns of the matrix A (Proposition 3.1, Statement 4). We will then substitute problem (3.2) by the following one

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0\\ &{} Ay\ge x \\ &{} c^{\top }y\le S\varphi (x)\\ &{} x \in \mathbb {R}^{n}_{+},\;y\in \mathbb {Z}^{m}_{+}, \end{array} \end{aligned}$$
(3.8)

where

$$\begin{aligned} S\varphi (x):= {\left\{ \begin{array}{ll} \varphi (0) \;\;\;\;&{}\text {if} \; x=0,\\ (n+1)\varphi (a) \;\; &{}\text {if} \; na<x\le (n+1)a,\;n\in \mathbb {N},\\ r \;\; &{}\text {otherwise}, \end{array}\right. } \end{aligned}$$

and r is an arbitrary real number with \(\varphi (d)\le r\) for all \(d\in \mathbb {R}^{n}_{+} \) respectively; note that such an r exists due to the boundedness of X.

We also consider the sub-problems \((P_{n})\) with respect to \(S\varphi \) defined by

$$\begin{aligned} \begin{array}{rl} \underset{x,y}{\min }&{} F(x,y)\\ \hbox { s.t. }&{} G(x)\le 0\\ &{} Ay\ge x \\ &{} c^{\top }y\le (n+1)S\varphi '(x)\\ &{} na\le x\le (n+1)a, \;\;n\in \mathbb {N}\\ &{} y\in \mathbb {Z}^{m}_{+}, \end{array}\quad \quad \quad \quad \quad \quad (P_{n}) \end{aligned}$$

where

$$\begin{aligned} S\varphi '(x) := {\left\{ \begin{array}{ll} (n+1)\varphi (a) \;\; &{}\text {if} \; na\le x\le (n+1)a,\;\;n\in \mathbb {N},\\ r \;\;&{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

and the problem

$$\begin{aligned} \begin{array}{rl} \underset{(x^{n},y^{n})}{\min }&{} F(x^{n},y^{n})\\ \hbox { s.t. }&{} (x^{n},y^{n})\;\text {optimal solution of}\; (P_{n})\\ &{} n\in \mathbb {N}. \end{array} \end{aligned}$$
(3.9)

In the sequel, we can simply apply the previous algorithm replacing Step 1 and Step 3 by the following one:

New step::

Let \((x^{k},y^{k})\) be an optimal solution of (3.9). Replace \(S\varphi _{k-1}\) by \(S\varphi _{k}\) given by

$$\begin{aligned} S\varphi _{k}(x) := {\left\{ \begin{array}{ll} \min \{S\varphi _{k-1}(x), \varphi (x^{k})\} \;\;\;\;&{}\text {if} \; x\le x^{k},\\ S\varphi _{k-1}(x) \;\; &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

4 Conclusion

In the paper, we proposed two approaches for solving linear discrete bilevel programming problems; firstly, in the case when all the variables are integers, secondly, when the upper level variables are continuous and the lower level variables are discrete. Both approaches use the optimal value reformulation in order to replace the original problem by a single level program. This led to algorithms providing global solutions of the bilevel programming problem. The accommodation of the algorithm in the first case where the upper level variables appear in the lower level constraints can not be straightforward, the main problem relies on the fact that in the inequality \(x^{\top }y\le x^{\top }\bar{y}\), \(\bar{y}\) is a lower feasible point which depends on the parameter x. Exploring further such direction is an area of future research.