On Solving Optimization Problems with Hidden Nonconvex Structures

Strekalovsky, Alexander S.

doi:10.1007/978-1-4939-0808-0_23

Alexander S. Strekalovsky⁴

1886 Accesses
34 Citations

Abstract

Here we consider three very popular optimization problems: the linear complementarity problem, the search for Nash equilibria in a bimatrix game, and the quadratic-linear bilevel programming problem. It can be shown that each of the problem possesses a hidden nonconvexity and, as a consequence, a rather large number of local solutions which are different from global ones from the viewpoint of the goal function. In order to attack these problems the principal points of Global Search theory are presented and discussed. Furthermore, the main stages of Local and Global Search Methods are precised for each problem. Finally, we present the new results of computational solution separately for every problem considered.

Access provided by Autonomous University of Puebla. Download chapter PDF

The Local and Global Searches in Bilevel Problems with a Matrix Game at the Lower Level

On Solving Bilevel Optimization Problems with a Nonconvex Lower Level: The Case of a Bimatrix Game

Global Search for Bilevel Optimization with Quadratic Data

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The most of real-life problems, according to Leibnitz and Euler [3, 10, 13], can be stated as optimization problems, because the lows of the nature just follow the principles of Fermat, Lagrange, Euler, and other equations provided by extremum principles.

On the other hand, the contemporary situation can be characterized by the crucial impact and the increasing value of the numerical methods in view of computational solving the problems of practical interest.

It is worthy to note that the optimization problems must be separated into two parts: convex and nonconvex. From the viewpoint of the numerical processing of the problem of a rather general kind

there exists a “solvable case”—this one of the convex optimization problems, those where the domain S and the functions f ₀ and f _i are all convex [3, 13, 22, 40].

Under minimal additional computability assumptions a convex optimization problem is computationally tractable [3, 22]. It means that the computational effort required to solve the problem to a given accuracy grows moderately with the dimension of the problem and the required number of accuracy digits.

In contrast to this, general-type nonconvex problems are too difficult for numerical solution, since in a real-life nonconvex optimization problem there can exist a lot (often a huge!) of local extrema and stationary points which are rather far from a global solution [9, 17, 25, 28].

As a consequence, the classical optimization methods (conjugate gradients, Newton’s and quasi-Newton’s methods, TRM, SQP, IPM, etc.) turn out to be inoperative, in general, and ineffective as to finding a global solution in nonconvex problems because they are not able to escape a local pit.

Moreover, specialists in applied problems do not think about the correctness of direct application of classical optimization methods in nonconvex problems, while the numerical results are interpreted only in the content aspect, without thinking of the fact that all classical optimization methods converge to the global solution only in convex problems [3, 22].

At the same time, in nonconvex problems, the direct application of standard methods may have unpredictable consequences [3, 17, 24, 28, 40], and sometimes may even distract one from the desired solution. So, these arise various approaches which completely neglect the classical optimization methods and use a direct selection way employing, for example, the B&B idea or cut’s method. As well known the latter algorithms suffer the curse of dimension, when the volume of computations grows exponentially side by side with the growth of the problem’s dimension [17]. We are sure, there exists also another way of solving nonconvex problems of high dimension [18, 19, 26–37].

In the recent two decades, we have managed to construct a theory of global search, which is harmonic from the viewpoint of the theory of optimization and which unexpectedly has turned out to be rather efficient in the aspect of computations, especially for the problems of high dimensions. Simultaneously necessary and sufficient Global Optimality Conditions (GOCs) for the principal classes of nonconvex problems can be viewed as the kernel of the theory (see below) [28].

Furthermore, we have proposed a family of local search methods (LSMs), which, on the one hand, in some cases develop methods earlier known for the special problems and, on the other hand, this family of LSMs represents a joint ensemble of methods, which is harmonic from the viewpoint of GOCs [28, 31, 33].

Moreover, the procedures of escape from stationary or local solutions, which are based on GOCs, are unique and quite efficient even in case of any simplest implementation [18, 19, 26–28, 31–37].

Besides, the approach elaborated has been tested on a wide field of popular nonconvex problems (some part of which is represented below). It has demonstrated an unexpected efficiency during the numerical solving problems of high dimension. Note, convex optimization methods are successfully used “inside” the procedures of local and global search proposed [18, 19, 26–28, 31–37].

Finally, we have to add that, according to the opinion of numerous confirmed specialists in optimization, the most attractive and promising fields of investigation and, may be, even modelling paradigms in optimization in twenty-first century can be represented (see [24]), in particular, by the following examples which both possess the hidden nonconvex structures:

the search for equilibriums in competitions (conflict situations or games);
hierarchical optimization problems.

Unexpectedly for us, we turned out to be on this main stream, but rather prepared, i.e. possessing a suitable mathematical apparatus.

2 Examples of Applied Problems

2.1 Linear Complementarity Problems

As well known [7], the linear complementary problem (LCP) aims at finding the pair of vector $(x,w) \in \mathbb{R}^{n+n}$, which satisfy the following conditions:

$$\displaystyle{ \left.\begin{array}{c} Mx + q = w,\quad \langle x,w\rangle = 0,\\ x \geq 0, \quad w \geq 0, \end{array} \right \} }$$

(1)

for a given vector $q \in \mathbb{R}^{n}$ and a given real (n × n)-matrix M which is, in general, indefinite. Many physical, engineering problems (the braking problem; the problem of contact; the problem of viscoelastic twisting, etc.), some economic problems (the problems of market equilibrium, the problem of optimal constant basic capital, etc.) and problems of computational geometry can often be stated as LSP. From the first glance any nonconvexity is not visible in (1). Even if we consider a similar formulation of LSP, for example,

$$\displaystyle\begin{array}{rcl} \left.\begin{array}{c} \langle x,Mx + q\rangle = 0,\quad M = M^{\top }, \\ x \geq 0,\quad Mx + q \geq 0, \end{array} \right \}& &{}\end{array}$$

(1′)

a nonconvexity does not appear since all data stays to be linear. However, if one looks at the problem as optimization problem:

$$\displaystyle{ \left.\begin{array}{c} \varPhi (x):=\langle x,Mx\rangle +\langle x,q\rangle \downarrow \min,\quad x \in S,\\ S\stackrel{\bigtriangleup }{= } \{\,x \in \mathbb{R}^{n } \mid x \geq 0,\;\; Mx + q \geq 0\,\}, \end{array} \right \} }$$

(2)

it becomes clear that the properties and the structure of the LSP (1) depend on the features of the matrix M, as follows.

(a)
If M is nonnegative definite, the problem (2) is convex, i.e. solvable with the classical methods, for instance, the conjugate gradient method (CGM).
(b)
If M is negative definite, then the problem (2) turns out to be nonconvex (anticonvex) optimization problem of concave minimization (that is equivalent to convex maximization).
(c)
If M is indefinite, i.e. it has positive and negative eigenvalues, then the problem (2) must be classified as a d.c. minimization problem:

$$\displaystyle{ \varPhi (x) = g(x) - h(x) \downarrow \min \limits _{x},\quad x \in S, }$$

(3)

where $g(x) =\langle x,M_{1}x\rangle +\langle q,x\rangle$, $M_{1} = M_{1}^{\top } > 0$, $h(x) =\langle x,M_{2}x\rangle$, $M_{2} = M_{2}^{\top } > 0$,$M = M_{1} - M_{2}$, g(⋅ ) and h(⋅ ) are strongly convex functions on $\mathbb{R}^{n}$. In each of cases (a), (b), and (c) one has to apply the different methods of local and global search in order to find a global solution to Problem (2) or, what is equivalent, to find a solution to Problem (1).

The conclusion is unexpected: if anyone needs to have a solution to the LCP (1), then he has to choose the only one way (global search method (GSM)) among three different paths (GS methods) dependent on the properties of the matrix M. Below we will show how to do it.

So, very simple, from the first sight, Problem (1) may turn out to be very difficult to solve, since it possesses, in general, a hidden nonconvexity.

2.2 Search for an Equilibrium

As example of equilibrium problems, let us consider the bimatrix games [33] which reflect the conflict of two parties (players), each one having a finite number of strategies. After having introduced the mixed strategies, we obtain

$$\displaystyle{ \left.\begin{array}{c} \langle x,Ay\rangle \uparrow \max \limits _{x},\quad x \in S_{m}, \\ \langle x,By\rangle \uparrow \max \limits _{y},\quad y \in S_{n}, \\ S_{p} ={\Bigl \{\, x \in \mathbb{R}_{+}^{p}\mid \sum \limits _{ i=1}^{p}x_{ i} = 1\,\Bigr \}},\;\;p = m,n. \end{array} \right \} }$$

(4)

Some economics, engineering and ecological problems can be represented in the form of bimatrix games, in which the Nash equilibrium is the common concept and can be represented as follows: find an equilibrium situation $(x^{{\ast}},y^{{\ast}}) \in S_{m} \times S_{n}$:

$$\displaystyle{ \left.\begin{array}{c} \langle x^{{\ast}},Ay{\ast}\rangle \geq \langle x,Ay^{{\ast}}\rangle \quad x \in S_{ m}, \\ \langle x^{{\ast}},By^{{\ast}}\rangle \geq \langle x^{{\ast}},By\rangle \quad y \in S_{n}. \end{array} \right \} }$$

(5)

In formulae (4) and (5), from the first sight, any nonconvexity also is not yet visible, since all the data is linear, and the problem (4)–(5) seems to be convex, i.e. solvable by the classical methods and approaches.

However, it turns out that the search for the Nash equilibrium can be reduced [20, 21] to solving the following nonconvex (in general) problem of mathematical programming:

$$\displaystyle{ \left.\begin{array}{c} F(x,y,\alpha,\beta ):=\langle x,(A + B)y\rangle -\alpha -\beta \uparrow \max, \\ x^{\top }B -\beta e_{n} \leq 0_{n},\quad x \in S_{m}, \\ Ay -\alpha e_{m} \leq 0_{m},\quad y \in S_{n}, \end{array} \right \} }$$

(6)

where $\alpha,\beta \in \mathbb{R}$, $e_{p} = (1,1,\ldots,1)^{\top }\in \mathbb{R}^{p}$, p = m, n. Note that the numbers α _∗ and β _∗ in a global solution $(x^{{\ast}},y^{{\ast}},\alpha _{{\ast}},\beta _{{\ast}})$ to Problem (6) are the optimal profits of the first and second players, respectively, in the game (4)–(5), while the pair (x ^∗, y ^∗) turns out to be just a Nash equilibrium point in the game (4)–(5).

On account of the formulation (6) it becomes clear that a way (method) of finding a Nash point strongly depends on the properties of the matrix (A + B).

So, the conclusion is obvious here and consists in the fact that the initial statement (4)–(5) of a bimatrix game is deceptive in the sense that it has, in general, a hidden (implicit) nonconvexity.

2.3 Hierarchical Optimization Problems

Hierarchical problems are encountered in practice because of impossibility of accumulation of the total available information at the upper level in the process of investigation of structurally complex control systems (social, economic, ecological-economic ones, etc.) and, as a consequence, possess some hidden nonconvexity generated by just hierarchical structures.

For example, the financial systems in the economic power countries are usually constructed as bilevel systems. Besides, the electric energy system in USSR was organized as a four-level system.

As to mathematical aspects of the statement, problems of bilevel programming represent extremum problems, that side by side with standard constraints which are expressed in terms of equalities and inequalities, include the constraints described with the aid of optimization subproblem representing the lower level of the bilevel problem (or the player called the follower in difference with the player of the upper level called the leader).

To begin with, let us consider the linear bilevel problem

$$\displaystyle{(\mathcal{L}\mathcal{B}\mathcal{P}): \left \{\begin{array}{c} F(x,y):=\langle c,x\rangle +\langle d,y\rangle \downarrow \min \limits _{x,y}, \\ x \in X =\{\, x \in \mathbb{R}^{m}\mid Ax \leq b\,\}, \\ y \in Y _{{\ast}}(x):=\mathop{ \mathrm{Arg}}\min \limits _{y}\{\langle d_{1},y\rangle \mid y \in Y (x)\}, \\ Y (x) =\{\, y \in \mathbb{R}^{n}\mid A_{1}x + B_{1}y \leq b_{1}\,\}, \end{array} \right.}$$

where $c \in \mathbb{R}^{m}$, $d,d_{1} \in \mathbb{R}^{n}$, $b \in \mathbb{R}^{p}$, $b_{1} \in \mathbb{R}^{q}$, and A, A ₁, B ₁ are matrices of corresponding dimensions. Suppose, that

$(\mathcal{H}_{1})$: the function F(x, y) is bounded below on the nonempty set Z,

$$\displaystyle{Z:=\{\, x \in \mathbb{R}^{m},\;\;y \in \mathbb{R}^{n}\mid Ax \leq b,\;\;A_{ 1}x + B_{1}y \leq b_{1}\,\};}$$

$(\mathcal{H}_{2})$: the function $\langle d_{1},y\rangle$ is bounded below on the set Y (x) for all x ∈ X.

Even in this very simple case it is easy to construct an example showing the nonconvexity of the problem $(\mathcal{L}\mathcal{B}\mathcal{P})$.

Example 1.

([8]) Consider the problem

$$\displaystyle{(\mathcal{P}_{L}): \left \{\begin{array}{c} f(y) = -y \downarrow \min \limits _{y}, \\ x + y \leq 8,\quad x + 4y \geq 8, \\ x + 2y \leq 13. \end{array} \right.}$$

Regardless of the convexity of the set

$$\displaystyle{Z =\{\, (x,y) \in \mathbb{R}^{2}\mid 1 \leq x \leq 6,\;\;x + y \leq 8,\;\;x + 4y \geq 8,\;\;x + 2y \leq 13\,\},}$$

it is easy to see even geometrically that the set

$$\displaystyle{Z_{{\ast}} =\{\, (x,y) \in Z\mid y \in Y _{{\ast}}(x)\,\}}$$

is nonconvex which provides for the nonconvexity in the problem $(\mathcal{L}\mathcal{B}\mathcal{P}_{1})$. □

So, the Example 1 shows the importance of the preliminary theoretical study of hierarchical optimization problems and, as, may be, the simplest case of $(\mathcal{B}\mathcal{P})$, the $(\mathcal{L}\mathcal{B}\mathcal{P})$.

2.4 Problems of Financial and Medical Diagnostics

Such problems are well known as applied ones, and on the other hand, these problems are often interpreted as the problems of generalized separability. For example, if the two sets of points $\mathcal{A}$ and $\mathcal{B}$ are characterized by the matrices A = [a ¹, …, a ^M], B = [b ¹, …, b ^N], $a^{i},b^{j} \in \mathbb{R}^{n}$, then the problem of polyhedral separability may be reduced to the problem of minimization of the nonconvex nondifferentiable error function (V = (v ^p), $\varGamma = (\gamma _{p}),\;\;\gamma _{p} \in \mathbb{R},\;\;v^{p} \in \mathbb{R}^{n},\;\;p = 1,\ldots,P$)

$$\displaystyle{ F(V,\varGamma ) = F_{1}(V,\varGamma ) + F_{2}(V,\varGamma ), }$$

(7)

$$\displaystyle{ \left.\begin{array}{c} F_{1}(V,\varGamma ) = \frac{1} {M}\sum \limits _{i=1}^{M}\max \{0;\max \limits _{ 1\leq p\leq P}(\langle a^{i},v^{p}\rangle -\gamma _{ p} + 1)\}, \\ F_{2}(V,\varGamma ) = \frac{1} {N}\sum \limits _{j=1}^{N}\max \{0;\min \limits _{ 1\leq p\leq P}(-\langle b^{i},v^{p}\rangle +\gamma _{ p} + 1)\}. \end{array} \right \} }$$

(8)

In this problem it is also not clear with which kind of nonconvexity we are dealing and how to overcome not only the nonsmoothness of the problem but also a nonconvexity generated apparently by F ₂(⋅ ).

But, anyway, the question arises how to attack optimization problems with a hidden or an explicit nonconvexities.

3 Optimization Problems with the Functions of A.D.Alexandrov

The targets of our presentation can be bounded by consideration of the class $DC(\mathbb{R}^{n})$ of the functions f(⋅ ) which can be represented as the difference of two convex functions (d.c. functions). The class was, for the first time, introduced in 1934 [1, 2] by the Russian mathematician A.D.Alexandrov, the member of AS of USSR.

Nowadays, this class is viewed by the specialists [9, 17, 25, 39] to be rather wise for consideration. Furthermore, the $DC(\mathbb{R}^{n})$ possess several remarkable properties.

(a)
The set $DC(\mathbb{R}^{n})$ is generated by the well-studied class—the convex cone of convex functions and forms a linear space [12, 13, 17, 28, 39].
(b)
$DC(\mathbb{R}^{n})$ includes the well-known classes such as twice differentiable functions, power and trigonometric polynomials [12, 13, 17, 39].
(c)
Any continuous function on a compact set $\mathcal{K}\subset \mathbb{R}^{n}$ can be approximated at any desired accuracy (in the topology of homogeneous convergence) by a function from $DC(\mathcal{K})$ [12]. Consequently, any optimization problem with continuous functions can be approximated at any desired accuracy by an extremum problem with functions of A.D.Alexandrov.

Only note, that, if f(⋅ ) is a d.c. function, then there exists an infinite number of d.c. representations of the form (3), for example, in the form of difference of strongly convex functions.

Closedness of the set $DC(\mathbb{R}^{n})$ of functions of A.D.Alexandrov with respect to the majority of operations, which are used in optimization, is also essential from the optimization viewpoint. For example, a sum, a difference, the module, the maximum, the minimum, etc. of the family of d.c. functions occur also in the class $DC(\mathbb{R}^{n})$.

Besides, the number of problems with d.c. functions is so large that the majority of the specialists, who have a long-time experience of solving problems of d.c. programming are sure [9, 12, 13, 17, 39] that all (or almost all) nonconvex optimization problems turn out to be really d.c. problems.

In this connection, the following statement of optimization problem can be viewed as rather general:

$$\displaystyle{ \left.\begin{array}{ll} f_{0}(x) = g_{0}(x) - h_{0}(x) \downarrow \min \limits _{x},\;\;x \in S, \\ f_{i}(x) = g_{i}(x) - h_{i}(x) \leq 0,\;\;i = 1,\ldots,m; \\ f_{j}(x) = g_{j}(x) - h_{j}(x) = 0,\;\;j = 1,\ldots,N. \end{array} \right \} }$$

(9)

Here $g_{i},\;g_{j},\;h_{i},\;h_{j}$ are convex functions and S is convex set from $\mathbb{R}^{n}$.

Apparently, almost all the specialists in optimization areas could estimate Problem (9) as very difficult and unsolvable by the existing approaches and methods even for the case of middle dimension (say, n = 100, …, 1, 000.)

Actually, even very simple (from the viewpoint of Problem (9)) the convex maximization quadratic problem on a box (which is a very particular case of (9)):

$$\displaystyle{ \left.\begin{array}{ll} h(x) = \frac{1} {2}\langle x,Qx\rangle \uparrow \max \limits _{x},\;\;Q = Q^{T} > 0, \\ x \in S:=\varPi =\{ x \in \mathbb{R}\;\vert \;\alpha _{i} \leq x_{i} \leq \beta _{i}\;\;i = 1,\ldots,n\} \end{array} \right \} }$$

(10)

is proved to be NP-hard [9]. Therefore, to begin with, let us simplify the situation and start with rather simple (from the first glance) nonconvex optimization problems.

1.
D.C. minimization
$$\displaystyle{ (\mathcal{P}): f(x) = g(x) - h(x) \downarrow \min,\;\;x \in D, }$$
(11)
where g(⋅ ), h(⋅ ) are convex functions, and D is a convex set, $D \subset \mathbb{R}^{n}$.
2.
D.C. constraint problem
$$\displaystyle{ (\mathcal{D}\mathcal{C}\mathcal{C}): \left.\begin{array}{ll} f_{0}(x) \downarrow \min \limits _{x},\;x \in S, \\ F(x) = g(x) - h(x) \leq 0,\end{array} \right \} }$$
(12)
where g(⋅ ) and h(⋅ ) are as above, $S \subset I\!\!R^{n}$, f ₀(⋅ ) is a continuous function.
3.
Convex maximization
$$\displaystyle{ h(x) \uparrow \max,\;\;x \in D, }$$
(13)
(when g ≡ 0 in (11)).
4.
Reverse-convex constraint problem
$$\displaystyle{ \left.\begin{array}{ll} f_{0}(x) \downarrow \min,\;x \in S, \\ h(x) \geq 0,\end{array} \right \} }$$
(14)
(g ≡ 0 in (12)).

Note, that any quadratic optimization problem with arbitrary matrices occurs in the classification (11)–(14) or takes the form (9).

4 Global Search Methodology

Since in our approach the general global search procedure includes two principal parts:

(a)
local search;
(b)
procedures of escaping a critical point provided by a LSM; we are going, first, to consider special (for each class of d.c. programming problems) LSMs.

4.1 Local Search

The ideas of the most of LSM are rather simple and may consist in the consecutive solution of the (partially) linearized problems which for Problems (11)–(14) turn out to be convex. As a consequence, it becomes possible to apply classical convex optimization methods (Newtonians, CGM, TRM, etc.) in order to find a solution to linearized problems, i.e. within the framework of Local Search Schemes.

So, unlike that in well-known methods of the so-called Global Optimization (such as B&B, cuts methods), which, say, “deny and ignore” the modern and classical optimization methods, we insist on the obligatory, but “indirect” application of these methods.

For example, as regards the problem of d.c. minimization $(\mathcal{P})$–(11), the basic element, the “cornerstone” of the Global and LSMs is solving the following (linearized at a current iteration point x ^s ∈ D) convex problem

$$\displaystyle{ (\mathcal{P}\mathcal{L}_{s}):\;\;\;\;\;\;\;\;\varPhi _{s}(x):= g(x) -\langle h'(x^{s}),x\rangle \downarrow \min \limits _{ x},\;\;\;x \in D, }$$

(15)

where $h'(x^{s}) = h'_{s} \in \partial h(x^{s}),\;\;s = 1,2,\ldots$ is a subgradient of the convex function h(⋅ ) at the point x ^s [13]. It is clear that in the differentiable case h′_s coincides with the usual gradient ∇h _s [13].

Furthermore, the LSM itself for $(\mathcal{P})$–(11) may consist in the consecutive solving (likewise in the method of “direct iterations”) Problems $(\mathcal{P}\mathcal{L}_{s})$–(15). More precisely, given x ^s ∈ D, we can find x ^s+1 ∈ D as an approximate solution to $(\mathcal{P}\mathcal{L}_{s})$ by means of some suitable convex optimization method (for example, BFGS), or one of the packages of applied software (Xpress-MP, IBM CPLEX etc).

So, we produce the sequence {x ^s} according to the inequality:

$$\displaystyle{ \varPhi _{s}(x^{s+1}):= g(x^{s+1}) -\langle h'(x^{s}),x^{s+1}\rangle \ \leq \ \inf \limits _{ x}\{g(x) -\langle h'(x^{s}),x\rangle \;\vert \;\;x \in D\} +\delta _{ s} }$$

(16)

where the sequence {δ _s} fulfils the condition

$$\displaystyle{\sum \limits _{s=0}^{\infty }\delta _{ s} < +\infty,\;\;\delta _{s} > 0,\;\;s = 1,2,\ldots.}$$

It was rather surprising that the process in this case converges in the following sense.

Theorem 1.

Suppose the cost function of Problem $(\mathcal{P})$ – (11) is bounded below, so that

$$\displaystyle{\mathcal{V}(\mathcal{P}):=\inf (f,D)\stackrel{\bigtriangleup }{=}\inf \limits _{x}\{f(x)\;\vert \;\;x \in D\} > -\infty.}$$

Then the sequence {x ^s } ∈ D generated by the rule (16) satisfies the following conditions.

(a) The number sequence $\{f_{s}\},\;\;f_{s} = f(x^{s})$ converges in the sense, as follows:

$$\displaystyle{ \lim \limits _{s\rightarrow \infty }f_{s} = f_{{\ast}}\geq \mathcal{V}(\mathcal{P}). }$$

(17)

(b)

$$\displaystyle{ \mbox{ (b) }\;\;\;\;\;\;\;\;\;\lim \limits _{s\rightarrow \infty }[\inf \limits _{x}\{g(x) - g(x^{s+1}) +\langle h'(x^{s}),x^{s+1} - x^{s}\rangle \;\vert \;\;x \in D\}] = 0.\;\;\;\;\;\;\;\;\; }$$

(18)

or, what is the same (see $(\mathcal{P}\mathcal{L}_{s})$ – (15) ),

$$\displaystyle\begin{array}{rcl} \lim \limits _{s\rightarrow \infty }[\mathcal{V}(\mathcal{P}\mathcal{L}_{s}) -\varPhi _{s}(x^{s+1})] = 0,& &{} \end{array}$$

(18′)

where

$$\displaystyle{ \mathcal{V}_{s}:= \mathcal{V}(\mathcal{P}\mathcal{L}_{s}):=\inf \limits _{x}\{g(x) -\langle h'(x^{s}),x\rangle \;\vert \;\;x \in D\} }$$

(19)

is the optimal value of the linearized problem $(\mathcal{P}\mathcal{L}_{s})$ – (15)

(c) If the function h(⋅) in (15) is strongly convex, then we have

$$\displaystyle{ \lim \limits _{s\rightarrow \infty }\|x^{s} - x^{s+1}\| = 0. }$$

(20)

(d) Any limit point x _∗ of the sequence {x ^s } generated by LSM (16) is a solution of the following linearized problem

$$\displaystyle{ (\mathcal{P}\mathcal{L}_{{\ast}}):\;\;\;\;\;\;\;\;\varPhi _{{\ast}}(x):= g(x) -\langle y_{{\ast}},x\rangle \downarrow \min \limits _{x},\;\;\;x \in D, }$$

(21)

where $y_{{\ast}} = h'(x_{{\ast}}) \in \partial h(x_{{\ast}}).$

Note that very frequently for small dimension (n ≤ 7, 8, 10) cases LSM (16) provides for a global solution to $(\mathcal{P})$–(11).

It is interesting that historically the particular case (g ≡ 0) of LSM (16) for differentiable convex maximization problem (13) has been proposed by Bulatov in 1969 [4] and can be represented in the modern form as follows:

$$\displaystyle{ \langle h'(x^{s}),x^{s+1}\rangle +\delta _{ s} \geq \sup \limits _{x}\{\langle h'(x^{s}),x\rangle \;\vert \;x \in D\}. }$$

(22)

Besides, the well-known “power” method for finding the maximal eigenvalue of a symmetric positive definite matrix A, or what is the same, for solving the problem [38]

$$\displaystyle{ \langle x,Ax\rangle \uparrow \max \limits _{x},\;\;\;\|x\| \leq 1, }$$

(23)

turns out to be very particular case of LSM (22), when the linearized problem $(\mathcal{P}\mathcal{L}_{s})$ (with g ≡ 0) can be solved analytically. Note that the method (22) in Problem (23) converges to the global solution [38].

Thus, one can conclude that the idea of linearization with respect to the basic nonconvexity of a nonconvex problem has certainly some age. Anyway, it is worth noting to mention the works of the group of Pham Dinh Tao in which the idea of linearization with respect to the basic nonconvexity also demonstrated its effectiveness [14–16].

Furthermore, special methods of local search have been developed for the problems with d.c. constraints (12), (14) (see [31]). These methods have also been grounded, for example, on considering linearized problems of the form

$$\displaystyle{ \left.\begin{array}{c} g(x) -\langle h'(x^{s}),x\rangle \downarrow \min \limits _{ x}, \\ x \in S,\;\;\;\;f_{0}(x) \leq \zeta _{s} = f_{0}(x^{s}), \end{array} \right \} }$$

(24)

and the duality of Tuy [39].

4.2 Global Optimality Conditions

The second step in the global search methodology can be viewed as the most important one and even crucial, because the question is how to escape a critical point (provided by an LSM and that is not a global solution).

Such a procedure is substantiated by the theoretical basis produced with the help of the so-called GOC which for the case of d.c. minimization problem ($\mathcal{P}$)–(11) takes the following form.

Theorem 2.

If z is a global solution to ( $\mathcal{P}$ ), $z \in Sol(\mathcal{P}),\;\;\zeta:= f(z),$ then

$$\displaystyle{ (\mathcal{E}):\;\;\;\;\;\;\;\; \left \{\begin{array}{c} \forall (y,\beta ) \in \mathbb{R}^{n} \times \mathbb{R}:\;\;\; h(y) =\beta -\zeta, \\ g(x)-\beta \geq \langle h'(y),x - y\rangle \;\;\forall x \in D. \end{array} \right. }$$

(25)

Proof.

Suppose, for some pair (y, β) satisfying (25) and a feasible point $\hat{x} \in D$ the inequality in (25) is violated

$$\displaystyle{g(\hat{x}) <\beta +\langle h'(y),\hat{x} - y\rangle.}$$

Then due to convexity of h(⋅ ) we have

$$\displaystyle{f(\hat{x})\stackrel{\bigtriangleup }{=}g(\hat{x}) - h(\hat{x}) < h(y) +\zeta -h(y) = f(z),}$$

or $f(\hat{x}) < f(z)$. Thus, $\hat{x}$ is “better” than z, which contradicts to $z \in Sol(\mathcal{P})$. □

So, when selecting the “perturbation parameters” (y, β) satisfying (25) and solving the linearized problem (sf. (15))

$$\displaystyle{ \varPhi _{y}(x):= g(x) -\langle h'(y),x\rangle \downarrow \min \limits _{x},\;\;\;x \in D, }$$

(26)

(where $y \in \mathbb{R}^{n}$ is not obligatory feasible!) we obtain a family of starting points x(y, β) for a further (assume) local search.

Moreover, on each level ζ _k = f(z ^k) it is not necessary to investigate all the pairs (y, β) satisfying (25), $\zeta _{k} =\beta -h(y)$, but it is sufficient to discover the violation of the variational inequality (25) only for one pair $(\hat{y},\hat{\beta })$.

After that, one proceeds to the next iteration of the global search: $z^{k+1}:=\hat{ x}$, $\zeta _{k+1}:= f(z^{k+1}),$ and starts the procedure from the very beginning. So, the idea of the GSM becomes considerably more clear.

For the case of d.c. constraint problem (12) the character of GOC is a little bit different. It particular, the necessary conditions are rather far from the sufficient ones. More precisely, we have the result as follows:

Theorem 3.

Assume that in Problem (DCC)– (12) the following condition holds:

$$\displaystyle{ (\mathcal{G}):\;\;\;\;\;\;\;\; \left.\begin{array}{l} \mathit{ there\ does\ not\ exist\ a\ solution}\ x_{{\ast}}\in S \\ \mathit{ to\ Problem\ (12)\ such\ that}\ F(x_{{\ast}}) < 0. \end{array} \right \}\;\;\;\;\;\; }$$

(27)

If a point z ∈ S is a global solution to Problem (12) such that F(z) = 0, then

$$\displaystyle{ (\mathcal{E}_{1}):\;\;\;\;\;\;\;\; \left \{\begin{array}{c} \forall (y,\beta ) \in \mathbb{R}^{n} \times \mathbb{R}:\;\;\;\beta = h(y),\;\;\forall h'(y) \in \partial h(y) \\ g(x)-\beta \geq \langle h'(y),x - y\rangle \;\;\forall x \in S,\;\;f_{0}(x) \leq f_{0}(z). \end{array} \right. }$$

(28)

Proof.

Suppose we find some parameters $(y_{0},\beta _{0}),\;\;h'(y_{0}) \in \partial h(y_{0})$ and a point x ₀ ∈ S such that

$$\displaystyle{\beta _{0} = h(y_{0}),\quad f_{0}(x_{0}) \leq f_{0}(z),\mbox{ and }g(x_{0}) -\beta _{0} <\langle h'(y_{0}),x_{0} - y_{0}\rangle.}$$

Then due to convexity of h(⋅ ) we obtain

$$\displaystyle{0 <\beta _{0} - g(x_{0}) + h(x_{0}) - h(y_{0}) = -F(x_{0}).}$$

Hence, we have the feasible point x ₀ ∈ S, F(x ₀) < 0 = F(z) with the property $f_{0}(x_{0}) \leq f_{0}(z)$. It means that x ₀ is a solution to Problem (12) as well as the point z. The latter contradicts to the condition $(\mathcal{G})$–(27). □

A procedure of escaping a local pit can be conducted in a similar manner as it was explained after Theorem 2.

In the next subsection such a procedure will be precised for the case of d.c. minimization problem (11).

4.3 Global Search Methods

In order to deal with nonconvex optimization problems and, in addition, on the basis of the rather large computational experience [11, 18, 19, 26–28, 31–37] we propose three principles on which can be produced a search for a global solution to d.c. optimization problems of the forms considered above.

1.
Linearization with respect to the basic nonconvexities of the problem under scrutiny and, consequently, the reduction of the original problem to a family of (partially) linearized problems.
2.
Application of contemporary convex optimization methods for solving linearized problems and, as a consequence, “within” special LSMs.
3.
Construction of “good” approximations (resolving sets) of the level surfaces/epigraph boundaries of convex functions.

Moreover, by working rather long time (about 30 years) on the field of nonconvex optimization, several practical rules have been elaborated, which can be represented as follows:

1.
Never apply convex optimization methods directly.
2.
Exact classification of the problem under scrutiny.
3.
Application of special (for the class of problems to which belongs your problem) LSM, or (your problem’s) specific methods.
4.
Application of GSM specialized for the class which includes your problem.
5.
Construction of suitable approximations of level surfaces (and the boundaries of the epigraphs) of convex functions with the aid of the experience obtained during solving similar problems.
6.
Application of convex optimization methods for solving linearized problems and within the framework of special LSM.

These rules may be explained otherwise and by examples, following the instances.

1.
Never apply CGM or BFGS if you are not convinced that your problem is convex.
2.
Try to separate the data of your problem into two parts—convex and anticonvex.

For example, dealing with a quadratic function of the kind

$$\displaystyle{q(x) = \frac{1} {2}\langle x,Qx\rangle,}$$

where the matrix Q is indefinite, you have to separate the matrix Q(n × n) into a difference $Q = Q_{1} - Q_{2}$ of two symmetric positive definite matrices $Q_{i} = Q_{i}^{T} > 0$, i = 1, 2. Note there exists infinity of such representations and several methods and ways to obtain its [28, 38].

Further, it is very important where your quadratic function q(x) is situated—in the objective function or among the constraint’s data, because depending on the situation you have different types of the problem to solve—d.c. minimization (11) or d.c. constraint problem (12), respectively. And as a consequence, you have to follow the different strategy (GSM, see below).

To demonstrate the effectiveness of these practical rules, let us consider the following example.

Example 2 (Incorrect Classification).

Consider the problem

$$\displaystyle{ \left.\begin{array}{c} \varphi (x) =\sum \limits _{ i=1}^{n}\ln (1 + x_{i}) \downarrow \min \limits _{x}, \\ x \in \varPi = \left \{x \in \mathbb{R}^{n}\;\vert \;-\frac{1} {2} \leq x_{i} \leq 3\right \} \subset \mathbb{R}^{n}. \end{array} \right \} }$$

(29)

Obviously, the point $z = \left (-\frac{1} {2},\ldots,-\frac{1} {2}\right )^{T}$ is the solution to the problem. Suppose, the current iterate is x ^k = (0, …, 0)^T

$$\displaystyle{\nabla f(x) = \left ( \frac{1} {1 + x_{1}},\ldots, \frac{1} {1 + x_{n}}\right )^{T},\;\;\nabla f(x^{k}) = (1,\ldots,1)^{T},}$$

$$\displaystyle{\nabla ^{2}f(x) = \left [\begin{array}{ccc} - \frac{1} {(1+x_{1})^{2}} & \ldots & 0\\ \ldots &\ldots & \ldots \\ 0 &\ldots & - \frac{1} {(1+x_{n})^{2}} \end{array} \right ],\;\;\nabla ^{2}f(x^{k}) = \left [\begin{array}{ccc} - 1&\ldots & 0\\ \ldots &\ldots & \ldots \\ 0 &\ldots & - 1 \end{array} \right ].}$$

The auxiliary problem of Newton’s method

$$\displaystyle{\varPhi (d) =\sum \limits _{ i=1}^{n}d_{ i} -\sum \limits _{i=1}^{n}d_{ i}^{2} \downarrow \min,\;\;d \in \varPi = (\varPi -x^{k})}$$

has obviously the solution d = (3, …, 3)^T (take the case n = 2), which is a direction to the worst feasible point x = (3, …, 3)^T. As a consequence, the iteration of the line-search method $x^{k+1} = x^{k} + td^{k}$ cannot escape from x ^k = (0, …, 0)^T and the process is stopped at x ^k. Besides, note that the auxiliary problem conserves the nonconvex character of the original problem (29).

In contrast to the incorrectness of the above classification, let us look at the goal function of Problem (29) as a concave function, and correspondingly at the problem (29) as the concave minimization problem. Then we immediately conclude that we are dealing just with Problem (13) ($h(\cdot ) = -\varphi (\cdot )$). Hence, we have to apply, first, the special LSM (16) where g(x) ≡ 0. So, beginning at arbitrary feasible point x ⁰ ∈ Π, we have to solve the linearized problem

$$\displaystyle{(\mathcal{P}\mathcal{L}_{0}):\;\;\;\;\langle \nabla \varphi (x^{0}),x\rangle = -\langle \nabla h(x^{0}),x\rangle \downarrow \min \limits _{ x},\;\;x \in \varPi \subset \mathbb{R}^{n},}$$

i.e.

$$\displaystyle{\sum \limits _{i=1}^{n} \frac{1} {1 + x_{i}^{0}} \cdot x_{i} \downarrow \min \limits _{x}\;\; -\frac{1} {2} \leq x_{i} \leq 3,\;\;i = 1,\ldots,n}$$

that provides for the global solution to the original problem (29)

$$\displaystyle{x^{1} = z\stackrel{\bigtriangleup }{=}\left (-\frac{1} {2},\ldots,-\frac{1} {2}\right )^{T} \in Sol(\mathcal{P}).}$$

So, the special LSM (in one step!) has found the global solution to Problem (29). □

Let us return now to the construction of a GSM (strategy) based on GOC presented in Theorem 2 and specialized only for Problem $(\mathcal{P})$–(11).

The basic stages of such a GSM (strategy) can be described as follows:

I.
Find a critical point z by means of the special LSM ((16), for example).
II.
Choose a number $\beta \in [\beta _{-},\beta _{+}]$, where $\beta _{-} =\inf (g,D),\;\;\beta _{+} =\sup (g,D)$ can be approximated by rather rough estimates.
III.
Construct an approximation
$$\displaystyle{\mathcal{A}(\beta ) =\{ y^{1},\ldots,y^{N}\mid h(y^{i}) =\beta -\zeta,\;\;i = 1,\ldots,N = N(\beta )\}}$$
of the level surface of the function h(⋅ ).
IV.
Beginning at every point y ⁱ of the approximation $\mathcal{A}(\beta )$ find a feasible point u ⁱ by means of the special local search algorithm (16).
V.
Verify the VI (25) from GOC
$$\displaystyle{ g(u^{i})-\beta \geq \langle h'(w^{i}),u^{i} - w^{i}\rangle \quad \forall i = 1,\ldots,N, }$$
(30)
where w ⁱ may be found as the projection of the point u ⁱ onto the convex set
$$\displaystyle{\mathcal{L}(h,\beta -\zeta ) =\{\, x \in \mathbb{R}^{n}\mid h(x) \leq \beta -\zeta \,\}.}$$
VI.
If ∃j ∈ { 1, …, N} such that (30) is violated, then set $x^{k+1}:= u^{j}$ and return to Stage I. Otherwise change β and return to Stage III.

Example 3.

Consider the problem

$$\displaystyle{ f(x) \downarrow \min,\quad x \in \mathbb{R}, }$$

(31)

$$\displaystyle{ f(x) = \left \{\begin{array}{l} \frac{1} {4}x^{4} -\frac{1} {2}x^{2},\quad x \geq 0, \\ \frac{1} {2}x^{4} - x^{2},\quad x < 0. \end{array} \right. }$$

(32)

The d.c. representation is here obvious, for example

$$\displaystyle{f(x) = g(x) - h(x),}$$

where

$$\displaystyle{ g(x) = \left \{\begin{array}{l} \frac{1} {4}x^{4},\quad x \geq 0, \\ \frac{1} {2}x^{4},\quad x < 0, \end{array} \right.\qquad h(x) = \left \{\begin{array}{l} \frac{1} {2}x^{2},\quad x \geq 0, \\ x^{2},\quad x < 0. \end{array} \right. }$$

(33)

Let us choose the starting point x ₀ = 100, while the global solution is $z = -1$, which can be readily seen.

(A) Local search

We have s = 0, x ₀ = 100, $\nabla h(x_{0}) = x_{0} = 100$, since

$$\displaystyle{ \nabla h(x) = \left \{\begin{array}{c} \nabla h_{1}(x) = x,\quad x \geq 0, \\ \nabla h_{2}(x) = 2x,\quad x < 0. \end{array} \right. }$$

(34)

Then the linearized (convex) problem $(\mathcal{P}\mathcal{L}_{0})$–(15) takes the form

$$\displaystyle{(\mathcal{P}\mathcal{L}_{0}): \varPhi _{0}(x) = g(x) -\langle \nabla h(x_{0}),x\rangle = \frac{1} {4}x^{4} - 100x \downarrow \min \limits _{ x},\quad x \in \mathbb{R}.}$$

To simplify the situation, in order to solve (PL _s) let us apply the Fermat rule instead of any numerical method. This yields

$$\displaystyle{\nabla \varPhi (x) = x^{3} - 100 = 0.}$$

Taking into account that 4³ = 64, 5³ = 125, let us risk to set x ₁ ≈ 4. 5. Further, we have to solve the next linearized problem (s = 1)

$$\displaystyle{(\mathcal{P}\mathcal{L}_{1}): \varPhi _{1}(x) = \frac{1} {4}x^{4} - 4.5x \downarrow \min \limits _{ x},\quad x \in \mathbb{R},}$$

and, as a consequence, the equation

$$\displaystyle{\nabla \varPhi _{1}(x) = x^{3} - 4.5 = 0,}$$

whence it follows that x ₂ ≈ 1. 7.

For s = 2 consider the next linearized problem

$$\displaystyle{(\mathcal{P}\mathcal{L}_{2}): \varPhi _{2}(x) = \frac{1} {4}x^{4} - 1.7x \downarrow \min \limits _{ x},\quad x \in \mathbb{R},}$$

and the corresponding equation

$$\displaystyle{x^{3} - 1,7 = 0,}$$

that provides for the solution x ₃ ≈ 1. 2. Hence, it is clear that {x ^s} tends to $z^{0} = 1 \in \mathop{\mathrm{Arglocmin}}$ (31).

(B) Global search.

Step 1.:

Thus, beginning at x ₀ = 100 LSM provided for the point z ⁰ = 1, $\zeta _{0}:= f(z^{0}) = -\frac{1} {4}$.

Step 2.:

To begin with let us choose $\beta _{0} = g(z^{0}) = g(1) = \frac{1} {4}$.

Step 3.:

Now we need to construct an approximation

$$\displaystyle\begin{array}{rcl} & & \mathcal{A}_{0} = \left \{\,y^{1},y^{2},\ldots,y^{N}\mid h(y^{i}) =\beta _{ 0} -\zeta _{0} = \frac{1} {4} -\left (-\frac{1} {4}\right ) = \frac{1} {2}\,\right \}. {}\\ & & \qquad \qquad \qquad i = 1,\quad h_{1}(y)\stackrel{\bigtriangleup }{=}\frac{1} {2}y^{2} = \frac{1} {2},\quad y > 0,\quad y_{1} = 1, {}\\ & & \qquad \qquad \quad i = 2,\quad h_{2}(y)\stackrel{\bigtriangleup }{=}y^{2} = \frac{1} {2},\quad y < 0,\quad y_{2} = -\frac{\sqrt{2}} {2}. {}\\ \end{array}$$

So, we obtain $\mathcal{A}_{0} =\{\, y_{1} = 1,\;\;y_{2} = -\frac{\sqrt{2}} {2} \,\}$.

Step 4.:

Further, we have to solve the linearized problems (i = 1, 2)

$$\displaystyle{(\mathcal{P}\mathcal{L}_{i}): \varPhi _{i}(x) = g(x) -\nabla h(y_{i}),x\rangle \downarrow \min \limits _{x},\quad x \in \mathbb{R}.}$$

a)
i = 1,
$$\displaystyle{\varPhi _{1}(x) = \frac{1} {4}x^{4} -\langle \nabla h_{ 1}(y_{1}),x\rangle = \frac{1} {4}x^{4} - x \downarrow \min \limits _{ x}.}$$
The Fermat rule provides for u ₁ = 1.

i = 2,
$$\displaystyle{\varPhi _{2}(x) = \frac{1} {4}x^{4} -\langle \nabla h_{ 2}(y_{2}),x\rangle = \frac{1} {4}x^{4} -\langle 2y_{ 2},x\rangle = \frac{1} {4}x^{4} + \sqrt{2}x \downarrow \min \limits _{ x},\quad x \in \mathbb{R},}$$
whence it follows $\nabla \varPhi _{2}(x) = x^{3} + \sqrt{2} = 0$ that yields $u_{2} = -(2)^{1/6}$.
b)
i = 1,
$$\displaystyle{\varPhi _{3}(x) = \frac{1} {2}x^{4} -\langle \nabla h_{ 1}(y_{1}),x\rangle = \frac{1} {2}x^{4} - x \downarrow \min \limits _{ x}.}$$
As above, one has $u_{3} = \root{3}\of{0.5}$.

Here, it is necessary to note that the points u ₂ and u ₃ are unacceptable because the initial data and the final results are incompatible, i.e.

a)
i = 2, $g(x) = g_{1}(x) = \frac{1} {4}x^{4}$ when x ≥ 0 meanwhile the solution $u_{2} = -(2)^{1/6}$ is negative;
b)
i = 1. $g(x) = g_{2}(x) = \frac{1} {2}x^{2}$ when x < 0, while $u_{3} = \root{3}\of{0.5} > 0$.

Further we consider the last case.
c)
i = 2.
$$\displaystyle{\varPhi _{4}(x) = g_{2}(x) -\langle \nabla h_{2}(y_{2}),x\rangle = \frac{1} {2}x^{4} -\langle 2y_{ 2},x\rangle = \frac{1} {2}x^{4} + \sqrt{2}x \downarrow \min \limits _{ x}.}$$
It is easy to see that the Fermat rule $\nabla \varPhi _{4}(x) = 2x^{3} + \sqrt{2} = 0$ yields $u_{4} = -\left (\frac{\sqrt{2}} {2} \right )^{1/3}$.

Now, we have to verify VI (30).

Step 5. a) i = 1.

$$\displaystyle{g(u_{1}) -\beta _{0} -\langle \nabla h(y_{1}),u_{1} - y_{1}\rangle = \frac{1} {4}u_{1}^{4} = \frac{1} {4} -\langle y_{1},u_{1} - y_{1}\rangle = \frac{1} {4} -\frac{1} {4} - 0 = 0.}$$

b)
i = 2.
$$\displaystyle{\begin{array}{l} g_{2}(u_{2}) -\beta _{0} -\langle \nabla h_{2}(y_{2}),u_{2} - y_{2}\rangle = \frac{1} {2}u_{4}^{4} -\frac{1} {4} -\langle 2y_{2},u_{4} - y_{2}\rangle \\ = \frac{1} {2}\left (\frac{\sqrt{2}} {3} \right )^{4/3} -\frac{1} {4} +\langle \sqrt{2},-\left (\frac{\sqrt{2}} {2} \right )^{1/3} + \frac{\sqrt{2}} {2} \rangle \\ = \frac{1} {2}\left [\left (\frac{1} {2}\right )^{2/3} -\frac{1} {2}\right ] +\langle \sqrt{2},\left (\frac{\sqrt{2}} {2} \right ) -\left (\frac{\sqrt{2}} {2} \right )^{1/3}\rangle.\end{array} }$$

Without a computer it is rather difficult to decide about the sign of the latter expression. Suppose, our program was incorrect at this point, and we turned out to be unsuccessful to violate the VI (30). What do we have to do further? It is necessary to change β for another value, i.e. to loop to Step 2.

Step 2.:

Change β ₀ for $\beta _{1} = \frac{3} {4}$.

Step 3.:

We need a new set $\mathcal{A}_{1}$ of points y _i satisfying

$$\displaystyle{h(y) = \left \{\begin{array}{c} \frac{1} {2}y^{2},\quad y \geq 0, \\ y^{2},\quad y < 0\end{array} \right \} =\beta _{1}-\zeta _{0} = \frac{3} {4}-\left (-\frac{1} {4}\right ) = 1,}$$

whence it follows

$$\displaystyle{\begin{array}{c} i = 1,\quad y_{1} = \sqrt{2},\\ i = 2, \quad y_{ 2} = -1.\end{array} }$$

On account of (34) we have to solve the linearized (convex) problems (i = 1, 2)

$$\displaystyle{(\mathcal{P}\mathcal{L}_{i}): \varPhi _{i}(x) = g(x) -\langle \nabla h(y_{i}),x\rangle \downarrow \min \limits _{x},\quad x \in \mathbb{R}.}$$

Note that here it is sufficient to investigate only the case (i = 1, a) and (i = 2, b), because other two (i = 1, b) and (i = 2, a) turn out to be unacceptable, as above.

i = 1, a)

$$\displaystyle{g_{1}(x) -\langle \nabla h_{1}(y_{1}),x\rangle = \frac{1} {4}x^{4} - x\sqrt{2} \downarrow \min \limits _{ x},\quad x \in \mathbb{R}.}$$

With the help of Fermat rule one has

$$\displaystyle{x^{3} -\sqrt{2} = 0,\quad x \geq 0,\quad u_{ 1} = (2)^{1/6}.}$$

i = 2, b)

$$\displaystyle{g_{2}(x) -\langle \nabla h_{2}(y_{2}),x\rangle =\varPhi _{4}(x) = \frac{1} {2}x^{4} + 2x \downarrow \min,\quad x \in \mathbb{R},}$$

which provides for

$$\displaystyle{2x^{3} + 2 = 0,\quad u_{ 4} = -1.}$$

Now, we need to verify VI (30). However, it is sufficient to consider only the case (i = 2, b) with $u_{4} = -1$, $y_{2} = -1$. Actually, in this case due to (34) we have

$$\displaystyle{g_{2}(u_{4})-\beta _{1}-\langle \nabla h_{2}(y_{2}),u_{4}-y_{2}\rangle = \frac{1} {2}u_{4}^{4}-1-\langle 2(-1),-1-1\rangle = \frac{1} {2} -1-0 = -\frac{1} {2} < 0.}$$

The latter inequality means that GOCs have been violated, and, moreover, we were successful to “jump” out the local pit z ⁰ = 1 directly to the global solution $z^{1} = -1$ by means of Global Search Strategy (Method). □

5 Numerical Solution of the Applied Problems

5.1 Linear Complementarity Problem

As it was said in Sect. 2, we have to look at the LCP (1) as the optimization problem (2). Besides, we will consider the most difficult case when (2) is nonconvex. More precisely, the matrix M in the statement (2) is indefinite, i.e. possesses positive and negative eigenvalues.

Note that the LCP (1) represents necessary optimality conditions for the problem

$$\displaystyle{ f(x) = \frac{1} {2}\langle x,Mx\rangle +\langle q,x\rangle \downarrow \min,\quad x \geq 0. }$$

(35)

However, this problem cannot replace (2) because M is indefinite. As a consequence, f(⋅ ) in (35) can be unbounded below, while the objective function in (2) is nonnegative and takes the zero value only at a solution to Problem (2). It is clear that this provides an additional information for the computational process.

Now, let us describe the principal stage of the Global Search Algorithm (GSA).

0. Classification. Thus, we decide to classify LCP with an indefinite matrix M as a d.c. minimization problem (3) with the strongly convex functions g(⋅ ) and h(⋅ ):

$$\displaystyle\begin{array}{rcl} & & g(x) =\langle x,M_{1}x\rangle +\langle q,x\rangle,\quad h(x) =\langle x,M_{2}x\rangle, {}\\ & & M_{i} = M_{i}^{\top } > 0,\quad i = 1,2,\quad M = M_{ 1} - M_{2}. {}\\ \end{array}$$

I. The next stage is the local search. In order to do it, let us apply the LSM (16) which (for the LCP (2)) takes the form of the consecutive solutions of the following linearized problem

$$\displaystyle{ \left.\begin{array}{c} \varPhi _{s}(x) =\langle M_{1}x,x\rangle +\langle q - 2M_{2}x^{s},x\rangle \downarrow \min, \\ x \geq 0,\quad Mx + q \geq 0. \end{array} \right \} }$$

(36)

To the end of solving Problems (36) we used the well-known XPress solver, which was especially designed for solving convex quadratic and linear programming problems.

On the other hand, to organize rather effective testing of the LSM (16), the data of the (n × n) matrix M were randomly generated in the interval [−n, n] (see [34]). So, the set of randomly generated LCPs of type (2) and of the dimension varying from n = 2 to n = 1000 has been formed. Moreover, for every LCP we used three different starting points. Further, the local solution process has been performed and analyzed on this field of test LCPs. In particular, the results enable us to observe the behavior of the method (15)–(16) and to choose appropriate starting points for global search (“good-bad” points, starting at which the LSM (15)–(16) does not provide for a global solution to Problem (2)).

Note separately that due to Theorem 1 the linearized problems (36) may be solved at a low accuracy at the first steps; further, the accuracy δ _s can be gradually improved (δ _s ↓ 0), for example, δ ₀ = 0. 1, $\delta _{s+1} = 0.5\delta _{s}$ until the condition δ _s ≤ δ is fulfilled with a given accuracy δ > 0. The results of computational testing of LSM (16) have been presented for the first time in [34] and after have been considerably improved (till the dimension n = 1, 000 with a 3.4 GHz Pentium computer with 1 Gb of memory). The auxiliary linearized problems (36) have been solved by XPress solver.

On the basis of the analysis of the results of computational testing [34] one can conclude that the LSM (16) showed itself rather effective for LCP (2). Moreover, it was considerably more effective in comparison with X-Press solver, because the latter was unable to deal with nonconvex LCP (2) of dimension n ≥ 10, meanwhile the LSM (15)–(16) yielded a critical (feasible) point for (2) in all considered test problems till the dimension n = 1, 000 with obligatory and considerable decreasing of the goal function Φ(⋅ ) in (2) (see [34]).

So, LSM (15)–(16) can be applied in a GSA, although it is not able, in general, to reach a global solution.

Now we can pass to a global search described in Sect. 4. To begin with, first, one has to propose a numeric solution of the equation

$$\displaystyle{h(y)\stackrel{\bigtriangleup }{=}\left \langle y,M_{2}y\right \rangle =\beta -\zeta _{k},\;\;M_{2} = M_{2}^{T} > 0,}$$

where $\zeta _{k}:=\varPhi (z^{k}),\;\;\beta \in [\beta _{-},\beta _{+}],$ more precisely to construct an approximation

$$\displaystyle{\mathcal{A}_{k}(\beta ) =\{ y^{1},\ldots,y^{N}\;\vert \;h(y^{i}) =\beta -\zeta _{ k},\;\;i = 1,\ldots,N_{k}\}}$$

of the level surface $U_{k}(\beta ) =\{ x\mid h(x) =\beta -\zeta _{k}\}$. The construction of such an approximation is a key point in the implementation of the global search.

Regardless of the importance of this procedure, the construction may be performed in rather simple fashion, for example,

$$\displaystyle{ y^{i} =\mu _{ i}d^{i},\quad i = 1,\ldots,N, }$$

(37)

where d ⁱ are elements of some set in $\mathbb{R}^{n}$, for instance, $d^{i} = e^{i},\;\;\{e^{1},\ldots,e^{n}\}$ being the Euclidian basis of $\mathbb{R}^{n}$, and the numbers μ _i are chosen as the roots of the quadratic equation $h(\mu _{i}d^{i}) =\beta -\zeta _{k}$ due to the quadratic structure of h(⋅ ).

In order to solve Problem (2) we used the approximations as follows:

$$\displaystyle{\mathcal{R}_{1} =\{ y^{i} =\mu _{ i}e^{i},\;\;y^{i+n} = -y^{i}\mid i = 1,\ldots,n\},}$$

$$\displaystyle{\mathcal{R}_{2} =\{ y^{i} = z^{k} +\mu _{ i}e^{i},\;\;y^{i+n} = z^{k} -\mu _{ i}e^{i}\mid i = 1,\ldots,n\},}$$

where z ^k is the current iteration point.

Besides, we also applied the third approximation using the form (37), where d ⁱ(i = 1, …, n) have been produced as the solutions of the linear programs

$$\displaystyle{ \left \langle e^{i},x\right \rangle \downarrow \min \limits _{ x},\;\;x \geq 0,\;\;Mx + q \geq 0,\;\;i = 1,\ldots,n }$$

(38)

and d ⁿ⁺¹ is the solution of the similar problem

$$\displaystyle{ \left \langle e,x\right \rangle \downarrow \min \limits _{x},\;\;x \geq 0,\;\;Mx + q \geq 0, }$$

(39)

with $e = (1,\ldots,1)^{T} \in \mathbb{R}^{n}$.

The results of computational testing of the developed GSA have first been published in [34] and turned out to be rather promising for the test problems of dimension till 400.

Now we are having the software which is able to solve LCP (2) till the dimension 10³ in 10–12 min and till the dimension 10⁴ in 90–150 min, remember, by means of (only one) almost the same computer as it was used in [34], without applying any parallel technology.

In addition, in order to compare the efficiency of GSA with existing software the same series of randomly generated problems have been solved using the solver PATH [34], which was especially designed for solving the LCP (1). Note that all computational simulations have been carried out by students and postgraduate students.

So, the qualities of the programs implemented may vary significantly.

Nevertheless, we can conclude that the developed GSA proved to be rather effective for solving (nonconvex) LCPs with indefinite matrix M.

5.2 Bimatrix Games

Here we present the principal points of the numerical search for Nash equilibrium (NE) defined in (5) in the two-person game stated in (4). The computational algorithm has been developed on the foundation of the following result of Mills [21].

Theorem 4.

([21])(i) A situation (x ^∗ ,y ^∗ ) is a Nash equilibrium in the bimatrix game G(A,B) (4) if and only if (x ^∗ ,y ^∗ ) is a part of a global solution $(x^{{\ast}},y^{{\ast}},\alpha _{{\ast}},\beta _{{\ast}}) \in \mathbb{R}^{m} \times \mathbb{R}^{n} \times \mathbb{R}^{2}$ to Problem (6).

(ii)
Moreover, α _∗ and β _∗ are the payoffs of the first and the second players, respectively:
$$\displaystyle{\langle x^{{\ast}},Ay^{{\ast}}\rangle =\alpha _{ {\ast}},\quad \langle x^{{\ast}},By^{{\ast}}\rangle =\beta _{ {\ast}}.}$$
(iii)
Finally, the optimization value of the goal function in Problem (6) is equal to zero
$$\displaystyle{\mathcal{V}(6) = F(x^{{\ast}},y^{{\ast}},\alpha _{ {\ast}},\beta _{{\ast}}) = 0.}$$
□

5.2.1 Classification

In order to develop a numerical method for solving Problem (6) we have, first, to classify it as a nonconvex problem. Because all the constraints in (6) are linear, we have to decide about the features of the cost function of Problem (6). It can be readily seen that this function has the d.c. decomposition as follows:

$$\displaystyle{ F(x,y,\alpha,\beta ) = h(x,y) - g(x,y,\alpha,\beta ), }$$

(40)

where

$$\displaystyle{ \left.\begin{array}{c} h(x,y) = \frac{1} {4}(\|x + Ay\|^{2} +\| x + By\|^{2}) \\ g(x,y,\alpha,\beta ) = \frac{1} {4}(\|x - Ay\|^{2} +\| x - By\|^{2}) +\alpha +\beta \end{array} \right \} }$$

(41)

are convex functions (h(⋅ ) on $\mathbb{R}^{m+n}$, g(⋅ ) on $\mathbb{R}^{m+n+2}$). In other words, we have the d.c. minimization problem $(\mathcal{P}_{\mathrm{BM}})$ as follows:

$$\displaystyle\begin{array}{rcl} (\mathcal{P}_{\mathrm{BM}}): \left.\begin{array}{c} F_{0}(x,y,\alpha,\beta ) = -F(x,y,\alpha,\beta ) = g(x,y,\alpha,\beta ) - h(x,y) \downarrow \min \limits _{x,y,\alpha,\beta }, \\ (x,\beta ) \in X =\{\, x \in S_{m},\quad \beta \in \mathbb{R}\mid x^{\top }B \leq \beta e_{n}\,\}, \\ (y,\alpha ) \in Y =\{\, y \in S_{n},\quad \alpha \in \mathbb{R}\mid Ay \leq \alpha e_{m}\,\}. \end{array} \right \}& &{}\\ \end{array}$$

(6′)

It is easy to see that the cost function F ₀(x, y, α, β) is nonnegative (F ₀(⋅ ) ≥ 0, F(⋅ ) ≤ 0) on the feasible set of Problem $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′).

In addition, if we denote

$$\displaystyle{ \alpha (y):=\max \limits _{1\leq i\leq m}(Ay)_{i},\quad \beta (x):=\max \limits _{1\leq j\leq n}(x^{\top }B)_{ j}, }$$

(42)

then due to the necessity proof of Theorem 2 we can reformulate Theorem 2 in the contrapositive form as follows.

Theorem 5.

([28])If a feasible tuple $(\hat{x},\hat{y},\hat{\alpha },\hat{\beta })$ is not a global solution to Problem $(\mathcal{P}_{\mathrm{BM}})$ – (6 ^′ ), then there exist some vectors (u,v) ∈ S _m × S _n and $(\bar{x},\bar{y}) \in S_{m} \times S_{n}$ , and a number γ such that

$$\displaystyle\begin{array}{rcl} \gamma - h(u,v) =\zeta:= F_{0}(\hat{x},\hat{y},\hat{\alpha },\hat{\beta }) > 0,& &{}\end{array}$$

(43)

$$\displaystyle\begin{array}{rcl} g(u,v,\alpha (v),\beta (u)) \leq \gamma \leq \sup (g,D),& &{}\end{array}$$

(44)

$$\displaystyle\begin{array}{rcl} g(\bar{x},\bar{y},\alpha (\bar{y}),\beta (\bar{x}))-\gamma <\langle \nabla _{x}h(u,v),\bar{x} - u\rangle +\langle \nabla _{y}h(u,v),\bar{y} - v\rangle.& &{}\end{array}$$

(45)

□

Applying just this result we will develop a GSM for finding a global solution to $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′). The first step of this GSM is a local search algorithm which takes into account the bilinear structure of the cost function $\alpha +\beta -\langle x,(A + B)y\rangle$ of the Problem $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′).

5.2.2 Local Search

First, let us repeat that LSMs play the important role in the processes of search for a global solution to nonconvex problems, since it provides for the so-called critical (stationary) points which may be considerably better than a simple feasible point. Moreover, if a starting point occurs rather closed to a global solution (as in the case of Newton method for solving systems of nonlinear equations), then an LSM is able to provide for the global solution.

Therefore, we have to pay our attention and considerable efforts to a creation (a design or a choice) and the substantiation of local search procedures.

For instance, for the case of Problem (6 ^′) it might be possible to apply the LSM (15)–(16) taking into account the d.c. representation (40)–(41) and applying the corresponding methods of quadratic programming.

However, in this case the bilinear nature of Problem $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′), the specific character of the cost function $\langle x,(A + B)y\rangle$, namely, its bilinearity, would be lost. Therefore, we propose to follow another way, more natural in the case, taking into account the bilinear structure of the goal function $F_{0}(x,y,\alpha,\beta ) =\alpha +\beta -\langle x,(A + B)y\rangle$. Combining the linearization idea and the separation of the variables into groups according to the statement of Problem $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′), we obtain without alternative a procedure of consecutive solving the following two (linearized at a point $(u,v) \in \mathbb{R}^{m} \times \mathbb{R}^{n}$) problems

$$\displaystyle{ (\mathcal{P}\mathcal{L}_{x}): \left \{\begin{array}{c} \beta -\langle (A + B)v,x\rangle \downarrow \min \limits _{(x,\beta )}, \\ (x,\beta ) \in X =\{ (x,\beta ) \in S_{m} \times \mathbb{R}\mid x^{\top }B \leq \beta e_{n}\}, \end{array} \right. }$$

(46)

$$\displaystyle{ (\mathcal{P}\mathcal{L}_{y}): \left \{\begin{array}{c} \alpha -\langle u^{\top }(A + B),y\rangle \downarrow \min \limits _{ (y,\alpha )}, \\ (y,\alpha ) \in Y =\{ (y,\alpha ) \in S_{n} \times \mathbb{R}\mid Ay \leq \alpha e_{m}\}. \end{array} \right. }$$

(47)

Unexpectedly enough, the procedure of consecutive solving the Problem $(\mathcal{P}\mathcal{L}_{x})$ and $(\mathcal{P}\mathcal{L}_{y})$ converges in the following sense.

Theorem 6.

([33])The sequence of the tuples $(x^{s},y^{s},\alpha _{s},\beta _{s})$ generated by the LSM consisting in the consecutive fulfilling of the following inequalities

$$\displaystyle\begin{array}{rcl} \alpha _{s+1} -\langle x^{s}(A + B),y^{s+1}\rangle -\frac{\rho _{s}} {2} \leq \inf \limits _{(y,\alpha )}\{\alpha -\langle x^{s}(A + B),y\rangle \mid (y,\alpha ) \in Y \},& &{}\end{array}$$

(48)

$$\displaystyle\begin{array}{rcl} \beta _{s+1} -\langle x^{s+1}(A + B),y^{s+1}\rangle -\frac{\rho _{s}} {2} \leq \inf \limits _{(x,\beta )}\{\beta -\langle x,(A + B)y^{s+1}\rangle \mid (x,\beta ) \in X\},& &{}\end{array}$$

(49)

converges to a quadruple $(\hat{x},\hat{y},\hat{\alpha },\hat{\beta })$ satisfying the conditions as follows

$$\displaystyle{ \left.\begin{array}{c} F_{0}(\hat{x},\hat{y},\hat{\alpha },\hat{\beta }) \leq F_{0}(\hat{x},y,\alpha,\hat{\beta })\quad \forall (y,\alpha ) \in Y, \\ F_{0}(\hat{x},\hat{y},\hat{\alpha },\hat{\beta }) \leq F_{0}(x,\hat{y},\hat{\alpha },\beta )\quad \forall (x,\beta ) \in X,\\ \end{array} \right \} }$$

(50)

provided that ρ _s > 0, s = 0,1,2,…, $\sum \limits _{s=0}^{\infty }\rho _{s} < +\infty $ .□

We will call, henceforth, such a point satisfying (50) a critical point of Problem $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′). The LSM (46)–(49) has been tested on a rather large field of well-known test problems [33], and also on the various test problems especially constructed with the help of the idea from [5], by beginning the known games of small dimension (2 × 2, 3 × 3) and until the test-games of rather high size (say, $m = n = 1,000$).

Computational simulations certify unexpected effectiveness of the developed LSM that naturally depends on the method or a package of applied software (CPLEX) that was used for solving the linear problems (46), (47). Now we are able to perform LSM with the data $m = n = 10^{6}$ rather easily and effectively.

5.2.3 Global Search Algorithm

Recall that, in addition to local search, the basic stages of a GSM include an approximation of the level surface of the convex function h(⋅ ) (which creates the basic nonconvexity in Problem $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′)), a solution of a linearized problem $(\mathcal{P}\mathcal{L}_{s})$, the verification of the VI (30) with $w^{i} \in U(h,\gamma -\zeta ) =\{ (x,y) \in \mathbb{R}^{m+n}\mid h(x,y) =\gamma -\zeta \}$, and finally a line search along the variable $\gamma \in \mathbb{R}$.

Taking into account the particularities of Problem (6 ^′) the following modifications have been introduced into the general scheme of Global Search on the bases of Theorem 5.

1.
Due to the properties of the cost function F ₀(x, y, α, β) (see Theorem 4) the supplementary stopping criterion was introduced.
2.
Two new parameters (q and ν) have been introduced in order to control the speed and the accuracy of the algorithm [23].

Let us now describe the GSM for solving Problem $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′) in a more algorithmic form.

Assume, we are given a starting feasible point $(x^{0},y^{0},\alpha _{0},\beta _{0}) \in D = X \times Y$; number sequences {τ _k} and {δ _k}, k = 0, 1, 2, …, τ _k ↓ 0, δ _k ↓ 0 (k → ∞); a set of directions $Dir =\{ (\bar{u}^{1},\bar{v}^{1}),\ldots,(\bar{u}^{N},\bar{v}^{N}) \in \mathbb{R}^{m+n}\}$ the bounds $\gamma _{-}\approx \inf (g,D),\gamma _{+} \approx \sup (g,D)$; and parameters ν ∈ ]0, 1[ and q.

Global Search Methods for $(\mathcal{P}_{\mathrm{BM}})$–(6 ^′).

Step 0. :

Set k: = 0, $(\bar{x}^{k},\bar{y}^{k},\bar{\alpha }_{k},\bar{\beta }_{k}):= (x^{0},y^{0},\alpha _{0},\beta _{0})$, s: = 0, p: = 1, $\gamma:=\gamma _{-}$, $\varDelta \gamma = (\gamma _{+} -\gamma _{-})/q$.

Step 1. :

Starting from $(\bar{x^{k}},\bar{y^{k}},\bar{\alpha }_{k},\bar{\beta }_{k}) \in D$, move to τ _k—critical point $(x^{k},y^{k},\alpha ^{k},\beta ^{k}) \in D$ by means of the special LSM (48)–(49).

Set $\xi _{k}:= F_{0}(x^{k},y^{k},\alpha _{k},\beta _{k}) \leq F_{0}(\bar{x^{k}},\bar{y^{k}},\bar{\alpha }_{k},\bar{\beta }_{k}).$

Step 2. :

(Stopping criterion). If $\xi _{k} \leq \varepsilon$, where $\varepsilon$ is the prescribed accuracy, then STOP: $(x^{k},y^{k}) \in NE(G,\varepsilon ).$

Step 3. :

With the help of the point $(\bar{u}^{p},\bar{v}^{p}) \in Dir$ (p = 1, . . , N) construct a point (u ^p, v ^p) such that $h(u^{p},v^{p}) =\gamma -\xi _{k}$. Compute the numbers $\alpha _{p}:=\max \limits _{1\leq i\leq m}(Av^{p})_{i},\beta _{p}:=\max \limits _{1\leq j\leq n}(u^{p}B)_{j}.$

Step 4. :

If $g(u^{p},v^{p},\alpha _{p},\beta _{p}) >\gamma +\nu \gamma$, then set $p:= p + 1$ and return to Step 3. Else go to Step 5.

Step 5. :

Starting at the point $(u^{p},v^{p},\alpha _{p},\beta _{p})$ find a 2τ _k-critical point $(\hat{x}^{p},\hat{y}^{p},\hat{\alpha }_{p},\hat{\beta }_{p}) \in D$ of Problem (6 ^′) by means of special LSM.

Step 6. :

(Stopping criterion). If $F_{0}((\hat{x}^{p},\hat{y}^{p},\hat{\alpha }_{p},\hat{\beta }_{p}) \leq \varepsilon$, then STOP. $(\hat{x}^{p},\hat{y}^{p}) \in NE(G,\varepsilon )$.

Step 7. :

Find a δ _k-solution $(x_{0}^{p},y_{0}^{p})$ to the level problem or, what is equivalent,

$$\displaystyle{ \begin{array}{c} \langle \nabla _{x}h(x_{0}^{p},y_{0}^{p}),\hat{x}^{p} - x_{0}^{p}\rangle +\langle \nabla _{y}h(x_{0}^{p},y_{0}^{p}),\hat{y}^{p} - y_{0}^{p}\rangle +\delta _{k} \\ \geq \sup \limits _{(x,y)}\{\langle \nabla _{x}h(x_{0}^{p},y_{0}^{p}),\hat{x}^{p} - x\rangle +\langle \nabla _{y}h(x_{0}^{p},y_{0}^{p}),\hat{y}^{p} - y\rangle \mid h(x,y) =\gamma -\zeta _{k}\},\end{array} }$$

(51)

where $h(x_{0}^{p},y_{0}^{p}) =\gamma -\zeta _{k}$.

Step 8. :

Compute

$$\displaystyle{\eta _{k}(\gamma ) = g(\hat{x}^{p},\hat{y}^{p},\hat{\alpha }_{ p},\hat{\beta }_{p}) -\gamma -\langle \nabla _{x}h(x_{0}^{p},y_{ 0}^{p}),\hat{x}^{p} -\hat{ x}_{ 0}^{p}\rangle -\langle \nabla _{ y}h(x_{0}^{p},y_{ 0}^{p}),\hat{y}^{p} -\hat{ y}_{ 0}^{p}\rangle.}$$

Step 9. :

If $\eta _{k}(\gamma ) \geq 0$ and p < N, then set $p:= p + 1$ and loop to Step 3.

Step 10. :

If η _k(γ) ≥ 0 and p = N, then set $\gamma:=\gamma +\bigtriangleup \gamma$ and p: = 1 and return to Step 3.

Step 11. :

If η _k(γ) < 0, then $k:= k + 1,(\bar{x}^{k+1},\bar{y}^{k+1},\bar{\alpha }_{k+1},\bar{\beta }_{k+1}):= (\hat{x}^{p},\hat{y}^{p},\hat{\alpha }^{p},\hat{\beta }^{p})$, and return to Step 1.

Step 12. :

If p = N and $\eta _{k}(\gamma ) \geq 0\forall \gamma \in [\gamma _{-},\gamma _{+}]$ (i.e., the line search with respect to $\gamma \in [\gamma _{-},\gamma _{+}]$ is finished), then stop.

The GSM presented above is not an algorithm, since some of its steps have not been described clearly, and must be precised. For instance, it is not clear how to find the pair $(x_{0}^{p},y_{0}^{p})$ on Step 7, besides, how to construct a point (u ^p, v ^p) on the level surface of h(⋅ ): $h(u^{p},v^{p}) =\gamma -\zeta _{k}$ with the help of a given direction $(\bar{u}^{p},\bar{v}^{p})$ on Step 3. As to the first problem of Step 7, it can be solved analytically for the quadratic function

$$\displaystyle{ h(x,y) = \frac{1} {4}(\|x + Ay\|^{2} +\| x + By\|^{2}), }$$

(52)

more precisely, the exact solution is given by the formula [32, 33]

$$\displaystyle{(x_{0}^{p},y_{ 0}^{p}) = t(\hat{x}^{p},\hat{y}^{p}),\ t ={\Bigl [ \frac{\gamma -\zeta _{k}} {h(\hat{x}^{p},\hat{y}^{p})}\Bigr ]}^{\frac{1} {2} }.}$$

The construction of point (u ^p, v ^p) satisfying $h(u^{p},v^{p}) =\gamma -\zeta _{k}$ can be done in the similar way:

$$\displaystyle\begin{array}{rcl} & & (u^{p},v^{p}) =\lambda _{ p}(\bar{u}^{p},\bar{v}^{p}),p = 1,\ldots,N, {}\\ & & \lambda _{p} =\lambda _{p}(\zeta _{k},\gamma ) = \pm [\gamma -\zeta _{k}/h(\bar{u}^{p},\bar{v}^{p})]^{\frac{1} {2} }. {}\\ \end{array}$$

To the end of the solving Problem ($\mathcal{P}_{\mathrm{BM}}$)–(6 ^′) it was used the approximations of the level surface $U(h,\gamma -\zeta ) =\{ (x,y)\vert \ h(x,y) =\gamma -\zeta \}$ (see Step 3) constructed with the help of the following sets of directions

$$\displaystyle{Dir1 =\{ (e^{i},e^{j}) \in \mathbb{R}^{m+n}\vert \ i = 1,\ldots,m,\ j = 1,\ldots,n\},}$$

where {e ⁱ} is the Euclidian basis in $\mathbb{R}^{m}$ and {e ^j} is the basis in $\mathbb{R}^{n}$, respectively;

$$\displaystyle{Dir2 =\{ (e^{i} + x,e^{j} + y)\vert \ i = 1,\ldots,m,\ j = 1,\ldots,n\},}$$

where (x, y) is a critical point provided by the special LSM;

$$\displaystyle{Dir3 =\{ (a^{j} + e_{ m},b^{i} + e_{ n})\vert \ i = 1,\ldots,m,\ j = 1,\ldots,n\},}$$

where $a^{j} \in \mathbb{R}^{n}$ are the columns in A and $b^{i} \in \mathbb{R}^{n}$ are the rows in B, and $e_{p} = (1,\ldots,1) \in \mathbb{R}^{p}$, p = m, n.

Note that the sets Dir1, Dir2, Dir3 have been selected as the most efficient ones after comparative computational experiments. But on the other hand, it is easy to see that the number of points in the constructed approximations strongly depends on the size of the problem, i.e. is equal to m × n.

So, the number of points in the approximations grows as q ², where q = min{m; n}.

It is clear that this moment makes it prohibited the numerical solution of Problem ($\mathcal{P}_{\mathrm{BM}}$)–(6) of high dimension due to the excessive solution time.

In order to avoid this drawback it was employed some reducing procedure of the sets Dir1, 2, 3 to sets with the number of points equal to 2(m + n) [23, 33].

5.2.4 Computational Simulations

The numerical experiments were conducted applying software programs implementing the GSAs described above. For all the problems, a starting point was chosen as follows:

$$\displaystyle\begin{array}{rcl} & & x_{i}^{0} = \frac{1} {m},\ i = 1,2,\ldots,m;\ y_{j}^{0} = \frac{1} {n},\ j = 1,2,\ldots,n, {}\\ & & \alpha _{0} =\max _{i}(Ay^{0})_{ i},\ \beta _{0} =\max _{j}(x^{0}B)_{ j}. {}\\ \end{array}$$

The computational simulations have been separated into several stages, and the first results of these experiments have been published in [23].

Further, the analysis of the results allowed us to conclude about some shortcomings of the software program developed. First of all, it was the solving method for linearized problems (46)–(47). Recall that to the end the simplex method program or the supporting cone method program was employed, which showed itself very excessive from the viewpoint of the solution time of problems of high dimension.

As a consequence, for solving the BM games of rather high dimension (up to 1, 000 × 1, 000) we decided to apply ILOG CPLEX 9.1(http://www-01.ibm.com/soft-ware/commerce/optimization/cplex-optimizer/index.html) especially oriented to LP problems. In addition, in order to create the worst conditions for the global search software the entries of matrices A and B have randomly been generated from the interval [−n, n], where n = m.

The software programs of global search were run on Pentium 4, CPU 3 GHz with 512 Mb of RAM and have been implemented by post-graduated students without a long computational experience.

Nevertheless, the results of computational solving of BM games (m = n) can be viewed as rather promising from the point of view of analysis of numeric results of Table 1.

Table 1 Computational results

Full size table

In Table 1, m = n is the number of pure strategies of players 1 and 2, F ₀ stands for the value of the goal function at the starting points, F _k is the corresponding value at the best obtained point, st is the number of iterations of GSAs (or, what is the same, the number of critical (stationary) points passed by GS algorithms), LP and Loc represent the number of linearized problems solved and the number of local search algorithm’s applications, respectively.

It can readily be seen that in the cases of $m = n = 250$ and 400 it happened to randomly generate very difficult problems.

Despite these difficulties, all test-problems have successfully been solved that certifies on the computational effectiveness of the software program created on the basis of the GSA and the Global Search Theory.

Now, we are preparing to attack the bimatrix game of dimension $m = n = 10^{4}$ and the similar three-person-game of dimension $m = n = l = 5$, and 10.

5.3 Quadratic-Linear Bilevel Optimization

In this subsection we will consider the following problem of bilevel programming

$$\displaystyle\begin{array}{rcl} (\mathcal{B}\mathcal{P}): F(x,y):= \frac{1} {2}\langle x,Cx\rangle +\langle c,x\rangle + \frac{1} {2}\langle y,Cy\rangle +\langle c_{1},y\rangle \downarrow \min \limits _{x,y},& &{}\end{array}$$

(53)

$$\displaystyle\begin{array}{rcl} (x,y) \in X:=\{ (x,y) \in \mathbb{R}^{m} \times \mathbb{R}^{n}\mid Ax + By \leq a,\quad x \geq 0\},& &{}\end{array}$$

(54)

$$\displaystyle\begin{array}{rcl} y \in Y _{{\ast}}(x):= \mathrm{Arg}\min \limits _{y}\{\langle d,y\rangle \mid y \in Y (x)\}& &{}\end{array}$$

(55)

$$\displaystyle\begin{array}{rcl} Y (x):=\{ y \in \mathbb{R}^{n}\mid A_{ 1}x + B_{1}y \leq b,\quad y \geq 0\},& &{}\end{array}$$

(56)

where we are seeking an optimistic solution [6, 8], i.e. the upper level (x, leader) and the lower level (y, follower) are searching together (in cooperation) a common solution (x _∗, y _∗). Here, $c \in \mathbb{R}^{m}$, $d,c_{1} \in \mathbb{R}^{n}$, $a \in \mathbb{R}^{p}$, $b \in \mathbb{R}^{q}$, and matrices C, C ₁, A, B, A ₁, B ₁ are of corresponding dimensions. In addition, C = C ^⊤ > 0, $C_{1} = C_{1}^{\top } > 0$, so that the leader cost function is a convex quadratic function, while the follower goal function is linear.

Assume that

$(\mathcal{H})$: (i) the function F(x, y) is bounded from below on X,

(ii) the function $\langle d,y\rangle$ is bounded from below on $Y (x)\ \forall \,x \in Pr(X)$.

It is clear, that, from the first glance, any nonconvexity is not visible in the formulation $(\mathcal{B}\mathcal{P})$–(53)–(56). In order to put it explicit we apply the KKT-conditions for the follower problem (55) –(56):

$$\displaystyle{ \left.\begin{array}{c} d + vB_{1} \geq 0,\quad v \geq 0,\quad A_{1}x + B_{1}y \leq b, \\ \langle d,y\rangle -\langle A_{1}x - b,v\rangle = 0, \end{array} \right \} }$$

(57)

where v is the Lagrangian multipliers. Since the follower problem is also convex, the relations (57) are equivalent to the statement (55)–(56).

Let us replace, now, in $(\mathcal{B}\mathcal{P})$ the follower problem by (57). It yields us the new problem

$$\displaystyle{ \left.\begin{array}{c} (\mathcal{P}): F(x,y) \downarrow \min \limits _{x,y,v}, \\ Ax + By \leq a,\quad A_{1}x + B_{1}y \leq b,\quad d + vB_{1} \geq 0, \\ \langle d,y\rangle =\langle A_{1}x - b,v\rangle,\quad x \geq 0,\quad y \geq 0,\quad v \geq 0. \end{array} \right \} }$$

(58)

The following result establishes the relation between the $(\mathcal{B}\mathcal{P})$ and Problem $(\mathcal{P})$.

Theorem 7.

([8])For the pair (x ^∗ ,y ^∗ ) to be a global solution to Problem $(\mathcal{B}\mathcal{P})$ – (53)–(56) , it is necessary and sufficient that there exists a vector $v^{{\ast}}\in \mathbb{R}^{q}$ such that the triple $(x^{{\ast}},y^{{\ast}},v^{{\ast}})$ is a global solution to Problem $(\mathcal{P})$ – (58) .□

Note, that, first, the relation exists only between the global solutions of Problems $(\mathcal{P})$ and $(\mathcal{B}\mathcal{P})$, but it does not take place between local solutions or between local and global ones.

Second, it is easy to see that the feasible set of Problem $(\mathcal{P})$–(58) is nonconvex because of the presence of the bilinear equality-constraint in (58). Thus, Problem $(\mathcal{P})$–(58) turns out to be nonconvex.

Let us denote

$$\displaystyle{ H(x,y,v):=\langle d,y\rangle -\langle A_{1}x - b,v\rangle }$$

(59)

and introduce a μ-parametric family of problems as follows:

$$\displaystyle{ \left.\begin{array}{c} (\mathcal{P}(\mu )): F_{1}(x,y,v,\mu ) = F(x,y) +\mu H(x,y,v) \downarrow \min \limits _{x,y,v}, \\ (x,y,v) \in D:=\{ (x,y,v)\mid Ax + By \leq a,\quad A_{1}x + B_{1}y \leq b, \\ d + vB_{1} \geq 0,\quad x \geq 0,\quad y \geq 0,v \geq 0\}, \end{array} \right \} }$$

(60)

where μ > 0 is a penalty parameter. If we rewrite the function H(⋅ ) in the form

$$\displaystyle\begin{array}{rcl} H(x,y,v) =\langle d + vB_{1},y\rangle -\langle A_{1}x + B_{1}y - b,v\rangle,& &{} \\ \end{array}$$

(59′)

then it becomes clear that

$$\displaystyle{ H(x,y,v) \geq 0\quad \forall \,(x,y,v) \in D. }$$

(61)

Furthermore, it can be readily seen that for a fixed value of μ Problem $(\mathcal{P}(\mu ))$ is convex and quadratic with respect to the variables (x, y), and, besides, bilinear with respect to the variables x and v. So, Problem $(\mathcal{P}(\mu ))$ can be called quadratic-bilinear, but anyway it stays to be nonconvex with the convex feasible set D (defined in (60)) and the nonconvex objective function F ₁(⋅ ). Below we will show that F ₁(x, y, v, μ) is a d.c. function.

Let us suppose (x(μ), y(μ), v(μ)) be a solution to Problem $(\mathcal{P}(\mu ))$–(60) for a given $\mu \in \mathbb{R}$. Further, denote H[μ] = H(x(μ), y(μ), v(μ)). Then the following relations between Problems $(\mathcal{P})$–(58) and $(\mathcal{P}(\mu ))$–(60) take place.

(i)
If the equality $H[\hat{\mu }] = 0$ holds for some value $\hat{\mu }\in \mathbb{R}$ and $(\hat{x},\hat{y},\hat{v}) = (\hat{x} = x(\hat{\mu }),\hat{y} = y(\hat{\mu }),\hat{v} = v(\hat{\mu }))$ is a solution to Problem $(\mathcal{P}(\hat{\mu }))$, then the triple $(\hat{x},\hat{y},\hat{v})$ is a solution to Problem $(\mathcal{P})$.
(ii)
Moreover, for all $\mu >\hat{\mu }$ the equality $H[\mu ] = H(x(\mu ),y(\mu ),v(\mu )) = 0$ holds, and, in addition, (x(μ), y(μ), v(μ)) is a solution to Problem $(\mathcal{P})$.

In connection with these assertions we have results more suitable for computational uses.

Proposition 1.

([36])Let (x(μ),y(μ),v(μ)) ∈ D be a τ ₁ -solution to Problem $(\mathcal{P}(\mu ))$ , and, besides,

$$\displaystyle{H(x(\mu ),y(\mu ),v(\mu )) \leq \tau _{2}.}$$

Then

(i)
y(μ) is a τ ₂ -solution to the follower problem (55)–(56) with parameter x = x(μ);
(ii)
(x(μ),y(μ)) is an approximate τ ₁ -solution to Problem $(\mathcal{B}\mathcal{P})$ – (53)–(56) .□

The above assertions allow us to apply the global search methodology developed in Sect. 4 for solving Problem $(\mathcal{P}(\mu ))$–(60) and, as a consequence, for finding an approximate global solution to Problem $(\mathcal{B}\mathcal{P})$–(53)–(56).

5.3.1 Local Search

It can be readily seen that Problem $(\mathcal{P}(\mu ))$ can be rewritten in the following form

$$\displaystyle\begin{array}{rcl} (\mathcal{P}(\mu )): F_{1}(x,y,v):= \frac{1} {2}\langle x,Cx\rangle +\langle c,x\rangle + \frac{1} {2}\langle y,C_{1}y\rangle +\langle c_{1},y\rangle & & \\ +\mu {\bigl [\langle d,y\rangle -\langle A_{1}x - b,v\rangle \bigr ]} \downarrow \min \limits _{x,y,v},& &{}\end{array}$$

(62)

$$\displaystyle\begin{array}{rcl} (x,y) \in Z:=\{ (x,y)\mid Ax + By \leq a,\quad A_{1}x + B_{1}y \leq b,\quad x \geq 0,\quad y \geq 0\},& &{}\end{array}$$

(63)

$$\displaystyle\begin{array}{rcl} v \in V:=\{ v\mid d + vB_{1} \geq 0,\quad v \geq 0\}.& &{}\end{array}$$

(64)

On account of the assumptions $(\mathcal{H})$, it is easy to see that the cost function F ₁(⋅ ) is bounded from below on the set D = Z × V. Further, the statement (62)–(64) of Problem $(\mathcal{P}(\mu ))$ suggests the idea of local search consisting in a consecutive solution of the problem (62)–(64) with respect to the groups of variables; more precisely, in the case (62)–(64), first, with respect to the pair (x, y) and, after that, with respect to the variables v, or in the inverse order.

Note that Problem $(\mathcal{P}(\mu ))$ with a fixed value of the variable v becomes a convex quadratic optimization problem. On the other hand, for a fixed pair (x, y) we obtain a linear programming (LP) problem with respect to v. So, these auxiliary problems can be solved by standard software packages (CPLEX, X-Press, etc.)

Therefore, we can produce local search as it was done for the Bimatrix games.

Given some starting point v ₀ ∈ V, we describe a so-called V -procedure as follows:

Step 0. :

Set s: = 0, v ^s: = v ₀.

Step 1. :

Find a $\frac{\rho _{s}} {2}$-solution $(x^{s+1},y^{s+1})$ of the problem

so that the following inequality holds

$$\displaystyle{ F_{1}(x^{s+1},y^{s+1},v^{s}) \leq \inf \limits _{ (x,y)}\{F_{1}(x,y,v^{s})\mid (x,y) \in Z\} + \frac{\rho _{2}} {2}. }$$

(65)

Step 2. :

Find a $\frac{\rho _{s}} {2}$-solution v ^s+1 of LP problem

so that the following inequality is satisfied

$$\displaystyle{ F_{1}(x^{s+1},y^{s+1},v^{s+1}) \leq \inf \limits _{ v}\{F_{1}(x^{s+1},y^{s+1},v)\mid v \in V \} + \frac{\rho _{2}} {2}. }$$

(66)

Step 3. :

Set $s:= s + 1$ and loop to Step 1.

Under the condition

$$\displaystyle{\rho _{s} > 0,\quad s = 0,1,2,\ldots,\quad \sum \limits _{s=0}^{\infty }\rho _{ s} < +\infty,}$$

we can prove, as it was in the Bimatrix games, that the numerical sequence $\{F_{1s} = F_{1}(x^{s},y^{s},v^{s})\}$ generated by the V -procedure from above is converging.

Moreover, if $(x^{s},y^{s},v^{s}) \rightarrow (\hat{x},\hat{y},\hat{v})$, then the point $(\hat{x},\hat{y},\hat{v})$ turns out to be a critical point of Problem $(\mathcal{P}(\mu ))$–(62)–(64) [35] or partially global solution to $(\mathcal{P}(\mu ))$, i.e.

$$\displaystyle{ \left.\begin{array}{c} F_{1}(\hat{x},\hat{y},\hat{v}) \leq F_{1}(x,y,\hat{v})\quad \forall \,(x,y) \in Z, \\ F_{1}(\hat{x},\hat{y},\hat{v}) \leq F_{1}(\hat{x},\hat{y},v)\quad \forall \,v \in V. \end{array} \right \} }$$

(67)

Note that if a point $(\bar{x},\bar{y},\bar{v})$ is a local solution to Problem $(\mathcal{P}(\mu ))$–(62)–(64), then $(\bar{x},\bar{y},\bar{v})$ turns out to be a critical point of Problem $(\mathcal{P}(\mu ))$. Thus, the notion of critical point just introduced is really substantiated by and connected with the common notion of local solution. The similar to V -procedure so-called XY -procedure (starting at a point (x ₀, y ₀) ∈ Z) has also been studied and substantiated.

In order to test the developed LSM a rather large field of test-problems of the form $(\mathcal{B}\mathcal{P})$–(53)–(56) has been constructed with the help of the idea of Calamai and Vicente [5], which provides for the bilevel problems with well-known properties, local and global solutions (even the numbers of which is known).

Now, a few words about the numerical testing of LSM.

First, the computational simulation was threefold:

(a)
to choose a suitable value of the penalty parameter μ that provides for the equality H(x(μ), y(μ), v(μ)) = 0 (the exact penalty [3, 22, 40]);
(b)
to find starting points suitable for Global Search, i.e. from which the LSM was not able to reach a global solution;
(c)
and finally, to compare two versions of LSM (with V - or XY -procedures).

Analyzing the testing results, one concluded that the computational time was rather short (less than 0.1 s), when the stopping criterion was satisfied at the accuracy $\tau = 10^{-4}$.

Furthermore, the value μ = 10 of penalty parameter μ turned out to be sufficient to reach the equality H(x(μ), y(μ), v(μ)) = 0 at a τ-critical point (x(μ), y(μ), v(μ)). The targets (b) and (c) have been also reached.

Moreover, it should be specially noted the high rate of convergence of the XY - and V -procedures on the considered series of randomly generated problems, only two iterations were needed (starting from arbitrary feasible point) in order to get a critical point. So, the results of computational testing of the LSM were rather promising [35, 37].

5.3.2 Global Search

Let us repeat that the numerical test results showed that the special LSMs (V - and XY procedures) do not, in general, yield a global solution, even in problems of small sizes.

According to the methodology of Sect. 4, first we need to derive an explicit d.c. decomposition (if possible) of the cost function of the problem under scrutiny.

It is not hard to see that the goal function F ₁(x, y, v) of the problem $(\mathcal{P}(\mu ))$ can be represented as a difference of two convex functions, for instance, as follows:

$$\displaystyle{ F_{1}(x,y,v) = g(x,y,v) - h(x,v), }$$

(68)

where

$$\displaystyle{ \left.\begin{array}{c} g(x,y,v) = \frac{1} {2}\langle x,Cx\rangle +\langle c,x\rangle + \frac{1} {2}\langle y,C_{1}y\rangle +\langle c_{1},y\rangle \\ +\mu {\bigl (\langle v,b\rangle +\langle y,d\rangle + \frac{1} {4}\|v - A_{1}x\|^{2}\bigr )}, \\ h(x,v) = \frac{\mu } {4}\|v + A_{1}x\|^{2} \end{array} \right \} }$$

(69)

are convex functions. Note that this d.c. decomposition is different with respect to these ones that was used in [35–37].

As it was noted above, the procedures of escaping critical points are based on GOC of Theorem 2 (see (25)) and employing the constructive (algorithmic) property of GOC. In the case of Problem $(\mathcal{P}(\mu ))$ these GOCs take the form as follows:

$$\displaystyle\begin{array}{rcl} (x_{{\ast}},y_{{\ast}},v_{{\ast}}) \in Sol(\mathcal{P}(\mu )),\quad \zeta:= F_{1}(x_{{\ast}},y_{{\ast}},v_{{\ast}})\Longrightarrow& & \\ \forall (z,w,\gamma ) \in \mathbb{R}^{m+q+1}: \;\;h(z,w) =\gamma -\zeta,& &{}\end{array}$$

(70)

$$\displaystyle\begin{array}{rcl} g(x,y,v)-\gamma \geq \langle \nabla _{xv}h(z,w),(x,v) - (z,w)\rangle \quad \forall (x,y,v) \in D.& &{}\end{array}$$

(71)

Besides, if for some $(\hat{z},\hat{w},\hat{\gamma })$ in (70) and $(\hat{x},\hat{y},\hat{v}) \in D$

$$\displaystyle{g(\hat{x},\hat{y},\hat{v}) <\hat{\gamma } +\langle \nabla _{xv}h(\hat{z},\hat{w}),(\hat{x},\hat{v}) - (\hat{z},\hat{w})\rangle,}$$

i.e. the VI (71) is violated, then due to the convexity of h(⋅ ) it follows

$$\displaystyle{F_{1}(\hat{x},\hat{y},\hat{v}) < F_{1}(x_{{\ast}},y_{{\ast}},v_{{\ast}}).}$$

In other words, $(\hat{x},\hat{y},\hat{v}) \in D$ is “better” than $(x_{{\ast}},y_{{\ast}},v_{{\ast}})$.

Similarly to Sect. 4 and according to the methodology presented in Sect. 3 Problem $(\mathcal{P}(\mu ))$ is decomposed into several simpler problems as follows:

(a)
one-dimensional search along the variable γ;
(b)
constructing the level surface approximation of the convex function h(x, v), which does not depend on y, as it was in our earlier papers [35–37]. It is clear that in this case the approximations must be easier to construct.

On the other hand, we have to pay attention to the fact that in view of the different d.c. representation (68)–(69) the global search has to be changed and becomes different with respect to [35, 36].

Assume, we are given a point $(x_{0},y_{0},v_{0}) \in \mathbb{R}^{m+n+q}$, numerical sequences {τ _k}, {δ _k}, τ _k, δ _k > 0, k = 0, 1, …, τ _k ↓ 0, δ _k ↓ 0 (k → ∞), numbers γ ₋ ≈ inf_{(x, y, v)}(g, D) and γ ₊ ≈ sup_{(x, y, v)}(g, D), an algorithm’s parameter M and a direction’s set of the form

$$\displaystyle{Dir = \left \{(a^{l},c^{l}) \in \mathbb{R}^{m+q}\;\vert \;(a^{l},c^{l})\neq 0,\;\;l = 1,\ldots,N\right \}.}$$

The GS algorithm used here can be represented as follows:

Step 0. :

Set $k:= 0,\;\;(\bar{x}^{k},\bar{y}^{k},\bar{v}^{k}):= (x_{0},y_{0},v_{0}),\;\;l:= 1.\;\gamma:=\gamma _{-};\varDelta \gamma:=\gamma _{+} -\gamma _{-}/M.$

Step 1. :

Starting at the point $(\bar{x}^{k},\bar{y}^{k},\bar{v}^{k})$ construct a τ _k-critical point $(x_{k},y_{k},v_{k}) \in D$ in Problem ($\mathcal{P}(\mu ))$ by applying V - or XY -procedure. Set $\zeta _{k}:= F_{1}(x_{k},y_{k},v_{k})$.

Step 2. :

Given a point (a ^l, c ^l) ∈ Dir, construct a point (z ^l, w ^l) such that $h(z^{l},w^{l}) =\gamma -\zeta _{k}.$

Step 3. :

Solve the linearized problem as follows:

$$\displaystyle{(\mathcal{P}\mathcal{L}_{l}):\;\;\;\; g(x,y,v) -\langle \nabla _{xv}h(z^{l},w^{l}),(x,v)\rangle \downarrow \min _{ (x,y,v)},\;\;(x,y,v) \in D.}$$

Let the point $(\hat{x},\hat{y},\hat{v})$ be a solution to $(\mathcal{P}\mathcal{L}_{l})$.

Step 4. :

Starting at the point $(\hat{x},\hat{y},\hat{v})$ construct a δ _k-critical point $(\hat{x}_{l},\hat{y}_{l},\hat{v}_{l})$.

Step 5. :

If $F_{1}(\hat{x}_{l},\hat{y}_{l},\hat{v}_{l}) <\zeta _{k}\stackrel{\bigtriangleup }{=}F_{1}(x_{k},y_{k},v_{k})$, then set $(\bar{x}^{k+1},\bar{y}^{k+1},\bar{v}^{k+1}):= (\hat{x}_{l},\hat{y}_{l},\hat{v}_{l})$, $k:= k + 1$, l: = 1, $\gamma:=\gamma _{-}$ and loop to Step 1.

Step 6. :

If $F_{1}(\hat{x}_{l},\hat{y}_{l},\hat{v}_{l}) \geq \zeta _{k}$ and l < N, then set $l:= l + 1$ and return to Step 2.

Step 7. :

If $F_{1}(\hat{x}_{l},\hat{y}_{l},\hat{v}_{l}) \geq \zeta _{k}$ and l = N, then set $\gamma:=\gamma +\varDelta \gamma,l:= 1$ and come back to Step 2.

Step 8. :

If $l = N,\;\;F_{1}(\hat{x}_{l},\hat{y}_{l},\hat{v}_{l}) \geq \zeta _{k}\;\;\forall \gamma \in [\gamma _{-},\gamma _{+}]$ (i.e., one-dimensional search along γ over the interval $[\gamma _{-},\gamma _{+}]$ is terminated), then STOP; $(x_{k},y_{k},v_{k})$ is a critical point provided by Algorithm of global search.

Remark 1.

It is clear that different values of the parameter M are responsible for the partitioning of the interval $[\gamma _{-},\gamma _{+}]$ into a suitable number of parts to implement a passive one-dimensional search along γ. On the other hand, it is necessary to precise how to construct a direction’s set Dir and, furthermore, an approximation of the level surface $h(z,w) =\gamma -\zeta _{k}$.

Taking into account that in contrast to the earlier papers [35–37] here due to (69)

$$\displaystyle\begin{array}{rcl} h(x,v)\stackrel{\bigtriangleup }{=} \frac{\mu } {4}\|v + A_{1}x\|^{2},& &{} \\ \end{array}$$

(69′)

we have to choose γ ≥ ζ _k so that γ ₋ can always be chosen as follows: $\gamma _{-}:=\zeta _{k}$. Other points of the implementation of the algorithm were similar to [35–37].

Let us focus now on the construction of approximation of the level surface

$$\displaystyle{U(\gamma ) =\{ (z,w)\vert \ h(z,w) =\gamma -\zeta _{k}\}.}$$

Recall that, on the one hand, such an approximation should be representative enough to escape a critical point (if possible). On the other hand, if we are rather far from a global solution, then the approximation must allow us to “jump out” the critical point where we are.

Let us show how to construct an approximation. Given a set of directions

$$\displaystyle{Dir =\{ (a^{l},c^{l}) \in \mathbb{R}^{m+q}\vert (a^{l},c^{l})\neq 0,\,l = 1,\ldots,N\},}$$

we construct a point of an approximation $\mathcal{A}_{n}$ in a rather simple manner as follows:

$$\displaystyle{ (z^{l},w^{l}) =\lambda _{ l}(a^{l},c^{l}),\ h(z^{l},w^{l}) =\gamma -\zeta _{ k},l = 1,\ldots,N. }$$

(72)

Due to (69 ^′) the corresponding equation

$$\displaystyle\begin{array}{rcl} \frac{\mu } {4}\|c^{l} + A_{ 1}a^{l}\|\lambda _{ l}^{2} =\gamma -\zeta _{ k}& &{} \\ \end{array}$$

(72′)

leads us to very simple computing in order to calculate λ _l.

As the sets of directions, one can consider, for example, the set

$$\displaystyle{Dir1 =\{ (x^{k} + e^{i},v^{k} + e^{j}),(x^{k} - e^{i},v^{k} - e^{j})\vert \ i = 1,\ldots,m,j = 1,\ldots,q\}}$$

where (x ^k, v ^k) is the part of the current critical point $(x^{k},y^{k},v^{k})$; $e^{i} \in \mathbb{R}^{m}$, $e^{j} \in \mathbb{R}^{q}$ are the Euclidean basis vectors. Further, it has been employed the set

$$\displaystyle{Dir0 =\{ (a^{i},b^{j})\vert (a^{i},b^{j})\neq 0,\ i = 1,\ldots,m,j = 1,\ldots,q\}.}$$

where a ⁱ and b ^j are, respectively, rows and columns of the matrix A ₁, which specifies nonconvexity in the goal function of Problem $(\mathcal{P}(\mu ))$–(60).

Note that the numbers of points in the approximations constructed are equal to 2qm and grow rapidly with the dimension. Therefore, we have also made the reduction of the approximations as described in [35, 36]. The first stages of computational testings of the developed GSA have been presented in [36, 37].

Here, we will show the preliminary results of further computational experiments with improved GSA described above (see Tables 2 and 3).

In particular we see in Table 2 the comparison of the results of computational solving the test-problems (generated, as above, with the help of the methodology from [5]) by GSA described above and by means of very popular package of applied software KNITRO (www.ziena.com/knitro.htm).

Since KNITRO is not able to solve bilevel problems directly, the equivalent formulation $(\mathcal{P}(\mu ))$ (with μ = 10, 15, 20) was used to this end.

On the other hand, GSA has been run on a computer with the processor Intel Core 2 Duo 2.0 GHz, while KNITRO used a computer with more powerful processor Intel Core 2 Quad 2.8 GHz. In Table 2 F _∗ is the known optimal values of the test-problems, F _Kms and T are the best values of the goal function and the corresponding solving time provided by KNITRO, while F _XY, F _V, and T stand for the best values of the cost function and the solution time obtained by GSA (using XY - or V -procedure as LSM). The bold values in Table 2 denote the successful cases when the known global solutions to the test-problems have been reached by the used algorithms.

Analyzing results of Table 2, it is easy to note that the KNITRO (multistart) was successful to find the global solutions only in 61 % of the test-problems of the middle dimension with the accuracy $\varepsilon = 10^{-2}$. Meanwhile, applying the programs implementing GSA, all considered test-problems have been solved at the same precision.

Table 2 Comparison of global search algorithm (GSA) with KNITRO

Full size table

Table 3 Testing of global search algorithm (GSA) on problems of high dimension

Full size table

Moreover, it is not hard to see the big difference in solution time between GSA and KNITRO for the problems of middle dimension more than 10. For example, for $m = n = 30$, KNITRO worked about 1.5–2 h without reaching a global solution, meanwhile GSA provided for a global solution in 2 min approximately.

Now, let us look at Table 3 where presented the results of computational solution of the test-problems of high dimension (until $m = n = 200$) provided by the software program implemented in a computer with the processor Intel Core i5-2400 3.1 GHz. In Table 3 N is the number of test-problems in series, LocSol _avg is an average number of local solutions which are not global in one problem of the series (this is very important difficulty index of the problem); Loc _avg stands for the average number of switching on of the LSM in conducting GSA; St _avg is the average number of iterations of GSA or critical points passed by GSA; T _avg is the average working time of the program implementing the GSA.

From the results of computation testing we can see, firstly, that all 2,350 randomly generated problems have been successfully solved so that, regardless the fantastic difficulty of the test-problems ($m + n = 300$ and LocSol _avg = 3. 93 ⋅ 10⁴⁴, $m + n = 400$ and LocSol _avg = 8. 64 ⋅ 10⁵⁹), GSA has found a global solution in every considered test-problem.

However, this version of program is characterized by the rapid increase in computing time with the growth of the dimension: for $m + n = 400$ it takes more than 9 h. On the other hand, it can be explained by the number Loc _avg of local search applications which is more than 310 thousands.

Moreover, it is not hard to note that the number St _avg of approximately critical points, at which it happened an improvement of the cost function, turned out to be rather moderate with respect to the number LocSol _avg of local solutions (different from global ones) which is varying from rather big ($m + n = 60$ and LocSol _avg = 1. 34 ⋅ 10⁸) until incalculable ($m + n = 300$, LocSol _avg = 3. 93 ⋅ 10⁴⁴; $m + n = 400$, LocSol _avg = 8. 64 ⋅ 10⁵⁹).

So, we conclude that the new results of computational solving the bilevel problem can be viewed as rather promising and competitive. Moreover, we did not be successful to find, at present, the solution’s results of similar problems of such dimensions in the existing literature.

6 Concluding Remarks

In the present paper, new procedures of finding the solution to the linear complementarity problem with indefinite matrices, the Nash equilibrium in bimatrix games, and optimistic solution in quadratic-linear bilevel optimization problems have been proposed, discussed, and illustrated.

Further, a new approach based on GOCs, LSMs and GSMs, was applied in order to solve all three problems. In addition, the new results of computational solutions were presented in the paper. According to these results, the new approach has shown itself rather promising and competitive.

References

Alexandrov, A.D.: On surfaces represented a difference of convex functions. Proc. Acad. Sci. KSSR. Ser. Math. Mech. 3, 3–20 (1949) (in Russian)
Google Scholar
Alexandrov, A.D.: The surfaces that can be represented by a difference of convex functions. Proc. Acad. Sci. USSR. Ser. Math. Mech. 72(4), 613–616 (1950)
Google Scholar
Bonnans, J.F., Gilbert, J.C., Lemaréchal, C., Sagastizábal, C.A.: Numerical Optimization: Theoretical and Practical Aspects, 2nd edn. Springer, Berlin, Heidelberg (2006)
Google Scholar
Bulatov, V.P.: The approximation method for solving some problems of mathematical programming. Applied Mathematics. Irkutsk State University, pp. 82–88 (1969)
Google Scholar
Calamai, P.H., Vicente, L.N.: Generating quadratic bilevel programming test problems. ACM Trans. Math. Softw. 20, 103–119 (1994)
Article MATH MathSciNet Google Scholar
Colson, B., Marcotte, P., Savard, C.: An overview of bilevel optimization. Ann. Oper. Res. 153(1), 235–256 (2007)
Article MATH MathSciNet Google Scholar
Cottle, R.W., Pang, J.-S., Stone, R.E.: The linear complementarity problem. SIAM, Philadelphia (2009) [Originally published by Academic Press, Boston (1992)]
Book MATH Google Scholar
Dempe, S.: Foundations of Bilevel Programming. Kluwer, Dordrecht (2002)
MATH Google Scholar
Floudas, C.A., Pardalos, P.M. (eds.): Frontiers in Global Optimization. Kluwer, New York (2004)
MATH Google Scholar
Grötschel, M. (ed.): Optimization Stories. Documenta Mathematica, Bielefeld (2012)
Google Scholar
Gruzdeva, T.V., Strekalovsky, A.S., Orlov, A.V., Druzhinina, O.V.: Nonsmooth minimization problems for the difference of two convex functions. Numer. Methods Program. 12(2), 139–151 (2011) (in Russian)
Google Scholar
Hiriart-Urruty, J.-B.: Generalized Differentiability, Duality and optimization for Problems dealing with Diffrence of Convex Functions. Lecture Notes in Economics and Mathematical Systems, vol. 256, pp. 37–69. Springer, Berlin (1985)
Google Scholar
Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms. Springer, Berlin (1993)
Google Scholar
Hoai An, L.T., Tao, P.D.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 23–46 (2005)
Article MATH MathSciNet Google Scholar
Hoai An, L.T., Tao, P.D.: On solving linear complemetarity problems by DC programming and DCA. J. Comput. Optim. Theory Appl. 50(3), 507–524 (2011)
Article MATH Google Scholar
Hoai An, L.T., Tao, P.D., Van Thoai, N., Canh, N.N.: DC programming techniques for solving a class of nonlinear bilevel programs. J. Glob. Optim. 44(3), 313–337 (2009)
Article MATH Google Scholar
Horst R., Tuy H.: Global Optimization: Deterministic Approaches. Springer, Berlin (1993)
Book Google Scholar
Malyshev, A.V., Strekalovsky, A.S.: Connection of some bilevel and nonlinear optimization problems. Russian Math. 55(4), 83–86 (2011)
Article MATH MathSciNet Google Scholar
Malyshev, A.V., Strekalovsky, A.S.: On global search for pessimistic solution in bilevel problems. Int. J. Biomed. Softw Comput. Hum. Sci. (Special Issue on Variational Inequality and Combinatorial Problems) 18(1), 57–61 (2011)
Google Scholar
Mangasarian, O.L.: Equilibrium points in bimatrix games. J. Soc. Ind. Appl. Math. 12, 778–780 (1964)
Article MATH MathSciNet Google Scholar
Mills, H.: Equilibrium points in finite games. J. Soc. Ind. Appl. Math. 8(4), 397–402 (1960)
Article MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, Berlin (2006)
MATH Google Scholar
Orlov, A.V., Strekalovsky A.S.: Numerical search for equilibria in bimatrix games. Comput. Math. Math. Phys. 45(6), 947–960 (2005)
MathSciNet Google Scholar
Pang, J.-S.: Three modelling paradigms in mathematical programming. Math. Program Ser B 125(2), 297–323 (2010)
Article MATH Google Scholar
Pardalos, P.M., Resende, M.G.C. (eds.): Handbook of Applied Optimization. Oxford Univesity Press, New York (2002)
MATH Google Scholar
Petrova, E.G., Strekalovsky, A.S.: The quadratic-linear bilevel problems solving via nonconvex constraint problems. Int. J. Biomed. Softw Comput. Hum. Sci. (Special Issue on Variational Inequality and Combinatorial Problems) 18(1), 63–67 (2011)
Google Scholar
Strekalovsky, A.S.: On problem of global extremum. Proc. USSR Acad. Sci. 292(5), 1062–1066 (1987)
Google Scholar
Strekalovsky, A.S.: Elements of Nonconvex Optimization. Nauka Publication, Novosibirsk (2003) (in Russian)
Google Scholar
Strekalovsky, A.S.: On the minimization of the difference of convex functions on a feasible set. Comput. Math. Math. Phys. 43(3), 399–409 (2003)
MathSciNet Google Scholar
Strekalovsky, A.S.: Minimizing sequences in problems with D.C. constraints. Comput. Math. Math. Phys. 45(3), 418–429 (2005)
Google Scholar
Strekalovsky, A.S., Gruzdeva, T.V.: Local search in problems with nonconvex constraints. Comput. Math. Math. Phys. 47(3), 397–413 (2007)
Article MATH Google Scholar
Strekalovsky, A.S., Orlov, A.V.: A new approach to nonconvex optimization. Numer. Methods Program. 8(2), 11–27 (2007) (in Russian)
Google Scholar
Strekalovsky, A.S., Orlov, A.V.: Bimatrix Games and Bilinear Programming. Physmathlit, Moscow (2007) (in Russian)
Google Scholar
Strekalovsky A.S., Petrova, E.G., Mazurkevich, E.O.: On numerical solving a linear problem of complementarity. Comput. Math. Math. Phys. 49(8), 1318–1331 (2009)
Article MathSciNet Google Scholar
Strekalovsky, A.S., Orlov, A.V., Malyshev, A.V.: Local search in a quadratic-linear bilevel programming problem. Numer. Anal. Appl. 3(1), 59–70 (2010)
Article MathSciNet Google Scholar
Strekalovsky, A.S., Orlov, A.V., Malyshev, A.V.: Numerical solution of a class of bilevel programming problems. Numer. Anal. Appl. 3(2), 165–173 (2010)
Article Google Scholar
Strekalovsky, A.S., Orlov, A.V., Malyshev, A.V.: On computational search for optimistic solutions in bilevel problems. J. Glob. Optim. 48, 159–172 (2010)
Article MATH MathSciNet Google Scholar
Trefethen, L.N., Bau, D.: Numerical Linear Algebra. SIAM, Philadelphia (1997)
Book MATH Google Scholar
Tuy, H.: D.C. optimization: theory, methods and algorithms. In: Horst, R., Pardalos, P.M. (eds.) Handbook of Global Optimization, pp. 149–216. Kluwer, Dordrecht (1995)
Chapter Google Scholar
Vasilyev, F.P.: Optimization Methods. Factorial Press, Moscow (2002) (in Russian)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for System Dynamics and Control Theory of SB RAS, Lermontov St. 134, Irkutsk, 664033, Russia
Alexander S. Strekalovsky

Authors

Alexander S. Strekalovsky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander S. Strekalovsky .

Editor information

Editors and Affiliations

Department of Mathematics, National Technical University of Athens, Athens, Greece
Themistocles M. Rassias
Princeton University Dept. Chemical Engineering, Princeton, New Jersey, USA
Christodoulos A. Floudas
Dept. Industrial & Systems Engineering, Texas A&M University, College Station, Texas, USA
Sergiy Butenko

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Strekalovsky, A.S. (2014). On Solving Optimization Problems with Hidden Nonconvex Structures. In: Rassias, T., Floudas, C., Butenko, S. (eds) Optimization in Science and Engineering. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-0808-0_23

Download citation

DOI: https://doi.org/10.1007/978-1-4939-0808-0_23
Published: 08 May 2014
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-0807-3
Online ISBN: 978-1-4939-0808-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

On Solving Optimization Problems with Hidden Nonconvex Structures

Abstract

Similar content being viewed by others

The Local and Global Searches in Bilevel Problems with a Matrix Game at the Lower Level

On Solving Bilevel Optimization Problems with a Nonconvex Lower Level: The Case of a Bimatrix Game

Global Search for Bilevel Optimization with Quadratic Data

Keywords

1 Introduction

2 Examples of Applied Problems

2.1 Linear Complementarity Problems

2.2 Search for an Equilibrium

2.3 Hierarchical Optimization Problems

Example 1.

2.4 Problems of Financial and Medical Diagnostics

3 Optimization Problems with the Functions of A.D.Alexandrov

4 Global Search Methodology

4.1 Local Search

Theorem 1.

4.2 Global Optimality Conditions

Theorem 2.

Proof.

Theorem 3.

Proof.

4.3 Global Search Methods

Example 2 (Incorrect Classification).

Example 3.

5 Numerical Solution of the Applied Problems

5.1 Linear Complementarity Problem

5.2 Bimatrix Games

Theorem 4.

5.2.1 Classification

Theorem 5.

5.2.2 Local Search

Theorem 6.

5.2.3 Global Search Algorithm

5.2.4 Computational Simulations

5.3 Quadratic-Linear Bilevel Optimization

Theorem 7.

Proposition 1.

5.3.1 Local Search

5.3.2 Global Search

Remark 1.

6 Concluding Remarks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation