Abstract
A canonical d.c. (difference of canonical and convex functions) programming problem is proposed, which can be used to model general global optimization problems in complex systems. It shows that by using canonical duality theory, a large class of nonconvex minimization problems can be equivalently converted to a unified concave maximization problem over a convex domain, which can be solved easily under certain conditions. Additionally, a detailed proof for triality theory is provided, which can be used to identify local extremal solutions. Applications are illustrated and open problems are presented.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Mathematical Modeling and Objectivity
It is known that in Euclidean space every continuous global optimization problem on a compact set can be reformulated as a d.c. optimization problem, i.e., a nonconvex problem which can be described in terms of d.c. functions (difference of convex functions) and d.c. sets (difference of convex sets) [19]. By the fact that any constraint set can be equivalently relaxed by a nonsmooth indicator function , general nonconvex optimization problems can be written in the following standard d.c. programming form
where \(\mathcal{X} = \mathbb {R}^n\), g(x), h(x) are convex proper lower-semicontinuous functions on \(\mathbb {R}^n\), and the d.c. function f(x) to be optimized is usually called the “objective function” in mathematical optimization. A more general model is that g(x) can be an arbitrary function [19]. Clearly, this d.c. programming problem is artificial. Although it can be used to “model” a very wide range of mathematical problems [15] and has been studied extensively during the last thirty years (cf. [16, 18]), it comes at a price: it is impossible to have elegant theory and powerful algorithms for solving this problem without detailed structures on these arbitrarily given functions. As the result, even some very simple d.c. programming problems are considered as NP-hard. This dilemma is mainly due to the existing gap between mathematical optimization and mathematical physics.
The real-world applications show a simple fact, i.e., the functions g(x) and h(x) in the standard d.c. programming problem (1) cannot be arbitrarily given, they must obey certain fundamental laws in physics in order to model real-world systems. In Lagrange mechanics and continuum physics, a real-valued function \(W:\mathscr {X}\rightarrow \mathbb {R}\) is said to be objective if and only if (see [6], Chap. 6)
where \(\mathcal{R}\) is a special rotation group such that \(R^{-1} = R^T, \;\; \det R = 1, \;\; \forall R \in \mathcal{R}\). Based on the original concept of objectivity, a general multi-scale mathematical model was proposed by Gao in [6]:
where \(D: \mathcal{X} \rightarrow \mathcal{Y} \) is a linear operator; \(W: \mathcal{Y} \rightarrow \mathbb {R}\cup \{ + \infty \}\) is an objective function on its effective domain \(\mathscr {Y}_a \subset \mathscr {Y}\), in which, certain physical constraints (such as constitutive laws, etc.) are given; correspondingly, \(F: \mathcal{X} \rightarrow \mathbb {R}\cup \{ -\infty \}\) is a so-called subjective function, which must be linear on its effective domain \(\mathscr {X}_a \subset \mathscr {X}\), wherein, certain “geometrical constraints” (such as boundary/initial conditions, etc.) are given. By Riesz representation theorem, the subjective function can be written as \(F(x) = \langle x, \bar{x}^* \rangle \), where \(\bar{x}^* \in \mathscr {X}^*\) is a given input (or source), the bilinear form \(\langle x, x^* \rangle :\mathscr {X}\times \mathscr {X}^* \rightarrow \mathbb {R}\) puts \(\mathscr {X}\) and \(\mathscr {X}^*\) in duality. Therefore, the extremality condition \(0 \in \partial \varPi (x)\) leads to the equilibrium equation [6]
In this model, the objective duality relation \(y^* \in \partial W(y)\) is governed by the constitutive law, which depends on mathematical modeling of the system; the subjective duality relation \(x^* \in \partial F(x)\) leads to the input \(\bar{x}^* \) of the system, which depends only on each given problem. Thus, the problem \((\mathscr {P})\) can be used to model general real-world applications.
Canonical duality-triality is a breakthrough theory which can be used not only for modeling complex systems within a unified framework, but also for solving real-world problems with a unified methodology. This theory was developed originally from Gao and Strang’s work in nonconvex mechanics [11] and has been applied successfully for solving a large class of challenging problems in both nonconvex analysis/mechanics and global optimization , such as phase transitions in solids [12], post-buckling of large deformed beam [17], nonconvex polynomial minimization problems with box and integer constraints [8, 10, 13], Boolean and multiple integer programming [3, 20], fractional programming [4], mixed integer programming [14], polynomial optimization [9], high-order polynomial with log-sum-exp problem [1].
The goal of this paper is to apply the canonical duality theory for solving the challenging d.c. programming problem (1). The rest of this paper is arranged as follows. Based on the concept of objectivity , a canonical d.c. optimization problem and its canonical dual are formulated in the next section. Analytical solutions and triality theory for a general d.c. minimization problem with sum of nonconvex polynomial and exponential functions are discussed in Sects. 3 and 4. Four special examples are illustrated in Sect. 5. Some conclusions and future work are given in Sect. 6.
2 Canonical D.C. Problem and Its Canonical Dual
It is known that the linear operator \(D:\mathscr {X}\rightarrow \mathscr {Y}\) can’t change the nonconvex W(Dx) to a convex function. According to the definition of the objectivity, a nonconvex function \(W:\mathscr {Y}\rightarrow \mathbb {R}\) is objective if and only if there exists a function \(V:\mathscr {Y}\times \mathscr {Y}\rightarrow \mathbb {R}\) such that \(W(y) = V(y^T y)\). Based on this fact, a canonical transformation was proposed by Gao in 2000 [7].
Definition 1
(Canonical Transformation and Canonical Measure).
For a given nonconvex function \(g:\mathscr {X}\rightarrow \mathbb {R}\cup \{\infty \}\), if there exists a nonlinear mapping \(\varLambda :\mathscr {X}\rightarrow \mathscr {E}\) and a convex, l.s.c function \(V:\mathscr {E}\rightarrow \mathbb {R}\cup \{ \infty \} \) such that
then, the nonlinear transformation (5) is called the canonical transformation and \(\xi = \varLambda (x) \) is called a canonical measure .
The canonical measure \(\xi = \varLambda (x)\) is also called the geometrically admissible measure in the canonical duality theory [7], which is not necessarily to be objective. But the most simple canonical measure in \(\mathbb {R}^n\) is the quadratic function \(\xi = x^T x\), which is clearly objective. Therefore, the canonical function can be viewed as a generalized objective function.
According to the canonical duality theory, the subjective function \(F(x) = \langle x , \bar{x}^* \rangle \) is necessary for any given real-world system in order to have non-trivial solutions (states or outputs). Since the function g(x) in the standard d.c. programming (1) could be nonconvex, it is reasonable to assume the convex function h(x) in (1) is a quadratic function
where \(C: \mathscr {X}\rightarrow \mathscr {X}^*\) is a given symmetrical positive definite operator (or matrix) and \(f\in \mathscr {X}^*\) is a given input. Thus, a canonical d.c. (CDC for short) minimization problem can be proposed as the following
Since the canonical measure \(\xi = \varLambda (x) \in \mathscr {E}\) is nonlinear and \(V(\xi )\) is convex on \(\mathscr {E}\), the composition \(V(\varLambda (x))\) has a higher order nonlinearity than Q(x). Therefore, the coercivity for the target function \(\varPi (x)\) should naturally satisfied, i.e.,
which is a necessary condition for the existence of the global minimal solution to (CDC). Clearly, this generalized d.c. minimization problem can be used to model a reasonably large class of real-world systems.
By the fact that \(V(\xi )\) is convex, l.s.c. on \(\mathscr {E}\), its conjugate can be uniquely defined by the Fenchel transformation
The bilinear form \(\langle \xi ; \xi ^* \rangle \) puts \(\mathscr {E}\) and \(\mathscr {E}^*\) in duality. According to convex analysis (cf. [2]), \(V^*:\mathscr {E}^* \rightarrow \mathbb {R}\cup \{ + \infty \}\) is also convex, l.s.c. on its domain \(\mathscr {E}^*\) and the following generalized canonical duality relations [7] hold on \(\mathscr {E}\times \mathscr {E}^*\)
Replacing \( V(\varLambda (x)) \) in the target function \(\varPi (x)\) by the Fenchel-Young equality \(V(\xi ) = \langle \xi ; \xi ^* \rangle - V^*(\xi ^*)\), Gao and Strang’s total complementary function (see [7]) \(\varXi : \mathscr {X}\rightarrow \mathscr {E}^* \rightarrow \mathbb {R}\cup \{ - \infty \}\) for this (CDC) can be obtained as
By this total complementary function, the canonical dual of \(\varPi (x)\) can be obtained as
where \(Q^\varLambda :\mathscr {E}^* \rightarrow \mathbb {R}\cup \{ - \infty \}\) is the so-called \(\varLambda \)-conjugate of Q(x) defined by (see [7])
If this \(\varLambda \)-conjugate has a non-empty effective domain , the following canonical duality
holds under certain conditions, which will be illustrated in the next section.
3 Application and Analytical Solution
Let us consider a special application in \(\mathbb {R}^n\) such that
where \( \{ A_i \}_{i=1}^p \in \mathbb {R}^{n\times n}\) are symmetric matrices and \(\{B_j \}_{j=1}^r \in \mathbb {R}^{n\times n}\) are symmetric positive definite matrices, \(\alpha _i\) and \(\beta _j\) are real numbers. Clearly, \(g:\mathbb {R}^n \rightarrow \mathbb {R}\) is nonconvex and highly nonlinear. This type of nonconvex function covers many real applications.
The canonical measure in this application can be given as
where \(m=p+r\). Therefore, a canonical function can be defined on \(\mathscr {E}_a\):
where
Here \(\theta _i\) and \(\eta _j\) denote the ith component of \(\theta \) and the jth component of \(\eta \), respectively. Since \(V_1(\theta )\) and \(V_2(\eta )\) are convex, \(V(\xi )\) is a convex function . By Legendre transformation, we have the following equation
where
and \(V^*(\zeta )\) is the conjugate function of \(V(\xi )\), defined as
with
where \(\beta =\{\beta _j\}\).
Since the canonical measure in this application is a quadratic operator, the total complementary function \(\Xi : \mathbb {R}^n\times \mathscr {E}_a^*\rightarrow \mathbb {R}\) has the following form
where
Notice that for any given \(\zeta \), the total complementary function \(\Xi (x,\zeta )\) is a quadratic function of x and its stationary points are the solutions of the following equation
If \(\det (G(\zeta ))\ne 0\) for a given \(\zeta \), then (19) can be solved analytically to have a unique solution \(x= G(\zeta )^{{-1}}f\). Let
Thus, on \(\mathscr {S}_a\) the canonical dual function \(\Pi ^d(\zeta )\) can then be written explicitly as
Clearly, both \(\Pi ^d(\zeta )\) and its domain \(\mathscr {S}_a\) are nonconvex. The canonical dual problem is to find all stationary points of \(\Pi ^d(\zeta )\) on its domain, i.e.,
Theorem 1.
(Analytic Solution and Complementary-Dual Principle).
Problem (\(\mathscr {P}^d\)) is canonical dual to the problem (\(\mathscr {P}\)) in the sense that if \(\bar{\zeta }\in \mathscr {S}_a\) is a stationary point of \(\Pi ^d(\zeta )\), then
is a stationary point of \(\Pi (x)\), the pair \((\bar{x},\bar{\zeta })\) is a stationary point of \(\Xi (x,\zeta )\), and we have
The proof of this theorem is analogous with that in [6]. Theorem 1 shows that there is no duality gap between the primal problem (\(\mathscr {P}\)) and the canonical dual problem (\(\mathscr {P}^d\)).
4 Triality Theory
In this section we will study the global optimality conditions for the critical solutions of the primal and dual problems. In order to identify both global and local extrema of both two problems, we let
where \(G\succ 0\) means that G is a positive definite matrix and where \(G\prec 0\) means that G is a negative definite matrix. It is easy to prove that both \(\mathcal{S}_a^+\) and \(\mathcal{S}_a^-\) are convex sets and
This shows that \(\mathscr {S}_a^+\) is an effective domain of \(Q^\varLambda (\zeta )\).
For convenience, we first give the first and second derivatives of functions \(\Pi (x)\) and \(\Pi ^d(\zeta )\):
where \(Z_0,Z\in \mathbb {R}^{n\times m}\) and \(H\in \mathbb {R}^{m\times m}\) are defined as
where \(E_n\) is a \(n\times n\) identity matrix. By the fact that \( \tau > 0\), the matrix \(H^{-1}\) is positive definite.
Next we can get the lemma as follows whose proof is trivial.
Lemma 1.
If \(M_1,M_2,\ldots ,M_N\in \mathbb {R}^{n\times n}\) are symmetric positive semi-definite matrices, then \(M=M_1+M_2+\ldots +M_N\) is also a positive semi-definite matrix.
Lemma 2.
If \(\lambda _{G}\) is an arbitrary eigenvalue of G, it follows that
in which \(\lambda ^{A_i}_{min}\) is the smallest eigenvalue of \(A_i\), \(\lambda ^{C_i}_{max}\) is the largest eigenvalue of \(C_i\), and
where \(\lambda ^{B_j}_{min}\) and \(\lambda ^{B_j}_{max}\) are the smallest eigenvalue and the largest eigenvalue of \(B_j\) respectively.
Proof.
Firstly, we need prove \(\tau _i(A_i-\lambda ^{A_i}_{min}E_n)\), \(\lambda ^{C}_{max}E_n-C\) and \(\sigma _j(B_j-\bar{\lambda }^{B_j}E_n)\) are all symmetric positive semi-definite matrices.
-
(a)
As \(\lambda ^{A_i}_{min}\) is the smallest eigenvalue of \(A_i\), then \(A_i-\lambda ^{A_i}_{min}E_n\) is symmetric positive semi-definite, so \(\tau _i(A_i-\lambda ^{A_i}_{min}E_n)\) is symmetric positive semi-definite with \(\tau _i=\exp \left( \theta _i-\alpha _i\right) >0\).
-
(b)
As \(\lambda ^{C}_{max}\) is the largest eigenvalue of C, then \(\lambda ^{C}_{max}E_n-C\) is a symmetric positive semi-definite matrix.
-
(c)
-
(c.1)
As \(\lambda ^{B_j}_{min}\) is the smallest eigenvalue of \(B_j\), then \(B_j-\lambda ^{B_j}_{min}E_n\) is symmetric positive semi-definite, so when \(\sigma _j> 0\) it holds that \(\sigma _j(B_j-\lambda ^{B_j}_{min}E_n)\) is symmetric positive semi-definite.
-
(c.2)
As \(\lambda ^{B_j}_{max}\) is the largest eigenvalue of \(B_j\), then \(B_j-\lambda ^{B_j}_{max}E_n\) is symmetric negative semi-definite, so when \(\sigma _j\le 0\) it holds that \(\sigma _j(B_j-\lambda ^{B_j}_{max}E_n)\) is symmetric positive semi-definite.
From (c.1) and (c.2), we know \(\sigma _j(B_j-\bar{\lambda }^{B_j}E_n)\) is always symmetric positive semi-definite.
-
(c.1)
Then by (a), (b), (c) and Lemma 1, we have
is a positive semi-definite matrix, which is equivalent to
is a positive semi-definite matrix, which implies that for every eigenvalue of G, it is greater than or equal to \(\sum _{i=1}^p\tau _i \lambda ^{A_i}_{min}+\sum _{j=1}^r\sigma _j\bar{\lambda }^{B_j}-\lambda ^{C}_{max}\). \(\square \)
Based on the above lemma, the following assumption is given for the establishment of solution method.
Assumption 1
There is a critical point \(\zeta =(\tau ,\sigma )\) of \(\Pi ^d(\zeta )\), satisfying \(\varDelta >0\) where
Lemma 3.
If \(\bar{\zeta }\) is a stationary point of \(\varPi ^d(\zeta ) \) satisfying Assumption 1, then \(\bar{\zeta }\in \mathscr {S}_a^+\).
Proof
From Lemma 3, we know if \(\lambda _{G}\) is an arbitrary eigenvalue of G, it holds that \(\lambda _{G}\ge \varDelta \). If \(\bar{\zeta }\) is a critical point satisfying Assumption 1, then \(\varDelta >0\), so for every eigenvalue of G, we have \(\lambda _{G}\ge \varDelta >0\), then G is a positive definite matrix, i.e., \(\bar{\zeta }\in \mathscr {S}_a^+\). \(\square \)
The following lemma is needed here. Its proof is omitted, which is similar to that of Lemma 6 in [5].
Lemma 4.
Suppose that \( P\in \mathbb {R}^{n\times n}\), \( U\in \mathbb {R}^{m\times m}\) and \( W\in \mathbb {R}^{n\times m}\) are given symmetric matrices with
where \( P_{11}\), \( U_{11}\) and \( W_{11}\) are \(r\times r\)-dimensional matrices, and \( W_{11}\) is nonsingular. Then,
Now, we give the main result of this paper, triality theorem , which illustrates the relationships between the primal and canonical dual problems on global and local solutions under Assumption 1.
Theorem 2.
(Triality Theorem) Suppose that \(\bar{\zeta }\) is a critical point of \(\Pi ^d(\zeta )\), and \(\bar{x}=G(\bar{\zeta })^{{-1}}f\).
-
1.
Min–max duality: If \(\bar{\zeta }\) is the critical point satisfying Assumption 1, then the canonical min–max duality holds in the form of
$$\begin{aligned} \Pi (\bar{x}) =\min _{x\in \mathbb {R}^n} \Pi (x)=\max _{\zeta \in \mathscr {S}_a^+} \Pi ^d(\zeta )=\Pi ^d(\bar{\zeta }). \end{aligned}$$(32) -
2.
Double-max duality: If \(\bar{\zeta }\in \mathscr {S}_a^-\), the double-max duality holds in the form that if \(\bar{x}\) is a local maximizer of \(\Pi (x)\) or \(\bar{\zeta }\) is a local maximizer of \(\Pi ^d(\zeta )\), we have
$$\begin{aligned} \Pi (\bar{x}) =\max _{x\in \mathscr {X}_0} \Pi (x)=\max _{\zeta \in \mathscr {S}_0} \Pi ^d(\zeta )=\Pi ^d(\bar{\zeta }) \end{aligned}$$(33)where \(\bar{x}\in \mathscr {X}_0 \subset \mathbb {R}^n\) and \(\bar{\zeta }\in \mathscr {S}_0 \subset \mathscr {S}_a^-\).
-
3.
Double-min duality: If \(\bar{\zeta }\in \mathcal{S}_a^-\), then the double-min duality holds in the form that when \(m=n\), if \(\bar{x}\) is a local minimizer of \(\Pi (x)\) or \(\bar{\zeta }\) is a local minimizer of \(\Pi ^d(\zeta )\), we have
$$\begin{aligned} \Pi (\bar{x}) =\min _{x\in \mathscr {X}_0} \Pi (x)=\min _{\zeta \in \mathscr {S}_0} \Pi ^d(\zeta )=\Pi ^d(\bar{\zeta }) \end{aligned}$$(34)where \(\bar{x}\in \mathscr {X}_0 \subset \mathbb {R}^n\) and \(\bar{\zeta }\in \mathscr {S}_0 \subset \mathscr {S}_a^-\).
Proof
-
1.
Because \(\bar{\zeta }\) is a critical point satisfying Assumption 1, by Lemma 4 it holds \(\bar{\zeta }\in \mathscr {S}_a^+\), i.e., \(G(\bar{\zeta })\succ 0\). As \(G(\bar{\zeta })\succ 0\) and \( H\succ 0\), by (29) we know the Hessian of the dual function is negative definitive, i.e., \(\nabla ^2\Pi ^d(\zeta )\prec 0\), which implies that \(\Pi ^d(\zeta )\) is strictly concave over \(\mathscr {S}_a^+\). Hence, we get
$$ \Pi ^d(\bar{\zeta })=\max _{\zeta \in \mathcal{S}_a^+} \Pi ^d(\zeta ). $$By the convexity of \(V(\xi )\), we have \(V(\xi )-V(\bar{\xi })\ge (\xi -\bar{\xi })^T\nabla V(\bar{\xi })=(\xi -\bar{\xi })^T\bar{\zeta }\) (see [11]), so
$$V(\varLambda (x))-V(\varLambda (\bar{x}))\ge (\varLambda (x)-\varLambda (\bar{x}))^T\bar{\zeta },$$which implies
$$\begin{aligned} \Pi (x)-\Pi (\bar{x})\ge & {} (\varLambda (x)-\varLambda (\bar{x}))^T\bar{\zeta }-\frac{1}{2}x^T C x+\frac{1}{2}\bar{x}^T C \bar{x}+f^T(x-\bar{x})\nonumber \\= & {} \frac{1}{2}x^T G(\bar{\zeta })x-\frac{1}{2}\bar{x}^T G(\bar{\zeta })\bar{x}-(x-\bar{x})^TG(\bar{\zeta })\bar{x}, \end{aligned}$$(35)Because \(G(\bar{\zeta })\succ 0\), the convexity of \(\frac{1}{2}x^T G(\bar{\zeta })x\) with respect to x in \(\mathbb {R}^n\) leads to
$$\frac{1}{2}x^T G(\bar{\zeta })x-\frac{1}{2}\bar{x}^T G(\bar{\zeta })\bar{x}\ge (x-\bar{x})^TG(\bar{\zeta })\bar{x}$$Then by (35), \(\Pi (x)\ge \Pi (\bar{x})\) for any \(x\in \mathbb {R}^n\), which with Theorem 1 and (4) shows that the Eq. (32) is true.
-
2.
If \(\bar{\zeta }\) is a local maximizer of \(\Pi ^d(\zeta )\) over \(\mathscr {S}_a^-\), it is true that \(\nabla ^2\Pi ^d(\bar{\zeta })=- Z^T G^{-1}Z- H^{-1}\preceq 0\) and there exists a neighborhood \(\mathscr {S}_0\subset \mathscr {S}_a^-\) such that for all \(\zeta \in \mathscr {S}_0\), \(\nabla ^2\Pi ^d(\zeta )\preceq 0\). Since the map \(x= G^{{-1}}f\) is continuous over \(\mathcal{S}_a\), the image of the map over \(\mathscr {S}_0\) is a neighborhood of \(\bar{x}\), which is denoted by \(\mathscr {X}_0\). Now we prove that for any \(x\in \mathscr {X}_0\), \(\nabla ^2\Pi (x)\preceq 0\), which plus the fact that \(\bar{x}\) is a critical point of \(\Pi (x)\) implies \(\bar{x}\) is a maximizer of \(\Pi (x)\) over \(\mathscr {X}_0\). By singular value decomposition, there exist orthogonal matrices \( J\in \mathbb {R}^{n\times n}\), \( K\in \mathbb {R}^{m\times m}\) and \( R\in \mathbb {R}^{n\times m}\) with
$$\begin{aligned} R_{ij}= \left\{ \begin{array}{ll} \delta _i, &{}~~ i=j \text { and }i=1,\ldots ,r,\\ 0,&{}~~\text {otherwise}, \end{array}\right. \end{aligned}$$(36)where \(\delta _i>0\) for \(i=1,\ldots ,r\) and \(r={\text {rank}}( F)\), such that \(Z H^{\frac{1}{2}}= J R K\), then
$$\begin{aligned} Z = J R K H^{-\frac{1}{2}}. \end{aligned}$$(37)For any \(x\in \mathscr {X}_0\), let \(\zeta \) be a point satisfying \(x= G^{{-1}}f\). Therefore, \(\nabla ^2\Pi ^d(\zeta )=- Z^T G^{-1}Z- H^{-1}\preceq 0\), then it holds that
$$\begin{aligned} - H^{-\frac{1}{2}} K^T R^T J^T G^{-1}J R K H^{-\frac{1}{2}}-H^{-1}\preceq 0. \end{aligned}$$(38)Multiplying above inequality by \( K H^{\frac{1}{2}}\) from the left and \( H^{\frac{1}{2}} K^T\) from the right, it can be obtained that
$$\begin{aligned} - R^T J^T G^{-1}J R-E_m\preceq 0, \end{aligned}$$(39)which, by Lemma 4, is further equivalent to
$$\begin{aligned} J^T G J+ R R^T\preceq 0, \end{aligned}$$(40)then it follows that
$$\begin{aligned} -G\succeq J R R^T J^T=J R K H^{-\frac{1}{2}} H H^{-\frac{1}{2}} K^T R^T J^T= Z H Z^T. \end{aligned}$$(41)Thus, \(\nabla ^2\Pi (x)= G+ Z H Z^T \preceq 0\), then \(\bar{x}\) is a maximizer of \(\Pi (x)\) over \(\mathscr {X}_0\). Similarly, we can prove that if \(\bar{x}\) is a maximizer of \(\Pi (x)\) over \(\mathscr {X}_0\), then \(\bar{\zeta }\) is a maximizer of \(\Pi ^d(\zeta )\) over \(\mathscr {S}_0\). By the Theorem 1, the Eq. (33) is proved.
-
3.
Now we prove the double-min duality. Suppose that \(\bar{\zeta }\) is a local minimizer of \(\Pi ^d(\zeta )\) in \(\mathscr {S}_a^-\), then there exists a neighborhood \(\mathscr {S}_0\subset \mathscr {S}_a^-\) of \(\bar{\zeta }\) such that for any \(\zeta \in \mathscr {S}_0\), \(\nabla ^2\Pi ^d(\zeta )\succeq 0\). Let \(\mathscr {X}_0\) denote the image of the map \(x= G^{{-1}}f\) over \(\mathscr {S}_0\), which is a neighborhood of \(\bar{x}\). For any \(x\in \mathscr {X}_0\), let \(\zeta \) be a point that satisfies \(x= G^{{-1}}f\). It follows from \(\nabla ^2\Pi ^d(\zeta )=- Z^T G^{-1}Z- H^{-1}\succeq 0\) that \(- Z^T G^{-1}Z\succeq H^{-1}\succ 0\), which implies the matrix F is invertible. Then it is true that
$$\begin{aligned} - G^{-1}\succeq ( Z^T)^{-1} H^{-1} Z^{-1}, \end{aligned}$$(42)which is further equivalent to
$$\begin{aligned} - G\preceq Z H Z^T. \end{aligned}$$(43)Thus, \(\nabla ^2\Pi (x)= G+ Z H Z^T\succeq 0\) and x is a local minimizer of \(\Pi (x)\). The converse can be proved similarly. By Theorem 1, the Eq. (34) is then true.
The theorem is proved. \(\square \)
5 Examples
In this section, let \(p=r=1\). From the definition of (CDC) problem, \(A_1\) is a symmetric matrix, \(B_1\) and \(C_1\) are two positive definite matrices. According to different cases of \(A_1\), following five motivating examples are provided to illustrate the proposed canonical duality method in our paper. By examining the critical points of the dual function, we will show how the dualities in the triality theory are verified by these examples.
Example 1
We consider the case that \(A_1\) is positive definite. Let \(\alpha _1=\beta _1=1\) and
then the primal problem:
The corresponding canonical dual function is
so there is no duality gap , then \((\bar{x}_1,\bar{y}_1)\) is the global solution of the primal problem, which demonstrates the min–max duality(see Fig. 1).
Example 2
We consider the case that \(A_1\) is negative definite. Let \(\alpha _1=-4\), \(\beta _2=0.5\) and
then the primal problem:
The corresponding canonical dual function is
In this problem, \(\lambda ^{A_1}_{min}=-1.5\), \(\lambda ^{B_1}_{min}=1\), \(\lambda ^{B_1}_{max}=2\), and \(\lambda ^{C_1}_{max}=3\). It is noticed that \((\bar{\tau }_1,\bar{\sigma }_1)=(0.145563,3.95352)\) is a critical point of the dual function \(\Pi ^d(\tau ,\sigma )\)(see Fig. 2a). As \(\bar{\sigma }_1>0\), we have \(\bar{\lambda }^{B_1}=\lambda ^{B_1}_{min}\) and
so Assumption 1 is satisfied, then \((\bar{\tau }_1,\bar{\sigma }_1)\) is in \(\mathcal{S}_a^+\). By Theorem 1, we get \((\bar{x}_1,\bar{y}_1)=(0.867833, 2.72044)\). Moreover, we have
so there is no duality gap, then \((\bar{x}_1,\bar{y}_1)\) is the global solution of the primal problem, which demonstrates the min–max duality(see Fig. 2).
For showing the double-max duality of Example 2, we find a local maximum point of \(\Pi ^d(\tau ,\sigma )\) in \(\mathscr {S}_a^-\): \((\bar{\tau }_2,\bar{\sigma }_2)=(54.3685,-0.492123)\). By Theorem 1, we get \((\bar{x}_2,\bar{y}_2)=(-0.0871798, -0.023517)\). Moreover, we have
and \((\bar{x}_2,\bar{y}_2)\) is also a local maximum point of \(\Pi (x,y)\), which demonstrates the double-max duality(see Fig. 3).
Example 3
We consider the case that \(A_1\) is indefinite. Let \(\alpha _1=\beta _1=1\) and
then the primal problem:
The corresponding canonical dual function is
In this problem, \(\lambda ^{A_1}_{min}=-2\), \(\lambda ^{B_1}_{min}=\lambda ^{B_1}_{max}=1\), and \(\lambda ^{C_1}_{max}=1.5\). It is noticed that \((\bar{\tau }_1,\bar{\sigma }_1)=(0.143473,1.91093)\) is a critical point of the dual function \(\Pi ^d(\tau ,\sigma )\)(see Fig. 4a). As \(\bar{\sigma }_1>0\), we have \(\bar{\lambda }^{B_1}=\lambda ^{B_1}_{min}\) and
so Assumption 1 is satisfied, then \((\bar{\tau }_1,\bar{\sigma }_1)\) is in \(\mathcal{S}_a^+\). By Theorem 1, we get \((\bar{x}_1,\bar{y}_1)=(1.80375, 1.60261)\). Moreover, we have
so there is no duality gap, then \((\bar{x}_1,\bar{y}_1)\) is the global solution of the primal problem, which demonstrates the min–max duality(see Fig. 4).
For showing the double-max duality of Example 3, we find a local maximum point of \(\Pi ^d(\tau ,\sigma )\) in \(\mathscr {S}_a^-\): \((\bar{\tau }_2,\bar{\sigma }_2)=(0.358833,-0.785507)\). By Theorem 1, we get \((\bar{x}_2,\bar{y}_2)=(-0.519029, -0.399493)\). Moreover, we have
and \((\bar{x}_2,\bar{y}_2)\) is also a local maximum point of \(\Pi (x,y)\), which demonstrates the double-max duality(see Fig. 5).
Example 4
We also consider the case that \(A_1\) is indefinite. Let \(\alpha _1=1\), \(\beta _1=2\) and
then the primal problem:
The corresponding canonical dual function is
In this problem, \(\lambda ^{A_1}_{min}=-3\), \(\lambda ^{B_1}_{min}=\lambda ^{B_1}_{max}=1\), and \(\lambda ^{C_1}_{max}=4.4\). It is noticed that \((\bar{\tau }_1,\bar{\sigma }_1)=(0.0612941,4.67004)\) is a critical point of the dual function \(\Pi ^d(\tau ,\sigma )\)(see Fig. 6a). As \(\bar{\sigma }_1>0\), we have \(\bar{\lambda }^{B_1}=\lambda ^{B_1}_{min}\) and
so Assumption 1 is satisfied, then \((\bar{\tau }_1,\bar{\sigma }_1)\) is in \(\mathcal{S}_a^+\). By Theorem 1, we get \((\bar{x}_1,\bar{y}_1)=(2.05695, 3.01812)\). Moreover, we have
so there is no duality gap, then \((\bar{x}_1,\bar{y}_1)\) is the global solution of the primal problem, which demonstrates the min–max duality(see Fig. 6).
For showing the double-max duality of Example 4, we find a local maximum point of \(\Pi ^d(\tau ,\sigma )\) in \(\mathscr {S}_a^-\): \((\bar{\tau }_2,\bar{\sigma }_2)=(0.361948,-1.97615)\). By Theorem 1, we get \((\bar{x}_2,\bar{y}_2)=(-0.141603, -0.166273)\). Moreover, we have
and \((\bar{x}_2,\bar{y}_2)\) is also a local maximum point of \(\Pi (x,y)\), which demonstrates the double-max duality(see Fig. 7).
For showing the double-min duality of Example 4, we find a local minimum point of \(\Pi ^d(\tau ,\sigma )\) in \(\mathscr {S}_a^-\): \((\bar{\tau }_3,\bar{\sigma }_3)=(0.149286,3.90584)\). By Theorem 1, we get \((\bar{x}_3,\bar{y}_3)=(-1.84496, -2.89962)\). Moreover, we have
and \((\bar{x}_3,\bar{y}_3)\) is also a a local minimum point of \(\Pi (x,y)\), which demonstrates the double-min duality(see Fig. 8).
From above double-min duality in Example 4, we can find our proposed canonical dual method can avoids a local minimum point \((\bar{x}_3,\bar{y}_3)\) of the primal problem. In fact, by the canonical dual method, the global solution is obtained, so any local minimum point is avoided. For instance, the point \((1.29672,-2.09209)\) is a local minimum point of the primal problem in Example 2 (see Fig. 9a), and the minimum value is -3.98411, but our proposed canonical dual method obtains the global minimum value -13.6736; the point \((1.88536,-1.10196)\) is a local minimum point of the primal problem in Example 3 (see Fig. 9b), and the minimum value is −2.45219, but our proposed canonical dual method obtains the global minimum value −22.6111.
6 Conclusions
Based on the original definition of objectivity in continuum physics, a canonical d.c. optimization problem is proposed, which can be used to model general nonconvex optimization problems in complex systems. Detailed application is provided by solving a challenging problem in \(\mathbb {R}^n\). By the canonical duality theory, this nonconvex problem is able to reformulated as a concave maximization dual problem in convex domain. A detailed proof for the triality theory is provided under a reasonable assumption. This theory can be used to identify both global and local extrema, and to develop a powerful algorithm for solving this general d.c. optimization problem. Several examples are given to illustrate detailed situations. All these examples support the Assumption 1. However, we should emphasize that this assumption is only a sufficient condition for the existence of a canonical dual solution in \(\mathscr {S}_a^+\). How to relax this assumption and to obtain a necessary condition for \(\mathscr {S}_a^+ \ne \emptyset \) are open questions and deserve detailed study.
References
Chen, Y., Gao, D.Y.: Global solutions to nonconvex optimization of 4th-order polynomial and log-sum-exp functions. J. Global Optim. 64(3), 1–15 (2016)
Ekeland, I., Temam, R.: Convex Analysis and Variational Problems. North-Holland, Amsterdam (1976)
Fang, S.-C., Gao, D.Y., Sheu, R.-L., Wu, S.Y.: Canonical dual approach for solving 0–1 quadratic programming problems. J. Ind. Manag. Optim. 4(1), 125–142 (2007)
Fang, S.-C., Gao, D.Y., Sheu, R.-L., Xing, W.X.: Global optimization for a class of fractional programming problems. J. Global Optim. 45(3), 337–353 (2009)
Gao, D., Wu, C.: On the triality theory for a quartic polynomial optimization problem. J. Ind. Manag. Optim. 8, 229–242 (2012)
Gao, D.Y.: Duality Principles in Nonconvex Systems: Theory, Methods, and Applications. Kluwer, Dordrecht (2000)
Gao, D.Y.: Canonical dual transformation method and generalized triality theory in nonsmooth global optimization. J. Global Optim. 17(1/4), 127–160 (2000)
Gao, D.Y.: Sufficient conditions and perfect duality in nonconvex minimization with inequality constraints. J. Ind. Manag. Optim. 1(1), 59–69 (2005)
Gao, D.Y.: Complete solutions and extremality criteria to polynomial optimization problems. J. Global Optim. 35, 131–143 (2006)
Gao, D.Y.: Solutions and optimality to box constrained nonconvex minimization problems. J. Ind. Manag. Optim. 3(2), 293–304 (2007)
Gao, D.Y., Strang, G.: Geometric nonlinearity: potential energy, complementary energy, and the gap function. Q. J. Appl. Math. 47(3), 487–504 (1989)
Gao, D.Y., Yu, H.F.: Multi-scale modelling and canonical dual finite element method in phase transitions of solids. Int. J. Solids Struct. 45, 3660–3673 (2008)
Gao, D.Y., Ruan, N.: Solutions to quadratic minimization problems with box and integer constraints. J. Global Optim. 47(3), 463–484 (2010)
Gao, D.Y., Ruan, N., Sherali, H.D.: Canonical duality solutions for fixed cost quadratic program. Optim. Optimal Control 39, 139–156 (2010)
Hiriart-Urruty, J.B.: Generalized differentiability, duality and optimization for problems dealing with differences of convex functions. Lecture Notes in Economics and Mathematical Systems, vol. 256, pp. 37–70. Springer, Berlin (1985)
Horst, R., Thoai, N.V.: DC Programming: overview. J. Opt. Theory Appl. 103, 1–43 (1999)
Santos, H.A.F.A., Gao, D.Y.: Canonical dual finite element method for solving post-buckling problems of a large deformation elastic beam. Int. J. Nonlinear Mech. 47, 240–247 (2011)
Tao, P.D., An, L.T.H.: Recent advances in DC programming and DCA. Trans. Comput. Collect. Intell. 13, 1–37 (2014)
Tuy, H.: D.C. Optimization: Theory, Methods and Algorithms. In: Horst, R., Pardalos, P.M. (eds.) Handbook of Global Optimization, pp. 149–216. Kluwer Academic Publishers, Dordrecht (1995)
Wang, Z., Fang, S.-C., Gao, D.Y., Xing, W.: Global extremal conditions for multi-integer quadratic programming. J. Ind. Manag. Optim. 4(2), 213–225 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Jin, Z., Gao, D.Y. (2017). On D.C. Optimization Problems. In: Gao, D., Latorre, V., Ruan, N. (eds) Canonical Duality Theory. Advances in Mechanics and Mathematics, vol 37. Springer, Cham. https://doi.org/10.1007/978-3-319-58017-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-58017-3_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58016-6
Online ISBN: 978-3-319-58017-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)