Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Mathematical Modeling and Objectivity

It is known that in Euclidean space every continuous global optimization problem on a compact set can be reformulated as a d.c. optimization problem, i.e., a nonconvex problem which can be described in terms of d.c. functions (difference of convex functions) and d.c. sets (difference of convex sets) [19]. By the fact that any constraint set can be equivalently relaxed by a nonsmooth indicator function , general nonconvex optimization problems can be written in the following standard d.c. programming form

$$\begin{aligned} \min \{ f(x)= g(x)-h(x) \; | \;\; \forall {x\in \mathcal{X}} \}, \end{aligned}$$
(1)

where \(\mathcal{X} = \mathbb {R}^n\), g(x), h(x) are convex proper lower-semicontinuous functions on \(\mathbb {R}^n\), and the d.c. function f(x) to be optimized is usually called the “objective function” in mathematical optimization. A more general model is that g(x) can be an arbitrary function [19]. Clearly, this d.c. programming problem is artificial. Although it can be used to “model” a very wide range of mathematical problems [15] and has been studied extensively during the last thirty years (cf. [16, 18]), it comes at a price: it is impossible to have elegant theory and powerful algorithms for solving this problem without detailed structures on these arbitrarily given functions. As the result, even some very simple d.c. programming problems are considered as NP-hard. This dilemma is mainly due to the existing gap between mathematical optimization and mathematical physics.

The real-world applications show a simple fact, i.e., the functions g(x) and h(x) in the standard d.c. programming problem (1) cannot be arbitrarily given, they must obey certain fundamental laws in physics in order to model real-world systems. In Lagrange mechanics and continuum physics, a real-valued function \(W:\mathscr {X}\rightarrow \mathbb {R}\) is said to be objective if and only if (see [6], Chap. 6)

$$\begin{aligned} W(x) = W(Rx) \;\; \forall x \in \mathscr {X}, \;\; \forall R \in \mathcal{R}, \end{aligned}$$
(2)

where \(\mathcal{R}\) is a special rotation group such that \(R^{-1} = R^T, \;\; \det R = 1, \;\; \forall R \in \mathcal{R}\). Based on the original concept of objectivity, a general multi-scale mathematical model was proposed by Gao in [6]:

$$\begin{aligned} (\mathscr {P}):~~~~\inf \{ \varPi (x)= W(Dx) - F(x) \; | \;\; \forall {x \in \mathcal{X}} \}, \end{aligned}$$
(3)

where \(D: \mathcal{X} \rightarrow \mathcal{Y} \) is a linear operator; \(W: \mathcal{Y} \rightarrow \mathbb {R}\cup \{ + \infty \}\) is an objective function on its effective domain \(\mathscr {Y}_a \subset \mathscr {Y}\), in which, certain physical constraints (such as constitutive laws, etc.) are given; correspondingly, \(F: \mathcal{X} \rightarrow \mathbb {R}\cup \{ -\infty \}\) is a so-called subjective function, which must be linear on its effective domain \(\mathscr {X}_a \subset \mathscr {X}\), wherein, certain “geometrical constraints” (such as boundary/initial conditions, etc.) are given. By Riesz representation theorem, the subjective function can be written as \(F(x) = \langle x, \bar{x}^* \rangle \), where \(\bar{x}^* \in \mathscr {X}^*\) is a given input (or source), the bilinear form \(\langle x, x^* \rangle :\mathscr {X}\times \mathscr {X}^* \rightarrow \mathbb {R}\) puts \(\mathscr {X}\) and \(\mathscr {X}^*\) in duality. Therefore, the extremality condition \(0 \in \partial \varPi (x)\) leads to the equilibrium equation [6]

$$\begin{aligned} 0 \in D^* \partial W(Dx) - \partial F (x) \;\; \Leftrightarrow \;\; D^* y^* - x^* = 0 \;\; \forall x^* \in \partial F(x), \;\; y^* \in \partial W(y). \end{aligned}$$
(4)

In this model, the objective duality relation \(y^* \in \partial W(y)\) is governed by the constitutive law, which depends on mathematical modeling of the system; the subjective duality relation \(x^* \in \partial F(x)\) leads to the input \(\bar{x}^* \) of the system, which depends only on each given problem. Thus, the problem \((\mathscr {P})\) can be used to model general real-world applications.

Canonical duality-triality is a breakthrough theory which can be used not only for modeling complex systems within a unified framework, but also for solving real-world problems with a unified methodology. This theory was developed originally from Gao and Strang’s work in nonconvex mechanics [11] and has been applied successfully for solving a large class of challenging problems in both nonconvex analysis/mechanics and global optimization , such as phase transitions in solids [12], post-buckling of large deformed beam [17], nonconvex polynomial minimization problems with box and integer constraints [8, 10, 13], Boolean and multiple integer programming [3, 20], fractional programming [4], mixed integer programming [14], polynomial optimization [9], high-order polynomial with log-sum-exp problem [1].

The goal of this paper is to apply the canonical duality theory for solving the challenging d.c. programming problem (1). The rest of this paper is arranged as follows. Based on the concept of objectivity , a canonical d.c. optimization problem and its canonical dual are formulated in the next section. Analytical solutions and triality theory for a general d.c. minimization problem with sum of nonconvex polynomial and exponential functions are discussed in Sects. 3 and 4. Four special examples are illustrated in Sect. 5. Some conclusions and future work are given in Sect. 6.

2 Canonical D.C. Problem and Its Canonical Dual

It is known that the linear operator \(D:\mathscr {X}\rightarrow \mathscr {Y}\) can’t change the nonconvex W(Dx) to a convex function. According to the definition of the objectivity, a nonconvex function \(W:\mathscr {Y}\rightarrow \mathbb {R}\) is objective if and only if there exists a function \(V:\mathscr {Y}\times \mathscr {Y}\rightarrow \mathbb {R}\) such that \(W(y) = V(y^T y)\). Based on this fact, a canonical transformation was proposed by Gao in 2000 [7].

Definition 1

(Canonical Transformation and Canonical Measure).

For a given nonconvex function \(g:\mathscr {X}\rightarrow \mathbb {R}\cup \{\infty \}\), if there exists a nonlinear mapping \(\varLambda :\mathscr {X}\rightarrow \mathscr {E}\) and a convex, l.s.c function \(V:\mathscr {E}\rightarrow \mathbb {R}\cup \{ \infty \} \) such that

$$\begin{aligned} g(x) = V( \varLambda (x)), \end{aligned}$$
(5)

then, the nonlinear transformation (5) is called the canonical transformation and \(\xi = \varLambda (x) \) is called a canonical measure .

The canonical measure \(\xi = \varLambda (x)\) is also called the geometrically admissible measure in the canonical duality theory [7], which is not necessarily to be objective. But the most simple canonical measure in \(\mathbb {R}^n\) is the quadratic function \(\xi = x^T x\), which is clearly objective. Therefore, the canonical function can be viewed as a generalized objective function.

According to the canonical duality theory, the subjective function \(F(x) = \langle x , \bar{x}^* \rangle \) is necessary for any given real-world system in order to have non-trivial solutions (states or outputs). Since the function g(x) in the standard d.c. programming (1) could be nonconvex, it is reasonable to assume the convex function h(x) in (1) is a quadratic function

$$\begin{aligned} Q(x) = \frac{1}{2} \langle x , C x \rangle + \langle x, f\rangle , \end{aligned}$$
(6)

where \(C: \mathscr {X}\rightarrow \mathscr {X}^*\) is a given symmetrical positive definite operator (or matrix) and \(f\in \mathscr {X}^*\) is a given input. Thus, a canonical d.c. (CDC for short) minimization problem can be proposed as the following

$$\begin{aligned} (CDC): \;\; \min \left\{ \varPi (x) = V(\varLambda (x) ) - Q (x) | \;\; x \in \mathscr {X}\right\} \end{aligned}$$
(7)

Since the canonical measure \(\xi = \varLambda (x) \in \mathscr {E}\) is nonlinear and \(V(\xi )\) is convex on \(\mathscr {E}\), the composition \(V(\varLambda (x))\) has a higher order nonlinearity than Q(x). Therefore, the coercivity for the target function \(\varPi (x)\) should naturally satisfied, i.e.,

$$\begin{aligned} \lim _{\Vert x \Vert \rightarrow \infty } \{\varPi (x) = V(\varLambda (x) ) - Q(x) \} = \infty \end{aligned}$$
(8)

which is a necessary condition for the existence of the global minimal solution to (CDC). Clearly, this generalized d.c. minimization problem can be used to model a reasonably large class of real-world systems.

By the fact that \(V(\xi )\) is convex, l.s.c. on \(\mathscr {E}\), its conjugate can be uniquely defined by the Fenchel transformation

$$\begin{aligned} V^*(\xi ^*) = \sup \{ \langle \xi ; \xi ^* \rangle - V(\xi ) | \;\; \xi \in \mathscr {E}\}. \end{aligned}$$
(9)

The bilinear form \(\langle \xi ; \xi ^* \rangle \) puts \(\mathscr {E}\) and \(\mathscr {E}^*\) in duality. According to convex analysis (cf. [2]), \(V^*:\mathscr {E}^* \rightarrow \mathbb {R}\cup \{ + \infty \}\) is also convex, l.s.c. on its domain \(\mathscr {E}^*\) and the following generalized canonical duality relations [7] hold on \(\mathscr {E}\times \mathscr {E}^*\)

$$\begin{aligned} \xi ^* \in \partial V(\xi ) \;\; \Leftrightarrow \;\; \xi \in \partial V^*(\xi ^*) \;\; \Leftrightarrow \;\; V(\xi ) + V^*(\xi ^*) = \langle \xi ; \xi ^* \rangle . \end{aligned}$$
(10)

Replacing \( V(\varLambda (x)) \) in the target function \(\varPi (x)\) by the Fenchel-Young equality \(V(\xi ) = \langle \xi ; \xi ^* \rangle - V^*(\xi ^*)\), Gao and Strang’s total complementary function (see [7]) \(\varXi : \mathscr {X}\rightarrow \mathscr {E}^* \rightarrow \mathbb {R}\cup \{ - \infty \}\) for this (CDC) can be obtained as

$$\begin{aligned} \varXi (x, \xi ^* ) = \langle \varLambda (x) ; \xi ^* \rangle - V^*(\xi ^*) - Q(x) . \end{aligned}$$
(11)

By this total complementary function, the canonical dual of \(\varPi (x)\) can be obtained as

$$\begin{aligned} \varPi ^d(\xi ^*) = \inf \{ \varXi (x, \xi ^*) | \;\; x \in \mathscr {X}\} = Q^{\varLambda }(\xi ^*) - V^*(\xi ^*), \end{aligned}$$
(12)

where \(Q^\varLambda :\mathscr {E}^* \rightarrow \mathbb {R}\cup \{ - \infty \}\) is the so-called \(\varLambda \)-conjugate of Q(x) defined by (see [7])

$$\begin{aligned} Q^\varLambda (\xi ^*) = \inf \{ \langle \varLambda (x) ; \xi ^* \rangle - Q(x) \; | \;\; x \in \mathscr {X}\}. \end{aligned}$$
(13)

If this \(\varLambda \)-conjugate has a non-empty effective domain , the following canonical duality

$$\begin{aligned} \inf _{x \in \mathscr {X}} \varPi (x) = \sup _{\xi ^* \in \mathscr {E}^*} \varPi ^d(\xi ^*) \end{aligned}$$
(14)

holds under certain conditions, which will be illustrated in the next section.

3 Application and Analytical Solution

Let us consider a special application in \(\mathbb {R}^n\) such that

$$\begin{aligned} g(x)= \sum _{i=1}^p\exp \left( \frac{1}{2}x^T A_i x-\alpha _i\right) + \sum _{j=1}^r\frac{1}{2}\left( \frac{1}{2}x^T B_jx-\beta _j\right) ^2, \end{aligned}$$
(15)

where \( \{ A_i \}_{i=1}^p \in \mathbb {R}^{n\times n}\) are symmetric matrices and \(\{B_j \}_{j=1}^r \in \mathbb {R}^{n\times n}\) are symmetric positive definite matrices, \(\alpha _i\) and \(\beta _j\) are real numbers. Clearly, \(g:\mathbb {R}^n \rightarrow \mathbb {R}\) is nonconvex and highly nonlinear. This type of nonconvex function covers many real applications.

The canonical measure in this application can be given as

$$ \xi =\begin{pmatrix}\theta \\ \eta \end{pmatrix}=\varLambda (x)=\begin{pmatrix}\left\{ \,\frac{1}{2}x^T A_ix\right\} _{i=1}^p\\ \left\{ \,\frac{1}{2}x^T B_jx\right\} _{j=1}^r \end{pmatrix}~:~\mathbb {R}^n\rightarrow \mathscr {E}_a\subseteq \mathbb {R}^m $$

where \(m=p+r\). Therefore, a canonical function can be defined on \(\mathscr {E}_a\):

$$ V(\xi )=V_1(\theta )+V_2(\eta ) $$

where

$$\begin{aligned}&V_1(\theta )= \sum _{i=1}^p\exp \left( \theta _i-\alpha _i\right) ,\nonumber \\&V_2(\eta )=\sum _{j=1}^r\frac{1}{2}(\eta _j-\beta _j)^2.\nonumber \end{aligned}$$

Here \(\theta _i\) and \(\eta _j\) denote the ith component of \(\theta \) and the jth component of \(\eta \), respectively. Since \(V_1(\theta )\) and \(V_2(\eta )\) are convex, \(V(\xi )\) is a convex function . By Legendre transformation, we have the following equation

$$\begin{aligned} V(\xi )+V^*(\zeta )=\xi ^T\zeta , \end{aligned}$$
(16)

where

$$ \zeta =\begin{pmatrix}\tau \\ \sigma \end{pmatrix}=\begin{pmatrix}\nabla V_1(\theta )\\ \nabla V_2(\eta ) \end{pmatrix}=\begin{pmatrix}\left\{ \,\exp \left( \theta _i-\alpha _i\right) \right\} _{i=1}^p\\ \left\{ \,\eta _j-\beta _j\right\} _{j=1}^r\end{pmatrix}~:~\mathscr {E}_a\rightarrow \mathscr {E}_a^* \subset \mathbb {R}^m $$

and \(V^*(\zeta )\) is the conjugate function of \(V(\xi )\), defined as

$$\begin{aligned} V^*(\zeta )=V_1^*(\tau )+V_2^*(\sigma ) \end{aligned}$$
(17)

with

$$\begin{aligned}&V_1^*(\tau )=\sum _{i=1}^p\left( \alpha _i+\ln (\tau _i)-1\right) \tau _i,\nonumber \\&V_2^*(\sigma )=\frac{1}{2}\sigma ^T\sigma +\beta ^T\sigma ,\nonumber \end{aligned}$$

where \(\beta =\{\beta _j\}\).

Since the canonical measure in this application is a quadratic operator, the total complementary function \(\Xi : \mathbb {R}^n\times \mathscr {E}_a^*\rightarrow \mathbb {R}\) has the following form

$$\begin{aligned} \Xi (x,\zeta ) =\frac{1}{2}x^T G(\zeta )x-f^Tx-V_1^*(\tau )-V_2^*(\sigma ), \end{aligned}$$
(18)

where

$$ G(\zeta )= \sum _{i=1}^p\tau _i A_i+\sum _{j=1}^r\sigma _j B_j-C. $$

Notice that for any given \(\zeta \), the total complementary function \(\Xi (x,\zeta )\) is a quadratic function of x and its stationary points are the solutions of the following equation

$$\begin{aligned} \nabla _{x}\Xi (x,\zeta )= G(\zeta )x-f=0. \end{aligned}$$
(19)

If \(\det (G(\zeta ))\ne 0\) for a given \(\zeta \), then (19) can be solved analytically to have a unique solution \(x= G(\zeta )^{{-1}}f\). Let

$$\begin{aligned} \mathscr {S}_a=\left\{ \,\zeta \in \mathscr {E}_a^*| \; ~ \det (G(\zeta ))\ne 0 \right\} . \end{aligned}$$
(20)

Thus, on \(\mathscr {S}_a\) the canonical dual function \(\Pi ^d(\zeta )\) can then be written explicitly as

$$\begin{aligned} \Pi ^d(\zeta )=-\frac{1}{2}f^T G(\zeta )^{{-1}}f - V_1^*(\tau )-V_2^*(\sigma ). \end{aligned}$$
(21)

Clearly, both \(\Pi ^d(\zeta )\) and its domain \(\mathscr {S}_a\) are nonconvex. The canonical dual problem is to find all stationary points of \(\Pi ^d(\zeta )\) on its domain, i.e.,

$$\begin{aligned} (\mathscr {P}^d):~~~~\text {sta}\left\{ \,\Pi ^d(\zeta )~|~\zeta \in \mathscr {S}_a\right\} . \end{aligned}$$
(22)

Theorem 1.

(Analytic Solution and Complementary-Dual Principle).

Problem (\(\mathscr {P}^d\)) is canonical dual to the problem (\(\mathscr {P}\)) in the sense that if \(\bar{\zeta }\in \mathscr {S}_a\) is a stationary point of \(\Pi ^d(\zeta )\), then

$$\begin{aligned} \bar{x}=G(\bar{\zeta })^{{-1}}f \end{aligned}$$
(23)

is a stationary point of \(\Pi (x)\), the pair \((\bar{x},\bar{\zeta })\) is a stationary point of \(\Xi (x,\zeta )\), and we have

$$\begin{aligned} \Pi (\bar{x})=\Xi (\bar{x},\bar{\zeta })=\Pi ^d(\bar{\zeta }). \end{aligned}$$
(24)

The proof of this theorem is analogous with that in [6]. Theorem 1 shows that there is no duality gap between the primal problem (\(\mathscr {P}\)) and the canonical dual problem (\(\mathscr {P}^d\)).

4 Triality Theory

In this section we will study the global optimality conditions for the critical solutions of the primal and dual problems. In order to identify both global and local extrema of both two problems, we let

$$\begin{aligned}&\mathscr {S}_a^+= \left\{ \,\zeta \in \mathscr {S}_a~|~G (\zeta ) \succ 0\right\} ,\\&\mathscr {S}_a^-= \left\{ \,\zeta \in \mathscr {S}_a~|~G (\zeta ) \prec 0\right\} . \end{aligned}$$

where \(G\succ 0\) means that G is a positive definite matrix and where \(G\prec 0\) means that G is a negative definite matrix. It is easy to prove that both \(\mathcal{S}_a^+\) and \(\mathcal{S}_a^-\) are convex sets and

$$\begin{aligned} Q^\varLambda (\zeta ) = \inf \{ \langle \varLambda (x) ; \zeta \rangle - Q(x) | \;\; x \in \mathbb {R}^n \} = \left\{ \begin{array}{ll} -\frac{1}{2}f^T G(\zeta )^{{-1}}f \;\; &{} \text{ if } \zeta \in \mathscr {S}_a^+\\ -\infty &{} \text{ otherwise } \end{array} \right. \end{aligned}$$
(25)

This shows that \(\mathscr {S}_a^+\) is an effective domain of \(Q^\varLambda (\zeta )\).

For convenience, we first give the first and second derivatives of functions \(\Pi (x)\) and \(\Pi ^d(\zeta )\):

$$\begin{aligned}&\nabla \Pi (x)= Gx-f, \end{aligned}$$
(26)
$$\begin{aligned}&\nabla ^2\Pi (x)= G+Z_0 H Z_0^T, \end{aligned}$$
(27)
$$\begin{aligned}&\nabla \Pi ^d(\zeta )= \left( \begin{array}{l} \left\{ \,\frac{1}{2}f^T G^{-1}A_i G^{{-1}}f-\alpha _i-\ln (\tau _i)\right\} _{i=1}^p \\ \left\{ \,\frac{1}{2}f^T G^{-1}B_j G^{{-1}}f-\sigma _j-\beta _j\right\} _{j=1}^r \end{array}\right) , \end{aligned}$$
(28)
$$\begin{aligned}&\nabla ^2\Pi ^d(\zeta )=- Z^T G^{-1}Z-H^{-1}, \end{aligned}$$
(29)

where \(Z_0,Z\in \mathbb {R}^{n\times m}\) and \(H\in \mathbb {R}^{m\times m}\) are defined as

$$\begin{aligned}&Z_0= \begin{bmatrix} A_1x,\ldots , A_px, B_1x,\ldots , B_rx \end{bmatrix},\\&Z= \begin{bmatrix} A_1G^{{-1}}f,\ldots , A_pG^{{-1}}f, B_1G^{{-1}}f,\ldots , B_rG^{{-1}}f \end{bmatrix},\\&H= \begin{bmatrix} \mathop {\mathrm {diag}}(\tau )&0\\ 0&E_n \end{bmatrix} , \end{aligned}$$

where \(E_n\) is a \(n\times n\) identity matrix. By the fact that \( \tau > 0\), the matrix \(H^{-1}\) is positive definite.

Next we can get the lemma as follows whose proof is trivial.

Lemma 1.

If \(M_1,M_2,\ldots ,M_N\in \mathbb {R}^{n\times n}\) are symmetric positive semi-definite matrices, then \(M=M_1+M_2+\ldots +M_N\) is also a positive semi-definite matrix.

Lemma 2.

If \(\lambda _{G}\) is an arbitrary eigenvalue of G, it follows that

$$\lambda _{G}\ge \sum _{i=1}^p\tau _i \lambda ^{A_i}_{min}+\sum _{j=1}^r\sigma _j\bar{\lambda }^{B_j}-\lambda ^{C}_{max},$$

in which \(\lambda ^{A_i}_{min}\) is the smallest eigenvalue of \(A_i\), \(\lambda ^{C_i}_{max}\) is the largest eigenvalue of \(C_i\), and

$$\begin{aligned} \bar{\lambda }^{B_j}= \left\{ \begin{array}{ll} \lambda ^{B_j}_{min},&{}~~\sigma _j> 0\\ \lambda ^{B_j}_{max},&{}~~\sigma _j\le 0, \end{array}\right. \end{aligned}$$
(30)

where \(\lambda ^{B_j}_{min}\) and \(\lambda ^{B_j}_{max}\) are the smallest eigenvalue and the largest eigenvalue of \(B_j\) respectively.

Proof.

Firstly, we need prove \(\tau _i(A_i-\lambda ^{A_i}_{min}E_n)\), \(\lambda ^{C}_{max}E_n-C\) and \(\sigma _j(B_j-\bar{\lambda }^{B_j}E_n)\) are all symmetric positive semi-definite matrices.

  1. (a)

    As \(\lambda ^{A_i}_{min}\) is the smallest eigenvalue of \(A_i\), then \(A_i-\lambda ^{A_i}_{min}E_n\) is symmetric positive semi-definite, so \(\tau _i(A_i-\lambda ^{A_i}_{min}E_n)\) is symmetric positive semi-definite with \(\tau _i=\exp \left( \theta _i-\alpha _i\right) >0\).

  2. (b)

    As \(\lambda ^{C}_{max}\) is the largest eigenvalue of C, then \(\lambda ^{C}_{max}E_n-C\) is a symmetric positive semi-definite matrix.

  3. (c)
    1. (c.1)

      As \(\lambda ^{B_j}_{min}\) is the smallest eigenvalue of \(B_j\), then \(B_j-\lambda ^{B_j}_{min}E_n\) is symmetric positive semi-definite, so when \(\sigma _j> 0\) it holds that \(\sigma _j(B_j-\lambda ^{B_j}_{min}E_n)\) is symmetric positive semi-definite.

    2. (c.2)

      As \(\lambda ^{B_j}_{max}\) is the largest eigenvalue of \(B_j\), then \(B_j-\lambda ^{B_j}_{max}E_n\) is symmetric negative semi-definite, so when \(\sigma _j\le 0\) it holds that \(\sigma _j(B_j-\lambda ^{B_j}_{max}E_n)\) is symmetric positive semi-definite.

    From (c.1) and (c.2), we know \(\sigma _j(B_j-\bar{\lambda }^{B_j}E_n)\) is always symmetric positive semi-definite.

Then by (a), (b), (c) and Lemma 1, we have

$$\sum _{i=1}^p\tau _i(A_i-\lambda ^{A_i}_{min}E_n)+\sum _{j=1}^r\sigma _j(B_j-\bar{\lambda }^{B_j}E_n)+\lambda ^{C}_{max}E_n-C$$

is a positive semi-definite matrix, which is equivalent to

$$G-\left( \sum _{i=1}^p\tau _i\lambda ^{A_i}_{min}+\sum _{j=1}^r\sigma _j\bar{\lambda }^{B_j}E_n-\lambda ^{C}_{max}\right) E_n$$

is a positive semi-definite matrix, which implies that for every eigenvalue of G, it is greater than or equal to \(\sum _{i=1}^p\tau _i \lambda ^{A_i}_{min}+\sum _{j=1}^r\sigma _j\bar{\lambda }^{B_j}-\lambda ^{C}_{max}\). \(\square \)

Based on the above lemma, the following assumption is given for the establishment of solution method.

Assumption 1

There is a critical point \(\zeta =(\tau ,\sigma )\) of \(\Pi ^d(\zeta )\), satisfying \(\varDelta >0\) where

$$\varDelta =\sum _{i=1}^p\tau _i\lambda ^{A_i}_{min}+\sum _{j=1}^r\sigma _j\bar{\lambda }^{B_j}-\lambda ^{C}_{max}.$$

Lemma 3.

If \(\bar{\zeta }\) is a stationary point of \(\varPi ^d(\zeta ) \) satisfying Assumption 1, then \(\bar{\zeta }\in \mathscr {S}_a^+\).

Proof

From Lemma 3, we know if \(\lambda _{G}\) is an arbitrary eigenvalue of G, it holds that \(\lambda _{G}\ge \varDelta \). If \(\bar{\zeta }\) is a critical point satisfying Assumption 1, then \(\varDelta >0\), so for every eigenvalue of G, we have \(\lambda _{G}\ge \varDelta >0\), then G is a positive definite matrix, i.e., \(\bar{\zeta }\in \mathscr {S}_a^+\). \(\square \)

The following lemma is needed here. Its proof is omitted, which is similar to that of Lemma 6 in [5].

Lemma 4.

Suppose that \( P\in \mathbb {R}^{n\times n}\), \( U\in \mathbb {R}^{m\times m}\) and \( W\in \mathbb {R}^{n\times m}\) are given symmetric matrices with

$$ P=\begin{bmatrix} P_{11}&P_{12}\\ P_{21}&P_{22} \end{bmatrix}\prec 0, ~~ U=\begin{bmatrix} U_{11}&0\\0&U_{22} \end{bmatrix}\succ 0, \text { and }W=\begin{bmatrix} W_{11}&0\\0&0 \end{bmatrix}, $$

where \( P_{11}\), \( U_{11}\) and \( W_{11}\) are \(r\times r\)-dimensional matrices, and \( W_{11}\) is nonsingular. Then,

$$\begin{aligned} - W^T P^{-1} W- U^{-1}\preceq 0\Leftrightarrow P+ W U W^T\preceq 0. \end{aligned}$$
(31)

Now, we give the main result of this paper, triality theorem , which illustrates the relationships between the primal and canonical dual problems on global and local solutions under Assumption 1.

Theorem 2.

(Triality Theorem) Suppose that \(\bar{\zeta }\) is a critical point of \(\Pi ^d(\zeta )\), and \(\bar{x}=G(\bar{\zeta })^{{-1}}f\).

  1. 1.

    Min–max duality: If \(\bar{\zeta }\) is the critical point satisfying Assumption 1, then the canonical min–max duality holds in the form of

    $$\begin{aligned} \Pi (\bar{x}) =\min _{x\in \mathbb {R}^n} \Pi (x)=\max _{\zeta \in \mathscr {S}_a^+} \Pi ^d(\zeta )=\Pi ^d(\bar{\zeta }). \end{aligned}$$
    (32)
  2. 2.

    Double-max duality: If \(\bar{\zeta }\in \mathscr {S}_a^-\), the double-max duality holds in the form that if \(\bar{x}\) is a local maximizer of \(\Pi (x)\) or \(\bar{\zeta }\) is a local maximizer of \(\Pi ^d(\zeta )\), we have

    $$\begin{aligned} \Pi (\bar{x}) =\max _{x\in \mathscr {X}_0} \Pi (x)=\max _{\zeta \in \mathscr {S}_0} \Pi ^d(\zeta )=\Pi ^d(\bar{\zeta }) \end{aligned}$$
    (33)

    where \(\bar{x}\in \mathscr {X}_0 \subset \mathbb {R}^n\) and \(\bar{\zeta }\in \mathscr {S}_0 \subset \mathscr {S}_a^-\).

  3. 3.

    Double-min duality: If \(\bar{\zeta }\in \mathcal{S}_a^-\), then the double-min duality holds in the form that when \(m=n\), if \(\bar{x}\) is a local minimizer of \(\Pi (x)\) or \(\bar{\zeta }\) is a local minimizer of \(\Pi ^d(\zeta )\), we have

    $$\begin{aligned} \Pi (\bar{x}) =\min _{x\in \mathscr {X}_0} \Pi (x)=\min _{\zeta \in \mathscr {S}_0} \Pi ^d(\zeta )=\Pi ^d(\bar{\zeta }) \end{aligned}$$
    (34)

    where \(\bar{x}\in \mathscr {X}_0 \subset \mathbb {R}^n\) and \(\bar{\zeta }\in \mathscr {S}_0 \subset \mathscr {S}_a^-\).

Proof

  1. 1.

    Because \(\bar{\zeta }\) is a critical point satisfying Assumption 1, by Lemma 4 it holds \(\bar{\zeta }\in \mathscr {S}_a^+\), i.e., \(G(\bar{\zeta })\succ 0\). As \(G(\bar{\zeta })\succ 0\) and \( H\succ 0\), by (29) we know the Hessian of the dual function is negative definitive, i.e., \(\nabla ^2\Pi ^d(\zeta )\prec 0\), which implies that \(\Pi ^d(\zeta )\) is strictly concave over \(\mathscr {S}_a^+\). Hence, we get

    $$ \Pi ^d(\bar{\zeta })=\max _{\zeta \in \mathcal{S}_a^+} \Pi ^d(\zeta ). $$

    By the convexity of \(V(\xi )\), we have \(V(\xi )-V(\bar{\xi })\ge (\xi -\bar{\xi })^T\nabla V(\bar{\xi })=(\xi -\bar{\xi })^T\bar{\zeta }\) (see [11]), so

    $$V(\varLambda (x))-V(\varLambda (\bar{x}))\ge (\varLambda (x)-\varLambda (\bar{x}))^T\bar{\zeta },$$

    which implies

    $$\begin{aligned} \Pi (x)-\Pi (\bar{x})\ge & {} (\varLambda (x)-\varLambda (\bar{x}))^T\bar{\zeta }-\frac{1}{2}x^T C x+\frac{1}{2}\bar{x}^T C \bar{x}+f^T(x-\bar{x})\nonumber \\= & {} \frac{1}{2}x^T G(\bar{\zeta })x-\frac{1}{2}\bar{x}^T G(\bar{\zeta })\bar{x}-(x-\bar{x})^TG(\bar{\zeta })\bar{x}, \end{aligned}$$
    (35)

    Because \(G(\bar{\zeta })\succ 0\), the convexity of \(\frac{1}{2}x^T G(\bar{\zeta })x\) with respect to x in \(\mathbb {R}^n\) leads to

    $$\frac{1}{2}x^T G(\bar{\zeta })x-\frac{1}{2}\bar{x}^T G(\bar{\zeta })\bar{x}\ge (x-\bar{x})^TG(\bar{\zeta })\bar{x}$$

    Then by (35), \(\Pi (x)\ge \Pi (\bar{x})\) for any \(x\in \mathbb {R}^n\), which with Theorem 1 and (4) shows that the Eq. (32) is true.

  2. 2.

    If \(\bar{\zeta }\) is a local maximizer of \(\Pi ^d(\zeta )\) over \(\mathscr {S}_a^-\), it is true that \(\nabla ^2\Pi ^d(\bar{\zeta })=- Z^T G^{-1}Z- H^{-1}\preceq 0\) and there exists a neighborhood \(\mathscr {S}_0\subset \mathscr {S}_a^-\) such that for all \(\zeta \in \mathscr {S}_0\), \(\nabla ^2\Pi ^d(\zeta )\preceq 0\). Since the map \(x= G^{{-1}}f\) is continuous over \(\mathcal{S}_a\), the image of the map over \(\mathscr {S}_0\) is a neighborhood of \(\bar{x}\), which is denoted by \(\mathscr {X}_0\). Now we prove that for any \(x\in \mathscr {X}_0\), \(\nabla ^2\Pi (x)\preceq 0\), which plus the fact that \(\bar{x}\) is a critical point of \(\Pi (x)\) implies \(\bar{x}\) is a maximizer of \(\Pi (x)\) over \(\mathscr {X}_0\). By singular value decomposition, there exist orthogonal matrices \( J\in \mathbb {R}^{n\times n}\), \( K\in \mathbb {R}^{m\times m}\) and \( R\in \mathbb {R}^{n\times m}\) with

    $$\begin{aligned} R_{ij}= \left\{ \begin{array}{ll} \delta _i, &{}~~ i=j \text { and }i=1,\ldots ,r,\\ 0,&{}~~\text {otherwise}, \end{array}\right. \end{aligned}$$
    (36)

    where \(\delta _i>0\) for \(i=1,\ldots ,r\) and \(r={\text {rank}}( F)\), such that \(Z H^{\frac{1}{2}}= J R K\), then

    $$\begin{aligned} Z = J R K H^{-\frac{1}{2}}. \end{aligned}$$
    (37)

    For any \(x\in \mathscr {X}_0\), let \(\zeta \) be a point satisfying \(x= G^{{-1}}f\). Therefore, \(\nabla ^2\Pi ^d(\zeta )=- Z^T G^{-1}Z- H^{-1}\preceq 0\), then it holds that

    $$\begin{aligned} - H^{-\frac{1}{2}} K^T R^T J^T G^{-1}J R K H^{-\frac{1}{2}}-H^{-1}\preceq 0. \end{aligned}$$
    (38)

    Multiplying above inequality by \( K H^{\frac{1}{2}}\) from the left and \( H^{\frac{1}{2}} K^T\) from the right, it can be obtained that

    $$\begin{aligned} - R^T J^T G^{-1}J R-E_m\preceq 0, \end{aligned}$$
    (39)

    which, by Lemma 4, is further equivalent to

    $$\begin{aligned} J^T G J+ R R^T\preceq 0, \end{aligned}$$
    (40)

    then it follows that

    $$\begin{aligned} -G\succeq J R R^T J^T=J R K H^{-\frac{1}{2}} H H^{-\frac{1}{2}} K^T R^T J^T= Z H Z^T. \end{aligned}$$
    (41)

    Thus, \(\nabla ^2\Pi (x)= G+ Z H Z^T \preceq 0\), then \(\bar{x}\) is a maximizer of \(\Pi (x)\) over \(\mathscr {X}_0\). Similarly, we can prove that if \(\bar{x}\) is a maximizer of \(\Pi (x)\) over \(\mathscr {X}_0\), then \(\bar{\zeta }\) is a maximizer of \(\Pi ^d(\zeta )\) over \(\mathscr {S}_0\). By the Theorem 1, the Eq. (33) is proved.

  3. 3.

    Now we prove the double-min duality. Suppose that \(\bar{\zeta }\) is a local minimizer of \(\Pi ^d(\zeta )\) in \(\mathscr {S}_a^-\), then there exists a neighborhood \(\mathscr {S}_0\subset \mathscr {S}_a^-\) of \(\bar{\zeta }\) such that for any \(\zeta \in \mathscr {S}_0\), \(\nabla ^2\Pi ^d(\zeta )\succeq 0\). Let \(\mathscr {X}_0\) denote the image of the map \(x= G^{{-1}}f\) over \(\mathscr {S}_0\), which is a neighborhood of \(\bar{x}\). For any \(x\in \mathscr {X}_0\), let \(\zeta \) be a point that satisfies \(x= G^{{-1}}f\). It follows from \(\nabla ^2\Pi ^d(\zeta )=- Z^T G^{-1}Z- H^{-1}\succeq 0\) that \(- Z^T G^{-1}Z\succeq H^{-1}\succ 0\), which implies the matrix F is invertible. Then it is true that

    $$\begin{aligned} - G^{-1}\succeq ( Z^T)^{-1} H^{-1} Z^{-1}, \end{aligned}$$
    (42)

    which is further equivalent to

    $$\begin{aligned} - G\preceq Z H Z^T. \end{aligned}$$
    (43)

    Thus, \(\nabla ^2\Pi (x)= G+ Z H Z^T\succeq 0\) and x is a local minimizer of \(\Pi (x)\). The converse can be proved similarly. By Theorem 1, the Eq. (34) is then true.

The theorem is proved. \(\square \)

5 Examples

In this section, let \(p=r=1\). From the definition of (CDC) problem, \(A_1\) is a symmetric matrix, \(B_1\) and \(C_1\) are two positive definite matrices. According to different cases of \(A_1\), following five motivating examples are provided to illustrate the proposed canonical duality method in our paper. By examining the critical points of the dual function, we will show how the dualities in the triality theory are verified by these examples.

Example 1

We consider the case that \(A_1\) is positive definite. Let \(\alpha _1=\beta _1=1\) and

$$ A_1=\begin{bmatrix} 2&0\\ 0&3 \end{bmatrix}, ~~ B_1=\begin{bmatrix} 1&0\\ 0&1.5 \end{bmatrix}, ~~ C_1=\begin{bmatrix} 0.5&0\\ 0&2 \end{bmatrix},\text { and }f=\begin{bmatrix} 1 \\ 2 \end{bmatrix}, $$

then the primal problem:

$$\begin{aligned} \min _{(x,y)\in \mathbb {R}^2}\Pi (x,y)= & {} \exp \left( x^2+1.5y^2-1\right) +0.5\left( 0.5x^2+0.75y^2-1\right) ^2-0.25x^2\nonumber \\&-y^2-x-2y.\nonumber \end{aligned}$$

The corresponding canonical dual function is

$$ \Pi ^d(\tau ,\sigma )=-0.5\left( \frac{1}{2\tau +\sigma -0.5}+\frac{4}{3\tau +1.5\sigma -2}\right) -\tau \ln (\tau )-0.5\sigma ^2-\sigma . $$

so there is no duality gap , then \((\bar{x}_1,\bar{y}_1)\) is the global solution of the primal problem, which demonstrates the min–max duality(see Fig. 1).

Fig. 1
figure 1

The min–max duality in Example 1: a contour plot of function \(\Pi ^d(\tau ,\sigma )\) near \((\bar{\tau }_1,\bar{\sigma }_1)\); b contour plot of function \(\Pi (x,y)\); c graph of function \(\Pi (x,y)\) near \((\bar{x}_1,\bar{y}_1)\)

Example 2

We consider the case that \(A_1\) is negative definite. Let \(\alpha _1=-4\), \(\beta _2=0.5\) and

$$ A_1=\begin{bmatrix} -1&0\\ 0&-1.5 \end{bmatrix}, ~~ B_1=\begin{bmatrix} 2&0\\ 0&1 \end{bmatrix}, ~~ C_1=\begin{bmatrix} 2&0\\ 0&3 \end{bmatrix},\text { and }f=\begin{bmatrix} 5 \\ 2 \end{bmatrix}, $$

then the primal problem:

$$\begin{aligned} \min _{(x,y)\in \mathbb {R}^2}\Pi (x,y)= & {} \left( -0.5x^2-0.75y^2+4\right) +0.5\left( x^2+0.5y^2-0.5\right) ^2-x^2 \nonumber \\&-1.5y^2-5x-2y .\nonumber \end{aligned}$$

The corresponding canonical dual function is

$$ \Pi ^d(\tau ,\sigma )=-0.5\left( \frac{25}{-\tau +2\sigma -2}+\frac{4}{-1.5\tau +\sigma -3}\right) -\tau \ln (\tau )+5\tau -0.5\sigma ^2-0.5\sigma . $$

In this problem, \(\lambda ^{A_1}_{min}=-1.5\), \(\lambda ^{B_1}_{min}=1\), \(\lambda ^{B_1}_{max}=2\), and \(\lambda ^{C_1}_{max}=3\). It is noticed that \((\bar{\tau }_1,\bar{\sigma }_1)=(0.145563,3.95352)\) is a critical point of the dual function \(\Pi ^d(\tau ,\sigma )\)(see Fig. 2a). As \(\bar{\sigma }_1>0\), we have \(\bar{\lambda }^{B_1}=\lambda ^{B_1}_{min}\) and

$$\varDelta =\bar{\tau }_1 \lambda ^{A_1}_{min}+\bar{\sigma }_1\lambda ^{B_1}_{min}-\lambda ^{C_1}_{max}=0.7352>0,$$

so Assumption 1 is satisfied, then \((\bar{\tau }_1,\bar{\sigma }_1)\) is in \(\mathcal{S}_a^+\). By Theorem 1, we get \((\bar{x}_1,\bar{y}_1)=(0.867833, 2.72044)\). Moreover, we have

$$\begin{aligned} \Pi (\bar{x}_1,\bar{y}_1)=\Pi ^d(\bar{\tau }_1,\bar{\sigma }_1)=-13.6736,\nonumber \end{aligned}$$

so there is no duality gap, then \((\bar{x}_1,\bar{y}_1)\) is the global solution of the primal problem, which demonstrates the min–max duality(see Fig. 2).

For showing the double-max duality of Example 2, we find a local maximum point of \(\Pi ^d(\tau ,\sigma )\) in \(\mathscr {S}_a^-\): \((\bar{\tau }_2,\bar{\sigma }_2)=(54.3685,-0.492123)\). By Theorem 1, we get \((\bar{x}_2,\bar{y}_2)=(-0.0871798, -0.023517)\). Moreover, we have

$$\begin{aligned} \Pi (\bar{x}_2,\bar{y}_2)=\Pi ^d(\bar{\tau }_2,\bar{\sigma }_2)=54.9641,\nonumber \end{aligned}$$

and \((\bar{x}_2,\bar{y}_2)\) is also a local maximum point of \(\Pi (x,y)\), which demonstrates the double-max duality(see Fig. 3).

Fig. 2
figure 2

The min–max duality in Example 2: a contour plot of function \(\Pi ^d(\tau ,\sigma )\) near \((\bar{\tau }_1,\bar{\sigma }_1)\); b contour plot of function \(\Pi (x,y)\); c graph of function \(\Pi (x,y)\) near \((\bar{x}_1,\bar{y}_1)\)

Fig. 3
figure 3

The double-max duality in Example 2: a contour plot of function \(\Pi ^d(\tau ,\sigma )\) near \((\bar{\tau }_2,\bar{\sigma }_2)\); b contour plot of function \(\Pi (x,y)\) near \((\bar{x}_2,\bar{y}_2)\); c graph of function \(\Pi (x,y)\) near \((\bar{x}_2,\bar{y}_2)\)

Example 3

We consider the case that \(A_1\) is indefinite. Let \(\alpha _1=\beta _1=1\) and

$$ A_1=\begin{bmatrix} 1&0\\ 0&-2 \end{bmatrix}, ~~ B_1=\begin{bmatrix} 1&0\\ 0&1 \end{bmatrix}, ~~ C_1=\begin{bmatrix} 1.5&0\\ 0&1 \end{bmatrix},\text { and }f=\begin{bmatrix} 1 \\ 1 \end{bmatrix}, $$

then the primal problem:

$$\begin{aligned} \min _{(x,y)\in \mathbb {R}^2}\Pi (x,y)= & {} \exp \left( 0.5x^2-y^2-1\right) +0.5\left( 0.5x^2+0.5y^2-1\right) ^2 \nonumber \\&-0.75x^2-0.5y^2-x-y.\nonumber \end{aligned}$$

The corresponding canonical dual function is

$$ \Pi ^d(\tau ,\sigma )=-0.5\left( \frac{1}{\tau +\sigma -0.5}+\frac{1}{-2\tau +\sigma -1}\right) -\tau \ln (\tau )-0.5\sigma ^2-\sigma . $$

In this problem, \(\lambda ^{A_1}_{min}=-2\), \(\lambda ^{B_1}_{min}=\lambda ^{B_1}_{max}=1\), and \(\lambda ^{C_1}_{max}=1.5\). It is noticed that \((\bar{\tau }_1,\bar{\sigma }_1)=(0.143473,1.91093)\) is a critical point of the dual function \(\Pi ^d(\tau ,\sigma )\)(see Fig. 4a). As \(\bar{\sigma }_1>0\), we have \(\bar{\lambda }^{B_1}=\lambda ^{B_1}_{min}\) and

$$\varDelta =\bar{\tau }_1 \lambda ^{A_1}_{min}+\bar{\sigma }_1\lambda ^{B_1}_{min}-\lambda ^{C_1}_{max}=0.1240>0,$$

so Assumption 1 is satisfied, then \((\bar{\tau }_1,\bar{\sigma }_1)\) is in \(\mathcal{S}_a^+\). By Theorem 1, we get \((\bar{x}_1,\bar{y}_1)=(1.80375, 1.60261)\). Moreover, we have

$$\begin{aligned} \Pi (\bar{x}_1,\bar{y}_1)=\Pi ^d(\bar{\tau }_1,\bar{\sigma }_1)=-5.16136,\nonumber \end{aligned}$$

so there is no duality gap, then \((\bar{x}_1,\bar{y}_1)\) is the global solution of the primal problem, which demonstrates the min–max duality(see Fig. 4).

Fig. 4
figure 4

The min–max duality in Example 3: a contour plot of function \(\Pi ^d(\tau ,\sigma )\) near \((\bar{\tau }_1,\bar{\sigma }_1)\); b contour plot of function \(\Pi (x,y)\); c graph of function \(\Pi (x,y)\) near \((\bar{x}_1,\bar{y}_1)\)

For showing the double-max duality of Example 3, we find a local maximum point of \(\Pi ^d(\tau ,\sigma )\) in \(\mathscr {S}_a^-\): \((\bar{\tau }_2,\bar{\sigma }_2)=(0.358833,-0.785507)\). By Theorem 1, we get \((\bar{x}_2,\bar{y}_2)=(-0.519029, -0.399493)\). Moreover, we have

$$\begin{aligned} \Pi (\bar{x}_2,\bar{y}_2)=\Pi ^d(\bar{\tau }_2,\bar{\sigma }_2)=1.30402,\nonumber \end{aligned}$$

and \((\bar{x}_2,\bar{y}_2)\) is also a local maximum point of \(\Pi (x,y)\), which demonstrates the double-max duality(see Fig. 5).

Fig. 5
figure 5

The double-max duality in Example 3: a contour plot of function \(\Pi ^d(\tau ,\sigma )\) near \((\bar{\tau }_2,\bar{\sigma }_2)\); b contour plot of function \(\Pi (x,y)\) near \((\bar{x}_2,\bar{y}_2)\); c graph of function \(\Pi (x,y)\) near \((\bar{x}_2,\bar{y}_2)\)

Example 4

We also consider the case that \(A_1\) is indefinite. Let \(\alpha _1=1\), \(\beta _1=2\) and

$$ A_1=\begin{bmatrix} -3&0\\ 0&1 \end{bmatrix}, ~~ B_1=\begin{bmatrix} 1&0\\ 0&1 \end{bmatrix}, ~~ C_1=\begin{bmatrix} 4&0\\ 0&4.4 \end{bmatrix},\text { and }f=\begin{bmatrix} 1 \\ 1 \end{bmatrix}, $$

then the primal problem:

$$\begin{aligned} \min _{(x,y)\in \mathbb {R}^2}\Pi (x,y)= & {} \exp \left( -1.5x^2+0.5y^2-1\right) +0.5\left( 0.5x^2+0.5y^2-2\right) ^2 \nonumber \\&-2x^2-2.2y^2-x-y .\nonumber \end{aligned}$$

The corresponding canonical dual function is

$$ \Pi ^d(\tau ,\sigma )=-0.5\left( \frac{1}{-3\tau +\sigma -4}+\frac{1}{\tau +\sigma -4.4}\right) -\tau \ln (\tau )-0.5\sigma ^2-2\sigma . $$

In this problem, \(\lambda ^{A_1}_{min}=-3\), \(\lambda ^{B_1}_{min}=\lambda ^{B_1}_{max}=1\), and \(\lambda ^{C_1}_{max}=4.4\). It is noticed that \((\bar{\tau }_1,\bar{\sigma }_1)=(0.0612941,4.67004)\) is a critical point of the dual function \(\Pi ^d(\tau ,\sigma )\)(see Fig. 6a). As \(\bar{\sigma }_1>0\), we have \(\bar{\lambda }^{B_1}=\lambda ^{B_1}_{min}\) and

$$\varDelta =\bar{\tau }_1 \lambda ^{A_1}_{min}+\bar{\sigma }_1\lambda ^{B_1}_{min}-\lambda ^{C_1}_{max}=0.0862>0,$$

so Assumption 1 is satisfied, then \((\bar{\tau }_1,\bar{\sigma }_1)\) is in \(\mathcal{S}_a^+\). By Theorem 1, we get \((\bar{x}_1,\bar{y}_1)=(2.05695, 3.01812)\). Moreover, we have

$$\begin{aligned} \Pi (\bar{x}_1,\bar{y}_1)=\Pi ^d(\bar{\tau }_1,\bar{\sigma }_1)=-22.6111,\nonumber \end{aligned}$$

so there is no duality gap, then \((\bar{x}_1,\bar{y}_1)\) is the global solution of the primal problem, which demonstrates the min–max duality(see Fig. 6).

For showing the double-max duality of Example 4, we find a local maximum point of \(\Pi ^d(\tau ,\sigma )\) in \(\mathscr {S}_a^-\): \((\bar{\tau }_2,\bar{\sigma }_2)=(0.361948,-1.97615)\). By Theorem 1, we get \((\bar{x}_2,\bar{y}_2)=(-0.141603, -0.166273)\). Moreover, we have

$$\begin{aligned} \Pi (\bar{x}_2,\bar{y}_2)=\Pi ^d(\bar{\tau }_2,\bar{\sigma }_2)=2.52149,\nonumber \end{aligned}$$

and \((\bar{x}_2,\bar{y}_2)\) is also a local maximum point of \(\Pi (x,y)\), which demonstrates the double-max duality(see Fig. 7).

For showing the double-min duality of Example 4, we find a local minimum point of \(\Pi ^d(\tau ,\sigma )\) in \(\mathscr {S}_a^-\): \((\bar{\tau }_3,\bar{\sigma }_3)=(0.149286,3.90584)\). By Theorem 1, we get \((\bar{x}_3,\bar{y}_3)=(-1.84496, -2.89962)\). Moreover, we have

$$\begin{aligned} \Pi (\bar{x}_3,\bar{y}_3)=\Pi ^d(\bar{\tau }_3,\bar{\sigma }_3)=-12.7833,\nonumber \end{aligned}$$

and \((\bar{x}_3,\bar{y}_3)\) is also a a local minimum point of \(\Pi (x,y)\), which demonstrates the double-min duality(see Fig. 8).

Fig. 6
figure 6

The min–max duality in Example 4: a contour plot of function \(\Pi ^d(\tau ,\sigma )\) near \((\bar{\tau }_1,\bar{\sigma }_1)\); b contour plot of function \(\Pi (x,y)\); c graph of function \(\Pi (x,y)\) near \((\bar{x}_1,\bar{y}_1)\)

Fig. 7
figure 7

The double-max duality in Example 4: a contour plot of function \(\Pi ^d(\tau ,\sigma )\) near \((\bar{\tau }_2,\bar{\sigma }_2)\); b contour plot of function \(\Pi (x,y)\) near \((\bar{x}_2,\bar{y}_2)\); c graph of function \(\Pi (x,y)\) near \((\bar{x}_2,\bar{y}_2)\)

Fig. 8
figure 8

The double-min duality in Example 4: a contour plot of function \(\Pi ^d(\tau ,\sigma )\) near \((\bar{\tau }_3,\bar{\sigma }_3)\); b contour plot of function \(\Pi (x,y)\) near \((\bar{x}_3,\bar{y}_3)\); c graph of function \(\Pi (x,y)\) near \((\bar{x}_3,\bar{y}_3)\)

From above double-min duality in Example 4, we can find our proposed canonical dual method can avoids a local minimum point \((\bar{x}_3,\bar{y}_3)\) of the primal problem. In fact, by the canonical dual method, the global solution is obtained, so any local minimum point is avoided. For instance, the point \((1.29672,-2.09209)\) is a local minimum point of the primal problem in Example 2 (see Fig. 9a), and the minimum value is -3.98411, but our proposed canonical dual method obtains the global minimum value -13.6736; the point \((1.88536,-1.10196)\) is a local minimum point of the primal problem in Example 3 (see Fig. 9b), and the minimum value is −2.45219, but our proposed canonical dual method obtains the global minimum value −22.6111.

Fig. 9
figure 9

graph of the primal problem near a local minimum point: a in Example 2; b in Example 3

6 Conclusions

Based on the original definition of objectivity in continuum physics, a canonical d.c. optimization problem is proposed, which can be used to model general nonconvex optimization problems in complex systems. Detailed application is provided by solving a challenging problem in \(\mathbb {R}^n\). By the canonical duality theory, this nonconvex problem is able to reformulated as a concave maximization dual problem in convex domain. A detailed proof for the triality theory is provided under a reasonable assumption. This theory can be used to identify both global and local extrema, and to develop a powerful algorithm for solving this general d.c. optimization problem. Several examples are given to illustrate detailed situations. All these examples support the Assumption 1. However, we should emphasize that this assumption is only a sufficient condition for the existence of a canonical dual solution in \(\mathscr {S}_a^+\). How to relax this assumption and to obtain a necessary condition for \(\mathscr {S}_a^+ \ne \emptyset \) are open questions and deserve detailed study.