Abstract
In this paper, we study global optimal solutions of minimizing a nonconvex quadratic function subject to a sphere constraint. The main challenge is to solve the problem when it has multiple global solutions on the boundary of the sphere, which is called hard case. By canonical duality theory, a concave maximization problem is formulated, which is one-dimensional and without duality gaps to the primal problem. Then sufficient and necessary conditions are provided to identify whether the problem is in the hard case or not. A perturbation method and associated algorithms are proposed to solve hard-case problems. Theoretical results and methods are verified by numerical examples.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
Mathematics Subject Classification (2010):
1 Introduction
In mathematical programming, the problem of minimizing a nonconvex quadratic function over a sphere constraint is known as the trust region subproblem, which arises in trust region methods [1, 2]. Here, we formulate it as
where the given matrix \(\boldsymbol{Q} \in \mathbb{R}^{n\times n}\) is assumed to be symmetric, \(\boldsymbol{f} \in \mathbb{R}^{n}\) is a given vector, and the feasible region is defined as
in which r is a positive real number and \(\|\boldsymbol{x}\| =\|\boldsymbol{ x}\|_{2}\) represents ℓ 2 norm in \(\mathbb{R}^{n}\). In literatures, two types of similar problems are also discussed: one considers a general quadratic constraint, i.e. the convexity is removed [3, 4]; the other one is equipped with a two-sided (lower and upper bound) quadratic constraint [5, 6].
It is proved that the problem (\(\mathcal{P}\)) possesses hidden convexity, i.e. it is actually equivalent to a convex optimization problem [5]. Thus, if the vector \(\bar{\boldsymbol{x}}\) is a solution of (\(\mathcal{P}\)), there exists a Lagrange multiplier \(\bar{\mu }\) such that besides the KKT conditions
we also have [7]
which demonstrates the hidden convexity. If we let λ 1 be the smallest eigenvalue of the matrix \(\boldsymbol{Q}\), from conditions (3) and (4), it is true that \(\bar{\mu }\geq \max \{ 0,-\lambda _{1}\}\). If the problem (\(\mathcal{P}\)) has no solution on the boundary of \(\mathcal{X}_{a}\), then \(\boldsymbol{Q}\) must be positive definite and \(\|\boldsymbol{Q}^{-1}\boldsymbol{f}\| < r\), which leads to \(\bar{\mu }= 0\). While if (\(\mathcal{P}\)) has a solution on the boundary of \(\mathcal{X}_{a}\) and \((\boldsymbol{Q} +\bar{\mu }\boldsymbol{ I}) \succ 0\), we have \(\|(\boldsymbol{Q} +\bar{\mu }\boldsymbol{ I})^{-1}\boldsymbol{f}\| = r\). In this case, the multiplier \(\bar{\mu }\) can be easily found. However, if the solution \(\bar{\boldsymbol{x}}\) is located on the boundary of \(\mathcal{X}_{a}\) and \(\mathrm{det}(\boldsymbol{Q} +\bar{\mu }\boldsymbol{ I}) = 0\), this situation is the so-called hard case (see [8]), which leads to numerical difficulties [9–13]. As pointed in [9, 12–14], the hard case always implies that \(\boldsymbol{f}\) is perpendicular to the subspace generated by all the eigenvectors corresponding to λ 1. We will show by Theorem 3 in this paper that this condition is only a necessary condition for the hard case. Many methods have been proposed for solving this spherical constrained quadratic minimization problem, especially focusing on the hard case. They include Newton type methods [8, 15], methods recasting the problem in terms of a parameterized eigenvalue problem [12, 13], methods sequential searching Krylov subspaces [14, 16], semidefinite programming methods [9, 11], and the D.C. (difference of convex functions) method [17].
The canonical duality theory was developed from Gao and Strang’s original work [18] for solving the nonconvex/nonsmooth variational problems. It is a powerful methodological theory which has been used successfully for solving a large class of difficult problems (nonconvex, nonsmooth or discrete) in global optimization (see [19, 20]) within a unified framework. This theory is mainly comprised of (1) a canonical dual transformation, which can be used to reformulate nonconvex/discrete problems from different systems as a unified canonical dual problem without duality gaps; (2) a complementary-dual principle, which provides a unified analytical solution form in terms of the canonical dual variable; and (3) a triality theory, which can be used to identify both global and local extrema.
The goal of this paper is to apply the canonical dual approach to find global solutions for the problem (\(\mathcal{P}\)), especially when it is in the hard case. We first show in the next section that the canonical dual problem is canonically (i.e., perfectly) dual to \((\mathcal{P})\) in the sense that both problems have the same set of KKT solutions. Then sufficient and necessary conditions are provided for identifying global optimal solutions. In Sect. 3, a perturbation method is proposed for problems in the hard case. Numerical results are presented in Sect. 4. The paper is ended with some conclusion remarks.
2 Canonical Dual Problem
Let \(\mathcal{S}_{a} =\{\sigma \in \mathbb{R}\ \vert \ \sigma \geq 0,\ \mathrm{det}\boldsymbol{G}(\sigma )\neq 0\;\}\), where \(\boldsymbol{G}(\sigma ) =\boldsymbol{ Q} +\sigma \boldsymbol{ I}\). The canonical dual function \(P^{d}: \mathcal{S}_{a} \rightarrow \mathbb{R}\) is defined by
and the stationary canonical dual problem [21] is defined by
Theorem 1 (Analytical Solution and Complementary-Dual Principle [20, 22]).
The problem(6)is canonically dual to the problem ( \(\mathcal{P}\) ) in the sense that if \(\bar{\sigma }\in \mathcal{S}_{a}\) is a critical point of P d (σ), then \(\bar{\boldsymbol{x}} =\boldsymbol{ G}_{a}(\bar{\sigma })^{-1}\boldsymbol{f}\) is a KKT point of the primal problem ( \(\mathcal{P}\) ), and we have \(P(\bar{\boldsymbol{x}}) = P^{d}(\bar{\sigma })\) .
Here, we focus the discussion on global optimal solutions and define the canonical dual problem to \((\mathcal{P})\) as the following maximization problem:
where \(\mathcal{S}_{a}^{+} = \left \{\sigma \in \mathcal{S}_{a}\ \vert \ \boldsymbol{G}(\sigma ) \succ \mathbf{0}\right \}\).
Theorem 2 (Global Optimality Condition [20, 22]).
If \(\bar{\sigma }\in \mathcal{S}_{a}^{+}\) is a critical point of P d (σ), then \(\bar{\sigma }\) is a global maximal solution of the problem ( \(\mathcal{P}^{d}\) ) and \(\bar{\boldsymbol{x}} =\boldsymbol{ G}(\bar{\sigma })^{-1}\boldsymbol{f}\) is a global minimal solution of the primal problem ( \(\mathcal{P}\) ), i.e. \(P(\bar{\boldsymbol{x}}) =\min _{\boldsymbol{x}\in \mathcal{X}_{a}}P(\boldsymbol{x}) =\max _{\sigma \in \mathcal{S}_{a}^{+}}P^{d}(\sigma ) = P^{d}(\bar{\sigma })\) .
According to the triality theorem [22, 23], the global optimality condition is called canonical min–max duality. Similar results are also discussed by Corollary 5.3 in [6] and Theorem 1 in [11].
By the symmetry of the matrix \(\boldsymbol{Q}\), there exist diagonal matrix \(\Lambda \) and orthogonal matrix \(\boldsymbol{U}\) such that \(\boldsymbol{Q} =\boldsymbol{ U}\Lambda \boldsymbol{U}^{T}\). The diagonal entities of \(\Lambda \) are the eigenvalues of the matrix Q and are arranged in nondecreasing order, \(\lambda _{1} = \cdots =\lambda _{k} <\lambda _{k+1} \leq \cdots \leq \lambda _{n}.\) The columns of \(\boldsymbol{U}\) are corresponding eigenvectors. Let \(\hat{\boldsymbol{f}} =\boldsymbol{ U}^{T}\boldsymbol{f}\).
Theorem 3 (Existence Conditions [24]).
Suppose that for any given symmetric matrix \(\boldsymbol{Q} \in \mathbb{R}^{n\times n}\) and vector \(\boldsymbol{f} \in \mathbb{R}^{n}\) , λ i and \(\hat{f}_{i}\) are defined as above. The canonical dual problem \((\mathcal{P}^{d})\) has a unique solution \(\bar{\sigma }\in (-\lambda _{1},+\infty )\) if and only if either \(\sum _{i=1}^{k}\hat{f}_{i}^{2}\neq 0\) or \(\sum _{i=k+1}^{n} \frac{\hat{f}_{i}^{2}} {(\lambda _{i}-\lambda _{1})^{2}} > r^{2}\) . If λ 1 ≤ 0, \(\bar{\boldsymbol{x}} =\boldsymbol{ G}(\bar{\sigma })^{-1}\boldsymbol{f}\) is a unique global solution of the problem ( \(\mathcal{P}\) ). Moreover, given that ( \(\mathcal{P}^{d}\) ) has no solutions in \((-\lambda _{1},+\infty )\) , the problem \((\mathcal{P})\) has exactly two global solutions if the multiplicity of λ 1 is k = 1 and infinite number of solutions if k > 1.
Generally speaking, the case that the canonical dual problem \((\mathcal{P}^{d})\) has no critical point in \(\mathcal{S}_{a}^{+}\) does not imply that the problem \((\mathcal{P})\) is in the hard case. For example, if λ 1 > 0, i.e. the matrix \(\boldsymbol{Q}\) is positive definite, \((\mathcal{P}^{d})\) may have no critical point in \(\mathcal{S}_{a}^{+} = [0,+\infty )\). But, the matrix \(\boldsymbol{G}\) is not singular and therefore, the problem is not in the hard case. If λ 1 ≤ 0, the hard case of \((\mathcal{P})\) is equivalent to that \((\mathcal{P}^{d})\) has no critical point in \(\mathcal{S}_{a}^{+} = (-\lambda _{1},+\infty )\). In this case, \(P^{d}(-\lambda _{1}) =\sup \{ P^{d}(\sigma )\vert \;\sigma \in \mathcal{S}_{a}^{+}\}\). The theorem can also be stated equivalently as: If \(\lambda _{1} \leq 0\) , the nonconvex problem \((\mathcal{P})\) is in the hard case if and only if \(\sum _{i=1}^{k}\hat{f}_{i}^{2} = 0\) and \(\sum _{i=k+1}^{n} \frac{\hat{f}_{i}^{2}} {(\lambda _{i}-\lambda _{1})^{2}} \leq r^{2}\) . The first condition indicates that a problem could be in the hard case only when the coefficient \(\boldsymbol{f}\) is perpendicular to the subspace generated by eigenvectors of the smallest eigenvalue. The second one adds that if the norm of \(\boldsymbol{f}\) is relative small, comparing to the radius r and differences between the smallest eigenvalue and all other eigenvalues, the hard case would happen.
3 Perturbation Methods
In order to reinforce the existence conditions, a perturbation \(\sum _{i=1}^{k}\alpha _{i}\boldsymbol{U}_{i}\) to \(\boldsymbol{f}\) with parameters \(\boldsymbol{\alpha }=\{\alpha _{i}\}_{i=1}^{k}\neq 0\) is introduced. Let \(\boldsymbol{p} =\boldsymbol{ f} +\sum _{ i=1}^{k}\alpha _{i}\boldsymbol{U}_{i}\) and \(\hat{\boldsymbol{p}} =\boldsymbol{ U}^{T}\boldsymbol{p}\). The perturbed problem is
It is true that the existence condition, \(\sum _{i=1}^{k}\hat{p}_{i}^{2}\neq 0\), holds for the perturbed problem.
Theorem 4.
[24]Suppose that λ 1 ≤ 0, there is no critical point of P d (σ) in \(\mathcal{S}_{a}^{+}\) , and \(\bar{\boldsymbol{x}}^{{\ast}}\) is the optimal solution of the problem ( \(\mathcal{P}_{\alpha }\) ). Then, there is a global solution of the problem ( \(\mathcal{P}\) ), denoted as \(\bar{\boldsymbol{x}}\) , which is on the boundary of \(\mathcal{X}_{a}\) and, for any \(\varepsilon > 0\) , if the parameter \(\boldsymbol{\alpha }\) satisfies
we have \(\|\bar{\boldsymbol{x}}^{{\ast}}-\bar{\boldsymbol{ x}}\| \leq \varepsilon\) .
Actually, if the perturbation parameter \(\boldsymbol{\alpha }\) is properly chosen, each solution of the problem (\(\mathcal{P}\)) can be approximated. When the multiplicity of λ 1 is equal to one, as stated in Theorem 3, there are exactly two global solutions. In this case, \(\boldsymbol{\alpha }\) becomes a scalar and has exactly two possible directions, which are mutual opposite and, respectively, lead to the two global solutions. For general cases, there may be infinite number of global solutions for the problem (\(\mathcal{P}\)), and we can show that between solutions of the problem (\(\mathcal{P}\)) and directions of \(\boldsymbol{\alpha }\) there is a one-to-one correspondence [24].
4 Numerical Results
Based on the perturbation method discussed in the previous section, a canonical primal-dual algorithm is developed [24], which is matrix inverse free and the essential cost of calculation is only the matrix-vector multiplication.
One hundred examples are randomly generated, containing 50 examples of the general case and 50 examples of the hard case. Both cases have ten examples for dimensions of 500, 1,000, 2,000, 3,000, and 5,000. All elements of the coefficients, \(\boldsymbol{Q}\), \(\boldsymbol{f}\) and r, are integer numbers in [−100, 100]. For each example of the hard case, in order to make \(\boldsymbol{f}\) can be easily chosen, we use a matrix \(\boldsymbol{Q}\) of whom the multiplicity of the smallest eigenvalue is equal to one. Then, the vector \(\boldsymbol{f}\) is constructed such that it is perpendicular to the eigenvector of the smallest eigenvalue, and a proper radius \(r\) is selected such that the existence conditions are violated.
For the hard case, a perturbation \(\alpha \boldsymbol{U}_{1}\) is added to the vector \(\boldsymbol{f}\), where \(\boldsymbol{U}_{1}\) is the eigenvector of the smallest eigenvalue, and two values of α, 1e−3 and 1e−4, are tried. The algorithm is implemented on Matlab 7.13, which was runned in the platform with a Linux 64-bit system and a quad-core CPU.
Results are shown in Tables 1, 2, 3, and 4, and they contain the number of examples which are successfully solved (Succ.Solv.), the distance of the optimal solution to the boundary of the sphere (Dist.Boun.), the number of iterations in Algorithm: Main (Numb.Iter.) and the running time (in second) of the algorithm (Runn.Time). The values in the columns of Dist.Boun., Numb.Iter. and Runn.Time are averages of the examples successfully solved. We compare the results of the algorithm adopting “left division” and that of the algorithm adopting “quadprog” in the same table, where LD denotes “left division” and QP denotes “quadprog.”
We can see that the examples are solved very accurately with error allowance being less than 1e-09. The failure in solving some examples is due to “left division” and “quadprog” being unable to handle very nearly singular matrices. For general cases, all the examples can be solved within no more than 30 iterations, while for hard cases, the number of iterations is around 40. From the running time, we notice that our method is capable to handle very large problems in reasonable time. The algorithms using “left division” and “quadprog” have similar performances in the accuracy and the number of iterations. The one using “left division” needs much less time than that of the one using “quadprog”, but “quadprog” is able to solve more examples successfully.
5 Conclusion Remarks
We have presented a detailed study on the quadratic minimization problem with a sphere constraint. By the canonical duality, this nonconvex optimization is equivalent to a concave maximization dual problem over a convex domain \(\mathcal{S}_{a}^{+}\), which is true also for many other global optimization problems (see [25–31]). Based on this canonical dual problem, sufficient and necessary conditions are obtained for both general and hard cases. In order to solve hard-case problems, a perturbation method and the associated algorithm are proposed. Numerical results demonstrate that the proposed approach is able to handle the problem effectively. Combining with the trust region method, the results presented in this paper can be used to solve general global optimizations.
References
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. SIAM, Philadelphia (2000)
Powell M.J.D.: On trust region methods for unconstrained minimization without derivatives. Math. Program. 97(3), 605–623 (2003)
Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Xing, W.X., Fang, S.C., Gao, D.Y., Sheu, R.L., Zhang, L.: Canonical dual solutions to the quadratic programming over a quadratic constraint. ICOTA 7, 35–36 (2007)
Ben-Tal, A., Teboulle, M.: Hidden convexity in some nonconvex quadratically constrained quadratic programming. Math. Program. 72(1), 51–63 (1996)
Stern, R.J., Wolkowicz, H.:. Indefinite trust region subproblems and nonsymmetric eigenvalue perturbations. SIAM J. Optim. 5(2), 286–313 (1995)
Sorensen, D.C.: Newton’s method with a model trust region modification. SIAM J. Numer. Anal. 19(2), 409–426 (1982)
Moré, J.J., Sorensen, D.C..: Computing a trust region step. SIAM J. Sci. Stat. Comput. 4(3), 553–572 (1983)
Fortin, C., Wolkowicz, H.: The trust region subproblem and semidefinite programming. Optim. Method Softw. 19(1), 41–67 (2004)
Jorge, N., Wright, S.J.: Numerical Optimization, vol. 2. Springer, New York (1999)
Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems with applications to large scale minimization. Math. Program. 77, 273–299 (1997)
Rojas, M., Santos, S.A., Sorensen, D.C.: A new matrix-free algorithm for the large-scale trust-region subproblem. SIAM J. Optim. 11(3), 611–646 (2001)
Sorensen, D.C.: Minimization of a large-scale quadratic functionsubject to a spherical constraint. SIAM J. Optim. 7(1), 141–161 (1997)
Hager, W.W.: Minimizing a quadratic over a sphere. SIAM J. Optim. 12(1), 188–208 (2001)
Gay, D.M.: Computing optimal locally constrained steps. SIAM J. Sci. Stat. Comp. 2(2), 186–197 (1981)
Gould, N.I.M., Lucidi, S., Roma, M., Toint, P.L.: Solving the trust-region subproblem using the lanczos method. SIAM J. Optim. 9(2), 504–525 (1999)
Tao, P.D., An, L.T.H.: A dc optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)
Gao, Y., Strang, G.: Geometric nonlinearity: potential energy, complementary energy, and the gap function. Q. Appl. Math. 47, 487–504 (1989)
Gao, D.Y.: Canonical duality theory: unified understanding and generalized solution for global optimization problems. Comput. Chem. Eng. 33(12), 1964–1972 (2009)
Gao, D.Y., Ruan, N., Sherali, H.D.: Solutions and optimality criteria for nonconvex constrained global optimization problems with connections between canonical and Lagrangian duality. J. Glob. Optim. 45(3), 473–497 (2009)
Gao, D.Y.: Canonical Duality theory and solutions to constrained nonconvex quadratic programming. J. Glob. Optim. 29(4), 377–399 (2004)
Gao, D.Y.: Duality Principles in Nonconvex Systems: Theory, Methods, and Applications. Springer, Netherlands (2000)
Gao, D.Y., Wu, C.: On the triality theory for a quartic polynomial optimisation problem. J. Ind. Manag. Optim. 8(1), 229–242 (2012)
Chen, Y., Gao, D.Y.: Global solutions to spherical constrained quadratic minimization via canonical dual approach. arXiv:1308.4450 (2013)
Gao, D.Y.: Perfect duality theory and complete solutions to a class of global optimization problems. Optimization 52(4–5), 467–493 (2003)
Gao, D.Y.: Sufficient conditions and perfect duality in nonconvex minimization with inequality constraints. J. Ind. Manag. Optim. 1(1), 53–63 (2005)
Gao, D.Y.: Complete solutions and extremality criteria to polynomial optimization problems. J. Glob. Optim. 35(1), 131–143 (2006)
Gao, D.Y.: Solutions and optimality criteria to box constrained nonconvex minimization problems. J. Ind. Manag. Optim. 3(2), 293–304 (2007)
Gao, D.Y., Ruan, N., Pardalos, P.M.: Canonical dual solutions to sum of fourth-order polynomials minimization problems with applications to sensor network localization. In: Pardalos, P.M., Ye, Y.Y., Boginski, V., Commander, C. (eds.) Sensors: Theory, Algorithms, and Applications, pp. 37–54. Springer, New York (2010)
Gao, D.Y., Ruan, N., Sherali, H.D.: Canonical dual solutions for fixed cost quadratic programs. In: Chinchuluun, A., Pardalos, P.M., Enkhbat, R., Tseveendorj, I. (eds.) Optimization and Optimal Control, pp. 139–156. Springer, New York (2010)
Gao, D.Y., Watson, L.T., Easterling, D.R., Thacker, W.I., Billups, S.C.: Solving the canonical dual of box- and integer-constrained nonconvex quadratic programs via a deterministic direct search algorithm. Optim. Methods Softw. 28, 313–326 (2013)
Acknowledgements
This research is supported by US Air Force Office of Scientific Research under the grant AFOSR FA9550-10-1-0487, as well as by a grant from the Australian Government under the Collaborative Research Networks (CRN) program. The main results of this paper have been announced at the 3rd World Congress of Global Optimization, July 9–11, 2013, the Yellow Mountains, China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, Y., Gao, D.Y. (2015). Canonical Dual Approach for Minimizing a Nonconvex Quadratic Function over a Sphere. In: Gao, D., Ruan, N., Xing, W. (eds) Advances in Global Optimization. Springer Proceedings in Mathematics & Statistics, vol 95. Springer, Cham. https://doi.org/10.1007/978-3-319-08377-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-08377-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08376-6
Online ISBN: 978-3-319-08377-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)