Abstract
We provide a simple counter-example to prove and illustrate that the backward differential flow approach, proposed by Zhu, Zhao and Liu for finding a global minimizer of coercive even-degree polynomials, can converge to a local minimizer rather than a global minimizer. We provide additional counter-examples to stress that convergence to a local minimum via the backward differential flow method is not a rare occurence.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In their recent article, Zhu et al. [1] provide a method for finding a solution to global minimization of multivariate polynomials of even degree. In this note, we exemplify, and thus prove, that their method does not necessarily yield a global minimizer.
2 Preliminaries
For simplicity, we focus on the special case of monic quartic univariate polynomials \(f:\mathbb {R}\rightarrow \mathbb {R}\) such that
where \(a_0\), \(a_1\), \(a_2\) and \(a_3\) are real numbers. What Zhu et al. propose in [1] can be translated into this setting as one of solving the following initial value problem. With \(x:\mathbb {R}\rightarrow \mathbb {R}\) as the dependent variable and \(t\) as the independent variable,
where \(\dot{x} = \hbox {d}x/\hbox {d}t\), such that
and
Theorem 4.1 in [1], which is the main result for the so-called backward differential flow method, can then be rephrased as follows.
“If \(x(t)\) solves (1) and \(f''(x(t)) + t > 0\) for all \(t\in \,]0, t_0]\), then \(x(0)\) is a global minimizer of \(f(x)\).”
We note that, because \(f\) is a monic quartic polynomial, and so is coercive, a large enough positive \(t_0\) can always be found so that Condition (3) is satisfied. Zhu et al. provide an estimate of \(t_0\) by restricting the domain of \(f\) to a closed ball (in the univariate case, \(-a\le x\le a\)), in which a global minimizer is contained. In the quartic univariate case, one can even find the smallest \(t_0\) satisfying (3) easily (as illustrated in the counter-example below). Therefore, an estimate for \(t_0\) as proposed in [1] is not needed. Then, by (3), there exists a unique solution \(x_0\) to (2). Finally, the initial value problem (1) is solved from \(x(t_0) = x_0\) backward in \(t\), with the resulting solution referred to as backward differential flow by Zhu et al., to obtain \(x(0)\). The point \(x(0)\) is claimed in [1] to be a global minimizer. We will prove, via a counter-example, that \(x(0)\) is not necessarily a global minimizer.
Before providing a counter-example to Theorem 4.1 of [1], we will make some remarks in order to view the problem from a slightly different point.
Remark 2.1
Define
Then, \(\varphi (x,t)\) can be viewed as a quadratic regularization of \(f(x)\), with regularization parameter \(t>0\). Note that \(\varphi _x(x,t) = f'(x) + t\,x\) and \(\varphi _{xx}(x,t) = f''(x) + t\), where the subscripts \(x\) and \(xx\) stand for \(\partial /\partial x\) and \(\partial ^2/\partial x^2\), respectively. Therefore, (2)–(3) above can be rewritten as
and
We now recall a well-known fact regarding maximal extension of solutions of ODEs.
Remark 2.2
Assume that \(f:\mathbb {R}\rightarrow \mathbb {R}\) is twice continuously differentiable everywhere. Let \(t_0\in \mathbb {R} \) and \(x_0\in \mathbb {R}\) such that
The following hold.
-
(a)
There exists \(r>0\) such that there is a unique solution \(x(\cdot )\) of (1) in \(]t_0-r,t_0+r[\).
-
(b)
There exists a maximal interval to the left of \(t_0\), say \(]m_0,t_0]\), such that there exists a solution of (1) in \(]m_0,t_0]\).
-
(c)
Either \(m_0 = -\infty \), or \(m_0\in \mathbb {R}\) and \(f''(x(m_0)) + m_0=0\).
Part (a) follows from the classical Picard-Lindelöf existence and uniqueness theorem (see [2]), because the right-hand side of the ODE in (1) is Lipschitz continuous in \(x\) and continuous in \(t\) in a neighborhood of \(t_0\). Part (b) is the classical result on maximal extension of solutions of ODEs. The option \(m_0 = -\infty \) of part (c) corresponds to the case in which the right-hand side remains Lipschitz continuous in \(x\) for all \(t<t_0\). The remaining option happens when the denominator
vanishes at \(t = m_0\).
In the following simple lemma, we state a straightforward reformulation of the initial value problem in (1).
Lemma 2.1
Assume that \(f:\mathbb {R}\rightarrow \mathbb {R}\) is twice continuously differentiable everywhere. Let \(t_0\in \mathbb {R} \) and \(x_0\in \mathbb {R}\) be chosen as in (4). Let \(x(\cdot )\) be the maximally extended solution of (1), and \(]m_0,t_0]\) the corresponding maximal interval. Then, we have that
Proof
Solvability of (1) over \(]m_0,t_0]\) implies that the right-hand side of the ODE is continuous on \(]m_0,t_0]\). In other words, the denominator of the right-hand side of the ODE is not zero and so it does not change sign on \(]m_0,t_0]\). Since \(\varphi _{xx}(x(t_0),t_0) > 0\) and the solution exists in \(]m_0,t_0]\), we must have
for all \(t\in \,]m_0,t_0]\). Then, for all \(t\in \,]m_0,t_0]\), we can rewrite the ODE in (1) as
which can be rewritten in terms of \(\varphi \) as
By (4), we also have
Equalities (7) and (8) imply that
for all \(t\in \,]m_0,t_0]\). Equality (9) holds at \(t=m_0\) by continuity of \(f'\) and \(x(\cdot )\). \(\square \)
Next lemma shows that if we start with a negative initial value at \(t_0\), then the solution of the initial value problem (1) remains negative over its maximal domain of definition.
Lemma 2.2
Let \(f:\mathbb {R}\rightarrow \mathbb {R}\) be twice continuously differentiable everywhere. Let \(t_0\in \mathbb {R} \) and \(x_0\in \mathbb {R}\) be chosen as in (4). Consider the initial value problem (1). Let \(x(\cdot )\) be the maximally extended solution of (1), and \(]m_0,t_0]\) the corresponding (finite or infinite) maximal interval of definition of \(x(\cdot )\). If \(x_0<0\), then \(x(t)<0\) for all \(t\in \,]m_0,t_0]\). If \(m_0\in \mathbb {R}\), then \(x(m_0)<0\).
Proof
Suppose that for some \(t\in \,]m_0,t_0]\), we have \(x(t)\ge 0\). Consider the set \(S:=\{t\in \,]m_0,t_0]\,:\, x(t)\ge 0\}\). This set is non-empty and bounded above by \(t_0\). Let
Note that \(t_1\in S\) and \(t_1<t_0\). We claim that \(x(t_1)\ge 0\). Indeed, if \(x(t_1)<0\), then for some \(r>0\), we have
By definition of \(t_1\) as a supremum of \(S\), there exists \(t\in S\) such that \(t\in \,]t_1-r,t_1]\), which means that \(x(t)\ge 0\), contradicting (10). Hence, \(x(t_1)\ge 0\) and by definition of \(t_1\), we have
Using (11) and Lemma 2.1 in the ODE in (1), we conclude that
By the mean value theorem, there exists \(s\in \,]t_1,t_0]\) such that
where we used (12). The above expression implies that
which is a contradiction. Hence, \(x(t)<0\), for all \(t\in \,]m_0,t_0]\). To prove the last assertion of the lemma, assume on the contrary that \(x(m_0)\ge 0\). Since \(x(t)<0\), for all \(t\in \,]m_0,t_0]\), use again Lemma 2.1 in the ODE in (1), to obtain (12) with \(m_0\) in the place of \(t_1\). Using the mean value theorem again, we get
for some \(s\in \,]m_0,t_0]\). The above expression entails a contradiction, which implies that \(x(m_0) < 0\). \(\square \)
Lemma 2.3
Let \(f:\mathbb {R}\rightarrow \mathbb {R}\) be twice continuously differentiable everywhere. Let \(t_0\in \mathbb {R} \) and \(x_0\in \mathbb {R}\) be chosen as in (4). Consider the initial value problem (1) with \(x_0<0\). Assume that the system, with the unknown \((x,t)\in \mathbb {R}^2\), given by
has a unique real solution \((\overline{x},\overline{t})\) with \(\overline{x}>0\) and \(\overline{t}>0\). Then, the solution of (1) can be infinitely extended to the left; in other words, \(m_0=-\infty \), and so \(x(t)<0\), for all \(t\le t_0\).
Proof
Indeed, assume that, on the contrary, \(m_0\in \mathbb {R}\). By Remark 2.2(c), this can only happen if the right-hand side of (1) becomes discontinuous at \(t=m_0\). This implies that
By Lemma 2.1, we have
for all \(t\in [m_0,t_0]\). This fact combined with (4) implies that
By Lemma 2.2, we have that \(x(m_0)<0\). Equations (15) and (16) imply that there is a pair \((x,t) = (x(m_0),m_0)\) which solves system (14), with \(x < 0\). Since system (14) has a unique solution \((\overline{x},\overline{t})\) with \(\bar{x}>0\), we arrive at a contradiction. Hence, we must have \(m_0=-\infty \). It follows by Lemma 2.2 that \(x(t)<0\), for all \(t\le t_0\). \(\square \)
3 Counter-Example
Proposition 3.1
Consider
Suppose that \(x(t)\) solves (1). Then, one has that \(f''(x(t)) + t > 0\) for all \(t\in \,]0, t_0]\), but that \(x(0)\) is not a global minimizer of \(f(x)\).
Proof
We will first show that this quartic polynomial function \(f(x)\) verifies the hypotheses of Lemma 2.3. Then, we will conclude that there exists \(t_0\) such that the denominator \(q(t)\), defined in (5), is positive for all \(t\in \,]\!\!-\!\infty ,t_0]\). Hence, \(f(x)\) satisfies the assumptions of Theorem 4.1 in [1].
Note that \(f(x)\) has local minima at \(x=-2\) and \(x=7\) and a local maximum at \(x=1\). We also note that \(f(-2) = -104\), \(f(7) = -833\) and \(f(1) = 31\). Therefore, \(x=7\) is the global minimizer of \(f(x)\).
Let us now compute \(t_0\) and \(x_0\). We have
and
The minimum of the quadratic function \(\varphi _{xx}(x,t_0)\) above occurs at \(x = 2\). Therefore, one gets \(t_0 > 84\), to guarantee that (3) holds. Let \(t_0=100\). Then we obtain, as the only real solution of (17),
by means of some computer algebra package, e.g., Matlab. Approximately, \(x_0 \approx -0.681220\). The initial value problem (1) becomes
Next, let us show that \(f\) verifies the hypotheses of Lemma 2.3. From \(\varphi _{xx}(\overline{x},\overline{t}) = 0\), which is the second equation of (14), we get
Substitution of this expression for \(\overline{t}\) into \(\varphi _x(\overline{x},\overline{t}) = 0\), which is the first equation of (14), yields
The only real solution of the latter equation is found as
by Matlab. Approximately, \(\overline{x}\approx 3.554149\) and, in turn, \(\overline{t}\approx 55.01544\).
Therefore, the hypotheses of Lemma 2.3 are satisfied. Note also that the denominator in (5), \(q(t_0)=q(100)>0\). Since, by Lemma 2.3, the solution of (18) is well-defined on \(]\!\!-\!\infty ,100]\), we have that the denominator \(q(t)>0\) for all \(t\in [0,100]\), satisfying the hypotheses of Theorem 4.1 in [1].
Since \(x_0<0\) and \(q(100)>0\), we have \(\dot{x}(100)>0\), and so, by Lemma 2.3, the unique \(x(t)\) which solves (18) is negative for all \(t\in [0,100]\). However, \(x(0)<0\) is not the global minimizer of \(f(x)\). \(\square \)
In Fig. 1, an illustration of the backward differential flow method, as applied to the polynomial in Proposition 3.1, is given. The solution curve of (18) is depicted on a surface plot of the function \(\varphi (x,t)\). The curve is generated by solving (18) numerically using the Matlab function ode113, with RelTol = 1e-06. It can be clearly observed in the figure that \(x(0)\) approximates the local minimizer \(x=-2\), rather than the global minimizer \(x=7\).
3.1 Other Counter-Examples
The fact that \(x(0)\) is not a global minimizer is not a rare occurence; indeed, it is frequently encountered. In what follows, we provide a few more examples for which \(x(0)\) of the backward differential flow is not a global minimizer.
\(f(x) = x^4 - (16/3)\,x^3 - 2\,x^2 + 16\,x + 2\) (global minimizer: \(x=4\); local minimizer: \(x=-1\))
\(f(x) = x^4 + (20/3)\,x^3 - 2\,x^2 -20\,x + 3\) (global minimizer: \(x=-5\); local minimizer: \(x=1\))
4 Conclusions
We have demonstrated, via a counter-example, that the backward differential flow approach presented by Zhu et al. [1] does not necessarily yield a global minimizer of a coercive even-degree polynomial. The counter-example will hopefully help/prompt to determine where the proof of Theorem 4.1 in [1] breaks down. This might in turn help find a correct statement for the theorem.
References
Zhu, J., Zhao, S., Liu, G.: Solution to global minimization of polynomials by backward differential flow. J. Optim. Theory Appl. 161, 828–836 (2014)
Arnold, V.I.: Ordinary Differential Equations. The MIT Press, Cambridge (1978)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Arıkan, O., Burachik, R.S. & Kaya, C.Y. “Backward Differential Flow” May Not Converge to a Global Minimizer of Polynomials. J Optim Theory Appl 167, 401–408 (2015). https://doi.org/10.1007/s10957-015-0727-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-015-0727-7