“Backward Differential Flow” May Not Converge to a Global Minimizer of Polynomials

Arıkan, Orhan; Burachik, Regina S.; Kaya, C. Yalçın

doi:10.1007/s10957-015-0727-7

“Backward Differential Flow” May Not Converge to a Global Minimizer of Polynomials

Published: 20 March 2015

Volume 167, pages 401–408, (2015)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

“Backward Differential Flow” May Not Converge to a Global Minimizer of Polynomials

Download PDF

Orhan Arıkan¹,
Regina S. Burachik² &
C. Yalçın Kaya²

201 Accesses
2 Citations
Explore all metrics

Abstract

We provide a simple counter-example to prove and illustrate that the backward differential flow approach, proposed by Zhu, Zhao and Liu for finding a global minimizer of coercive even-degree polynomials, can converge to a local minimizer rather than a global minimizer. We provide additional counter-examples to stress that convergence to a local minimum via the backward differential flow method is not a rare occurence.

New analysis of linear convergence of gradient-type methods via unifying error bound conditions

Article 04 January 2019

Forward–backward quasi-Newton methods for nonsmooth optimization problems

Article 10 April 2017

On the Acceleration of Forward-Backward Splitting via an Inexact Newton Method

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In their recent article, Zhu et al. [1] provide a method for finding a solution to global minimization of multivariate polynomials of even degree. In this note, we exemplify, and thus prove, that their method does not necessarily yield a global minimizer.

2 Preliminaries

For simplicity, we focus on the special case of monic quartic univariate polynomials $f:\mathbb {R}\rightarrow \mathbb {R}$ such that

$$\begin{aligned} f(x) = x^4 + a_3\,x^3 + a_2\,x^2 + a_1\,x + a_0, \end{aligned}$$

where $a_0$, $a_1$, $a_2$ and $a_3$ are real numbers. What Zhu et al. propose in [1] can be translated into this setting as one of solving the following initial value problem. With $x:\mathbb {R}\rightarrow \mathbb {R}$ as the dependent variable and $t$ as the independent variable,

$$\begin{aligned} \dot{x}(t) = -\frac{x(t)}{f''(x(t)) + t},\quad 0\le t\le t_0\,,\quad x(t_0) = x_0, \end{aligned}$$

(1)

where $\dot{x} = \hbox {d}x/\hbox {d}t$, such that

$$\begin{aligned} f'(x_0) + t_0\,x_0 = 0 \end{aligned}$$

(2)

and

$$\begin{aligned} f''(x) + t_0 > 0\,,\quad \text{ for } \text{ all } x\in \mathbb {R}. \end{aligned}$$

(3)

Theorem 4.1 in [1], which is the main result for the so-called backward differential flow method, can then be rephrased as follows.

“If $x(t)$ solves (1) and $f''(x(t)) + t > 0$ for all $t\in \,]0, t_0]$, then $x(0)$ is a global minimizer of $f(x)$.”

We note that, because $f$ is a monic quartic polynomial, and so is coercive, a large enough positive $t_0$ can always be found so that Condition (3) is satisfied. Zhu et al. provide an estimate of $t_0$ by restricting the domain of $f$ to a closed ball (in the univariate case, $-a\le x\le a$), in which a global minimizer is contained. In the quartic univariate case, one can even find the smallest $t_0$ satisfying (3) easily (as illustrated in the counter-example below). Therefore, an estimate for $t_0$ as proposed in [1] is not needed. Then, by (3), there exists a unique solution $x_0$ to (2). Finally, the initial value problem (1) is solved from $x(t_0) = x_0$ backward in $t$, with the resulting solution referred to as backward differential flow by Zhu et al., to obtain $x(0)$. The point $x(0)$ is claimed in [1] to be a global minimizer. We will prove, via a counter-example, that $x(0)$ is not necessarily a global minimizer.

Before providing a counter-example to Theorem 4.1 of [1], we will make some remarks in order to view the problem from a slightly different point.

Remark 2.1

Define

$$\begin{aligned} \varphi (x,t) := f(x) + \frac{t}{2}\,x^2. \end{aligned}$$

Then, $\varphi (x,t)$ can be viewed as a quadratic regularization of $f(x)$, with regularization parameter $t>0$. Note that $\varphi _x(x,t) = f'(x) + t\,x$ and $\varphi _{xx}(x,t) = f''(x) + t$, where the subscripts $x$ and $xx$ stand for $\partial /\partial x$ and $\partial ^2/\partial x^2$, respectively. Therefore, (2)–(3) above can be rewritten as

$$\begin{aligned} \varphi _x(x_0,t_0) = 0, \end{aligned}$$

and

$$\begin{aligned} \varphi _{xx}(x,t_0) > 0\,,\quad \text{ for } \text{ all } x\in \mathbb {R}. \end{aligned}$$

We now recall a well-known fact regarding maximal extension of solutions of ODEs.

Remark 2.2

Assume that $f:\mathbb {R}\rightarrow \mathbb {R}$ is twice continuously differentiable everywhere. Let $t_0\in \mathbb {R} $ and $x_0\in \mathbb {R}$ such that

$$\begin{aligned} f'(x_0) + t_0\,x_0=0\quad \hbox {and}\quad f''(x_0) + t_0 > 0. \end{aligned}$$

(4)

The following hold.

(a)
There exists $r>0$ such that there is a unique solution $x(\cdot )$ of (1) in $]t_0-r,t_0+r[$.
(b)
There exists a maximal interval to the left of $t_0$, say $]m_0,t_0]$, such that there exists a solution of (1) in $]m_0,t_0]$.
(c)
Either $m_0 = -\infty $, or $m_0\in \mathbb {R}$ and $f''(x(m_0)) + m_0=0$.

Part (a) follows from the classical Picard-Lindelöf existence and uniqueness theorem (see [2]), because the right-hand side of the ODE in (1) is Lipschitz continuous in $x$ and continuous in $t$ in a neighborhood of $t_0$. Part (b) is the classical result on maximal extension of solutions of ODEs. The option $m_0 = -\infty $ of part (c) corresponds to the case in which the right-hand side remains Lipschitz continuous in $x$ for all $t<t_0$. The remaining option happens when the denominator

$$\begin{aligned} q(t) := f''(x(t)) + t \end{aligned}$$

(5)

vanishes at $t = m_0$.

In the following simple lemma, we state a straightforward reformulation of the initial value problem in (1).

Lemma 2.1

Assume that $f:\mathbb {R}\rightarrow \mathbb {R}$ is twice continuously differentiable everywhere. Let $t_0\in \mathbb {R} $ and $x_0\in \mathbb {R}$ be chosen as in (4). Let $x(\cdot )$ be the maximally extended solution of (1), and $]m_0,t_0]$ the corresponding maximal interval. Then, we have that

$$\begin{aligned} \varphi _x(x(t),t)&= f'(x(t)) + t x(t) = 0\,,\ \ \varphi _{xx}(x(t),t)\\&= f''(x(t)) + t>0\,,\ \forall t\in [m_0,t_0]. \end{aligned}$$

Proof

Solvability of (1) over $]m_0,t_0]$ implies that the right-hand side of the ODE is continuous on $]m_0,t_0]$. In other words, the denominator of the right-hand side of the ODE is not zero and so it does not change sign on $]m_0,t_0]$. Since $\varphi _{xx}(x(t_0),t_0) > 0$ and the solution exists in $]m_0,t_0]$, we must have

$$\begin{aligned} \varphi _{xx}(x(t),t) = f''(x(t)) + t>0, \end{aligned}$$

(6)

for all $t\in \,]m_0,t_0]$. Then, for all $t\in \,]m_0,t_0]$, we can rewrite the ODE in (1) as

$$\begin{aligned} \dot{x}(t)\, (f''(x(t)) + t ) + x(t) = 0, \end{aligned}$$

which can be rewritten in terms of $\varphi $ as

$$\begin{aligned} \frac{\hbox {d}}{\hbox {d}t}\,\varphi _{x}(x(t),t) = 0. \end{aligned}$$

(7)

By (4), we also have

$$\begin{aligned} \varphi _x(x(t_0),t_0) = f'(x(t_0)) + x(t_0)\,t_0 = 0. \end{aligned}$$

(8)

Equalities (7) and (8) imply that

$$\begin{aligned} \varphi _{x}(x(t),t) = f'(x(t)) + x(t)\, t = 0, \end{aligned}$$

(9)

for all $t\in \,]m_0,t_0]$. Equality (9) holds at $t=m_0$ by continuity of $f'$ and $x(\cdot )$. $\square $

Next lemma shows that if we start with a negative initial value at $t_0$, then the solution of the initial value problem (1) remains negative over its maximal domain of definition.

Lemma 2.2

Let $f:\mathbb {R}\rightarrow \mathbb {R}$ be twice continuously differentiable everywhere. Let $t_0\in \mathbb {R} $ and $x_0\in \mathbb {R}$ be chosen as in (4). Consider the initial value problem (1). Let $x(\cdot )$ be the maximally extended solution of (1), and $]m_0,t_0]$ the corresponding (finite or infinite) maximal interval of definition of $x(\cdot )$. If $x_0<0$, then $x(t)<0$ for all $t\in \,]m_0,t_0]$. If $m_0\in \mathbb {R}$, then $x(m_0)<0$.

Proof

Suppose that for some $t\in \,]m_0,t_0]$, we have $x(t)\ge 0$. Consider the set $S:=\{t\in \,]m_0,t_0]\,:\, x(t)\ge 0\}$. This set is non-empty and bounded above by $t_0$. Let

$$\begin{aligned} t_1:=\sup \,S. \end{aligned}$$

Note that $t_1\in S$ and $t_1<t_0$. We claim that $x(t_1)\ge 0$. Indeed, if $x(t_1)<0$, then for some $r>0$, we have

$$\begin{aligned} x(t) < 0\,,\hbox { for all } t\in \,]t_1-r,t_1+r [\,. \end{aligned}$$

(10)

By definition of $t_1$ as a supremum of $S$, there exists $t\in S$ such that $t\in \,]t_1-r,t_1]$, which means that $x(t)\ge 0$, contradicting (10). Hence, $x(t_1)\ge 0$ and by definition of $t_1$, we have

$$\begin{aligned} x(t)<0\,,\hbox { for all } t\in \,]t_1,t_0]. \end{aligned}$$

(11)

Using (11) and Lemma 2.1 in the ODE in (1), we conclude that

$$\begin{aligned} \dot{x}(t) > 0,\hbox { for all } t\in \,]t_1,t_0]. \end{aligned}$$

(12)

By the mean value theorem, there exists $s\in \,]t_1,t_0]$ such that

$$\begin{aligned} \begin{array}{rcl} x(t_1)= & {} x(t_0) + \dot{x}(s)\,(t_1-t_0) < x_0, \end{array} \end{aligned}$$

where we used (12). The above expression implies that

$$\begin{aligned} x(t_1) < x_0 < 0\,,\hbox { for all } t\in \,]t_1,t_0], \end{aligned}$$

(13)

which is a contradiction. Hence, $x(t)<0$, for all $t\in \,]m_0,t_0]$. To prove the last assertion of the lemma, assume on the contrary that $x(m_0)\ge 0$. Since $x(t)<0$, for all $t\in \,]m_0,t_0]$, use again Lemma 2.1 in the ODE in (1), to obtain (12) with $m_0$ in the place of $t_1$. Using the mean value theorem again, we get

$$\begin{aligned} 0\le x(m_0)=x(t_0)+\dot{x}(s)\,(m_0-t_0) < x_0 < 0, \end{aligned}$$

for some $s\in \,]m_0,t_0]$. The above expression entails a contradiction, which implies that $x(m_0) < 0$. $\square $

Lemma 2.3

Let $f:\mathbb {R}\rightarrow \mathbb {R}$ be twice continuously differentiable everywhere. Let $t_0\in \mathbb {R} $ and $x_0\in \mathbb {R}$ be chosen as in (4). Consider the initial value problem (1) with $x_0<0$. Assume that the system, with the unknown $(x,t)\in \mathbb {R}^2$, given by

$$\begin{aligned} f'(x) + t\,x = 0\,,\quad f''(x) + t = 0, \end{aligned}$$

(14)

has a unique real solution $(\overline{x},\overline{t})$ with $\overline{x}>0$ and $\overline{t}>0$. Then, the solution of (1) can be infinitely extended to the left; in other words, $m_0=-\infty $, and so $x(t)<0$, for all $t\le t_0$.

Proof

Indeed, assume that, on the contrary, $m_0\in \mathbb {R}$. By Remark 2.2(c), this can only happen if the right-hand side of (1) becomes discontinuous at $t=m_0$. This implies that

$$\begin{aligned} f''(x(m_0)) + m_0 = 0. \end{aligned}$$

(15)

By Lemma 2.1, we have

$$\begin{aligned} f'(x(t)) + t\,x(t) = 0, \end{aligned}$$

for all $t\in [m_0,t_0]$. This fact combined with (4) implies that

$$\begin{aligned} f'(x(m_0)) + m_0\,x(m_0) = 0. \end{aligned}$$

(16)

By Lemma 2.2, we have that $x(m_0)<0$. Equations (15) and (16) imply that there is a pair $(x,t) = (x(m_0),m_0)$ which solves system (14), with $x < 0$. Since system (14) has a unique solution $(\overline{x},\overline{t})$ with $\bar{x}>0$, we arrive at a contradiction. Hence, we must have $m_0=-\infty $. It follows by Lemma 2.2 that $x(t)<0$, for all $t\le t_0$. $\square $

3 Counter-Example

Proposition 3.1

Consider

$$\begin{aligned} f(x) = x^4 - 8\,x^3 - 18\,x^2 + 56\,x. \end{aligned}$$

Suppose that $x(t)$ solves (1). Then, one has that $f''(x(t)) + t > 0$ for all $t\in \,]0, t_0]$, but that $x(0)$ is not a global minimizer of $f(x)$.

Proof

We will first show that this quartic polynomial function $f(x)$ verifies the hypotheses of Lemma 2.3. Then, we will conclude that there exists $t_0$ such that the denominator $q(t)$, defined in (5), is positive for all $t\in \,]\!\!-\!\infty ,t_0]$. Hence, $f(x)$ satisfies the assumptions of Theorem 4.1 in [1].

Note that $f(x)$ has local minima at $x=-2$ and $x=7$ and a local maximum at $x=1$. We also note that $f(-2) = -104$, $f(7) = -833$ and $f(1) = 31$. Therefore, $x=7$ is the global minimizer of $f(x)$.

Let us now compute $t_0$ and $x_0$. We have

$$\begin{aligned} \varphi _x(x_0,t_0) = 4\,x_0^3 - 24\,x_0^2 + (t_0 - 36)\,x_0 + 56 = 0 \end{aligned}$$

(17)

and

$$\begin{aligned} \varphi _{xx}(x,t_0) = 12\,x^2 - 48\,x + t_0 - 36 > 0, \quad \text{ for } \text{ all } x\in \mathbb {R}. \end{aligned}$$

The minimum of the quadratic function $\varphi _{xx}(x,t_0)$ above occurs at $x = 2$. Therefore, one gets $t_0 > 84$, to guarantee that (3) holds. Let $t_0=100$. Then we obtain, as the only real solution of (17),

$$\begin{aligned} x_0 = 2 + \left( (\sqrt{18417}/9) - 15\right) ^{1/3} - 4\bigg /\left( 3\,\left( \displaystyle (\sqrt{18417}/9) - 15\right) ^{1/3}\right) < 0, \end{aligned}$$

by means of some computer algebra package, e.g., Matlab. Approximately, $x_0 \approx -0.681220$. The initial value problem (1) becomes

$$\begin{aligned} \dot{x}(t) = -\frac{x(t)}{12\,x^2(t) - 48\,x(t) + t - 36},\quad 0\le t\le 100\,,\quad x(100) = x_0. \end{aligned}$$

(18)

Next, let us show that $f$ verifies the hypotheses of Lemma 2.3. From $\varphi _{xx}(\overline{x},\overline{t}) = 0$, which is the second equation of (14), we get

$$\begin{aligned} \overline{t}= -12\,\overline{x}^2 + 48\,\overline{x}+ 36. \end{aligned}$$

Substitution of this expression for $\overline{t}$ into $\varphi _x(\overline{x},\overline{t}) = 0$, which is the first equation of (14), yields

$$\begin{aligned} 8\,\overline{x}^3 - 24\,\overline{x}^2 - 56 = 0. \end{aligned}$$

The only real solution of the latter equation is found as

$$\begin{aligned} \overline{x}= 1 + \left( \frac{9 + \sqrt{77}}{2}\right) ^{1/3} - \left( \frac{2}{9 + \sqrt{77}}\right) ^{1/3} > 0, \end{aligned}$$

by Matlab. Approximately, $\overline{x}\approx 3.554149$ and, in turn, $\overline{t}\approx 55.01544$.

Therefore, the hypotheses of Lemma 2.3 are satisfied. Note also that the denominator in (5), $q(t_0)=q(100)>0$. Since, by Lemma 2.3, the solution of (18) is well-defined on $]\!\!-\!\infty ,100]$, we have that the denominator $q(t)>0$ for all $t\in [0,100]$, satisfying the hypotheses of Theorem 4.1 in [1].

Since $x_0<0$ and $q(100)>0$, we have $\dot{x}(100)>0$, and so, by Lemma 2.3, the unique $x(t)$ which solves (18) is negative for all $t\in [0,100]$. However, $x(0)<0$ is not the global minimizer of $f(x)$. $\square $

In Fig. 1, an illustration of the backward differential flow method, as applied to the polynomial in Proposition 3.1, is given. The solution curve of (18) is depicted on a surface plot of the function $\varphi (x,t)$. The curve is generated by solving (18) numerically using the Matlab function ode113, with RelTol = 1e-06. It can be clearly observed in the figure that $x(0)$ approximates the local minimizer $x=-2$, rather than the global minimizer $x=7$.

3.1 Other Counter-Examples

The fact that $x(0)$ is not a global minimizer is not a rare occurence; indeed, it is frequently encountered. In what follows, we provide a few more examples for which $x(0)$ of the backward differential flow is not a global minimizer.

$f(x) = x^4 - (16/3)\,x^3 - 2\,x^2 + 16\,x + 2$ (global minimizer: $x=4$; local minimizer: $x=-1$)

$f(x) = x^4 + (20/3)\,x^3 - 2\,x^2 -20\,x + 3$ (global minimizer: $x=-5$; local minimizer: $x=1$)

4 Conclusions

We have demonstrated, via a counter-example, that the backward differential flow approach presented by Zhu et al. [1] does not necessarily yield a global minimizer of a coercive even-degree polynomial. The counter-example will hopefully help/prompt to determine where the proof of Theorem 4.1 in [1] breaks down. This might in turn help find a correct statement for the theorem.

References

Zhu, J., Zhao, S., Liu, G.: Solution to global minimization of polynomials by backward differential flow. J. Optim. Theory Appl. 161, 828–836 (2014)
Article MATH MathSciNet Google Scholar
Arnold, V.I.: Ordinary Differential Equations. The MIT Press, Cambridge (1978)
Google Scholar

Download references

Author information

Authors and Affiliations

Electrical and Electronics Engineering Department, Bilkent University, Bilkent, 06800, Ankara, Turkey
Orhan Arıkan
School of IT and Mathematical Sciences, University of South Australia, Mawson Lakes, SA, 5095, Australia
Regina S. Burachik & C. Yalçın Kaya

Authors

Orhan Arıkan
View author publications
You can also search for this author in PubMed Google Scholar
Regina S. Burachik
View author publications
You can also search for this author in PubMed Google Scholar
C. Yalçın Kaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. Yalçın Kaya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arıkan, O., Burachik, R.S. & Kaya, C.Y. “Backward Differential Flow” May Not Converge to a Global Minimizer of Polynomials. J Optim Theory Appl 167, 401–408 (2015). https://doi.org/10.1007/s10957-015-0727-7

Download citation

Received: 08 November 2014
Accepted: 11 March 2015
Published: 20 March 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s10957-015-0727-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

“Backward Differential Flow” May Not Converge to a Global Minimizer of Polynomials

Abstract

Similar content being viewed by others

New analysis of linear convergence of gradient-type methods via unifying error bound conditions

Forward–backward quasi-Newton methods for nonsmooth optimization problems

On the Acceleration of Forward-Backward Splitting via an Inexact Newton Method

1 Introduction

2 Preliminaries

Remark 2.1

Remark 2.2

Lemma 2.1

Proof

Lemma 2.2

Proof

Lemma 2.3

Proof

3 Counter-Example

Proposition 3.1

Proof

3.1 Other Counter-Examples

4 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

“Backward Differential Flow” May Not Converge to a Global Minimizer of Polynomials

Abstract

Similar content being viewed by others

New analysis of linear convergence of gradient-type methods via unifying error bound conditions

Forward–backward quasi-Newton methods for nonsmooth optimization problems

On the Acceleration of Forward-Backward Splitting via an Inexact Newton Method

1 Introduction

2 Preliminaries

Remark 2.1

Remark 2.2

Lemma 2.1

Proof

Lemma 2.2

Proof

Lemma 2.3

Proof

3 Counter-Example

Proposition 3.1

Proof

3.1 Other Counter-Examples

4 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation