1 Introduction

Mathematical programming problems have been widely applied in practically every area of production, physical sciences, many engineering problems and government planning. The nonlinear programming (NLP) problem plays an important part among them. Convex programming is a widespread class of NLP problems where the objective function and constraints are convex functions.

NNs are computing systems composed of a number of highly interconnected simple information processing units, and thus can usually solve optimization problems faster than most popular optimization algorithms. Also, the numerical ordinary differential equation techniques can be applied directly to the continuous-time NN for solving constrained optimization problems effectively. NN models are usually more competent than numerical optimization methods because of their inherent parallel nature. First Tank and Hopfield (1986) proposed a NN model for linear programming (LP) problems. Then, Kennedy and Chua (1988) proposed a NN model with a finite penalty parameter for solving NLP problems. Rodriguez et al. (1990) proposed a switched capacitor NN for solving a class of constrained nonlinear convex optimization problems. Zhang and Constantinides (1992) proposed a two-layer NN model to solve some strictly convex programming problems. Bouzerdoum and Pattison (1993) presented a recurrent NN for solving convex quadratic optimization problems with bounded constraints. Zhang (1996), Zhang et al. (2002) proposed an adaptive NN model for NLP problems. Wang (1994), Xia (1996), Xia and Wang (2000, 2004, 2005) presented several NN models for solving LP and NLP problems, monotone variational inequality problems and monotone projection equations. Effati and Baymani (2005), Effati et al. (2011, 2015), Effati and Nazemi (2006), Effati and Ranjbar (2011), Ranjbar et al. (2017) proposed some NN models for solving LP, NLP and binary programming problems. Nazemi (2012, 2013, 2014) proposed some dynamic system models for solving convex NLP problems. Also, recently Huang and Cui (2016) proposed a NN model for solving convex quadratic programming (QP) problems.

The nonlinear complementarity problems (NCPs) attracted a lot of attention because of its wide applications in operations research, economics and engineering (Chen et al. 2010; Ferris et al. 2001). Liao et al. (2001) presented a NN approach for solving NCPs, which NN model has derived from an unconstrained minimization reformulation of the complementarity problem. A popular NCP function is the Fischer–Burmeister function (see Fischer 1992, 1997); also Chen and Pan (2008) proposed a family of NCP functions that subsumes the Fischer–Burmeister function as a special case. Chen et al. (2010) developed a NN for solving the NCPs based on the generalized Fischer–Burmeister.

In this paper by using Krush–Kuhn–Tucker (KKT) condition and by the MS implicit Lagrangian function (see Mangasarian and Soldov 1993) as NCP function, we presented a one-layer NN model for solving the convex optimization problems. The rest of the paper is organized as follows; in Sect. 2, some preliminary results are provided. In Sect. 3 are proposed an equivalent formulation for convex optimization problems and a NN model for it. Convergence and stability results are discussed in Sect. 4. Simulation results on several numerical examples of the new model are reported in Sect. 5. Finally, some concluding remarks are drawn in Sect. 6.

2 Preliminaries

In this section, we recall some necessary mathematical background concepts and materials which will play an important role for the desired NN and to study its stability. First, we recall some basic classes of functions and matrices and then introduce some properties of special NCP functions.

Definition 2.1

A matrix \(A \in {\mathbb {R}}^{n \times n}\) is a

  1. (i)

    \(P_0\)-matrix if each of its principal minors is nonnegative.

  2. (ii)

    P-matrix if each of its principal minors is positive.

Obviously, a positive-definite matrix is a P-matrix and a semi-positive-definite matrix is a \(P_0\)-matrix. For more properties about P-matrix and \(P_0\)-matrix, please refer to Chen and Pan (2008).

Definition 2.2

The function \(F:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) is said to be a

  1. (i)

    monotone if \((x- y)(F(x)-F(y)) \ge 0\) for all \(x,y \in {\mathbb {R}}^n\);

  2. (ii)

    \(P_0\)-function if \(\max _{1 \le i \le n,~ x_i \ne y_i}(x_i - y_i)(F_i(x) - F_i(y)) \ge 0\) for all \(x,y \in {\mathbb {R}}^n\) with \(x \ne y\);

  3. (iii)

    P-function if \(\max _{1 \le i \le n }(x_i - y_i)(F_i(x)-F_i(y)) > 0\) for all \(x,y \in {\mathbb {R}}^n\) with \(x \ne y\).

Note 2.1

It is known that F(x) is a \(P_0\)-function if and only if \(\frac{\mathrm{d} F}{\mathrm{d} x}\) is a \(P_0\)-matrix for all \(x \in {\mathbb {R}}^n\), and if \(\frac{\mathrm{d} F}{\mathrm{d} x}\) is a P-matrix for all \(x \in {\mathbb {R}}^n\), then F must be a P-function.

The NCP is to find a point \(x \in {\mathbb {R}}^n\) such that

$$\begin{aligned} x \ge 0, \,\,\,\,\, F(x) \ge 0, \,\,\,\,\, \langle x , F(x) \rangle =0, \end{aligned}$$
(1)

where \(\langle \cdot , \cdot \rangle \) is the Euclidean inner product and \(F=(F_1, F_2, \ldots , F_n)^\mathrm{T}\) maps from \({\mathbb {R}}^n\) to \({\mathbb {R}}^n\). There are many methods for solving the NCP, one of the most popular and powerful approaches that has been studied intensively is to reformulate the NCP as a system of nonlinear equations or as an unconstrained minimization problem. A function that can constitute an equivalent unconstrained minimization problem for the NCP is called a merit function. Many studies on NCP functions and applications have been achieved during the past three decades (see Chen 2007; Cottle et al. 1992; Ferris and Kanzow 2002; He et al. 2015; Hu et al. 2009; Kanzow et al. 1997; Mangasarian and Soldov 1993; Miri and Effati 2015). The class of NCP functions defined below is used to construct a merit function.

Definition 2.3

A function \(\phi : R^2 \rightarrow R\) is called an NCP function if it satisfies,

$$\begin{aligned} {\phi (a,b)=0 \Longleftrightarrow a \ge 0, \; \; b \ge 0, \; \; ab=0. } \end{aligned}$$
(2)

For example, the following functions are NCP functions:

  1. (a)

    \(\varphi (a, b) = \min \lbrace a, b \rbrace \)

  2. (b)

    \(\varphi (a, b) =(\frac{1}{2})((ab)^2 + \min ^2\lbrace 0, a \rbrace + \min ^2\lbrace 0, b \rbrace )\)

  3. (c)

    \(\varphi (a, b)=\sqrt{a^2 + b^2} - a - b\)

  4. (d)

    \(\varphi (a, b)=ab + \frac{1}{2\alpha }\big ( ((a-\alpha b)_{+})^2 - a^2 + ((b - \alpha a)_{+})^2 - b^2\big ), \,\, \alpha > 1,\)

where \((x)_{+}= \max \lbrace 0, x\rbrace \) and \(\alpha > 1\). Some other NCP functions are listed in Chen (2007); Ferris and Kanzow (2002); Hu et al. (2009); Kanzow et al. (1997). Among introduced NCP functions in above, we use function (d), that is, the well-known MS NCP function, as follows:

$$\begin{aligned} M(x, \alpha )= & {} xF(x) + \frac{1}{2\alpha }\big (\Vert (x-\alpha F(x))_{+} \Vert ^2 - \Vert x \Vert ^2 \\&+ \Vert (F(x) - \alpha x)_{+} \Vert ^2 -\Vert F(x) \Vert ^2\big ). \end{aligned}$$

This NCP function has some positive features as follows:

  • \(M(x, \alpha )\) is nonnegative on \(R^n \times (1, \infty )\). This is not true for NCP functions (a) and (c).

  • \(M(x, \alpha )\) is equal to zero if and only if x is a solution of the NCP, without regard to whether F is monotone or not.

  • \(M(x, \alpha )\) is continuously differentiable at all points. This is not true for NCP function (a).

  • If F is differentiable on \(R^n\), \(M(x, \alpha )\) satisfy in \(M(x, \alpha )=0 \Leftrightarrow \nabla M(x, \alpha )=0\) for \(\alpha > 1\). This is not true for NCP function (b).

  • It is a merit function. This is not true for NCP functions (a) and (c).

For to see other features and checking the validity of the above features, refer to Mangasarian and Soldov (1993). For \(F=(F_1, F_2, \ldots , F_n)^\mathrm{T}\), NCP(F) can be equivalently reformulated as finding a solution of the following equation:

$$\begin{aligned} {\left[ \begin{array}{c} M_1(x , \alpha )\\ \vdots \\ M_n(x , \alpha )\\ \end{array} \right] =0,} \end{aligned}$$
(3)

where \(M_i(x , \alpha )=x_iF_i(x) + \frac{1}{2\alpha }(((x_i-\alpha F_i(x))_{+})^2 - x_i^2 + ((F_i(x) - \alpha x_i)_{+})^2 - F_i^2(x))\) for \(i=1, 2, \ldots , n.\) For convenience, we define \(M_{\alpha }(x)\) as follows.

Definition 2.4

We define \(M_{\alpha }(x):{\mathbb {R}}^n \rightarrow {\mathbb {R}}_{+}\) by

$$\begin{aligned} {M_{\alpha }(x)=\sum _{i=1}^{n} M_i(x, \alpha ). } \end{aligned}$$
(4)

In follows, we will present some favorable properties of \(M_{\alpha }(x)\), which their proofs are in Mangasarian and Soldov (1993).

Theorem 2.1

For \(\alpha \in (1, \infty )\), \(M_{\alpha }(x) \ge 0\) for all \(x \in {\mathbb {R}}^{n}\) and \(M_{\alpha }(x)=0\) if and only if x solves the NCP.

Theorem 2.1 establishes a one-to-one correspondence between solutions of the NCP and global unconstrained minima of the merit function \(M_{\alpha }(x)\). As a consequence were obtained the following immediate results.

Corollary 2.1

If F is differentiable at a solution \({\overline{x}}\) of NCP, then \(\nabla M_{\alpha }({\overline{x}})=0\) for \(\alpha \in (1, \infty )\).

Theorem 2.2 shows that under certain assumptions, each stationary point of the unconstrained objective function \(M_{\alpha }(x)\) is already a global minimum and therefore a solution of problem (1).

Theorem 2.2

Let \(F: {\mathbb {R}}^{n} \rightarrow {\mathbb {R}}^{n}\) be continuously differentiable having a positive-definite Jacobian \(\frac{\mathrm{d} F}{\mathrm{d} x}\) for all \(x \in {\mathbb {R}}^{n}\). Assume that the complementarity problem is solvable, then \(x^*\) is a stationary point of \(M_{\alpha }(x)\) if and only if \(x^*\) solve NCP.

We also recall some materials about first-order differential equation and Lyapunov function. These materials can be found in ordinary differential equation and nonlinear control textbooks (see Miller and Michel 1982; Slotin and Li 1991).

Definition 2.5

Let \(K \subseteq {\mathbb {R}}^{n}\) be an open neighborhood of \(x^{*}\). A continuously differentiable function \(E:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}\) is said to be a Lyapunov function at the state \(x^{*}\) over the set K for \(\dot{x}=f(x(t))\) if

  1. (i)

    \( E(x^{*})=0\) and for all \(x\ne x^{*} \in K, \; E(x)>0.\)

  2. (ii)

    \(\frac{\mathrm{d}(E(x(t)))}{\mathrm{d}t}=(\nabla _{x} E(x(t)))^\mathrm{T} {\dot{x}}=(\nabla _{x} E(x(t)))^\mathrm{T} f(x(t)) \le 0. \)

Lemma 2.1

  1. (i)

    An isolated equilibrium point \(x^{*}\) is Lyapunov stable if there exists a Lyapunov function over some neighborhood K of \(x^{*}.\)

  2. (ii)

    An isolated equilibrium point \(x^{*}\) is asymptotically stable if there is a Lyapunov function over some neighborhood K of \(x^{*}\) such that \(\frac{\mathrm{d}E}{\mathrm{d}t}<0\) for all \(x\ne x^{*} \in K\).

3 Problem formulation and neural network design

Consider the following convex optimization problem:

$$\begin{aligned} \begin{array}{cl} \mathrm{Min} &{}\qquad {f(x)}\\ \\ \mathrm{s.t.} &{}\qquad g(x) \le 0 \\ &{}\qquad x \ge 0, \\ \end{array} \end{aligned}$$
(5)

where \(x \in {\mathbb {R}}^n\), \(f:{\mathbb {R}}^n \rightarrow {\mathbb {R}}\), \(g(x)=[g_1(x), \ldots , g_m(x)]^\mathrm{T} \) is an m-dimensional vector-valued continuous function of n variables and \(f, g_1, \ldots , g_m\) are convex and twice differentiable.

It is well known (see Bazaraa and Shetty 1979) that by the KKT conditions, \(x \in {\mathbb {R}}^{n}\) is an optimal solution to (5) if and only if there exists \(u \in {\mathbb {R}}^{m}\) such that

$$\begin{aligned} \begin{array}{l} \displaystyle \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \nabla f(x) + \sum _{i=1}^{m} u_i \nabla g_i(x) \ge 0\\ \\ \displaystyle \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, [\nabla f(x) + \sum _{i=1}^{m} u_i \nabla g_i(x)]^\mathrm{T} x=0\\ \\ \displaystyle \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, u_{i}g_{i}(x)=0, \; \; \;\; \; \;\; i=1,2,\ldots ,m\\ \\ \displaystyle \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, u_{i} \ge 0 \; \; \;\; \; \;\; \;\,\; \;\; \; \;\; \; \; i=1,2,\ldots ,m \\ \\ \displaystyle \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, g_{i}(x) \le 0 \; \; \;\; \; \;\; \;\, \;\; \; i=1,2,\ldots ,m \\ \\ \displaystyle \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, x \ge 0. \end{array} \end{aligned}$$
(6)

It is clear that the KKT optimality conditions of this problem lead to a complementarity problem as follows

$$\begin{aligned} z \ge 0, \,\,\,\,\, F(z) \ge 0, \,\,\,\,\, \langle z , F(z) \rangle =0, \end{aligned}$$

where \(z=(x,u)\) and \(F: {\mathbb {R}}^{n+m} \rightarrow {\mathbb {R}}^{n+m}\) is defined by

$$\begin{aligned} {F(z)=\left[ \begin{array}{c} \nabla _{x_1} f(x) + \sum _{i=1}^{m} u_i \nabla _{x_1} g_i(x)\\ \vdots \\ \nabla _{x_n} f(x) + \sum _{i=1}^{m} u_i \nabla _{x_n} g_i(x)\\ - g_1(x)\\ \vdots \\ - g_m(x)\\ \end{array} \right] ,} \end{aligned}$$
(7)

by considering:

$$\begin{aligned} F_k(z)= & {} \nabla _{x_k} f(x) + \sum _{i=1}^{m} u_i \nabla _{x_k} g_i(x), \,\,\, k=1, 2, \ldots , n, \\ F_{n+j}(z)= & {} -g_j(x), \,\,\, j=1, 2, \ldots , m. \end{aligned}$$

Solving problem (5) is equivalent to find a solution for the following equation:

$$\begin{aligned} {\left[ \begin{array}{c} M_1(z , \alpha )\\ \vdots \\ M_{n+m}(z , \alpha )\\ \end{array} \right] =0,} \end{aligned}$$
(8)

where \(M_i(z , \alpha )=z_iF_i(z) + \frac{1}{2\alpha }(((z_i-\alpha F_i(z))_{+})^2 - z_i^2 + ((F_i(z) - \alpha z_i)_{+})^2 - F_i^2(z))\) for \(i=1, 2, \ldots , n+m.\)

By Definition 2.4 and Theorem 2.1, we have \(M_{\alpha }(z) \ge 0\) for all \(z \in {\mathbb {R}}^{n+m}\) and z solves (8) if and only if \(M_{\alpha }(z) = 0\). Hence, solving (8) is equivalent to finding the global minimizer of \(M_{\alpha }(z)\) if (8) has solution.

We utilize \(M_{\alpha }(z)\) as the traditional energy function. As mentioned above, problem (5) is equivalent to the unconstrained smooth minimization problem as follows

$$\begin{aligned} \min _{z \in {\mathbb {R}}^{n+m}} \;\;\;\;\; M_{\alpha }(z), \end{aligned}$$
(9)

where \(M_{\alpha }(z)=\sum _{i=1}^{n+m} M_i(z, \alpha )\). Hence, it is natural to adopt the following steepest descent-based NN for problem (5)

$$\begin{aligned} \frac{\mathrm{d}z(t)}{\mathrm{d}t}= -\eta \nabla M_{\alpha }(z), \end{aligned}$$
(10)

where \(\eta > 0\) is a scaling factor.

Compared with the existent NNs for solving the such nonlinear optimization problem, the proposed NN is one layer and its architecture is shown in Fig. 1. Also, it has a simple form and a better performance in convergence time that is shown in Sect. 5. Moreover, the proposed NN is stable in the sense of Lyapunov and has globally convergent to an exact optimal solution of the original problem.

Fig. 1
figure 1

Architecture of the proposed NN in (10)

4 Stability analysis

In this section, we will study the stability of the equilibrium point and the convergence of the optimal solution of NN (10). We first state the relationships between an equilibrium point of (10) and a solution to the NCP.

Lemma 4.1

Let S be a nonempty open convex set in \({\mathbb {R}}^n\), and let \(f: S \rightarrow {\mathbb {R}}\) be twice differentiable on S. Then f is convex if and only if the Hessian matrix is semi-positive definite at each point in S.

Lemma 4.2

If the objective function and all constraint functions are convex, then function F that has been defined in (7) is a \(P_0\)-function on \({\mathbb {R}}^{n+m}\).

Proof

For this purpose, it is sufficient that \(\frac{\partial F}{\partial z}\) is semi-positive-definite matrix on \({\mathbb {R}}^{n+m}\). We have

$$\begin{aligned} \frac{\partial F}{\partial z}= \left[ \begin{array}{cc} \nabla ^2f(x) + u \nabla ^2g(x) &{} \nabla g(x) \\ -\nabla g(x) &{} 0_{m\times m} \\ \end{array} \right] _{(n+m)\times (n+m)}, \end{aligned}$$

now, let \(z=(x,u) \ne 0\) is arbitrary vector in \({\mathbb {R}}^{n+m}\), since f(x) and \(g_i(x)\) are convex and \(u_i \ge 0\) for \(i=1,2, \ldots ,m\) by Lemma 2.1, we have

$$\begin{aligned} z \frac{\partial F}{\partial z} z^\mathrm{T}= x(\nabla ^2f(x) + u \nabla ^2g(x))x^\mathrm{T} \ge 0; \end{aligned}$$

hence, F is an \(P_0\)-function. \(\square \)

In the next theorem, we state the existence of the solution trajectory of (10).

Theorem 4.1

For any initial state \(z_0 = z(t_0)\), there is exactly one unique solution z(t) with \(t \in [t_{0},\tau (z_0))\) for NN (10).

Proof

Since \(\nabla M_{\alpha }(z)\) is continuous, there is a local solution z(t) for (10) with \(t\in [t_{0},\tau )\) for some \(\tau \ge t_{0}\) and since \(\nabla M_{\alpha }(z)\) is locally Lipschitz continuous, the proof is completed. \(\square \)

Theorem 4.2

Let \(z^*\) be an isolated equilibrium point of NN (10), \(z^*\) is globally asymptotically stable for (10).

Proof

Since \(z^*\) is a solution to the NCP, \(M_{\alpha }(z^*)=0\). In addition, since \(z^*\) is an isolated equilibrium point of (10), there is a neighborhood \(K \subseteq {\mathbb {R}}^{n+m}\) of \(z^*\) such that

$$\begin{aligned} \nabla M_{\alpha }(z^*)=0, ~~~~ \nabla M_{\alpha }(z) \ne 0 ~~ \forall z \in K{\setminus }\lbrace z^*\rbrace . \end{aligned}$$

Next, we define a Lyapunov function as

$$\begin{aligned} E(z(t))=M_{\alpha }(z), \end{aligned}$$

we show that E(z(t)) is a Lyapunov function over the set K for NN (10). By Theorem 2.1, we have \(E(z) \ge 0\) over \({\mathbb {R}}^{n+m}\), since \(z^*\) is a solution to NCP(F), obviously \(E(z^*)=0\). If there is an \(z \in K{\setminus }\lbrace z^*\rbrace \) that satisfying \(E(z)=0\), then from Corollary 2.1\(\nabla M_{\alpha }(z)=0\); hence, z is an equilibrium point of (10), and it is contradicting with the being isolated of \(z^*\) in K. On the other hand, since the F is \(P_0\) function, Theorem 3.2 in Facchinei (1998) shows that there is no solution other than \(z^*\) for NCP(F); therefore, for all \(z \ne z^*\) we have \(E(z)>0\).

Now it is sufficient that for all \(z \in K{\setminus }\lbrace z^*\rbrace \), we have \(\frac{\mathrm{d}E(z)}{\mathrm{d}t}<0\). For this purpose, by taking the derivative of E(z) with respect to time t, since \(z^{*}\) is isolated equilibrium point for all \(z(t) \in K{\setminus }\lbrace z^*\rbrace \), we have:

$$\begin{aligned} \frac{\mathrm{d}E(z)}{\mathrm{d}t}= & {} \frac{\mathrm{d}E(z)}{\mathrm{d}z}\frac{\mathrm{d}z}{\mathrm{d}t}=(\nabla M_{\alpha }(z) )^\mathrm{T}(-\eta \nabla M_{\alpha }(z)) \\= & {} -\eta \Vert \nabla M_{\alpha }(z) \Vert ^2 <0, \end{aligned}$$

and thus, by Lemma 2.1(ii), it implies that \(z^*\) is globally asymptotically stable. \(\square \)

5 Simulation results

In order to demonstrate the effectiveness and performance of the proposed NN model, we discuss several illustrative examples. Note, to solve the proposed NN in (10), we use the solver ode15s of MATLAB. The tolerances were chosen AbsTol \(=\) 1e–6 and RelTol \(=\) 1e–4.

Example 5.1

Consider the following NLP problem:

$$\begin{aligned} \begin{array}{cl} \mathrm{{Min}} &{} \qquad f(x)=\frac{1}{4}x_1^4+\frac{1}{2}x_1^2+\frac{1}{4}x_2^4+\frac{1}{2}x_2^2 -\frac{9}{10}x_1x_2\\ \mathrm{{s.t.}}&{} \qquad \\ &{} \qquad x_1+x_2 \le 2\\ &{} \qquad -x_1+x_2 \le 2\\ &{} \qquad x_1-3x_2 \le -2\\ &{} \qquad x_i \ge 0, \quad \, i=1,2 .\\ \end{array} \end{aligned}$$

In this problem, f(x) is strictly convex and the feasible region is a convex set, and the optimal solution of the NLP problem is \(x^*=[0.3461, 0.7820]^\mathrm{T}\). We apply the proposed NN in (10) to solve the above problem. Simulation results show the trajectory of (10) with 16 initial points is always convergent to

$$\begin{aligned} z^*=[0.3461, 0.7820, 0.0000, 0.0000, 0.3163]^\mathrm{T}. \end{aligned}$$

Figure 2 displays the transient behavior of z(t) with 16 initial points. Moreover, Fig. 3 shows the phase diagram of state variables (\(x_1(t), x_2(t)\)) with 16 different initial points, which shows globally convergent to an exact optimal solution of Example 5.1.

Fig. 2
figure 2

State variables of Example 5.1 where obtained by the proposed NN of (10) with 16 initial points

Fig. 3
figure 3

Phase diagram of NN (10) with 16 different initial points in Example 5.1

Example 5.2

Consider the following NLP problem (see Bazaraa and Shetty 1979):

$$\begin{aligned} \begin{array}{cl} \mathrm{{Min}} &{} \qquad f(x)=\frac{4}{3}(x_1^2 - x_1x_2 + x_2^2)^{\frac{3}{4}} - x_3 \\ \mathrm{{s.t.}}&{} \qquad \\ &{} \qquad x_3 \le 2\\ &{} \qquad x_i \ge 0, \qquad i=1, 2, 3.\\ \end{array} \end{aligned}$$

In this problem, the objective function is convex and the feasible region is a convex set; also optimal solution is achieved at the unique point \(x^*=(0, 0, 2)\) (see Bazaraa and Shetty 1979). We apply the proposed NN in (10) to solve the above problem. Simulation results show the trajectory of (10) with 20 initial points is always convergent to

$$\begin{aligned} z^*=[0.0000, 0.0000, 2.0000, 1.0000]^\mathrm{T}. \end{aligned}$$

Figure 4 displays the transient behavior of z(t) with 20 initial points. Moreover, Fig. 5 shows the phase diagram of state variables (\(x_1(t), x_2(t), x_3(t)\)) with 20 different initial points, which shows globally convergent to an exact optimal solution of Example 5.2.

Fig. 4
figure 4

State variables of the Example 5.2 where obtained by the proposed NN of (10) with 20 initial points

Fig. 5
figure 5

Phase diagram of NN (10) with 20 different initial points in Example 5.2

Example 5.3

Consider the following NLP problem:

$$\begin{aligned} \begin{array}{cl} \mathrm{{Min}} &{} \qquad f(x)=(x_1 - x_2)^2 + (x_2 - x_3)^2 + (x_3 - x_4)^4\\ \mathrm{{s.t.}} &{} \qquad \\ &{} \qquad x_1^2 + x_2^2 + x_3^2 + x_4^2 \le 10\\ &{} \qquad (x_1 - 4)^2 + (x_2 + 4)^2 + (x_3 - 1)^2 + (x_4 + 1)^4 \le 18\\ &{} \qquad x_i \ge 0, \quad i=1,2,3,4. \\ \end{array} \end{aligned}$$

It is easy to see that the objective function is convex and constraint functions are strictly convex. The optimal solution is achieved at the unique point \(x^*=(3.0660, 0, 0.6426, 0)\). We apply the proposed NN in (10) to solve the above problem. Simulation results show the trajectory of (10) with 10 initial points are always convergent to:

$$\begin{aligned} z^*=[3.0660, 0.0000, 0.6426, 0.0000, 0.0000, 3.2829]^\mathrm{T}. \end{aligned}$$

Figure 6 displays the transient behavior of z(t) with 20 initial points. Moreover, Fig. 7 shows the phase diagram of state variables (\(x_1(t), x_2(t), x_3(t)\)) with 20 different initial points, which shows globally convergent to an exact optimal solution of Example 5.3.

Fig. 6
figure 6

State variables of the Example 5.3 where obtained by the proposed NN of (10) with 20 initial points

Fig. 7
figure 7

Phase diagram of NN (10) with 20 different initial points in Example 5.3

Example 5.4

Consider the following NLP problem:

$$\begin{aligned} \begin{array}{cl} \mathrm{{Min}} &{} \qquad f(x)=0.4x_1 + x_1^2 + x_2^2 - x_1x_2 + 0.5x_3^2 + 0.5x_4^2 + \frac{1}{30}x_1^3\\ \mathrm{{s.t.}} &{} \qquad \\ &{} \qquad -x_1+x_2 -x_3\le 2\\ &{} \qquad 3x_1+x_2 -x_3-x_4\le 18\\ &{} \qquad \frac{1}{3}x_1 + x_2 - x_4= 2\\ &{} \qquad x_i \ge 0,\,\,\,\,\,\,\,\, i=1,2,3,4 .\\ \end{array} \end{aligned}$$

The optimal solution is achieved at the unique point \(x^*=(0.982, 1.672, 0, 0)\) (see Nazemi 2012). We apply the proposed NN in (10) to solve the above problem. Simulation results show the trajectory of (10) with 20 initial points is always convergent to

$$\begin{aligned} z^*= & {} [0.9820, 1.6727, 0.0000, 0.0000, 0.0000,\\&0.0000, 0.0000, 2.3635]^\mathrm{T}. \end{aligned}$$

Figure 8 displays the transient behavior of z(t) with 20 initial points. An \(l_2\) normal error between z(t) and z with 20 different initial points is also shown in Fig. 9.

Fig. 8
figure 8

State variables of the Example 5.4 where obtained by the proposed NN of (10) with 20 initial points

Fig. 9
figure 9

Convergence behaviors of \(\Vert z(t) - z \Vert ^2\) in Example 5.4 with 20 initial points

Example 5.5

Consider the following NLP problem:

$$\begin{aligned} \begin{array}{cl} \mathrm{{Min}} &{} \qquad f(x)=x_1^2+x_2^2+x_1x_2-14x_1-16x_2+(x_3-10)^2\\ &{} \qquad +4(x_4-5)^2+(x_5-3)^2+2(x_6-1)^2+5x_7^2\\ &{} \qquad +7(x_8-11)^2+2(x_9-10)^2+(x_{10}-7)^2+45\\ \mathrm{{s.t.}} &{} \qquad \\ &{} \qquad 3(x_1-2)^2+4(x_2-3)^2+2x_3^2-7x_4 \le 120\\ &{} \qquad 5x_1^2+8x_2+(x_3-6)^2-2x_4 \le 40\\ &{} \qquad \frac{1}{2}(x_1-8)^2+2(x_2-4)^2+3x_5^2-x_6 \le 30\\ &{} \qquad x_1^2+2(x_2-2)^2-2x_1x_2+14x_5-6x_6 \le 0\\ &{} \qquad 4x_1+5x_2-3x_7+9x_8 \le 105\\ &{} \qquad 10x_1-8x_2-17x_7+2x_8 \le 0\\ &{} \qquad -3x_1+6x_2+12(x_9-8)^2-7x_{10} \le 0\\ &{} \qquad -8x_1+2x_2+5x_9-2x_{10} \le 12\\ &{} \qquad x_i \ge 0,\,\,\,\,\,\,\,\, i=1,2, \ldots ,10 \\ \end{array} \end{aligned}$$

The optimal solution of this problem is given in Xia and Wang (2004), \((x^*=[2.17199, 2.36368, 8.77392, 5.09598, 0.99065,1.43057, 1.32164, 9.82872, 8.28009, 8.37592]^\mathrm{T}).\) We apply the proposed NN in (10) to solve the above problem. Simulation results show the trajectory of (10) with 20 initial points is always convergent to:

$$\begin{aligned} z^*= & {} [x_{1}^*=2.1720, x_{2}^*=2.3637, x_{3}^*=8.7739,\\&x_{4}^*=5.0960, x_{5}^*=0.9907, x_{6}^*=1.4306\\&x_{7}^*=1.3216, x_{8}^*=9.8287, x_{9}^*=8.2801, \\&x_{10}^*=8.3759, u_{1}^*=0.0205, u_{2}^*=0.3120 \\&u_{3}^*=0.0000, u_{4}^*=0.2870, u_{5}^*=1.7165, \\&u_{6}^*=0.4745, u_{7}^*=0.0000, u_{8}^*=1.3759]^\mathrm{T} \\ \end{aligned}$$

which corresponds to the optimal solution. Figure 10 shows the transient behavior of x(t) with 20 initial points. An \(l_2\) normal error between z(t) and z with 20 different initial points is also shown in Fig. 11.

Fig. 10
figure 10

State variables of the Example 5.5 where obtained by the proposed NN of (10) with 20 initial points

Fig. 11
figure 11

Convergence behaviors of \(\Vert z(t) - z \Vert ^2\) in Example 5.5 with 20 initial points

In comparison with existing NN models, we use of the three NN models. First Kennedy and Chua (1988) proposed the following NN model for solving (5):

$$\begin{aligned} \frac{\mathrm{d}x}{\mathrm{d}t}=-\lbrace \nabla f(x) + s(\nabla g(x) g(x)^{+} - (-x)^+ ) \rbrace , \end{aligned}$$
(11)

where s is a penalty parameter. This NN has a low model complexity, but it only converges an approximate solution of (5) for any given finite penalty parameter. Afterward, Xia and Wang (2004) based on the projection formulation proposed a recurrent NN model for solving (5) with the following dynamical equation:

$$\begin{aligned} \frac{\mathrm{d}x}{\mathrm{d}t}= & {} -x + (x - \alpha (\nabla f(x) + \nabla g(x) u) )^+ \end{aligned}$$
(12)
$$\begin{aligned} \frac{\mathrm{d}u}{\mathrm{d}t}= & {} -u + (u+\alpha g(x))^+, \end{aligned}$$
(13)

where \(x \in {\mathbb {R}}^{n}, u \in {\mathbb {R}}^{m}\) and \(\alpha > 0\). This NN has a one-layer structure. Recently, Nazemi (2012) based on the Lagrange function proposed a NN model for solving the convex NLP problem, as the following form:

$$\begin{aligned} \begin{array}{cl} \mathrm{Min} &{} \quad {f(x)} \\ \\ \mathrm{s.t.} &{} \quad {g(x) \le 0}\\ &{} \quad h(x) = 0,\\ \end{array} \end{aligned}$$
(14)

with the following dynamical system:

$$\begin{aligned} \frac{\mathrm{d}x}{\mathrm{d}t}= & {} - \left( \nabla f(x) + \frac{1}{2} \nabla g(x)^\mathrm{T} u^2 + \nabla h(x)^\mathrm{T} v\right) \end{aligned}$$
(15)
$$\begin{aligned} \frac{\mathrm{d}u}{\mathrm{d}t}= & {} \mathrm{diag}(u_1, \ldots , u_m) g(x) \end{aligned}$$
(16)
$$\begin{aligned} \frac{\mathrm{d}v}{\mathrm{d}t}= & {} h(x). \end{aligned}$$
(17)

Under the condition that the objective function is convex and constraint functions are strictly convex or that the objective function is strictly convex and the constraint functions are convex, Table 1 shows that for Examples 5.15.5, the proposed NN has a better performance in convergence time than the NNs introduced in (11), (12) and (13). Remark that the numerical implementation in all the models coded on MATLAB and the ordinary differential equation solver adopted ode15s. Note, initial states for all the implementations in Table 1 are equal. The stopping criterion in all the models is \(\Vert x(t_\mathrm{f}) - x^* \Vert \le 10^{-4}\) where \(t_\mathrm{f}\) represents the final time when the stopping criterion is met.

Table 1 Final time (second) to stopping criterion for the proposed model in (10) compared to NN models in (11), (12) and (13) for Examples 5.15.5

Note that, in models (12) and (13), objective or constraint functions should be strictly convex. Therefore, since in Example 5.2 objective and constraint functions are convex, NNs (12) and (13) do not converge to the optimal solution.

It should be noted that the proposed model in (10) can solve LP and convex QP problems. In order to demonstrate, we give two examples in following, for LP and convex QP problems.

Example 5.6

Consider the following LP problem (see Bazaraa et al. 1990):

$$\begin{aligned} \begin{array}{cl} \mathrm{{Min}} &{} \qquad f(x)=x_1 + x_2 - 4x_3 \\ \mathrm{{s.t.}}&{} \qquad \\ &{} \qquad x_1 + x_2 + 2x_3 \le 9\\ &{} \qquad x_1 + x_2 - x_3 \le 2\\ &{} \qquad -x_1 + x_2 + x_3 \le 4\\ &{} \qquad x_i \ge 0,\,\,\,\,\,\,\,\, i=1,2,3 .\\ \end{array} \end{aligned}$$

It is easy to see that objective function is convex and the feasible region is a convex set, and the optimal solution of LP problem is \(x^*=[\frac{1}{3}, 0, \frac{13}{3}]^\mathrm{T}\) (see Bazaraa et al. 1990). We apply the proposed NN in (10) to solve the above problem. Simulation results show the trajectory of (10) with 10 initial points is always convergent to

$$\begin{aligned} z^*=[0.3333, 0.0000, 4.3333, 1.0000, 0.0000, 1.9999 ]^\mathrm{T}. \end{aligned}$$

Figure 12 displays the transient behavior of z(t) with 10 initial points.

Fig. 12
figure 12

State variables of the Example 5.6 where obtained by the proposed NN of (10) with 10 initial points

Example 5.7

Consider the following QP problem (see Huang and Cui 2016):

$$\begin{aligned} \begin{array}{cl} \mathrm{{Min}} &{} \qquad f(x)=0.4 x_1^2 +0.3 x_2^2 -0.1 x_1x_2 -0.2x_1\\ \mathrm{{s.t.}}&{} \qquad \qquad -\,0.4x_2+ 0.7x_3\\ &{} \qquad x_1-x_2 +x_3 = 5\\ &{} \qquad 0.9x_1+0.2x_2 -0.2x_3 \le 4\\ &{} \qquad 0.2x_1 +0.7x_2 -0.1x_3 \le 10\\ \end{array} \end{aligned}$$

This problem is a convex QP, and also optimal solution is achieved at the unique point \(x^*=(1.0851, -0.3191, 3.5957)\) (see Huang and Cui 2016). We apply the proposed NN in (10) to solve the above problem. Simulation results show the trajectory of (10) with 20 initial points is always convergent to

$$\begin{aligned} z^*= & {} [1.0851, -0.3191, 3.5957, 0.6489,\\&1.3489 , 0.0000, 0.0000]^\mathrm{T}. \end{aligned}$$

Figure 13 displays the transient behavior of z(t) with 20 initial points. Moreover, final time to stopping criterion for the proposed model in (10) is 0.002 s and final time to stopping criterion for the proposed NN in Huang and Cui (2016) for \(k=10\) is 2 s, and it shows that the new model has a better performance in convergence time.

Fig. 13
figure 13

State variables of the Example 5.7 where obtained by the proposed NN of (10) with 20 initial points

6 Conclusions

In this paper, we proposed a one-layer recurrent NN model for solving convex optimization problems. First by using KKT condition and with application MS merit function as a NCP function, we proposed a one-layer NN model, and in comparison with other existing NNs for solving such problems, the proposed NN has a simple form and a better performance in convergence time. Moreover, the proposed NN is stable in the sense of Lyapunov and has globally convergent to optimal solution of the original problem.