A New Recurrent Neural Network with Fewer Neurons for Quadratic Programming Problems

Chen, Sanfeng; Han, Xin; Tang, Fei; Lin, Guangming

doi:10.1007/978-981-13-1648-7_1

Sanfeng Chen¹²,
Xin Han^12,13,
Fei Tang¹² &
…
Guangming Lin¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 873))

Included in the following conference series:

International Symposium on Intelligence Computation and Applications

678 Accesses

Abstract

A new recurrent neural network is presented to solve a general quadratic programming problem in real time. In contrast with the available neural networks, the new neural network is with fewer neurons for solving quadratic programming problems. The global convergence of the model is proven with contraction analysis. The discrete time model and an alternative model for solving the problem under irredundant equality constraints are also studied. Simulation results demonstrate that the proposed recurrent neural networks are effective.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Solving a Class of Optimal Control Problems by Using Chebyshev Polynomials and Recurrent Neural Networks

Recurrent Neural Networks for Solving Real-Time Linear System of Equations

A recurrent neural network with finite-time convergence for convex quadratic bilevel programming problems

Article 04 March 2017

Keywords

1 Introduction

There are a lot of studies of recurrent neural networks focusing on the filed of signal processing [1, 2], pattern classification [3, 4], robotics [5, 6], optimization [7], and so on. Especially, with the invention of the Hopfield [8], it was specially invented for solving online optimization. Recurrent neural networks are becoming a popular research branch in the field of online optimization. They are with powerful parallelism and online solving capability. Recurrent neural networks have made huge advances for online optimization in both theory and application. A recurrent neural network [9] is developed for nonlinear programming problems, where a penalty term is introduced as equality and inequality constraints, and it converges to an approximate optimal solution. A switched-capacitor neural network is proposed [10] for solving nonlinear convex programming problems. However, the model will be unstable in the case that the optimal solution is outside the feasible region. A neural network is proposed for solving linear quadratic programming problems [11]. The optimal solution is proven globally converged. Some slack variables is introduced to the problem, which leads the dimension of model is too large. A dual neural network is proposed for reducing the dimension. It is composed of a single layer of neurons,and the dimension of the dual network is equal to its neurons. The model and its modifications [13, 14] are introduced to kinematic control of robot [12, 15]. A simplified dual neural network is proposed [16]. It much reduces complexity while the convergence property is sound. The model is applied to the KWTA problem in real time [17], which is just a single neuron. However, it just deals with quadratic programming problem with a square quadratic term in box constraints and cost function. A recurrent neural network for solving general quadratic programming problems is proposed. It is with fewer neurons, and the dimension of the model is greatly reduced while keeping sound accuracy and efficiency.

The remainder of this paper is organized as follows. In Sect. 2, A neural network model is presented for solving quadratic programming problems. In Sect. 3, the convergence of the neural network is analyzed and it is proven to be globally convergent to the optimal solution of the quadratic programming problems. A discrete-time model in Sect. 4 for solving the same problem and an alternative neural network model for solving the quadratic programming problem under irredundant equality constraints are studied. In Sect. 5, numerical examples are given to demonstrate the effectiveness of our method. Section 6 is the conclusion.

In this paper, $ {\mathbb{R}} $ denotes the real number field, $ A^{T} $ represents the transport matrix of $ A $, $ I $ denotes a unitary matrix.

2 Mathematical Model

The general quadratic programming problem as following is studied:

$$ \begin{aligned} \hbox{min} (\frac{1}{2}x^{T} Wx + c^{T} x) \hfill \\ {\text{s}} . {\text{t}} .\,\,\,\,\,\,\,\begin{array}{*{20}c} {Ax = b} \\ {Ex \le e} \\ \end{array} \hfill \\ \end{aligned} $$

(1)

Where $ W \in {\mathbb{R}}^{n \times n} $, $ x \in {\mathbb{R}}^{n} $ denotes a positive definite matrix, $ c \in {\mathbb{R}}^{n} $, $ A \in {\mathbb{R}}^{m \times n} $, $ b \in {\mathbb{R}}^{m} $, $ e \in {\mathbb{R}}^{q} $ and $ E \in {\mathbb{R}}^{q \times n} $. The equality constraint $ Ax = b $ is transformed to two inequalities equivalently: $ -Ax \le - b $ and $ Ax \le b $. The Eq. (1) is transferred as a quadratic programming problem subject to inequality constraints. It is as following:

$$ \begin{aligned} \hbox{min} (\frac{1}{2}x^{T} Wx + c^{T} x) \hfill \\ {\text{s}} . {\text{t}} .\,\,\,\,\,\frac{1}{2}Bx \le d \hfill \\ \end{aligned} $$

(2)

$$ B = \left[ \begin{aligned} A \hfill \\ - A \hfill \\ E \hfill \\ \end{aligned} \right],\;d = \left[ \begin{aligned} b \hfill \\ - b \hfill \\ e \hfill \\ \end{aligned} \right]\text{ } $$

(3)

Where $ B \in {\mathbb{R}}^{p \times n} $, $ d \in {\mathbb{R}}^{p} $ and $ p = 2 \times m + q $. The inequality constraint of (2) is transformed as $ \hbox{max} (Bx - d) \le 0 $. The $ \hbox{max} (x) $ denotes the largest element of vector $ x $. The problem (1) is equivalent to the Eq. (4):

$$ \begin{aligned} \hbox{min} (\frac{1}{2}x^{T} Wx + c^{T} x) \hfill \\ {\text{s}} . {\text{t}} .\,\,\,\,B_{\sigma }^{T} x - d_{\sigma } \le 0 \hfill \\ \end{aligned} $$

(4)

In the Eq. (4), $ \sigma $ denotes the row No. of the biggest element of $ Bx - d $, and $ B_{\sigma }^{T} $ represents the $ \sigma^{th} $ row of $ B $, $ d_{\sigma } $ denotes the $ \sigma^{th} $ element of $ d $. According to the KKT terms, the solution of problem (4) meets the requirements:

$$ Wx + c - \mu B_{\sigma } = 0 $$

$$ \left\{ \begin{aligned} B_{\sigma }^{T} x - d_{\sigma } = 0\text{ }if\text{ }\mu \le 0 \hfill \\ B_{\sigma }^{T} x - d_{\sigma } \le 0\text{ }if\text{ }\mu = 0 \hfill \\ \end{aligned} \right. $$

(5)

The dual variable of inequality constraint in the Eq. (4) is represented with $ \mu \in {\mathbb{R}} $. The Eq. (5) is simplified with an upper saturation function as following:

$$ \begin{array}{*{20}l} {Wx + c - \mu B_{\sigma } = 0} \hfill \\ {B_{\sigma }^{T} x - d_{\sigma } = g(B_{\sigma }^{T} x - d_{\sigma } - \mu )} \hfill \\ \end{array} $$

(6)

Where the upper saturation function $ g(.) $ is as following:

$$ g(.) = \left\{ {\begin{array}{*{20}c} 0 & {x > 0} \\ x & {x \le 0} \\ \end{array} } \right. $$

(7)

The $ W $ is positive definite, $ x $ could be explicitly solved with $ \mu $ and the first equality in Eq. (6) as following:

$$ x = \mu W^{ - 1} B_{\sigma } - W^{ - 1} c $$

(8)

A dynamic neuron is used to solve $ \mu $ in Eq. (6) as following:

$$ \in \dot{\mu } = g(B_{\sigma }^{T} x - d_{\sigma } - \mu ) - B_{\sigma }^{T} x + d_{\sigma } $$

(9)

Where $ \in > 0 $ is a scaling parameter. Substituting (8) into (9) generates the neural network dynamics with the following state equation and output equation,

State equation:

$$ \in \dot{\mu } = g( - \mu + B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu - B_{\sigma }^{T} W^{ - 1} c - d_{\sigma } ) - B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu + B_{\sigma }^{T} W^{ - 1} c + d_{\sigma } $$

(10)

Output equation:

$$ x = \mu W^{ - 1} B_{\sigma } - W^{ - 1} c\text{ } $$

(11)

Where $ \sigma $ is the row No. of $ \hbox{max} (Bx - d) $, $ d $ and $ B $ and are as Eq. (3).

Remark 2.1:

Only one dynamic neuron is required in the neural network (10), which is nothing to do with the conditions of Eq. (1). There are at least $ q $ dynamic neurons in recurrent neural networks for solving a general quadratic programming problem in [13,14,15,16, 18]. It is the No. of inequalities of problem (1). However, the proposed model is just a single dynamic neuron, which greatly reduces the number of neurons and computational complexity.

Remark 2.2:

The neural network dynamic modeled by (10) is a switched dynamic system. It switches in a family of dynamic systems under the endogenous switching signal σ (signal flow in the neural network is plotted in Fig. 1)

$$ \begin{array}{*{20}l} { \in \dot{\mu } = f_{\sigma } (\mu ),\sigma \in S = \left\{ {1,2 \ldots ,p} \right\}} \hfill \\ {\sigma :\mu \to S} \hfill \\ \end{array} $$

(12)

Where $ f_{\sigma } (\mu ) = g(B_{\sigma }^{T} x - d_{\sigma } - \mu ) - B_{\sigma }^{T} x + d_{\sigma } $

Remark 2.3:

The proposed model is liable to be implemented with hardware. In comparison with the research [16], the new part with hardware implementation is a switching of the proposed network (10). The $ \sigma $ is a switching signal, which is equal to the row No. of $ \hbox{max} (Bx - d) $. It is the winner of $ Bx - d $. $ \sigma $ is gotten with a WTA circuit [19]. It is worked as the switching signal for $ B_{\sigma }^{{}} $ and $ d_{\sigma } $ in (10). There are a lot of studies on WTA circuits, and it is easily implemented with CMOS [20]. It is also easily implemented by a recurrent neural network just a single dynamic neuron [19]. There is just two dynamic neurons in the neural network (10), which is much less than the solutions of [13,14,15,16,17,18].

3 Convergence Analysis

It is a feasible way to prove the global convergence of the proposed system by constructing a common negative definite Lyapunov function. However, choosing such a common Lyapunov function is a difficult problem. Some researches on the contraction theory [21, 22] greatly simplify the proof process with virtual dynamics of the system. In this paper, the contraction analysis is made to prove the proposed model (10) convergent. The proof process is based on one definition and two lemmas:

Definition 3.1

([22]): Given the system equations $ \dot{x} = f(x,t) $, a region of the state space is called a contraction region with respect to a uniformly positive definite metric $ M(x,t) = \Uptheta^{T} \Uptheta $ if $ \frac{{\partial f^{T} }}{\partial x}M + M\frac{\partial f}{\partial x} + \dot{M} \le - \beta M $ (with $ \beta > 0) $ in that region.

Lemma 3.1

[22]: Given the system equations

$ \dot{x} = f(x, t) $, any trajectory, which starts in a ball of constant radius with respect to the metric $ M(x, t) $, center at a given trajectory and contained at all times in a contraction region with respect o $ M(x, t) $, remains in that ball and converges exponentially to this trajectory. furthermore, global exponential convergence to the given trajectory is guaranteed if the whole state space is a contraction region with respect to the metric $ M(x, t) $.

Remark 3.1:

As pointed out in [21], Lemma 3.1 holds for all switched systems in case that all subsystems in the switched family are within a same contraction region $ M(x,t) $.

Lemma 3.2:

The solution to problem (6) (represented by $ \mu^{*} $, $ x^{*} $) is the optimal solution to problem (1). At the same time, $ \mu^{*} $ is an equilibrium point of the state equation of the proposed model, which is as following:

$$ g( - \mu + B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu - B_{\sigma }^{T} W^{ - 1} c - d_{\sigma } ) - B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu + B_{\sigma }^{T} W^{ - 1} c + d_{\sigma } = 0 $$

The optimal solution also satisfies (11) as following: $ x^{*} = \mu W^{ - 1} B_{\sigma } - W^{ - 1} c. $

Proof:

According to the KKT condition, equation sets (5) are the optimal solution to problem (4). Equation sets (6) are equivalent to equation sets (5), the solution to equation set (6) is equivalent to the optimal solution to problem (1). They are as following:

$$ g( - \mu^{*} + B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu^{*} - B_{\sigma }^{T} W^{ - 1} c - d_{\sigma } ) - B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu^{*} + B_{\sigma }^{T} W^{ - 1} c + d_{\sigma } = 0 $$

Where $ x^{*} = \mu W^{ - 1} B_{\sigma } - W^{ - 1} c $.

Now, we are on the stage to state the convergence result of the proposed model.

Theorem 3.1:

The proposed (10) converges to the equilibrium point from any start point $ \mu \in {\mathbb{R}} $, and the Eq. (11) is the optimal solution to Eqs. (1).

Proof:

Based on the contraction theory for analyzing the convergence of (10):

$$ \dot{\mu } = f(\mu ,t) = \frac{1}{\varepsilon }(g( - \mu + B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu - B_{\sigma }^{T} W^{ - 1} c - d_{\sigma } ) - B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu + B_{\sigma }^{T} W^{ - 1} c + d_{\sigma } ) $$

(13)

The partial derivative $ \frac{\partial f}{\partial \mu } $ is as following:

$$ \frac{\partial f}{\partial \mu } = \text{ }\left\{ {\begin{array}{*{20}l} { - \frac{{B_{\sigma }^{T} W^{ - 1} B_{\sigma } }}{\varepsilon }\text{ }if\text{ }\upsilon \ge \text{ }0} \hfill \\ { - \frac{1}{\varepsilon }\text{ }\upsilon < 0\text{ }} \hfill \\ \end{array} \text{ }} \right. $$

(14)

Where $ \upsilon = - \mu + B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu - B_{\sigma }^{T} W^{ - 1} c - d_{\sigma} $. Note that the derivate $ g^{\prime}(x) = 0 $ if $ x \ge \text{ }0 $ and $ g^{\prime}(x) = 1 $ if $ x < 0 $ (more exactly, it should be the upper right-hand derivative since the function $ g(x) $ is non-smooth at $ x = 0 $.) is used in the derivation of (14). By choose $ M = 1 $, we have

$$ \frac{{\partial f^{T} }}{\partial \mu }M + M\frac{\partial f}{\partial \mu } + \dot{M} = 2\frac{\partial f}{\partial \mu } \le - \frac{{2\hbox{min} (1,B_{\sigma }^{T} W^{ - 1} B_{\sigma } )}}{\varepsilon }\text{ } $$

(15)

Since $ W^{ - 1} $ is positive definite and $ B_{\sigma } $ is not equal to 0 (otherwise, the corresponding inequality in (2) does not include $ x $), we conclude that $ B_{\sigma }^{T} W^{ - 1} B_{\sigma } > 0 $ and $ \frac{{\hbox{min} (1,B_{\sigma }^{T} W^{ - 1} B_{\sigma } )}}{\varepsilon } > 0. $

By defining $ \beta = \frac{{\hbox{min} \{ 1,\min_{i} (1,B_{i}^{T} W^{ - 1} B_{i} )}}{\varepsilon } $, with $ B_{i}^{T} $ denoting the $ i{\text{th}} $ row of $ B $ and the set $ {\mathbf{S}} = \{ 1,2, \ldots ,p\} $, we have the following inequality for all $ \mu \in {\mathbb{R}} $ and all $ \sigma \in {\mathbf{S}} $ under the same metric $ M $,

$$ \frac{{\partial f^{T} }}{\partial x}M + M\frac{\partial f}{\partial x} + \dot{M} \le - \beta M $$

(16)

The whole state space of $ \mu $ is a contraction region.

Based on Lemma 3.2, $ \mu^{*} $ and $ x^{*} $, is the optimal solution to problem (1), and it is also an equilibrium point to Eq. (10).

4 Extensions

4.1 Discrete-Time Model

In this part, we propose the discrete-time model to solve the problem (1) and give conditions for global convergence

Replacing $ \mu $, $ \dot{\mu } $ with $ \mu_{n} $, $ \frac{{\mu_{n + 1} - \mu_{n} }}{\Delta t} $ in (10), respectively, and denoting $ \Upsilon = \frac{\Delta t}{\varepsilon } $, we get the following state equation and output equation of the discrete model as following with some trivial manipulations. state equation:

$$ \mu_{n + 1} =\Upsilon g( - \mu_{n} + B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu_{n} - B_{\sigma }^{T} W^{ - 1} c - d_{\sigma } ) + (1 -\Upsilon B_{\sigma }^{T} W^{ - 1} B_{\sigma } )\mu_{n} +\Upsilon B_{\sigma }^{T} W^{ - 1} c +\Upsilon d_{\sigma } $$

(17)

The output equation:

$$ x_{n} = \mu_{n} W^{ - 1} B_{\sigma } - W^{ - 1} c $$

(18)

Where $ \sigma $ is the row No. of the largest element of $ Bx_{n} - d $, $ B $ and $ d $ are as equation sets (3).

In the discrete-time case, the contraction theory is an extension of the well-known contraction mapping theorem. We still use the contraction theory to analyze the convergence. For discrete-time systems, the definition of a contraction region and the condition for contraction are stated as below.

Definition 4.1

([22]): The discrete-time model is as following:

Equation $ x_{n + 1} = f_{n} (x_{n} ,n) $, is a contraction region. Positive definite metric $ M_{n} (x_{n} ,n) = \Uptheta_{n}^{T} \Uptheta_{n} $, given in the region $ \exists \beta > 0 $, $ F_{n}^{T} F_{n} - I \le - \beta I < 0 $, where $ F_{n} = \Uptheta_{n + 1} \frac{{\partial f_{n} }}{{\partial x_{n} }}\Uptheta_{n}^{ - 1} $.

Lemma 4.1

([22]): If the whole state space is a contraction region, The global convergence to the given trajectory is made sure.

Theorem 4.1:

The discrete model (17) will converge to the equilibrium point after initialization $ \mu \in {\mathbb{R}} $ and the output of equilibrium point, as shown in (18), is also the optimal solution to problem (1) under:

$$ 0 < {{{\upgamma} < 2,}}\,\,\,{{{\upgamma} < }}\frac{2}{{B_{i}^{T} W^{ - 1} B_{i} }}\,{\text{for}}\,{\text{all}}\,i \in {\mathbf{S}} $$

(19)

where $ B_{i}^{T} $ denotes the $ i{\text{th}} $ row of $ B $ and $ {\mathbf{S}} = \{ 1,2,3, \ldots ,p\} $

Proof:

Firstly, solve the equilibrium point for the discrete model (17), and get the output of the Eq. (18). Secondly, prove the global contraction to the equilibrium point.

Step 1: Comparing the discrete-time model (17) and its output (18) with the continuous-time model (10) and its output (11), we can find that they have the same equilibrium and the same output at this equilibrium. According to the Lemma 3.2, the solution to equation set (6) is an equilibrium to the discrete model (17) too, and the output at the point is the optimal solution to the problem (1).

Step 2: To the discrete model (17),

$$ \begin{array}{*{20}l} {\mu_{n + 1} = f_{n} (x_{n} ,n)} \hfill \\ {\quad \quad =\Upsilon g( - \mu_{n} + B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu_{n} - B_{\sigma }^{T} W^{ - 1} c - d_{\sigma } ) + (1 -\Upsilon B_{\sigma }^{T} W^{ - 1} B_{\sigma } )\mu_{n} +\Upsilon B_{\sigma }^{T} W^{ - 1} c +\Upsilon d_{\sigma } } \hfill \\ \end{array} $$

(20)

Calculate $ \frac{{\partial f_{n} }}{{\partial \mu_{n} }} $ as follows:

$$ \frac{{\partial f_{n} }}{{\partial \mu_{n} }} = \left\{ {\begin{array}{*{20}l} {1 -\Upsilon B_{\sigma }^{T} W^{ - 1} B_{\sigma } \text{ }if\text{ }v_{n} \ge 0} \hfill \\ {1 -\Upsilon if\text{ }v_{n} < 0} \hfill \\ \end{array} } \right. $$

(21)

Where $ v_{n} = - \mu_{n} + B_{\sigma }^{T} W^{ - 1} B_{\sigma } \mu_{n} - B_{\sigma }^{T} W^{ - 1} c - d_{\sigma } $. Choosing $ \Uptheta_{n} = 1 $ for $ n = 0,1,2, \ldots ,n $ then

$$ \begin{aligned} F_{n} & = \Uptheta_{n + 1} \frac{{\partial f_{n} }}{{\partial \mu_{n} }}\Uptheta_{n}^{ - 1} & = \frac{{\partial f_{n} }}{{\partial \mu_{n} }} \end{aligned} $$

(22)

Based on (21) and (22),

$$ \begin{aligned} F_{n}^{T} F_{n} & \le (max\{ |1 -\upgamma | , | 1 { - }{\upgamma}B_{\sigma }^{T} W^{ - 1} B_{\sigma } |\} )^{2}\\ & \le (max\{ |1 -\upgamma | ,max_{{i \in {\mathbf{S}}}} \{ | 1 { - }{\upgamma}B_{\sigma }^{T} W^{ - 1} B_{i} |\} \} )^{2} \end{aligned} $$

(23)

Set $ \beta = 1 - (max\{ |1 -\upgamma | ,max_{{i \in {\mathbf{S}}}} {\{ } | 1 { - }{\upgamma}B_{\sigma }^{T} W^{ - 1} B_{i} |\} \} )^{2} $. To all $ \mu_{n} \in {\mathbb{R}} $, $ \sigma \in {\mathbf{S}} $, the inequality as following (24) is valid.

$$ F_{n}^{T} F_{n} - I \le \beta I $$

(24)

Since $ 0 < {\varvec{\upgamma}} < 2,\text{ }0 < {\varvec{\upgamma}} B_{i}^{T} W^{ - 1} B_{i} < 2 $ for $ i \in {\mathbf{S}} $, we get $ |1 - {{{\upgamma} |}} < 1{{, |1 - {\upgamma} }} B_{i}^{T} W^{ - 1} B_{i} | < 1 $ for $ i \in {\mathbf{S}} $. Therefore, $ (max\{ |1 -\upgamma | ,max_{{i \in {\mathbf{S}}}} {\{ } | 1 { - }{\upgamma}B_{\sigma }^{T} W^{ - 1} B_{i} |\} \} )^{2} \le 1 $ and $ \beta > 0 $.

Then the whole state space is within contraction range. Let the equilibrium point of (17) to be the given trajectory, according to Lemma 4.1, the discrete model (17) will convergence to the equilibrium point.

4.2 Irredundant Equality Constraint

Based on that the equality constraint exists without redundancy, there is another way to tackle the problem (1). This is to say, the matrix $ A \in {\mathbb{R}}^{m \times n} $ in (1) problem meets the requirement of rank $ (A) = m $ and $ m < n. $

$ Ex \le e $ in problem (1) is replaced by $ max(Ex - e) \le 0 $, then

$$ \begin{aligned} & {\text{min( }}\frac{1}{2}x^{T} Wx + C^{T} x) \hfill \\ & {\text{ s.t.}}\;Ax = b \hfill \\ & E_{\sigma }^{T} x - e_{\sigma } \le 0 \hfill \\ \end{aligned} $$

(25)

Based on KKT conditions and the upper saturation function g(·), the solution to problem (25) meets,

$$ \begin{array}{*{20}c} {Wx + c - A^{T} y - \mu B_{\sigma }^{T} = 0} \\ {\text{ }Ax = b} \\ {E_{\sigma }^{T} x - e_{\sigma } = g(E_{\sigma }^{T} x - e_{\sigma } - \mu )} \\ \end{array} $$

(26)

Where $ y \in {\mathbb{R}}^{m} ,\;\mu \in {\mathbb{R}} $. $ x $ and $ y $ can be explicitly solved in terms of $ \mu $ under the above three equations:

$$ \begin{array}{*{20}l} {x = \mu PE_{\sigma } + s} \hfill \\ {y = (AW^{ - 1} A^{T} )^{ - 1} ( - AW^{ - 1} B_{\sigma } \mu + AW^{ - 1} c + b)} \hfill \\ \end{array} $$

(27)

Where $ P = - W^{ - 1} A^{T} (AW^{ - 1} A^{T} )^{ - 1} AW^{ - 1} + W^{ - 1} $, $ \text{S} = W^{ - 1} (A^{T} (AW^{ - 1} A^{T} )^{ - 1} (AW^{ - 1} c + b) - c) $ $ P \in {\mathbb{R}}^{n \times n} $. The third equation in (26) is solved as following:

$$ \varepsilon \dot{\mu } = g(E_{\sigma }^{T} x - e_{\sigma } - \mu ) \;{ - }\;E_{\sigma }^{T} x + e_{\sigma } $$

(28)

Where $ \varepsilon > 0 $ is a scaling parameter. The state equation of the neural network is as following:

$$ \varepsilon \dot{\mu } = g(E_{\sigma }^{T} PE_{\sigma } \mu - \mu + E_{\sigma }^{T} s - e_{\sigma } ) \;{ - }\;E_{\sigma }^{T} PE_{\sigma } \mu - E_{\sigma }^{T} s + e_{\sigma } $$

(29)

Before stating the convergence result about the neural network (29), the following lemma is presented, which is used in the convergence proof.

Lemma 4.2:

The symmetric matrix $ P \in {\mathbb{R}}^{n \times n} $, $ P = - W^{ - 1} A^{T} (AW^{ - 1} A^{T} )^{ - 1} AW^{ - 1} + W^{ - 1} $ is semi-positive definite, i.e., $ z^{T} Pz \ge 0 $ for all $ z \in {\mathbb{R}}^{n} $ and $ z^{T} (W^{ - 1} A^{T} (AW^{ - 1} A^{T} )^{ - 1} A - I) \ne 0 $.

Proof:

Since $ W^{ - 1} \in {\mathbb{R}}^{n \times n} $ is positive definite, it can be factorized into $ W^{ - 1} = QQ^{T} $ with $ Q $ positive definite. Defining $ G \in {\mathbb{R}}^{m \times n} $, $ G = AQ $, then $ G $ is also row full rank (since $ A $ is row full rank and $ Q $ is positive definite) and therefore $ G $ can be decomposed into $ G = U[ \wedge \text{ }0]V $ via singular value decomposition, where $ U \in {\mathbb{R}}^{m \times m} $, $ V \in {\mathbb{R}}^{n \times n} $ are both unitary matrices, $ [ \wedge \text{ }0] \in {\mathbb{R}}^{m \times n} $, $ \wedge \text{ } \in {\mathbb{R}}^{m \times m} $ is a diagonal matrix with all positive elements on the diagonal. Bringing

$$ W^{ - 1} = QQ^{T} ,\,G = AQ,\,{\text{and}}\,G = U[ \wedge \text{ }0]V\,\,{\text{into}}\,\,P = - W^{ - 1} A^{T} (AW^{ - 1} A^{T} )^{ - 1} AW^{ - 1} + W^{ - 1} $$

We get the equations as (30) as below. The stability of the neural network (29) and its global convergence to the optimal solution of (1) is guaranteed by the following theorem.

Theorem 4.2:

The neural network (29) exponentially converges to its equilibrium from any initial point $ \mu \in {\mathbb{R}} $, and its output at this equilibrium by following the output Eq. (27) is the optimal solution to problem (1), if $ rank(A) = m $ in (1) and $ (E_{i}^{T} (W^{ - 1} A^{T} (AW^{ - 1} A^{T} )^{ - 1} A - I) \ne 0 $

For $ i = 1,2, \ldots ,q $, with $ E_{i}^{T} $ denoting the $ i{\text{th}} $ row of the matrix $ E $.

Proof:

This theorem can be proven in a two-step procedure similar to the proof of Theorem 4.1. The first step is to solve the equilibrium and show that the output at this equilibrium is the optimal solution to the problem (1). The second step is to prove the global contraction to the equilibrium. Note that the condition $ (E_{i}^{T} (W^{ - 1} A^{T} (AW^{ - 1} A^{T} )^{ - 1} A - I) \ne 0 $ For $ i = 1,2, \ldots , q $, according to Lemma 4.2, guarantees $ E_{\sigma }^{T} PE_{\sigma } > 0 $ for all possible switching signal $ \sigma $. Based on this result, global contraction of the neural network (29) can be proven. Detailed proof process for the two steps is omitted.

$$ \begin{aligned} P & = - QQ^{T} A^{T} (AQQ^{T} A^{T} )^{ - 1} AQQ^{T} + QQ^{T} \\ & = - QG^{T} (GG^{T} )^{ - 1} GQ^{T} + QQ^{T} \\ & = - QV^{T} \left[ {\begin{array}{*{20}c} \varLambda \\ 0 \\ \end{array} } \right]U^{T} (U\varLambda^{ - 2} U^{T} )U\left[ {\begin{array}{*{20}c} \varLambda & 0 \\ \end{array} } \right]VQ^{T} + QQ^{T} \\ & = - QV^{T} \left[ {\begin{array}{*{20}c} {I_{m \times m} } & 0 \\ 0 & 0 \\ \end{array} } \right]VQ + QV^{T} VQ^{T} \\ & = (VQ^{T} )^{T} \left[ {\begin{array}{*{20}c} 0 & 0 \\ 0 & {I_{(n - m) \times (n - m)} } \\ \end{array} } \right]VQ^{T} \ge 0 \\ \end{aligned} $$

(30)

5 Simulations

5.1 Continuous-Time Neural Network Model

The following problem will be solved for showing the convergence of the continuous-time neural network modelled with (10) and (11).

$$ \begin{aligned} \hbox{min} (3x_{1}^{2} + 3x_{2}^{2} + 4x_{3}^{2} + 5x_{4}^{2} + 3x_{1} x_{2} + 5x_{1} x_{3} + x_{2} x_{4} - 11x_{1} - 5x_{4} ) \hfill \\ {\text{s}} . {\text{t}} .\begin{array}{*{20}c} {\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, - 3x_{1} + 3x_{2} + 2x_{3} - x_{4} \le 0} \\ {\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, - 4x_{1} - x_{2} + x_{3} + 2x_{4} \le 0} \\ {\, - x_{1} + x_{2} \le - 1} \\ {\,\,\,\,\,\,\,\,\, - 2 \le 3x_{1} + x_{3} \le 4} \\ \end{array} \,\,\, \hfill \\ \end{aligned} $$

(31)

To the model, the parameters are as following:

$$ \begin{array}{*{20}l} {W = \left[ {\begin{array}{*{20}c} 6 & 3 & 5 & 0 \\ 3 & 6 & 0 & 1 \\ 5 & 0 & 8 & 0 \\ 0 & 1 & 0 & {10} \\ \end{array} } \right], \, c = \left[ {\begin{array}{*{20}c} { - 11} \\ 0 \\ 0 \\ { - 5} \\ \end{array} } \right]} \hfill \\ {B = \left[ {\begin{array}{*{20}l} { - 1} \hfill & 1 \hfill & 0 \hfill & 0 \hfill \\ 3 \hfill & 0 \hfill & 1 \hfill & 0 \hfill \\ { - 3} \hfill & 0 \hfill & { - 1} \hfill & 0 \hfill \\ { - 3} \hfill & 3 \hfill & 2 \hfill & { - 4} \hfill \\ { - 4} \hfill & { - 1} \hfill & 1 \hfill & 2 \hfill \\ \end{array} } \right],\,\,D = \left[ \begin{aligned} - 1\hfill \\ 4\hfill \\ 2\hfill \\ 0\hfill \\ 0\hfill \\ \end{aligned} \right]} \hfill \\ \end{array} \text{ } $$

(32)

We run the neural network model in simulation with a random initialization of the state variable $ \mu $ and we choose $ \in \;=\;10^{ - 8} $. Simulations is run for $ 5 \times 10^{ - 8} $ s and the simulation result of $ \mu $ and $ x $ are as shown in the Figs. 2 and 3. The optimal solution is about −19.1794310722, the proposed output $ x = [1.8774617068, -1.0393873085, -1.6323851203,0.6039387308] $, the optimum point is −19.179. The differences are both less than $ 1.0 \times 10^{ - 10} $ for $ x $ and the optimum point.

5.2 Discrete-Time Model

We also consider the problem studied in Example V-A and solve it with the discrete model based on (17) and (18). In the simulation, $ {\gamma} = 0.05 $, randomly initialization $ \mu $. The evolution of $ \mu $ and $ x $ are as shown in Figs. 4 and 5. The proposed network output is $ x = [1.8774617068, - 1.0393873085, - 1.6323851203,0.6039387308] $. The optimal value is −19.179. The difference is less than $ 5.85 \times 10^{ - 6} $ for $ x $ and $ 4.2 \times 10^{ - 5} $ for the optimal value.

5.3 Problem with Irredundant Equality Constraint

To show the convergence of the neural network, we solve a quadratic programming problem with irredundant equality constraints as following

$$ \begin{aligned} \hbox{min} (3x_{1}^{2} + 3x_{2}^{2} + 4x_{3}^{2} + 5x_{4}^{2} + 3x_{1} x_{2} + 5x_{1} x_{3} + x_{2} x_{4} - 11x_{1} - 5x_{4} ) \hfill \\ {\text{s}} . {\text{t}} .\,\,\,\,\,\,\,\,\,\begin{array}{*{20}c} {\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, - 3x_{1} + 3x_{2} + 2x_{3} - x_{4} = 0} \\ {\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, - 4x_{1} - x_{2} + x_{3} + 2x_{4} = 0} \\ {\, - x_{1} + x_{2} \le - 1} \\ {\,\,\,\,\,\,\,\,\,\, - 2 \le 3x_{1} + x_{3} \le 4\text{ }} \\ \end{array} \hfill \\ \end{aligned} $$

(33)

As the problem (34), A simulation is performed with a random initialization of the state variable $ \mu $ and the scaling parameter $ \in \,=\,10^{ - 8} $. Simulation result is as shown in Figs. 6, 7 and 8, from which we can observe that the neural network converges very fast after a short transient period shorter after than $ 10^{ -6} $ s. The error of solution at the final time of the simulation is less than $ 0.9 \times 10^{ -10} $ in $ l_{2} $ norm sense, and the optimal solution is $ x = [0.5, -0.5, -1.5,0] $.

$$ \begin{array}{*{20}l} {W\text{ = }\left[ {\begin{array}{*{20}c} 6 & 3 & 5 & 0 \\ 3 & 6 & 0 & 1 \\ 5 & 0 & 8 & 0 \\ 0 & 1 & 0 & {10} \\ \end{array} } \right], \, c = \left[ {\begin{array}{*{20}c} { - 11} \\ 0 \\ 0 \\ { - 5} \\ \end{array} } \right]} \hfill \\ {A\text{ = }\left[ {\begin{array}{*{20}c} 3 & { - 3} & { - 2} & 1 \\ 4 & 1 & { - 1} & { - 2} \\ \end{array} } \right],\text{ }b\text{ = }\left[ \begin{aligned} 0 \hfill \\ 0\hfill \\ \end{aligned} \right]\text{ }} \hfill \\ {E = \left[ {\begin{array}{*{20}c} { - 1} & 1 & 0 & 0 \\ 3 & 0 & 1 & 0 \\ { - 3} & 0 & { - 1} & 0 \\ \end{array} } \right],\text{ }e\text{ = }\left[ \begin{aligned} - 1 \hfill \\ 4\hfill \\ 2 \hfill \\ \end{aligned} \right]\text{ }} \hfill \\ \end{array} $$

(34)

6 Conclusions

For solving the general quadratic programming problems, a new recurrent neural network is proposed. The proposed model is with fewer neurons and simple structure. The neural network are shown global convergence based on contraction analysis. The discrete time counterpart to the proposed neural network is studied and an alternative recurrent neural network to solve the problem under irredundant equality constraint is also studied. Finally, simulation results demonstrate the models are efficient and accurate.

References

Hughes, T., Mierle, K.: Recurrent neural networks for voice activity detection. In: IEEE International Conference on Acoustics, pp. 7378–7382 (2013)
Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (2002)
Article Google Scholar
Scardino, L., Niess, R.: Performance of recurrent neural networks applied to a simplified pattern recognition problem. J. Comput. Sci. Colleges 15(3), 251–258 (2000)
Google Scholar
Nan, B., Fukuda, O.: EMG-based motion discrimination using a novel recurrent neural network. J. Intell. Inf. Syst. 21(2), 113–126 (2003)
Article Google Scholar
Zhang, Y.N., Tan, Z.G.: Repetitive motion of redundant robots planned by three kinds of recurrent neural networks and illustrated with a four-link planar manipulator’s straight-line example. Robot. Auton. Syst. 57(6–7), 645–651 (2009)
Article Google Scholar
Vázquez, L.A., Jurado, F.: Decentralized identification and control in real-time of a robot manipulator via recurrent wavelet first-order neural network. In: Mathematical models in Engineering, pp. 1–12 (2015)
Google Scholar
Xu, R., Wunsch, D.: Inference of genetic regulatory networks with recurrent neural network models using particle swarm optimization. IEEE Trans. Comput. Biol. 4(4), 681–692 (2007)
Article Google Scholar
Hopfield, J.J.: Neurons with graded response have collective computational properties like those of two-state neurons. Proc. Nat. Acad. Sci. 81(10), 3088–3092 (1984)
Article Google Scholar
Liu, Q., Wang, J.: A one-layer recurrent neural network with a discontinuous hard-limiting activation function for quadratic programming. IEEE Trans. Neural Netw. 19(4), 558–570 (2008)
Article Google Scholar
Mestari, M., Namir, A.: Switched capacitor neural networks for optimal control of nonlinear dynamic systems: design and stability analysis. Syst. Anal. Model. Simul. 41(3), 11–20 (2001)
MATH Google Scholar
Malek, A., Alipour, M.: Numerical solution for linear and quadratic programming problems using a recurrent neural network. Appl. Math. Comput. 192(1), 27–39 (2007)
MathSciNet MATH Google Scholar
Liu, Q., Wang, J.: A one-layer recurrent neural network with a discontinuous activation function for linear programming. Neural Comput. 20(5), 1366–1383 (2008)
Article MathSciNet Google Scholar
Hu, X., Wang, J.: An improved dual neural network for solving a class of quadratic programming problems and its k-winners-take-all application. IEEE Trans. Neural Netw. 19(12), 2022–2031 (2008)
Article Google Scholar
Xia, Y., Han, Y.W.: A mixed-binary convex quadratic reformulation for box-constrained nonconvex quadratic integer program. Mathematics 10(12), 7897–7905 (2014)
Google Scholar
Li, S., Chen, S.: Decentralized kinematic control of a class of collaborative redundant manipulators via recurrent neural networks. Neurocomputing 91(9), 1–10 (2012)
Google Scholar
Cheng, L., Hou, Z.: A Simplified Neural Network for Linear Matrix Inequality Problems. Neural Process. Lett. 29(3), 213–230 (2009)
Article Google Scholar
Hu, X., Zhang, B.: A simplified dual neural network for quadratic programming with its KWTA application. IEEE Trans. Neural Netw. 17(6), 1500–1510 (2006)
Article Google Scholar
Xia, Y., Feng, G., Wang, J.: A novel recurrent neural network for solving nonlinear optimization problems with inequality constraints. IEEE Trans. Neural Netw. 19, 1340–1353 (2008)
Article Google Scholar
Tymoshchuk, P.: A discrete-time dynamic K-winners-take-all neural circuit. Neurocomputing 72(13–15), 3191–3202 (2009)
Article Google Scholar
Xiao, Y., Liu, Y.: Analysis on the convergence time of dual neural network-based KWTA. IEEE Trans. Neural Netw. Learn. Syst. 23(4), 676–682 (2012)
Article Google Scholar
Hu, F., Zhang, Z.: Contraction theory-based adaptive dynamic surface control for a class of nonlinear systems. Control Decis. 31(5), 769–775 (2016)
MATH Google Scholar
Wang, W., Slotine, J.J.E.: Contraction analysis of time-delayed communications and group cooperation. IEEE Trans. Autom. Control 51(4), 712–717 (2006)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge the support of Guangdong Science Foundation of China under Grant No. S2011010006116 and No. 2015A030313587, Shenzhen Science Technology Project No. JCYJ20150417094158025, No. JCY20160307100530069 and GRCK20170424095924228,Shenzhen Institute of Information Technology Scientific Research Platform Cultivation Project (PT201704).

Author information

Authors and Affiliations

Shenzhen Institute of Information Technology, Shenzhen, 518029, Guangdong, China
Sanfeng Chen, Xin Han, Fei Tang & Guangming Lin
School of Mechanical and Electrical Engineering, Jiangxi University of Science and Technology, Ganzhou, 341000, Jiangxi, China
Xin Han

Authors

Sanfeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xin Han
View author publications
You can also search for this author in PubMed Google Scholar
Fei Tang
View author publications
You can also search for this author in PubMed Google Scholar
Guangming Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guangming Lin .

Editor information

Editors and Affiliations

College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
Kangshun Li
Jiangxi University of Science and Technology, Ganzhou, Jiangxi, China
Wei Li
Chemical and Petroleum Engineering, University of Calgary, Calgary, Alberta, Canada
Zhangxing Chen
School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu, Fukushima, Japan
Yong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, S., Han, X., Tang, F., Lin, G. (2018). A New Recurrent Neural Network with Fewer Neurons for Quadratic Programming Problems. In: Li, K., Li, W., Chen, Z., Liu, Y. (eds) Computational Intelligence and Intelligent Systems. ISICA 2017. Communications in Computer and Information Science, vol 873. Springer, Singapore. https://doi.org/10.1007/978-981-13-1648-7_1

Download citation

DOI: https://doi.org/10.1007/978-981-13-1648-7_1
Published: 21 July 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1647-0
Online ISBN: 978-981-13-1648-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A New Recurrent Neural Network with Fewer Neurons for Quadratic Programming Problems

Abstract

Similar content being viewed by others

Solving a Class of Optimal Control Problems by Using Chebyshev Polynomials and Recurrent Neural Networks

Recurrent Neural Networks for Solving Real-Time Linear System of Equations

A recurrent neural network with finite-time convergence for convex quadratic bilevel programming problems

Keywords

1 Introduction

2 Mathematical Model

Remark 2.1:

Remark 2.2:

Remark 2.3:

3 Convergence Analysis

Definition 3.1

Lemma 3.1

Remark 3.1:

Lemma 3.2:

Proof:

Theorem 3.1:

Proof:

4 Extensions

4.1 Discrete-Time Model

Definition 4.1

Lemma 4.1

Theorem 4.1:

Proof:

4.2 Irredundant Equality Constraint

Lemma 4.2:

Proof:

Theorem 4.2:

Proof:

5 Simulations

5.1 Continuous-Time Neural Network Model

5.2 Discrete-Time Model

5.3 Problem with Irredundant Equality Constraint

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation