Keywords

1 Introduction

Throughout the years, there are many studies on solving constrained nonlinear programming problems [13] based on neural circuit approach. When the realtime solutions are needed, the analog neural circuit is preferred. Although many neural models [4, 5] were introduced, they are designed for a particular kind of problems only. The Lagrange programming neural network (LPNN) [3, 6, 7] is a framework for solving constrained nonlinear programming problems. But it can solve differentiable objective/contraint functions only.

In signal processing, one of the important topics is sparse approximation [8, 9]. Sparse approximation aims at recovering an unknown sparse signal from the measurements. Sparse approximation has many potential applications, such as channel estimation in MIMO wireless communication channels [10] and image denoising [11]. However, in sparse approximation, many problems involve nondifferentiable objective or constraints. By introducing the concept of soft threshold function, the local competition algorithm (LCA) [12], being an analog method, is able to solve the basis pursit denoising problem [9]. which is a unconstrained nonlinear programming problem. However, the LCA is not designed to handle constrained optimization problems.

This paper focuses on \(l_1\)-norm constrained quadratic minimization (L1CQM) for sparse approximation. Section 2 reviews the basic concept of LPNN, sparse approximation, and LCA. Section 3 presents the proposed LPNN model for solving the L1CQM. Section 4 presents some properties of the LPNN. Simulation results are then presented in Sect. 5.

2 Background

LPNN: The LPNN approach considers a general constrained nonlinear programming problem:

$$\begin{aligned} \text{ EP: } \min f(\varvec{x}) \, \, \text{ s.t. } \varvec{h}(\varvec{x})= \varvec{0}, \end{aligned}$$
(1)

where \(\varvec{x}\in \mathfrak {R}^n\) is the state vector, \(f:\mathfrak {R}^n \rightarrow \mathfrak {R}\) is the objective function, and \(\varvec{h}: \mathfrak {R}^n \rightarrow \mathfrak {R}^m\) (\(m<n\)) describes the m equality constraints. The objective function f and constraints \(\varvec{h}\) are assumed to be twice differentiable. It should be noticed that the LPNN approach can solve inequality constraints by introducing dummy variables.

In LPNN, a Lagrangian function is set up, given by

$$\begin{aligned} \mathcal{L}_{ep} = f(\varvec{x}) + \varvec{\lambda }^{\mathrm {T}} \varvec{h}(\varvec{x}) \, , \end{aligned}$$
(2)

where \(\varvec{\lambda }= [ \lambda _1, \cdots , \lambda _m ]^{\mathrm {T}}\) is the Lagrange multiplier vector. There are two kinds of neurons: variable neurons and Lagrange neurons. The variable neurons hold the variable vector \(\varvec{x}\), while the Lagrange neurons hold the multiplier vector \(\varvec{\lambda }\). The LPNN dynamics are given by

$$\begin{aligned} \frac{1}{\epsilon } \frac{d \varvec{x}}{dt} = - \frac{\partial {\mathcal{L}}_{ep}}{\partial \varvec{x}}, \text{ and } \frac{1}{\epsilon } \frac{d \varvec{\lambda }}{dt} = \frac{\partial {\mathcal{L}}_{ep}}{\partial \varvec{\lambda }}, \end{aligned}$$
(3)

where \(\epsilon \) is the time constant of the circuit. In this paper, \(\epsilon \) is considered as equal to 1 in regard to generality.

Sparse Approximation: In sparse approximation, we would like to estimate a sparse solution \(\varvec{x}\in \mathfrak {R}^n\) from the measurement \(\varvec{b}= \varvec{\varPhi }\varvec{x}+\varvec{\xi }\), where \(\varvec{b}\in \mathfrak {R}^m\) is the observation vector, \(\varvec{\varPhi }\in \mathfrak {R}^{m \times n}\) is the measurement matrix with a rank of m, \(\varvec{x}\in \mathfrak {R}^n\) is the unknown sparse vector (\(m<n\)), and \(\xi _i\) ’s are the measurement noise. The recovery can be formulated as the following programming problem:

$$\begin{aligned} \text{ min } |\varvec{b}-\varvec{\varPhi }\varvec{x}|_2^2, \, \text{ s.t. } |\varvec{x}|_1 \le \psi , \end{aligned}$$
(4)

where \(\psi > 0\). This problem is called as \(l_1\)-norm Constrained Quadratic Minimization (L1CQM). In this problem, we would like to minimize the residue subject to the sum of absolute of signal elements is less than a value. Since the constraint function \(|\varvec{x}|_1 - \psi \) is nondifferentiable, the conventional LPNN is unable to solve L1CQM directly.

Subdifferential: Subdifferential was developed to handle nondifferentiable functions. Their definitions are stated in the following.

Definition 1

Given a convex function f, the subgradient \(\varvec{\rho }\) of f at \(\varvec{x}\) is

$$\begin{aligned} f(\varvec{y}) \ge f(\varvec{x}) + \varvec{\rho }^{\mathrm {T}} (\varvec{y}-\varvec{x}), \, \forall \varvec{y}. \end{aligned}$$
(5)

Definition 2

The subdifferential \(\partial f(\varvec{x})\) at \(\varvec{x}\) is the set of all subgradients:

$$\begin{aligned} \partial f(\varvec{x}) = \left\{ \varvec{\rho }\, | \, \varvec{\rho }^{\mathrm {T}} (\varvec{y}-\varvec{x}) \le f(\varvec{y}) - f(\varvec{x}), \, \forall \varvec{y}\right\} . \end{aligned}$$
(6)

Note that when \(f(\cdot )\) is differentiable at \(\varvec{x}_o\), its subdifferential at \(\varvec{x}_o\) is equal to the conventional partial derivative. Let us use the absolute function \(f(x)=|x| \) as an example to example the subdifferential:

$$\begin{aligned} \partial |x| = {\left\{ \begin{array}{ll} [-1,1] &{} x=0, \\ \text{ sign }(x) &{} x \ne 0. \end{array}\right. } \end{aligned}$$
(7)

Concept of LCA: The LCA is designed to handle the following unconstrained optimization problems:

$$\begin{aligned} \mathcal {L} = \frac{1}{2} |\varvec{\varPhi }\varvec{x}-\varvec{b}|_2^2 + \lambda | \varvec{x}|_1, \end{aligned}$$
(8)

where \(\lambda \) is a trade-off parameter. In the LCA, there are n neurons. The neuron outputs are denoted as \(\varvec{x}\) and their internal states are denoted as \(\varvec{u}\). By a threshold function, the mapping from \(\varvec{u}\) to \(\varvec{x}\) is stated as below

$$\begin{aligned} x_i = T_{\lambda }(u_i) = \left\{ \begin{array}{lcl} 0, &{} \text{ for } &{} |u_i| \le \lambda , \\ u_i - \lambda \text{ sign } (u_i), &{} \text{ for } &{} |u_i| > \lambda . \end{array}\right. \end{aligned}$$
(9)

The forward mapping \(T_\lambda \) from \(u_i\) to \(x_i\) is one-to-one for \(|u_i| > \lambda \) and is many-to-one for \(|u_i| \le \kappa \). The inverse mapping \(T^{-1}_{\lambda }\) from \(x_i\) to \(u_i\) is one-to-one for \(x_i \ne 0\), while it is one-to-many for \(x_i = 0\). It implies that \(T^{-1}_{\lambda }(0)\) is equal to a set: \([-\lambda ,\lambda ]\). The LCA defines the dynamics on \(\varvec{u}\) rather than \(\varvec{x}\), given by

$$\begin{aligned} \frac{d \varvec{u}}{dt} = -\partial _{\varvec{x}} \mathcal {L}_{lca} = -\lambda \partial |\varvec{x}|_1 + \varvec{\varPhi }^{\mathrm {T}} (\varvec{b}- \varvec{\varPhi }\varvec{x}) . \end{aligned}$$
(10)

With the property of \(T_{\lambda }(\cdot )\) and the definition subdifferential [12], the LCA replaces “\(\kappa \partial |\varvec{x}|_1\)” with “\(\varvec{u}- \varvec{x}\)”. The dynamics become

$$\begin{aligned} \frac{d \varvec{u}}{dt} = -\varvec{u}+\varvec{x}+ \varvec{\varPhi }^{\mathrm {T}} (\varvec{b}- \varvec{\varPhi }\varvec{x}). \end{aligned}$$
(11)

If we do not introduce the internal state vector \(\varvec{u}\), \(\partial |\varvec{x}|_1\) (may be equal to a set) is not implementable.

3 LPNN for L1CQM

Our aim is to solve the programming problem:

$$\begin{aligned} \text{ min } |\varvec{b}-\varvec{\varPhi }\varvec{x}|_2^2, \, \text{ s.t. } |\varvec{x}|_1 \le \psi . \end{aligned}$$
(12)

From the convex optimization theory, one can obtain the following theorem.

Theorem 1

For the programming problem (12), \(\varvec{x}^{\star }\) is an optimal solution, iff, there exists a \(\lambda ^{\star }\) (Lagrange multiplier) and

$$\begin{aligned}&\varvec{0} \in - 2 \varvec{\varPhi }^{\mathrm {T}} (\varvec{b}-\varvec{\varPhi }x^{\star }) + \lambda ^{\star } ({\partial |\varvec{x}^{\star }|_1}), \end{aligned}$$
(13a)
$$\begin{aligned}&|\varvec{x}^{\star }|_1 - \psi \le 0, \end{aligned}$$
(13b)
$$\begin{aligned}&\lambda ^{\star } \ge 0, \end{aligned}$$
(13c)
$$\begin{aligned}&\lambda ^{\star } (|\varvec{x}^{\star }|_1 - \psi ) = 0 . \end{aligned}$$
(13d)

where \(\lambda ^{\star }\) is the optimal dual variable (Lagrange multiplier).

Since the problem (12) has an inequality constraint, we cannot directly use the LPNN. However, if \(\psi ^2\) is less than a certain value, given by \(\psi ^2 < \frac{|\varvec{\varPhi }^{\mathrm {T}} \varvec{b}|_2^2}{|\varvec{\varPhi }^{\mathrm {T}}\varvec{\varPhi }|_2^2}\), then the inequality constraint becomes an equality ones. Therefore, Theorem 1 becomes the following theorem.

Theorem 2

If \(\psi ^2 < \frac{|\varvec{\varPhi }^{\mathrm {T}} \varvec{b}|_2^2}{|\varvec{\varPhi }^{\mathrm {T}}\varvec{\varPhi }|_2^2}\), then the optimization problem (13) becomes

$$\begin{aligned} \min |\varvec{b}-\varvec{\varPhi }\varvec{x}|_2^2, \, \text{ s.t. } |\varvec{x}|_1 = \psi . \end{aligned}$$
(14)

And \(\varvec{x}^\star \) is the optimal solution, iff, there exists a \(\lambda ^\star \) (Lagrange multiplier) and

$$\begin{aligned}&\varvec{0} \in - 2 \varvec{\varPhi }^{\mathrm {T}} (\varvec{b}-\varvec{\varPhi }x^{\star }) + \lambda ^{\star } ({\partial |\varvec{x}^{\star }|_1}), \end{aligned}$$
(15a)
$$\begin{aligned}&|\varvec{x}^{\star }|_1 - \psi = 0, \end{aligned}$$
(15b)
$$\begin{aligned}&\lambda ^{\star } > 0 . \end{aligned}$$
(15c)

Note that (15) summarizes the KKT conditions (necessary and sufficient).

Proof: Suppose \((\varvec{x}^\star ,\lambda ^\star )\) is an optimal solution of Theorem 1. That means, \(\lambda ^\star \ge 0\). Firstly, we will use contradiction to prove that \(\lambda ^\star \) cannot be equal to zero when \(\psi ^2 < \frac{|\varvec{\varPhi }^{\mathrm {T}} \varvec{b}|_2^2}{|\varvec{\varPhi }^{\mathrm {T}}\varvec{\varPhi }|_2^2}\).

Since \((\varvec{x}^\star ,\lambda ^\star )\) are optimal, they satisfy the KKT conditions (14) of Theorem 1. If \(\lambda ^\star = 0\), then from (13a) we have

$$\begin{aligned} \varvec{\varPhi }^{\mathrm {T}}\varvec{\varPhi }\varvec{x}^{\star } = \varvec{\varPhi }^{\mathrm {T}} \varvec{b}\Rightarrow |\varvec{\varPhi }^{\mathrm {T}}\varvec{\varPhi }|_2 |\varvec{x}^{\star }|_2 \ge |\varvec{\varPhi }^{\mathrm {T}} \varvec{b}|_2 \Rightarrow |\varvec{x}^{\star }|_2^2 \ge \frac{|\varvec{\varPhi }^{\mathrm {T}} \varvec{b}|_2^2}{|\varvec{\varPhi }^{\mathrm {T}}\varvec{\varPhi }|_2^2} \, . \end{aligned}$$
(16)

From (13b),

$$\begin{aligned} \psi ^2 \ge |\varvec{x}^\star |^2_1 = (\sum _{i=1}^{n}| x_i^{\star }|)^2 = \sum _{i=1}^{n}{x_i^{\star }}^2 + \sum _{i =1}^{n} \sum _{i \ne j}^{n} |x_i^{\star }||x_j^{\star }| \ge \sum _{i=1}^{n}{x_i^{\star }}^2 \ge |\varvec{x}^{\star }|_2^2 \end{aligned}$$
(17)

From (16) and (17),

$$\begin{aligned} \psi ^2 \ge |\varvec{x}^{\star }|_2^2 \Rightarrow \psi ^2 \ge \frac{|\varvec{\varPhi }^{\mathrm {T}} \varvec{b}|_2^2}{|\varvec{\varPhi }^{\mathrm {T}}\varvec{\varPhi }|_2^2}. \end{aligned}$$
(18)

The above contradicts the assumption of \(\psi ^2 < \frac{|\varvec{\varPhi }^{\mathrm {T}} \varvec{b}|_2^2}{|\varvec{\varPhi }^{\mathrm {T}}\varvec{\varPhi }|_2^2}\). As a result, it proves that \(\lambda ^{\star } > 0\). This means, the KKT conditions of Theorem 1 can be rewritten as (15). Besides, the inequality in (12) can be removed. Thus, the optimization problem can be written as (14). The proof is complete. \(\blacksquare \)

With Theorem 2, we can define the Lagrangian function:

$$\begin{aligned} \mathcal{L}_{n} = |\varvec{b}-\varvec{\varPhi }\varvec{x}|^2_2 + \lambda (|\varvec{x}|_1 - \psi ). \end{aligned}$$
(19)

Since the traditional LPNN model cannot handle nondifferentiable constraints, direct implementing the neuron dynamics is impossible. Using the concept of the LCA, we introduce the hidden state vector \(\varvec{u}\) as the internal state vector and \(\varvec{x}\) as the corresponding neuron outputs. Besides, from the property of \(T_{\kappa }(\cdot )\) and the definition subdifferential [12], the dynamics of \(\varvec{u}\) and \(\lambda \) is given by

$$\begin{aligned} \frac{d\varvec{u}}{dt}&= 2 \varvec{\varPhi }^{\mathrm {T}} (\varvec{b}-\varvec{\varPhi }\varvec{x}) - \lambda (\varvec{u}- \varvec{x}) \end{aligned}$$
(20a)
$$\begin{aligned} \frac{d\lambda }{dt}&= |\varvec{x}|_1 - \psi . \end{aligned}$$
(20b)

The role of (20a) is used to minimize the objective value, while the role of (20b) is used to constraint \(\varvec{x}\) in the feasible region.

Fig. 1.
figure 1

Illustration example for the LPNN. (a) The original signal. (b) Recovery from pseudoinverse. (c) Recovery from LPNN. (d) The dynamics of LPNN.

We can use Fig. 1 to illustrate the idea of LPNN for L1CQM. In Fig. 1(a), there is a 1D sparse signal. The length of this signal is 128. There are five nonzero elements. The number of measurement is 30. The measurement matrix is an \(\pm 1\) random matrix. In Fig. 1(b), we show the recovery from the pseudoinverse. Clearly it is not our expected signal. In Fig. 1(c), we show the recovery signal from the LPNN, which is pretty close to the original signal. Figure 1(d) shows the dynamics of \(\varvec{x}\). It can be seen that the LPNN can settle down in five characteristic time.

4 Properties of LPNN

This section discusses the properties of the proposed LPNN for L1CQM. Firstly, we will show that the equilibrium point of the LPNN approach is the global minimum point of the L1CQM.

Theorem 3

Let \((\varvec{u}^\star , \lambda ^*)\), with \(\lambda ^\star >0\) and \(\varvec{u}^\star \ne \text{0 }\), be an equilibrium point of the LPNN dynamics (Eq. (20a)) and \(\varvec{x}^\star \) be the corresponding output vector. The KKT conditions in Theorem 2 are satisfied at this equilibrium point. And the corresponding output vector \(\varvec{x}^\star \) is the optimal solution of the problem.

Proof: In the proof, we will prove that \((\varvec{u}^\star , \lambda ^*)\), with \(\lambda ^\star >0\) and \(\varvec{u}^\star \ne \text{0 }\) satisfies the KKT conditions in Theorem 2. At the equilibrium point, from (20b), we have

$$\begin{aligned}&2 \varvec{\varPhi }^{\mathrm {T}} (\varvec{b}- \varvec{\varPhi }\varvec{x}^{\star }) -\lambda ^{\star } (\varvec{u}^{\star } - \varvec{x}^{\star }) = \text{0 }, \end{aligned}$$
(21)
$$\begin{aligned}&|\varvec{x}^{\star }|_1 - \psi =0. \end{aligned}$$
(22)

For (21) and by \(\partial |\varvec{x}|_1 = \varvec{u}- \varvec{x}\),

$$\begin{aligned} 2 \varvec{\varPhi }^{\mathrm {T}}(\varvec{b}- \varvec{\varPhi }\varvec{x}^{\star }) -\lambda ^{\star } (\varvec{u}^{\star } - \varvec{x}^{\star }) = \text{0 }\Rightarrow -2 \varvec{\varPhi }^{\mathrm {T}}(\varvec{b}- \varvec{\varPhi }\varvec{x}^{\star }) + \lambda ^{\star } (\partial |\varvec{x}^\star |_1) = \text{0 }. \end{aligned}$$

The above means that “\(\text{0 }\in -2 \varvec{\varPhi }^{\mathrm {T}}(\varvec{b}- \varvec{\varPhi }\varvec{x}^{\star }) + \lambda ^{\star } (\partial |\varvec{x}^\star |_1)\)”. This satisfies the KKT condition (15a). Based on the assumption that \(\lambda ^{\star } > 0\), (15c) is satisfied. Also for “\(|\varvec{x}^{\star }|_1 - \psi = 0\)” in (22), it satisfies the KKT condition (15b). As the KKT conditions are necessary and sufficient, which implies that any equilibrium point \((\varvec{u}^{\star }, \lambda ^{\star })\), with \(\varvec{u}^{\star } \ne 0\) and \(\lambda ^{\star }>0\), is an optimal solution to the problem. The proof is complete.     \(\blacksquare \)

Another thing needed to be concern is the stability of the equilibrium point. Otherwise, the equilibrium points of the LPNN are not achievable. Based on the approach in [3, 6], the equilibrium point of (20) is an asymptotically stable point.

5 Simulations

The proposed LPNN approach is undergoing some experiments by using the standard configures [13, 14]. The aim of the experiment is to verify if the proposed LPNN has the similar performance to the conventional numerical method LASSO. For the sparse vector \(\varvec{x}\in \mathfrak {R}^n\), we consider two signal lengths: \({n=512,4096}\). The numbers of non-zero elements in \(\varvec{x}\) are selected as 15 and 25 for \({n=512}\), and 75 and 125 for \({n=4096}\), respectively. For the non-zero elements, they are uniformly distributed random numbers, either in between \(-\)5 to \(-\)1 or in between 1 to 5.

In the tests, the measurement matrix is an \(\pm 1\) random matrix, which is then normalized with the signal length. In the experiments, we vary the number m of measurement signals. For each setting, we repeat the experiments with 100 times using different measurement matrices and sparse signals. The variances of Gaussian noise introduced in the measured signal are \(\sigma ^2= \{0.05^2,0.025^2,0.005^2\}\).

Fig. 2.
figure 2

The MSE of the recovery signals from noisy measurement for signal length \(n = 512\). The signals contain 15 and 15 non-zero data points in first and second row.

Fig. 3.
figure 3

The MSE of the recovery signals from noisy measurement for signal length \(n = 4096\). The signals contain 75 and 125 non-zero data points in first and second row.

In order to compare the proposed LPNN approach with digital method, the LASSO algorithm from SPGL1 [13, 14] is applied for recovering the sparse signals. During comparison, the mean square error (MSE) values of recovery signals are recorded. The results are shown in Figs. 2 and 3. From the figures, the performance of the LPNN is similar to that of the LASSO algorithm. In addition, as shown in Fig. 2, for \(n = 512\) with 15 non-zero data points, around 80 measurements are required to recover the sparse signal. For 25 non-zero data points, around 120 measurements are required. As shown in Fig. 3, for \(n = 4096\) with 75 non-zero data points, around 500 measurements are required to recover the sparse signal. For 125 non-zero data points, around 700 measurements are required.

6 Conclusion

This paper proposed a new LPNN model for solving the L1CQM problem. In the theoretical side, we proved that the equilibrium points of the LPNN is the optimal solution. Besides, stimulations are carried out to verify the effectiveness of the LPNN. There are some possible extensions of our works. From the simulation, the network always converges. However, we do not theoretically show that the LPNN is global stable. Hence it is interesting to theoretically study the global stability.