Regularized Multiple Sparse Bayesian Learning for Hyperspectral Target Detection

Kong, Fanqiang; Wen, Keyao; Li, Yunsong

doi:10.1007/s41651-019-0034-1

Regularized Multiple Sparse Bayesian Learning for Hyperspectral Target Detection

Published: 01 July 2019

Volume 3, article number 11, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Geovisualization and Spatial Analysis Aims and scope Submit manuscript

Regularized Multiple Sparse Bayesian Learning for Hyperspectral Target Detection

Download PDF

Fanqiang Kong¹,
Keyao Wen¹ &
Yunsong Li²

176 Accesses
5 Citations
Explore all metrics

Abstract

The sparse representation method has been successfully applied in the field of hyperspectral image target detection. It assumes that target detection can be achieved by using target and background libraries to represent test pixels. Under this formulation, the presentation of the target and background signatures can be solved by L₁-norm minimization of the weight coefficient and the target detection output is simply achieved by the difference between the two representation residuals. In this paper, a regularized multiple sparse Bayesian learning (RMSBL) method for hyperspectral target detection is proposed, which is established by Bayesian inference using the conditional posterior distributions of the model parameters under a hierarchical Bayesian model. According to the cost function for multiple sparse Bayesian learning, the presentation of the target and background signatures can be obtained by an L_2,1-norm iterative minimization method. And the target detection result can be achieved with the difference between the two representation residuals. Four groups of hyperspectral datasets are used for simulation experiments. The results are compared with those of other common detection algorithms. The experimental results demonstrate that the RMSBL algorithm has higher detection performance.

Automatic Target Detection for Sparse Hyperspectral Images

Collaborative Representation-Based Binary Hypothesis Model with Multi-features Learning for Target Detection in Hyperspectral Imagery

Article 22 May 2018

Integration of Spatial and Spectral Information by Means of Sparse Representation-Based Classification for Hyperspectral Imagery

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Target detection, as an important research direction in the field of hyperspectral imaging, aims to detect small objects or anomalies in hyperspectral image (HSI). It has broad application potential in military safety, environmental pollution monitoring, geological exploration, agriculture and forest monitoring, etc. (Matteoli et al. 2010; Li et al. 2017; Xie et al., 2019a, b). HSI contains rich spectral information and spatial information. Since spectral features differ among substances, it is possible to effectively distinguish objects in a scene using HSI.

Many target detection algorithms have been proposed and applied to a HSI. The Reed-Xiaoli (RX) detector (Reed and Yu, 1990; Xie et al., 2019a, b) obtains the detection result by constructing the generalized likelihood ratio and estimating the background covariance matrix. However, the RX algorithm ignores the rich nonlinear information in a HSI, resulting in poor detection accuracy. The collaboration representation detector (CRD) (Li and Du 2015) is directly based on the concept that each pixel in the background can be approximately represented by its spatial neighborhoods, while anomalies cannot but it only considers the spectral features of a HSI, ignoring the spatial features.

The combined sparse and collaboration representation detector (CSCR) (Li et al. 2015) assumes that the representation of known target signatures is sparse and can be solved by L₁-norm minimization of the representation weight vector. However, the representation of background atoms is assumed to be collaborative and can be solved by L₂-norm minimization. Finally, the decision can be made by computing the difference between the two representation residuals.

Support vector machines (SVM) (Zhao et al. 2012) are a highly effective method for nonlinear signals. It maps the signals to a new feature space where it is easier to distinguish among signals (Tan and Du 2008). The kernel method has yielded satisfactory results in HSI processing (Li et al. 2010; Zhao et al. 2010). In addition, many algorithms use statistics for hypothesis detection, such as spectral matched filter (SMF) (MANOLAKIS and SHAW 2002), all of which must assume the mathematical distribution of the pixel spectrum of hyperspectral images (HSIs). The accuracy of the hypothesis distribution model has a substantial impact on the detection results.

In recent years, the sparse representation method has attracted increasing attention (Chen et al., 2011a, b). It assumes that target detection can be achieved by using target and background libraries to represent test pixels. Under this formulation, the presentation of the target and background signatures can be solved by L₁-norm minimization of the weight coefficient. The target detection problem is thus transformed into the optimization problem of solving for dictionary atomic coefficients (Chen et al., 2011a, b). This algorithm does not require assumptions regarding the mathematical distribution model of the background and target pixels, nor does it require the atoms of the training sample dictionary to be independent of one another. Because of the characteristics of background and target pixel spectra, they belong to different subspaces and the atoms that constitute their spectral dictionaries will differ. Whether the pixel belongs to the background or the target can be determined according to the positions of dictionary atoms with nonzero sparseness.

For solving the dictionary atomic optimization problem, convex optimization methods typically use the L₁-norm instead of the L₀-norm to solve the problem efficiently. However, only under strict conditions do the two yield equivalent solutions and the actual signal typically cannot be recovered. In addition, due to the similarity of the endmembers in the spectral library and the insufficient constraints of the optimization function, the real solutions differ substantially from the results of the abundance estimation. At the same time, because of the large amount of hyperspectral image data, the convex optimization algorithm is slow.

Another method is sparse Bayesian learning (SBL) (Themelis et al. 2012) that models the unknown variables based on Bayesian concepts and obtains sparse solutions by Bayesian derivation (Wipf 2006; Zhang and Rao 2012, 2013; Qiu and Dogandzic 2010; Zhang and Rao 2011; Kong et al. 2017). The core strategy of Bayesian theory is to obtain the probability of an a posteriori unknown parameter by using the sample prior information and the complete information. Wipf and Rao (2004) proved that the SBL algorithm can obtain the sparsest solution. SBL still performs reliably if the endmembers in the spectral library are strongly correlated. However, the expectation maximization (EM) algorithm is used to update the parameters. This leads to a large amount of computation and fails to consider the joint sparsity of the endmember combinations in adjacent pixels, thereby resulting in low efficiency of the algorithm.

To make better use of the spatial correlation of HSI, a regularized multiple sparse Bayesian learning (RMSBL) method for target detection in HSI is proposed, that is established by Bayesian inference using the conditional posterior distribution of the model parameters under a hierarchical Bayesian model. According to the cost function for multiple sparse Bayesian learning (MSBL), the presentation of the target and background signatures can be obtained by the L_2,1-norm minimization iterative method and the target detection result can be achieved with the difference between the two representation residuals. In the simulation experiment, compared with other commonly used detection algorithms, the RMSBL algorithm has superior detection performance.

Target Detection Based on Sparse Representation

Let Y be a set of hyperspectral image data and y ∈ R^N × 1 be an N-dimensional spectral vector in Y. Vector y can be represented as follows (Zhang et al. 2017):

$$ {\displaystyle \begin{array}{l}\mathbf{y}={\mathbf{a}}_1^b{x}_1^b+{\mathbf{a}}_2^b{x}_2^b+\cdots +{\mathbf{a}}_{N_b}^b{x}_{N_b}^b+{\mathbf{a}}_1^t{x}_1^t+{\mathbf{a}}_2^t{x}_2^t+\cdots +{\mathbf{a}}_{N_t}^t{x}_{N_t}^t+\mathbf{n}\\ {}\kern0.9000001em =\left[{\mathbf{a}}_1^b{\mathbf{a}}_2^b\cdots {\mathbf{a}}_{N_b}^b\right]\;{\left[{x}_1^b{x}_2^b\cdots {x}_{N_b}^b\right]}^T+\left[{\mathbf{a}}_1^t{\mathbf{a}}_2^t\cdots {\mathbf{a}}_{N_t}^t\right]\;{\left[{x}_1^t{x}_2^t\cdots {x}_{N_t}^t\right]}^T+\mathbf{n}\\ {}\kern0.9000001em ={\mathbf{A}}_b{\mathbf{x}}_b+{\mathbf{A}}_t{\mathbf{x}}_t+\mathbf{n}=\left[{\mathbf{A}}_b\;{\mathbf{A}}_t\right]\;\left[\begin{array}{c}{\mathbf{x}}_b\\ {}{\mathbf{x}}_t\end{array}\right]+\mathbf{n}=\mathbf{Ax}+\mathbf{n}\end{array}} $$

(1)

where A_b and A_t are the background dictionary and the target dictionary, respectively; x denotes the weight coefficients that correspond to the dictionary; x is a sparse vector, of which only a few coefficients are nonzero; and n represents the observation error.

In the sparse model, it is not necessary to assume the distribution characteristics of the target and the background because the spectral characteristics of the background and target pixels differ and are distributed in different subspaces. The sparse vector x is composed of background weight coefficient x_b and target weight coefficient x_t. If y is a target pixel, then x_b is a zero vector and x_t is a sparse vector; if y is a background pixel, then x_b is a sparse vector and x_t is a zero vector. Therefore, based on the nonzero position of the coefficient x of pixel y, whether the pixel is a background or a target pixel can be determined.

To obtain the weight coefficient x of the pixel y, one must solve the optimization problem that is defined by the following formula:

$$ \mathbf{x}=\mathrm{argmin}{\left\Vert \mathbf{x}\right\Vert}_1\;\mathrm{subject}\ \mathrm{to}\;\mathbf{Ax}=\mathbf{y} $$

(2)

where argmin(x) represents the value of the variable when the objective function takes the minimum value.

Pixel y can be classified by comparing the values of the reconstructed residuals after x has been obtained. Therefore, the output of the detector is expressed as follows:

$$ R\left(\mathbf{y}\right)={\left\Vert \mathbf{y}-{\mathbf{A}}_b{\mathbf{x}}_b\right\Vert}_2-{\left\Vert \mathbf{y}-{\mathbf{A}}_t{\mathbf{x}}_t\right\Vert}_2 $$

(3)

where ‖y − A_bx_b‖₂ and ‖y − A_tx_t‖₂ are the background residual and the target residual, respectively. For a specified threshold δ, if R(y) > δ, then y is classified as a target; otherwise, y belongs to the background.

The above is a target detection model that is based on sparse representation. Because the adjacent pixels in hyperspectral data contain similar information, they can be linearly combined by mixing coefficients of the endmembers in the spectral library. The spectral vector is extended to a spectral matrix and the weight coefficient vector is expanded to a weight coefficient matrix. Then, the mathematical model of the multiple sparse representation is as follows:

$$ \mathbf{Y}=\mathbf{AX}+\mathbf{N}={\mathbf{A}}_t{\mathbf{X}}_t+{\mathbf{A}}_b{\mathbf{X}}_b+\mathbf{N}\kern0.2em $$

(4)

where Y ∈ R^N × M denotes the observed values of M pixels in N bands; A ∈ R^N × L represents the spectral library, which contains the reflected values of L objects in N bands; X ∈ R^L × M is the weight coefficient matrix; and N is the observation noise.

The key step in target detection is to find weight coefficient X such that $ {\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2 $ is minimized under the constraint that ‖X‖_{2, 1} is also minimized. Therefore, the objective function is as follows:

$$ \mathbf{X}=\arg \underset{\mathbf{X}}{\min }{\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2+{\left\Vert \mathbf{X}\right\Vert}_{2,1} $$

(5)

The optimization formula of the L_2,1-norm is typically solved by a convex optimization algorithm. The representative algorithm for convex optimization is the collaborative spectral unmixing by variable splitting and augmented Lagrangian algorithm (CLSUnSAL) (Bioucas-Dias and Figueiredo 2010).

The values of the target and the background under the sparse representation can be obtained separately after solving for the weight coefficients:

$$ {\mathbf{Y}}_t={\mathbf{A}}_t{\mathbf{X}}_t $$

(6)

$$ {\mathbf{Y}}_b=\mathbf{AX}-{\mathbf{Y}}_t $$

(7)

Then, the residuals of the pixels are calculated one by one, and the target residual and background residual are calculated as follows:

$$ {r}_t\left(\mathbf{y}\right)={\left\Vert \mathbf{y}-{\mathbf{y}}_t\right\Vert}_2^2 $$

(8)

$$ {r}_b(y)={\left\Vert y-{y}_b\right\Vert}_2^2 $$

(9)

Pixel y is classified by comparing the values of the reconstructed residuals. Therefore, the output of the detector is expressed as follows:

$$ R\left(\mathbf{y}\right)={r}_b\left(\mathbf{y}\right)-{r}_t\left(\mathbf{y}\right) $$

(10)

For a specified threshold δ, if R(y) > δ, then y is classified as a target; otherwise, y belongs to the background.

Regularized Multiple Sparse Bayesian Learning for Target Detection

Multiple Sparse Bayesian Learning Model

Multiple sparse Bayesian learning is an efficient method for solving the simultaneous sparse approximation problem in the simultaneous sparse model. Based on the MMV model, the prior distribution of the sparse coefficient matrix with joint hyperparameters is established. Because the current pixels in hyperspectral images and the surrounding pixels contain similar information, the coefficient matrix should satisfy the row sparsity characteristic. The prior distribution of each row vector is characterized by hyperparameters such that the matrix satisfies the row sparsity characteristic. According to the cost function of MSBL, an iterative method is obtained by theoretical derivation, which effectively reduces the number of iterations.

Let Y_⋅j and X_⋅j represent the jth columns of Y and X, respectively. The likelihood function is obtained (Kong et al. 2016):

$$ p\left({\mathbf{Y}}_{.j}|{\mathbf{X}}_{.j}\right)={\left(\pi {\sigma}^2\right)}^{-N}\exp \left(-\frac{1}{\sigma }{\left\Vert {\mathbf{Y}}_{.j}-\mathbf{A}{X}_{.j}\right\Vert}_2^2\right) $$

(11)

The general method is to represent the sparsity of the abundance matrix directly by Laplace a priori, which will cause the likelihood function and prior distribution not to satisfy the requirement of conjugation. Therefore, the hierarchical Bayesian model is used to design the prior distribution. Assume that the ith row X_i⋅ of the coefficient matrix X obeys the Gaussian distribution p(X_i⋅; γ_i) = N(0, γ_iI) of the parameter γ_i. Then, the prior distribution of the coefficient matrix is a high-dimensional Gaussian distribution:

$$ p\left(\mathbf{X};\gamma \right)=\prod \limits_{i=1}^Mp\left({\mathbf{X}}_{i.};{\gamma}_i\right) $$

(12)

where $ \gamma ={\left[{\gamma}_1,{\gamma}_2,\cdots, {\gamma}_M\right]}^T\in {R}_{+}^M,{\gamma}_i $ is used to denote the sparsity of each row of the coefficient matrix. If γ_i = 0, X_i. is all zero rows, namely, the conditional probability p(X_i. = 0| Y; γ_i = 0) = 1is satisfied. The parameter γ_i obeys the Gamma distribution: p(γ_i| λ_i)~Γ(γ_i| 1, λ_i/2).

According to Bayesian theory, a posterior distribution is obtained:

$$ {\displaystyle \begin{array}{l}p\left({\mathbf{X}}_{.j}|{\mathbf{Y}}_{.j};\gamma \right)=\frac{p\left({\mathbf{X}}_{.j},{\mathbf{Y}}_{.j};\gamma \right)}{\int p\left({\mathbf{X}}_{.j},{\mathbf{Y}}_{.j};\gamma \right)d{\mathbf{X}}_{.j}}\\ {}\kern6.599996em =N\left({\mu}_{.j},\sum \right)\end{array}} $$

(13)

The mean and variance can be expressed as follows:

$$ \mathbf{M}=\left[{\mu}_{.1},{\mu}_{.2},\cdots, {\mu}_{.L}\right]=E\left[\mathbf{X}|\mathbf{Y};\gamma \right]=\varGamma {\mathbf{A}}^T{\sum}_{\mathbf{Y}}^{-1}\mathbf{Y} $$

(14)

$$ \sum = Cov\left[{\mathbf{X}}_{.j},{\mathbf{Y}}_{.j};\gamma \right]=\varGamma -\varGamma {\mathbf{A}}^T{\sum}_{\mathbf{Y}}^{-1}\mathbf{A}\varGamma, \forall j $$

(15)

where Γ = diag (γ) and ∑_Y = σ²I + AΓA^T.

The logarithmic form of the cost function of MSBL is expressed as follows (Wang et al. 2013):

$$ {\displaystyle \begin{array}{l}L\left(\gamma, {\sigma}^2\right)=-2\log \int p\left(\mathbf{Y}|\mathbf{X}\right)p\left(\mathbf{X};\gamma, {\sigma}^2\right)d\mathbf{X}\\ {}\kern1.44em =L\log \mid {\sum}_{\mathbf{Y}}\mid +{\sum}_{j=1}^L{\mathbf{Y}}_{.j}^T{\sum}_{\mathbf{Y}}^{-1}{\mathbf{Y}}_{.j}\end{array}} $$

(16)

For parameter optimization, the EM method is commonly used in sparse Bayesian learning. The EM method is divided into two steps: In step E, the mean value is calculated via formula (14) and in step M, the iterative update is carried out by the following formula:

$$ {\gamma}_i={\mu}_i^2+{\sum}_{ii} $$

(17)

The MacKay method obtains the parameter iteration formula by calculating the extremum. This method is equivalent to point estimation and has a faster iteration speed than formula (17):

$$ {\gamma}_i=\frac{\mu_i^2}{1-{\gamma}_i^{-1}{\sum}_{ii}} $$

(18)

Optimal Solution

This paper proposes a new method for optimization. In Eq. (16), the former term, namely, log ∣ ∑_Y∣, is a smooth concave function, which can be transformed by the property of conjugate functions, while the latter term, namely, $ {\mathbf{Y}}_{.j}^T{\sum}_{\mathbf{Y}}^{-1}{\mathbf{Y}}_{.j} $, is a quadratic term. After deducing the two terms, the following is obtained:

$$ {\displaystyle \begin{array}{l}{L}_z\left(\mathbf{X},\gamma \right)=\frac{1}{\sigma^2}\underset{\mathbf{X}}{\min }{\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2+{z}^T\gamma +{\mathbf{X}}^T{\varGamma}^{-1}\mathbf{X}\\ {}\kern1.44em =\frac{1}{\sigma^2}\underset{\mathbf{X}}{\min }{\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2+{z}^T\gamma +{\sum}_{i=1}^M{\gamma}_i^{-1}{\left\Vert {\mathbf{X}}_{i.}\right\Vert}_2^2\end{array}} $$

(19)

where min(x) represents a function that takes the minimum value of the objective function.

The optimal iteration of γ_i is obtained by using the derivation rule for Eq. (19):

$$ {\gamma}_i={z}_i^{-1/2}\sqrt{{\mathbf{X}}_{i.}{\mathbf{X}}_{i.}^T}={z}_i^{-1/2}{\left\Vert {\mathbf{X}}_{i.}\right\Vert}_2\left(\forall i\right) $$

(20)

Formula (20) is substituted for formula (19) and the coefficients of the regular terms are normalized:

$$ \mathbf{X}=\underset{\mathbf{X}}{\mathrm{argmin}}\frac{1}{2}{\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2+{\sum}_{i=1}^M{\sigma}^2{\gamma}_i^{-1}{\left\Vert {\mathbf{X}}_{i.}\right\Vert}_2^2 $$

(21)

Let $ {w}_i={\sigma}^2{z}_i^{1/2} $. The optimal expression for weight coefficient estimation is as follows:

$$ {\displaystyle \begin{array}{l}\mathbf{X}=\underset{\mathbf{X}}{\mathrm{argmin}}\frac{1}{2}{\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2+{\sum}_{i=1}^M{\sigma}^2{z}_i^{1/2}{\left\Vert {\mathbf{X}}_{i.}\right\Vert}_2\\ {}\kern0.36em =\arg \underset{\mathbf{X}}{\min}\frac{1}{2}{\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2+{\sum}_i^M{w}_i{\left\Vert {\mathbf{X}}_{i.}\right\Vert}_2\end{array}} $$

(22)

Define the matrix norm:

$$ {\left\Vert B\right\Vert}_{2,1}={\sum}_i\sqrt{\sum_j\left({B}_{ij}^2\right)}={\sum}_i\sqrt{B_{i.}{B}_{i.}^T}={\sum}_i{\left\Vert B\right\Vert}_2 $$

(23)

Equation (22) can be rewritten as follows:

$$ \mathbf{X}=\arg \underset{\mathbf{X},\kern0.48em W}{\min}\frac{1}{2}{\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2+{\left\Vert W\mathbf{X}\right\Vert}_{2,1} $$

(24)

where W = diag(w_i) denotes a diagonal matrix with diagonal elements w_i. Equation (24) is an L_2,1-regularization weighted iterative problem. The results that are obtained by the alternating iteration method are globally convergent and are the sparsest solution (Rakotomamonjy 2011).

In addition, since the noise variance only affects the convergence speed, it does not affect the accuracy of the sparse solution. To set the parameter value adaptively, we update the calculation of the variance as follows (Wipf and Rao 2007):

$$ \left({\sigma}^2\right)=\frac{{\left\Vert \mathbf{Y}-\mathbf{AX}\right\Vert}_F^2/L}{N-M+{\sum}_{i=1}^M{\sum}_{ii}/{\gamma}_i} $$

(25)

For problem (16), parameter learning can be performed by alternate iteration; the expressions for which are listed in Table 1.

Table 1 Parameter updating in each iteration

Full size table

The overall RMSBL algorithm is summarized as Algorithm 1.

Experimental Results and Analysis

In this section, experiments are conducted on four datasets, and we compare the results with five widely used methods: the CRD, CSCR, RX, LRX, and RMSBL algorithms. The parameters of each algorithm are optimized in the experiment, and the receiver operating characteristic (ROC) curve is typically employed to quantitatively evaluate the detection performance. Then, we compute the area under the ROC curve (AUC) to evaluate the performance of the RMSBL and other algorithms.

Hyperspectral Data

The first experimental dataset, namely, airport2, uses a portion of the hyperspectral image of Los Angeles Airport that was collected by the airborne visible/infrared imaging spectrometer (AVIRIS) sensor. This scene consists of 100 × 100 pixels (as shown in Fig. 1a) and the spatial resolution is 7.1 m. After removing the water absorption and low-SNR bands, 205 bands remain, including 87 target pixels to be detected. Figure 1 b shows the ground-truth image of the target.

The second dataset, namely, Cuprite, was captured by the AVIRIS sensor in 1997 over the Cuprite mine in Nevada. Only a small part of the data is used in this experiment. Figure 1 c and d show the color composites of Cuprite and the ground-truth image of the target, respectively. This scene consists of 250 × 191 pixels. After removing the water absorption and low-SNR bands, 188 bands remain, including approximately 35 to 40 target pixels to be detected.

The third dataset, San Diego, uses a portion of the hyperspectral image of the San Diego Airport in the USA that was collected by the AVIRIS sensor. This scene consists of 200 × 200 pixels (as shown in Fig. 1e) and the spatial resolution is 3.5 m. After removing the water absorption and low-SNR bands, 189 bands remain, including approximately 132 target pixels to be detected. Figure 1 f shows the ground-truth image of the target.

The fourth dataset, the HYDICE Urban scene, is a hyperspectral image of a suburban residential area in Texas, USA, that was captured by Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensor. Figure 1 g and h show the color composites of the HYDICE Urban scene and the ground-truth image of the target, respectively. This scene consists of 80 × 100 pixels, and the spatial resolution is approximately 1 m. After removing the water absorption and low-SNR bands, 162 bands remain, including approximately 21 target pixels to be detected.

Detection Performance

According to the existing theoretical knowledge, algorithms RMSBL, CRD, CSCR, and LRX all use a background dictionary. In the actual target detection, the background dictionary is typically obtained via local and adaptive methods. For algorithms CRD, CSCR, and LRX, the sliding dual-window method is used to obtain the background dictionary; hence, the sizes of the inner and outer windows will affect the performance of the algorithm. Therefore, the performance of the algorithm is optimized by adjusting the parameters during the simulation experiment. The RMSBL algorithm obtains the dictionary via the vertex component analysis (VCA) method; thus, the RMSBL and RX algorithms are not affected by the dual-window scheme.

We conduct an experimental simulation on the four datasets that are described above and analyze the results of the RMSBL algorithm and the four comparison algorithms in terms of their ROC curves and AUC values.

To optimize the performance of the algorithm for the airport2 dataset, the parameters of each algorithm are determined after many experiments as follows: for the CRD algorithm, the outer window size is w_out = 11, the inner window size is w_in = 5, and the regularization parameter is λ = 10⁻⁶; for the CSCR algorithm, the window sizes are (w_out, w_in) = (11, 3), and the regularization parameter is λ₁ = 10⁻², λ₂ = 10⁻¹; and for the LRX algorithm, the window sizes are (w_out, w_in) = (15, 3). The detection outputs of the algorithms are shown in Fig. 2. The proposed RMSBL algorithm yields the best result. Figure 6 a shows the ROC curves of the proposed algorithm and the comparison algorithms. The detection rate of the RMSBL algorithm is lower than those of the other algorithms if the false-alarm rate is less than 10⁻²; however, it increases rapidly if the false alarm rate exceeds 10⁻², significantly higher compared with the other algorithms; and it reaches 1 before those of the other algorithms. On this dataset, the CRD algorithm is inferior to the RMSBL but outperforms the other algorithms.

For the Cuprite dataset, the parameters of each algorithm are as follows: for the CRD algorithm, the window sizes are (w_out, w_in) = (11, 5) and the regularization parameter is λ = 10⁻⁶; for the CSCR algorithm, the window sizes are (w_out, w_in) = (11, 5) and the regularization parameters are λ₁ = 10⁻¹ and λ₂ = 10⁻²; and for the LRX algorithm, the window sizes are (w_out, w_in) = (13, 9). Figure 3 shows the outputs of the proposed algorithm and the comparison algorithms. The ROC curves of the algorithms are plotted in Fig. 6b. The experimental results demonstrate that the RMSBL algorithm far outperforms the other algorithms in terms of the detection probability and when the false alarm rate is less than 10⁻², the detection rate has reached 1; hence, the RMSBL algorithm performs well on this dataset and the RX algorithm is inferior to the RMSBL algorithm but outperforms the other algorithms.

For the San Diego dataset, the parameters of each algorithm are as follows: for the CRD algorithm, the window sizes are (w_out, w_in) = (17, 9), and the regularization parameter is λ = 10⁻⁶; for the CSCR algorithm, the window sizes are (w_out, w_in) = (7, 5), and the regularization parameters are λ₁ = 10⁻² and λ₂ = 10⁻¹; and for the LRX algorithm, the window sizes are (w_out, w_in) = (13, 7). Figure 4 shows the outputs of the algorithms. Figure 6 c shows the ROC curves of the algorithms, according to which the detection probability of the RMSBL algorithm exceeds those of the other algorithms. If the false alarm rate is close to 10⁻¹, the detection rate reaches 1. Under this set of data, the CRD algorithm is inferior to the RMSBL algorithm but outperforms the other algorithms.

For the HYDICE dataset, the parameters of each algorithm are as follows: for the CRD algorithm, the window sizes are (w_out, w_in) = (13, 7), and the regularization parameter is λ = 10⁻⁶; for the CSCR algorithm, the window sizes are (w_out, w_in) = (9, 5), and the regularization parameters are λ₁ = 10⁻² and λ₂ = 10⁻¹; and for the LRX algorithm, the window sizes are (w_out, w_in) = (13, 7). Figure 5 shows the outputs of the algorithms. The ROC curves of the algorithms are shown in Fig. 6d. If the false alarm rate is less than 10⁻³, the detection rate of the RMSBL algorithm is low. If the false alarm rate exceeds 10⁻³, the detection rate increases rapidly and reaches 1 when the false alarm rate is 10⁻².

The AUC values of each algorithm are listed in Table 2. From the data, we can judge the performance of each algorithm more accurately. The AUC value of RMSBL is the largest in the experimental results for each group of data, namely, its performance is the best. For the Cuprite and HYDICE datasets, the AUC value of RX algorithm is slightly smaller than that of RMSBL; however, the RX algorithm does not perform well on the airport2 and San Diego datasets. The RMSBL algorithm performs well on all test datasets, especially the Cuprite and HYDICE datasets, on which the AUC value is close to 1.

Table 2 AUC (%) values for the proposed algorithm and the comparison algorithms

Full size table

Finally, we report the computational complexities of the compared detection methods with optimal parameters. All experiments were conducted using MATLAB R2014a on an Intel Core i5-3470 CPU machine with 12 GB of RAM. The execution times (in seconds) for the experimental data are listed in Table 3. All the other algorithms except the RX algorithm have higher computational costs than RMSBL.

Table 3 Execution times (in seconds) on all experimental datasets

Full size table

Conclusions

This paper proposes a hyperspectral target detection algorithm that is based on RMSBL. The weight coefficient is calculated by L_2,1-norm regularization, and the target residual and background residual are obtained. Finally, target detection is achieved by evaluating the difference between the two residuals. The proposed method is compared with the CRD, CSCR, LRX, and RX methods. Experiments are performed on four datasets and the results demonstrate that our proposed method outperforms the state-of-the-art methods. In the future research, we will try to use the deep learning method to solve for the weight coefficient in order to achieve better results.

References

Bioucas-Dias JM, Figueiredo MAT (2010) Alternating direction algorithms for constrained sparse regression: application to hyperspectral unmixing[C]. Hyperspectral image and signal processing: evolution in remote sensing (WHISPERS), 2010 2nd Workshop on. IEEE, 2010:1–4
Chen Y, NASRABADI NM, TRAN D (2011a) Sparse representation for target detection in hyperspectral imagery [J]. IEEE J Sel Top Sign Proces 5(3):629–640
Article Google Scholar
Chen Y, NASRABADI NM, TRAN TD (2011b) Hyperspectral image classification using dictionary-based sparse representation [J]. IEEE Trans Geosci Remote Sens 49(10):3973–3985
Article Google Scholar
Reed IS, Yu X (1990) Adaptive multiple-band CFAR detection of an optical pattern with unknown spectral distribution [J]. IEEE Trans Acoust Speech Signal Process 38(10):1760–1770
Article Google Scholar
Kong F, Li Y, Guo W (2016) Regularized MSBL algorithm with spatial correlation for sparse hyperspectral unmixing [J]. J Vis Commun Image Represent 40:525–537
Article Google Scholar
Kong F, Guo W, Shen Q et al (2017) Recursive dictionary-based simultaneous orthogonal matching pursuit for sparse unmixing of hyperspectral data [J]. Trans Nanjing Univ Aero Astro 34(4):456–464
Google Scholar
Li W, Du Q, Zhang B (2015) Combined sparse and collaboration representation for hyperspectral target detection [J]. Pattern Recogn 48:3904–3916
Article Google Scholar
Li W, Du Q (2015) Collaborative representation for hyperspectral anomaly detection [J]. IEEE Trans Geosci Remote Sens 53(3):1463–1474
Article Google Scholar
Li Y, Xie W, Li H (2017) Hyperspectral image reconstruction by deep convolutional neural network for classification [J]. Pattern Recogn 63:371–383
Article Google Scholar
Li J, Zhao C, Mei F (2010) Detection hyperspectral anomaly by using background residual error data [J]. J Infrared Millim Waves 29(2):150–155
Article Google Scholar
Matteoli S, Diani M, Corsini G (2010) A tutorial overview of anomaly detection in hyperspectral images [J]. IEEE Aerosp Electron Syst Mag 25(7):5–28
Article Google Scholar
MANOLAKIS D, SHAW G (2002) Detection algorithms for hyperspectral imaging applications [J]. IEEE Signal Process Mag 19(1):29–43
Article Google Scholar
Qiu K, Dogandzic A (2010) Variance-component based sparse signal reconstruction and model selection [J]. IEEE Trans Signal Process 58(6):2935–2951
Article Google Scholar
Rakotomamonjy A (2011) Surveying and comparing simultaneous sparse approximation (or group-lasso) algorithms [J]. Signal Process 91(7):1505–1526
Article Google Scholar
Tan K, Du P (2008) Hyperspectral remote sensing image classification based on support vector machine [J]. Infrared Millim Waves 27(2):123–128
Article Google Scholar
Themelis KE, Rontogiannis A, Koutroumbas KD (2012) A novel hierarchical Bayesian approach for sparse semisupervised hyperspectral unmixing [J]. IEEE Trans Signal Process 60(2):585–599
Article Google Scholar
Wipf DP, Rao B (2004) Sparse Bayesian learning for basis selection [J]. IEEE Trans Signal Process 52(8):2153–2164
Article Google Scholar
Wipf DP (2006) Bayesian methods for finding sparse representations[D]. [Ph. D. dissertation], UC, San Diego
Wipf DP, Rao BD (2007) An empirical Bayesian strategy for solving the simultaneous sparse approximation problem [J]. IEEE Trans Signal Process 55(7):3704–3716
Article Google Scholar
Wang L, Wei F, Liu D, Wang Q (2013) Fast implementation of maximum simplex volume-based endmember extraction in original hyperspectral data space [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 6(2):516–521
Article Google Scholar
Xie W, Jiang T, Li Y et al (2019a) Structure tensor and guided filtering-based algorithm for hyperspectral anomaly detection [J]. IEEE Trans Geosci Remote Sens 2019:1–13
Google Scholar
Xie W, Shi Y, Li Y, Jia X, Lei J (2019b) High-quality spectral-spatial reconstruction using saliency detection and deep feature enhancement [J]. Pattern Recogn 88:139–152
Article Google Scholar
Zhang Y, Du B, Zhang Y et al (2017) Spatially adaptive sparse representation for target detection in hyperspectral images [J]. IEEE Geosci Remote Sens Lett 14(11):1923–1927
Article Google Scholar
Zhao C, Li J, Mei F (2010) A kernel weighted RX algorithm for anomaly detection in hyperspectral imagery [J]. J Infrared Millim Waves 29(5):378–382
Article Google Scholar
Zhao C, Cheng B, Yang W (2012) Algorithm for hyperspectral unmixing using constrained nonnegative matrix factorization [J]. Journal of Harbin Engineering University 3(33):377–382
Google Scholar
Zhang Z, Rao B (2011) Iterative reweighted algorithms for sparse signal recovery with temporally correlated source vectors[C]. The 36th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brisbane, Australia, 2011: 3932–3935
Zhang Z, Rao B (2012) Recovery of block sparse signals using the framework of block sparse Bayesian learning[C]. The 37th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), La Jolla, CA, USA, 2012: 3345–3348
Zhang Z, Rao B (2013) Extension of SBL algorithms for the recovery of block sparse signals with intra-block correlation [J]. IEEE Trans Signal Process 61(8):2009–2015
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their helpful comments, which facilitated the improvement of this paper.

Funding

This work was financially supported by a grant from the National Natural Science Foundation of China (No. 61801214, 61801359).

Author information

Authors and Affiliations

College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
Fanqiang Kong & Keyao Wen
State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an, 710071, China
Yunsong Li

Authors

Fanqiang Kong
View author publications
You can also search for this author in PubMed Google Scholar
Keyao Wen
View author publications
You can also search for this author in PubMed Google Scholar
Yunsong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fanqiang Kong.

Ethics declarations

This article does not contain any studies with human participants or animals performed by any of the authors. Ethics Committee approval was obtained from the Institutional Ethics Committee of Nanjing University of Aeronautics and Astronautics to the commencement of the study.

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kong, F., Wen, K. & Li, Y. Regularized Multiple Sparse Bayesian Learning for Hyperspectral Target Detection. J geovis spat anal 3, 11 (2019). https://doi.org/10.1007/s41651-019-0034-1

Download citation

Published: 01 July 2019
DOI: https://doi.org/10.1007/s41651-019-0034-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Regularized Multiple Sparse Bayesian Learning for Hyperspectral Target Detection

Abstract

Similar content being viewed by others

Automatic Target Detection for Sparse Hyperspectral Images

Collaborative Representation-Based Binary Hypothesis Model with Multi-features Learning for Target Detection in Hyperspectral Imagery

Integration of Spatial and Spectral Information by Means of Sparse Representation-Based Classification for Hyperspectral Imagery

Introduction

Target Detection Based on Sparse Representation