Keywords

1 Introduction

In recent years, as deep learning has become more and more widely used in computer vision, it has gradually been introduced into the fields of image classification, natural language processing, and remote sensing data. At present, autoencoder (AE), restricted Boltzmann machine, deep belief network, and convolutional neural network are the most common deep learning models. The autoencoder is based on reconstruction and unsupervised framework, such that the extracted features contain enough components to represent the input signal. Based on this characteristic, the AE can be well applied to hyperspectral unmixing. Firstly, the main features of the original data are extracted by encoding. Secondly, the original data is reconstructed by decoding, and the abundance coefficient and the endmember matrix can be, respectively, obtained. Because the traditional AE has no constraints, it is easy to copy the input to the output directly, or only make minor changes to produce small reconstruction errors, so that the model performance is usually very poor. The denoising autoencoder (DAE) based on the AE adds random noise to the input and transfers it to the AE to reconstruct the noise-free input.

In this paper, a deep denoising autoencoder network (DDAE) for hyperspectral unmixing is proposed. According to the physical meaning of unmixing, the abundance and endmember must satisfy nonnegativity, and the abundance coefficient should satisfy the sum-to-one requirement, so when using DAE for hyperspectral unmixing, set the weight of the hidden layer and the decoding layer to nonnegative, and the hidden layer satisfies the sum-to-one requirement. If the noise and endmember are estimated incorrectly, the performance of unmixing will decrease dramatically [1]. In order to solve this problem, the L2,1-norm constraint is added to the objective function as a regularization term, which not only reduces the redundant lines of the encoder but also take advantage of the multiple sparsity between adjacent pixels, and improves the performance of abundance estimation. In the simulation experiment, compared with other commonly used unmixing algorithms, the DDAE algorithm is superior to other comparison algorithms on simulated and real.

2 The Model of Denosing Autoencoder

Traditional AE without any constraint is easy to directly copy the input to the output, or only make minor changes to produce small reconstruction errors, such that the model performance is usually very poor. In this context, Vincent et al. proposed the DAE algorithm [2] based on the traditional AE, by adding noise to the input, and then using the “corrupted” samples containing noises to reconstruct the “clean” input without noise, which is the main difference from the traditional AE. At the same time, the training strategy enables DAE to learn more about the essential characteristics of the input data. In this paper, add additive Gaussian noise to the data, and change the size of the noise to conduct experiments separately.

The DAE network includes the following two parts:

  1. 1.

    Encode the input data X with the encoder \(f(x)\) to obtain the hidden layer output S (that is, abundance coefficient):

$$S = f(x) = \sigma ({\mathbf{WX}})$$
(1)

The activation function is represented by \(\sigma (x)\), and the weight of the encoder is represented by W.

  1. 2.

    Decoder \(g(x)\) uses S to reconstruct the data:

$${\hat{\mathbf{X}}} = g({\mathbf{S}}) = {\mathbf{AS}}$$
(2)

The decoder weight (i.e., endmember) is represented by A, and \({\hat{\mathbf{X}}}\) represents the reconstructed data.

The network uses the minimized average reconstruction error to learn the weights and hidden layer representation of the reconstructed data defined as follows:

$$J({\mathbf{W}},{\mathbf{A}}) = \frac{1}{n}\sum\limits_{i = 1}^{n} {\frac{1}{2}} \left\| {g(f({\mathbf{x}}_{i} )) - {\hat{\mathbf{x}}}_{i} } \right\|_{2}^{2}$$
(3)

According to the existing formulas, we will further study to solve the unmixing problem by combining nonnegative constraints and sum-to-one constraints.

3 Deep Denoising Autoencoder Networks for Hyperspectral Unmixing

In this section, the proposed deep denoising autoencoder is introduced in details. Figure 1 shows the network structure of the proposed DDAE algorithm.

Fig. 1
figure 1

Network structure of the proposed DDAE

For unmixing problems, the sum-to-one of the abundance coefficient is an important constraint. In order to satisfy this constraint, the input data \({\hat{\mathbf{X}}}\) and the weight A are expanded by a constant vector, and the augmented matrices are denoted by \({\bar{\mathbf{X}}}\) and \({\bar{\mathbf{A}}}\), respectively:

$${\bar{\mathbf{X}}} = \left[ \begin{aligned} {{\hat{\mathbf{X}}}} \hfill \\ 1_{m}^{T} \hfill \\ \end{aligned} \right],{\bar{\mathbf{A}}} = \left[ \begin{aligned} {\mathbf{A}} \hfill \\ 1_{l}^{T} \hfill \\ \end{aligned} \right]$$
(4)

So \(1_{l}^{T} {\mathbf{s}}_{j} = 1\), the column vector of the abundance matrix satisfies the sum-to-one constraint.

In practical applications, the data is often accompanied by noise, and the presence of noise and the incorrect estimation of the endmember will cause the unmixing performance to drop sharply. Therefore, regular terms \(\left\| {{\mathbf{W}}^{T} } \right\|_{2,1}\) are introduced to reduce the redundant rows of the encoder, but this strategy cannot reflect the multiple sparsity between adjacent pixels; in this paper, we improved the term to \(\left\| {\sigma ({\mathbf{WX}})} \right\|_{2,1}\), which not only reduces the redundant endmember but also introduces multiple sparsity to improve the performance of abundance estimation. In summary of the above analysis, defining the objective function of W as:

$$J({\mathbf{W}}) = \frac{1}{2}\left\| {{\bar{\text{A}}}\sigma ({\mathbf{WX}} )- {\bar{\text{X}}}} \right\|_{F}^{2} + \left\| {\lambda ({\mathbf{WX}} )} \right\|_{2,1}$$
(5)

According to the requirement of linear unmixing, both the encoding function and the decoding function should be linear mapping functions. At the same time, the activation function also ensures that the hidden layer (abundance S) is nonnegative. Therefore, the ReLu function is selected as the activation function, which is defined as follows \(\sigma (x) = \hbox{max} (x,0)\). However, the ReLu function has a disadvantage in increasing the gradient extensively. According to the research, it can be solved by L1-norm or L2-norm [3]. In the proposed network, we use the L2,1-norm to solve the problems caused by ReLu [1]. According to the requirements of the unmixing, the weight of decoder also needs to be nonnegative, that is \({\mathbf{A}} \ge 0\). Also, we solve this problem with the ReLu function, which guarantees the nonnegativity of A during the optimization process.

4 Experimental Results and Analysis

In this part, we conducted simulation experiments through a set of real data to verify the performance of the algorithm. And compare it with SUnSAL, SUnSAL-TV, and SMP algorithm.

The experiment used the airborne Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) Cuprite data as the hyperspectral data [4]. The data size is 250 × 191 pixels and contains 188 spectral bands. The mineral map generated by Tricorder3.3 software and the reconstructed abundance map of each algorithm are shown in Fig. 2.

Fig. 2
figure 2

Abundance images of three different elements and reconstruction abundance images of different algorithms

In the experiment, the performance of the unmixing algorithm is evaluated by using sparsity and reconstruction error. Sparsity is the number of non-zero values in the abundance matrix of hyperspectral images. In order to prevent the negligible value from being counted in the total during calculation, we define the abundance value greater than 0.001 as non-zero abundance [5]. The RMSE is defined as follows [6]:

$${\text{RSME = }}\frac{1}{n}\sum\limits_{i = 1}^{n} {\sqrt {\frac{1}{m}\sum\limits_{j = 1}^{m} {\left\| {{\mathbf{X}}_{ij} - {\hat{\mathbf{X}}}_{ij} } \right\|_{2}^{2} } } }$$
(6)

where X represents the original hyperspectral image, \({\hat{\mathbf{X}}}\) represents the reconstruction of the hyperspectral image, n represents the number of bands, and m represents the number of pixels of the image. The better the quality of the unmixing with the lower the RMSE.

Table 2 shows the sparsity and RMSE of each algorithms. It can be seen that the sparsity and reconstruction error of the DDAE algorithm are the smallest and far superior to other algorithms, indicating that the proposed algorithm has high unmixing performance for real hyperspectral images. From Fig. 2, it can be seen that the reconstructed abundance image has fewer noise points, and retains the edge and feature information of the image, which is closer to the abundant distribution map generated by the software.

Table 1 Sparsity and the reconstruction errors

5 Conclusion

In this paper, a deep denoising autoencoder network is proposed to solve the problem of hyperspectral unmixing. On the basis of DAE, we add the nonnegative and the sum-to-one constraints to the abundance coefficient, and add L2,1-norm constraint as a regular term to the objective function, which makes good use of the multiple sparsity between adjacent pixels and improves the accuracy of abundance estimation. Experimental data shows that the DDAE algorithm is superior to other contrast algorithms, especially for high noise hyperspectral data.