Keywords

1 Introduction

With the improvement in spectral resolution of imaging sensors, the technology that evolved is the hyperspectral imagery, in which, a location or a scene is captured in hundreds of bands. It is one of the most important technological developments in the field of remote sensing. With hyperspectral sensors, contiguous or noncontiguous bands of width around 10 nm over the region of 400–2500 nm [1] in the electromagnetic spectrum can be obtained. Since lot of spectral and spatial data are present in hyperspectral image, many specialized algorithms are being developed for extracting the great deal of information present in them. Different land cover types can be precisely discriminated using the abundant spectral information. This also has the potential to detect minerals, aid in precision farming, urban planning, etc.

However good the sensors be, presence of noise in captured images is unavoidable. Noise tends to degrade the performance of image classification, target detection, spectral unmixing, and any such application. So, there is a need for denoising hyperspectral images. Thus, hyperspectral image (HSI) denoising has become an important and essential pre-processing technique.

In the previous years, many denoising techniques have been employed. Ting Li [2] used total variation algorithm, Hao Yang [3] used wavelet-based denoising method. Legendre-Fenchel transformation-based denoising was employed by Nikhila Haridas [4]. But there are some inherent limitations in any of these techniques, like total variation causes the denoised image to lose the edge information. Wavelet-based denoising technique’s major success is due to its satisfactory performance for piecewise smooth function for signals of one dimension. But, this ability to smooth is lost when it is applied to two or higher dimension signals [5].

Each hyperspectral data consists of around 150–200 or even more bands. When there are huge number of captured images, denoising all bands in all the images takes very high computation time. So, there is a need to implement a fast and robust technique of denoising. The proposed least square-based method is a solution for this.

2 Methodology

2.1 Hyperspectral Denoise Model

Denoising can be modelled as—construct a clean image X out of a noisy image Y. Let each pixel be represented by \(x_{ijb}\) for X and \(y_{ijb}\) for Y. The indices i, j and b represent the row number, column number, and the band number, respectively. Considering the image to be additive zero-mean Gaussian and represented as \(w_b\) and with a standard deviation \(\sigma \), we can express Y as

$$\begin{aligned} y_{ijb} = x_{ijb} + w_b \end{aligned}$$
(1)

Here, the noise \(w_b\) is band-dependent. The solution to the Eq. 1, which is a problem of estimating \(\hat{X}\) from Y, with a L2 fidelity constant, i.e., \(\hat{X} = arg min_X ||Y -X||_2^2\), in general is not unique. Having a prior knowledge about the ideal image aids in finding better solution, which can be used to regularize the denoising problem.

2.2 Least Square Model

Application of least square for one-dimensional denoising was proposed by Ivan W. Selesnick [8]. Here, the usage of least square for two-dimensional signals, i.e., images are explained. For any single dimensional signal, let y be the input noisy signal and x represents the unknown denoised signal. Then, the problem formulation for least square is given by

$$\begin{aligned} min_{x} \Vert y-x \Vert _{2} ^{2} + \lambda \Vert \mathbf D x \Vert _{2}^{2} \end{aligned}$$
(2)

where, \(min_x\) represents a minimization function that describes that the given equation is to be minimized with respect to x; \(\lambda \) is a control parameter; \(||.||_2\) represents L2 norm; D is a second-order difference matrix.

The main idea behind any norm-based denoising is to generate a signal or image that does not contain any noise but still resemble the information present in the original noisy data. This similarity is taken care by the first term of the Eq. 2, whose minimization forces output to be similar to the input. The second term captures the degree to which smoothness is achieved, thus reducing the noise.

Solving the Eq. 2 results in a formulation for least square signal denoising as

$$\begin{aligned} x=(\mathbf I +\lambda \mathbf D ^{T}{} \mathbf D )^{-1}y \end{aligned}$$
(3)

The control parameter, \( \lambda \) > 0, controls the trade-off between smoothing of y and preserving similarity of x to y. The Eq. 3 can be directly applied for denoising one-dimensional (1D) signals. For two-dimensional (2D) signals (i.e., images), this needs to be first applied on the columns of the 2D matrix, then to its rows. Thus, whole procedure consists of simple matrix operations.

2.3 Experimental Procedure

An image (in gray scale) is a two-dimensional matrix, where each element of the matrix represents corresponding pixel value. An image which has been captured in multiple wavelengths is similar to a three-dimensional matrix, where the third dimension represents the spectral bands of the image. For example, a color image has three bands, namely red, green, and blue. Similarly, a hyperspectral image consists of hundreds of bands. Most of the pre-processing techniques, also denoising, involve manipulation over each band separately.

Assessment of denoising may be done either by simulating noise or by identifying already noisy bands. The latter seems to be more appropriate, because it may not be always possible to simulate noise to match the real-world situations. The choice of \(\lambda \) is crucial in least square-based denoising. For computational time calculation, all the bands in each dataset are denoised and total time requirement is considered.

Least square-based denoising of an image is a matter of simple matrix operations and hence must be faster than most other denoising method involving complex differential equations. Legendre-Fenchel transform-based denoising uses the concept of duality [7]; wavelet technique uses Discrete Wavelet Transform (DWT) and inverse DWT; total variation method is solvable through primal-dual method, and the solution is non-trivial. When comparing all the above-mentioned denoising methods, we can perceptually tell that proposed LS technique is simple and must run relatively faster in any system or software.

2.4 Data Sources

For experimentation, following standard hyperspectral image data [4, 6, 7] were used: Salinas scene, Pavia University, and Indian Pines. Salinas scene and Indian Pines were captured by NASA AVIRIS Sensor. Salinas scene was collected in 224 bands over Salinas Valley, California, and is characterized by high spatial resolution (3.7-m). The area covered comprises 512 lines by 217 samples. Indian Pines scene was gathered over the Indian Pines test site in North-western Indiana and consists of \(145\,\times \,145\) pixels and 220 spectral reflectance bands in the wavelength range 0.4–2.5 \(\upmu \)m. Pavia University image was captured by ROSIS sensor during a flight campaign over Pavia, Northern Italy. The image has 103 bands with dimensions \(610\,\times \,340\). But some of the bands contain no information and needs to be discarded during experimentation.

3 Results and Analysis

3.1 Visual Interpretation

The proposed method is tested on original data without simulating noise. The evaluation of denoise quality is primarily based on visual analysis and a posterior measure—Signal-to-Noise Ratio (SNR) calculation. For better analysis of least square technique, noise simulation is done on ground truth of Indian Pines image and denoising is done.

Denoising effectiveness is analyzed by varying control parameter \(\lambda \) and the obtained outputs are shown in the Fig. 1. It can be observed that, as the value of \(\lambda \) increases, noise removal is better but the image becomes smoother and edge information gradually reduces. It shows that there is a trade-off between degree of noise removal and the amount of smoothness of the image. So, based on the amount of noise and precision of information required from the noisy image, the control parameter \(\lambda \) is to be set.

Fig. 1
figure 1

Least square denoised images with different \(\lambda \) values

The Figs. 2 and 3 show the comparison of different denoise methods applied over Pavia University (band 15) and Salinas scene (band 2) datasets. Visual perception shows that LS can perform as good as LF and better than Wavelet and TV denoising techniques.

Fig. 2
figure 2

Clockwise from Top-Left: Noisy band (band 15) of Pavia University; TV denoised; LF denoised; proposed LS denoised; wavelet denoised

Fig. 3
figure 3

Clockwise from Top-Left: Noisy band (band 2) of Salinas scene; TV denoised; LF denoised; proposed LS denoised; wavelet denoised

3.2 SNR Calculation

The numerical approach for quality measurement of denoised images is the signal-to-noise ratio or the SNR, which is a measure of signal power to that of noise power. It is expressed in decibels (dB). Each band has been captured in different wavelength, and every element has different reflectance properties for different wavelengths. Thus, any single band cannot be used as a reference band for calculating SNR. A new approach for SNR calculation of hyperspectral images is used by Linlin Xu [7]. For a given band, SNR may be calculated as

$$\begin{aligned} SNR = 10 \log _{10} \frac{\Sigma _{ij} \hat{x}_{ijb}^2}{\Sigma _{ij}(\hat{x}_{ijb} - m_b)^2} \end{aligned}$$
(4)

where, \(\hat{x}_{ijb}\) is the denoised pixel and \(m_b\) is the mean value of \(\{\hat{x}_{ijb}\}\) in an area where the pixels are homogeneous. So, the estimation of the SNR relies on the selection of the homogeneous area. Class labels can be used to identify homogeneous area, since the pixels belonging to the same class are more similar to each other. Calculating SNR for all the pixels has an added advantage that the approach reduces the chance of bias that could occur by the selection of a particular homogeneous area.

The Table 1 shows the calculated SNR values after denoising particular bands in each dataset. The Table 2 shows the SNR values obtained after all the bands are denoised using various denoise models. SNR values shown in the table represent the average of SNR of all bands.

Table 1 Signal-to-noise ratio for particular bands of various datasets

The inference that can be made by observing these two tables is that the proposed least square denoising has SNR values that are comparable with other denoise techniques. As known, increasing \(\lambda \) value removes noise to much higher extent, giving higher SNR values in both TV and the proposed LS denoising models. But, higher \(\lambda \) values have the effect of depleting edge information. For this reason, it is necessary that image quality after denoising must be interpreted both numerically and visually.

Table 2 Signal-to-noise ratio after denoising all bands for various datasets

3.3 Computational Time Requirements

The main advantage of least square is that its model is very simple and thus needs lesser time for computation. This has been proved by showing average time requirement for denoising all the bands in each of the three datasets (Table 3). It is clearly visible that the time taken by total variation and Legendre-Fenchel techniques is very high compared to wavelet and least square techniques. But LS is still far faster than wavelet. Low time requirement becomes prevalent in real-time applications where huge amounts of data are being collected by the sensors every minute and needs to be pre-processed. In such situations, techniques like LS gain more importance than slow-processing techniques.

Table 3 Time requirement of different denoise techniques

4 Conclusion

This paper explains a novel method of denoising hyperspectral images that can be used as an alternative to many denoising methods proposed till date. The proposed algorithm has the ability to denoise images to a satisfactory level with very less time consumption. The denoising technique has been tested on different datasets for analyzing its denoising capability and time consumption to have more precise conclusion. The approach used for noise level approximation is simple enough but gives much better idea about SNR of each band and also for whole dataset. This gives a clear picture of the denoise quality. But, there is a need for some other technique that can measure the loss of edge information.

As a future work, improving denoise quality by altering difference matrix, D will be considered. Further, denoising hyperspectral images with LS technique prior to classification can be done to assess the improvement in classification accuracy.