Abstract
Band selection is of great important for hyperspectral image processing, which can effectively reduce the data redundancy and computation time. In the case of unknown class labels, it is very difficult to select an effective band subset. In this paper, an unsupervised band selection algorithm is proposed which can preserve the original information of the hyperspectral image and select a low-redundancy band subset. First, a search criterion is designed to effectively search the best band subset with maximum information entropy. It is challenging to select a low-redundancy spectral band subset with maximizing the search criteria since it is a NP-hard problem. To overcome this problem, a double-graph model is proposed to capture the correlations between spectral bands with full use of the spatial information. Then, an improved Determinantal Point Process algorithm is presented as the search method to find the low-redundancy band subset from the double-graph model. Experimental results verify that our algorithm achieves better performance than other state-of-the-art methods.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Hyperspectral data is a three-dimension image with the spatial information and abundant spectral information which makes great development of hyperspectral images in the field of remote sensing [1, 2]. Multidimensional spectral bands can improve the representation of the ground objects. However, not all the bands play an important role in hyperspectral image processing [3]. That is because high dimensional hyperspectral data may cause a lot of problems, such as redundancy of information, noise bands and Hughes phenomenon. Therefore, an effective dimension reduction method is necessary for the hyperspectral data. Traditional dimensionality reduction methods mainly include two types: band selection and feature extraction [5,6,7]. Compared with feature extraction, band selection can effectively protect the original information as much as possible.
The research of band selection can be mainly divided into two directions: supervised band selection [8, 9] and unsupervised band selection [10,11,12]. Supervised band selection method can select a spectral band subset that have a strong correlation with class labels. However, it’s time-consuming to label the hyperspectral images. Therefore, the unsupervised band selection method is more applicable. In this paper, we propose an unsupervised band selection method to select a low-redundancy band subset with maximum amount of information.
The unsupervised band selection process mainly contains two steps: the selection criteria and search strategy. At present, the unsupervised selection methods can be divided into the ranking-based methods [13], clustering-based methods [17], and greedy-algorithms [15]. However, most of the above-mentioned band selection methods only consider the property of the band combination ignoring the latent structure between spectral bands in the high dimensional space [3]. This defect will cause the lack of global considerations. In addition, most band selection methods do not consider the correlation between pairwise bands from the spatial information. Spatial information helps to make the measurement of spectral redundancy more comprehensive. In [3], Yuan et al. proposed a multiple graph to measure similarity between spectral bands by spectral clustering method. However, the spectral clustering has very high computation complexity and high memory requirements. In view of this, this paper proposes a double-graph structure to describe the complex correlation between spectral bands with both pixels information and neighborhood information of pixels.
In addition, selecting a low-redundancy band subset from the graph model with traditional search methods is a challenging work. DPP [16] is a probabilistic model that can select a diverse subset of features. The k-DPP [17], as an improved algorithm of the DPP, can select a low-redundancy band subset with dimension k. The original k-DPP can only be applied to select from single graph model without full use of the spatial information of hyperspectral images. Furthermore, it does not consider the physical properties of each band in the selecting process.
To solve the above problems, a double-graph model DPP is proposed with full use of pixel information and its domain information. To attain a high-performance band subset, a spectral band selection criterion: maximum band subset information [4] is used which means that the selected spectral band subset has a rich amount of information.
2 The Proposed Framework
2.1 Hyperspectral Data Representation
A hyperspectral dataset is represented by \( {\text{B}} = \left\{ {{\text{b}}_{1} ,{\text{b}}_{2} ,{\text{b}}_{3} , \ldots ,{\text{b}}_{l} } \right\} \in R^{n \times l} \), where n represents the number of pixels in each spectral band and l is the total number of all spectral bands. \( {\text{b}}_{i} \left( {1 \le i \le l} \right) \) indicates the i-th spectral band. Given a pixel point \( p_{x}^{j} \), we use \( \hat{p}_{x}^{j} \) to represent the mean of the neighbourhood of pixel \( p_{x}^{j} \). Then, the neighbour information can be rewritten as a new dataset \( {\hat{\text{B}}} = \left\{ {{\hat{\text{b}}}_{1} ,{\hat{\text{b}}}_{2} ,{\hat{\text{b}}}_{3} , \ldots ,{\hat{\text{b}}}_{l} } \right\} \in R^{s \times l} \), where s is the total number of \( {\hat{\text{b}}}_{i } \left( {1 \le i \le l} \right) \).
2.2 Double Graph Model
In [3], a multi-graph structure with spectral clustering method is proposed that can effectively represent the relationship between bands. In this section, we construct a double graph model to capture the complex relationships between pairwise spectral bands with statistical property of space. In graph model, each vertex represents a band bi and each edge corresponds to the correlation between pairwise bands. The correlation of vertices can be described as an adjacency matrix which can be defined by a Gram matrix. The adjacency matrix of bands is defined as follows:
And the adjacency matrix of neighbour information data can be expressed as follows:
LB and \( L_{{\hat{B}}} \) are both l × l positive semidefinite similarity matrices. According to the reference [18], it is clear that
where \( { \det }\left( {{\text{L}}_{Y} } \right) \) is equal to the squared k-dimensional volume of the parallelepiped spanned by the \( \left\{ {{\text{b}}_{i} } \right\}_{i \in Y} \) of B corresponding to bands in subset Y, where k is the dimension of subset Y.
2.3 Search Criterion
To select a band subset with rich information, the evaluation criteria of maximum information entropy [4] is used here. The information entropy of the spectral band subset is defined as:
where \( p_{x}^{i} \) represents the pixel value of the i-th spectral band and k is the number of band subset. Equation (4) can be used as the evaluation criteria for the performance of each band subset, which is called MI for short.
2.4 MI-DPP Band Selection Method
The k-DPP [14] can be expressed as follows:
where \( \text{P}_{L} \left( {Y^{\prime } } \right) \) represents the probability of any band subset \( Y^{\prime } \) with dimension k. According to Eq. (5), k-DPP can be expressed as follows:
According to reference [13], we can get
Equation (7) is the mathematical expression of k-DPP which shows that \( P_{L}^{k} \) is a mixture of the elementary DPPs \( P^{{V_{Y} }} \).
With full use of the spatial information, our algorithm can be written as follows:
where u and \( \hat{u} \) are scalars that balance the weight of two k-DPPs. Equation (8) shows that our algorithm is a mixture of two elementary DPPs \( P^{{V_{Y}^{B} }} \) and \( P^{{V_{Y}^{{\hat{B}}} }} \), where \( \lambda_{n}^{B} \) and \( \lambda_{n}^{{\hat{B}}} \) are the eigenvalues of the adjacent matrices LB and \( L_{{\hat{B}}} \) respectively. \( e_{B,k}^{N} = \sum\nolimits_{\left| Y \right| = k} {\prod\nolimits_{n \in Y} {\lambda_{n}^{B} } } \) and \( e_{{\hat{B},k}}^{N} = \sum\nolimits_{\left| Y \right| = k} {\prod\nolimits_{n \in Y} {\lambda_{n}^{{\hat{B}}} } } \) are the eigenvalue polynomials of the adjacent matrices LB and \( L_{{\hat{B}}} \), respectively. \( P_{{L_{B} ,L_{{\hat{B}}} }}^{k} \left( Y \right) \) is the probability of sampling a band subset Y. Combined with the search criterion MI mentioned in Sect. 2.3, we call the proposed algorithm as MI-DPP. Details of the sampling process of MI-DPP are summarized in Algorithm 1.
We repeat Algorithm 1 for α times to obtain α spectral band subsets Y with the dimension k. Then we select the band subset that meets the maximum search criteria Hk from the α spectral band subsets. With the proposed algorithm MI-DPP, the selected band subset will have the advantages of low redundancy and high information.
3 Experimental Result
In this section, we test the performances of the proposed algorithm and compare with several well-known unsupervised selection algorithms, including MIC [18] and Lscore [19].
3.1 Hyperspectral Datasets
AVRIS sensor: Indian Pines dataset
The India Pines dataset was obtained by the Airborne Visible/Infrared Imaging Spectrometer sensor (AVIRIS) in 1992 [20]. The image covers Indian Pines test site in North-western Indiana which has 220 spectral bands covering the spectrum range of 0.2–2.4 \( \upmu{\text{m}} \). And 20 water absorption spectral bands [104–108], [150–163] are removed and remaining 200 spectral bands form the dataset. It has 145 × 145 spatial pixels, including 16 classes ground truth.
3.2 Experimental Parameter Setup
In this section, we will introduce the parameter settings. The weight scalar u and \( \hat{u} \) are set to be u = \( \hat{u} \) = 0.5. The number of iterations α = 5. The number of selected bands increased by 10 and up to 100 bands. We use the SVM algorithm [21] to test the performance of the selected bands. The parameters C and \( \gamma \) of RBF kernel of SVM are in the range of 10{1:4} and 2{-4:4}, and are determined by five-fold cross validation. In all experiments, 10% of the samples were randomly selected for training, and the rest of the samples were used for testing. In Table 1, we show the numbers of training samples and test samples for Indian Pines dataset. The results of the experiment are evaluated through three widely used indexes: overall accuracy (OA), average accuracy (AA) and kappa coefficient [22].
3.3 Runtime Analysis
As showed in [18], k-DPP has a time complexity of \( {\mathcal{O}}\left( {Nk^{3} } \right) \). It spends \( {\mathcal{O}}\left( {N^{3} } \right) \) time to decompose s adjacent kernel matrices of bands. The MI-DPP algorithm takes total time \( {\mathcal{O}}\left( {nN^{3} + nNk^{3} } \right) \) to sample k bands, where n is the number of the similar kernel matrices. The computation time of the algorithm will greatly increase when N and k are large. However, the numbers of N and k are normally small in the selecting progress. We pre-calculate the information entropy for each band and establish lookup tables for Indian Pines dataset. Establishing lookup tables for Indian Pines dataset takes only 0.05 s. All the experiments are carried out on Matlab with Intel CPU E5-1620@3.5 GHz with 32 GB RAM.
In Fig. 1, the computing time of the three algorithms is shown as the number of selected spectral bands increases from 10 to 100. It’s obvious that the calculation time of the proposed MI-DPP is the least comparing with the other algorithms when the dimension of band subset is low. As the number of selected spectral bands increases, the calculation time of the MIC and Lscore does not change significantly. The calculation time of the algorithm MI-DPP will increase as the dimension of selected band subset increases. In general, the proposed algorithm MI-DPP takes less time than the other band selection algorithms.
3.4 Classification Results
In this subsection, we test the performance of the selected subsets of spectral bands on Indian Pines image. In Fig. 2, we compared the classification accuracy of the band subset selected by different algorithms. When the dimension of band subsets is in the range of 10–70, our proposed band selection method MI-DPP is significantly better than other algorithms. The algorithm Lscore performs the worst. When the dimension of spectral band subsets is in the range of 70–100, the performances of the three algorithms are similar. This is because as the number of bands is increasing, the amount of information in the spectral band increases. In the following, we will analyse the performance of the proposed MI-DPP in the low dimension in detail. We set the dimension of selected band subset to be 10 for each dataset and perform the classification tests. The results are shown in Table 2. In Fig. 3, we show the classification maps of three band selection algorithms on Indian Pines. It can be seen that among these three algorithms, Lscore obtains the worse result. Compared with the other two algorithms, the classification accuracy of the optical band selected by the proposed MI-DPP in all categories has been significantly improved. This is because the spectral band subset selected by the algorithm MI-DPP are more diverse and low-redundant which can provide abundant information.
4 Conclusion
In this paper, an unsupervised band selection algorithm called MI-DPP is proposed for hyperspectral images. We use the evaluation criterion to evaluate the spectral band performance, aiming to maintain the maximum band information. A double graph model is used to capture the complex relationship between bands. Furthermore, an improved k-DPP is proposed to sample a diversity and low-redundancy band subset from the two-graph structure. The experimental results on Indian Pines image dataset show that the proposed algorithm has a good expression in the task of band selection.
References
Qiao, T., Yang, Z., Ren, J., et al.: Joint bilateral filtering and spectral similarity-based sparse representation: a generic framework for effective feature extraction and data classification in hyperspectral imaging. Pattern Recogn. 77, 316–328 (2017)
Qiao, T., Ren, J., et al.: Effective denoising and classification of hyperspectral images using curvelet transform and singular spectrum analysis. IEEE Trans. Geosci. Remote Sens. 55(1), 119–133 (2017)
Yuan, Y., Zheng, X., Lu, X.: Discovering diverse subset for unsupervised hyperspectral band selection. IEEE Trans. Image Process. 26(1), 51–64 (2017)
Feng, J., Jiao, L., Liu, F., Sun, T., Zhang, X.: Unsupervised feature selection based on maximum information and minimum redundancy for hyperspectral images. Pattern Recogn. 51, 295–309 (2016)
Zabalza, J., Ren, J., Yang, M., et al.: Novel folded-PCA for improved feature extraction and data reduction with hyperspectral imaging and SAR in remote sensing. ISPRS J. Photogram. Remote Sens. 93, 112–122 (2014)
Zabalza, J., et al.: Structured covariance principal component analysis for real-time onsite feature extraction and dimensionality reduction in hyperspectral imaging. Appl. Opt. 53(20), 4440–4449 (2014)
Ren, J., Zabalza, J., Marshall, S., Zheng, J.: Effective feature extraction and data reduction in remote sensing using hyperspectral imaging. IEEE Sig. Process. Mag. 31(4), 149–154 (2014)
Sotoca, J.M., Pla, F.: Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recogn. 43(6), 2068–2081 (2010)
Yang, H., Du, Q., Su, H., Sheng, Y.: An efficient method for supervised hyperspectral band selection. IEEE Geosci. Remote Sens. Lett. 8(1), 138–142 (2011)
Sui, C., Tian, Y., Xu, Y., Xie, Y.: Unsupervised band selection by integrating the overall accuracy and redundancy. IEEE Geosci. Remote Sens. Lett. 2(1), 185–189 (2015)
Wang, C., Gong, M., Zhang, M., Chan, Y.: Unsupervised hyperspectral image band selection via column subset selection. IEEE Geosci. Remote Sens. Lett. 12(7), 1411–1415 (2015)
Zhang, M., Ma, J., Gong, M.: Unsupervised hyperspectral band selection by fuzzy clustering with particle swarm optimization. IEEE Geosci. Remote Sens. Lett. 4(5), 773–777 (2017)
Chang, C., Wang, S.: Constrained band selection for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 44(6), 1575–1585 (2006)
Jia, S., Ji, Z., Qian, Y., Shen, L.: Unsupervised band selection for hyperspectral imagery classification without manual band removal. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 5(2), 531–543 (2012)
Geng, X., Sun, K., Ji, L., Zhao, Y.: A fast volume-gradient-based band selection method for hyperspectral image. IEEE Trans. Geosci. Remote Sens. 52(11), 7111–7119 (2014)
Kulesza, A., Taskar, B.: Determinantal point processes for machine learning. Found. Trends Mach. Learn. 5(2–3), 123–286 (2012)
Kulesza, A., Taskar, B.: k-DPPs: fixed-size determinantal point processes. In: Proceedings of International Conference on Machine Learning, pp. 1193–1200 (2016)
Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS, pp. 507–514 (2006)
Green, R.O., et al.: Imaging spectroscopy and the airborne visible/infrared imaging spectrometer (AVIRIS). Remote Sens. Environ. 65(3), 227–248 (1998)
Bazi, Y., Melgani, F.: Toward an optimal SVM classification system for hyper-spectral remote sensing images. IEEE Trans. Geosci. Remote Sens. 44(11), 3374–3385 (2006)
Foody, G.M.: Status of land cover classification accuracy assessment. Remote Sens. Environ. 80, 185–201 (2002)
Acknowledgements
This work is supported by the National Nature Science Foundation of China (nos. 61471132 and 61372173), and the Training program for outstanding young teachers in higher education institutions of Guangdong Province (no. YQ2015057).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, Z., Chen, W., Yan, Y., Cao, F., Cai, N. (2018). Unsupervised Hyperspectral Band Selection Based on Maximum Information Entropy and Determinantal Point Process. In: Ren, J., et al. Advances in Brain Inspired Cognitive Systems. BICS 2018. Lecture Notes in Computer Science(), vol 10989. Springer, Cham. https://doi.org/10.1007/978-3-030-00563-4_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-00563-4_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00562-7
Online ISBN: 978-3-030-00563-4
eBook Packages: Computer ScienceComputer Science (R0)