Introduction

Information extraction has always been a hot field in remote sensing, since it is the key point to transfer image data into information for practical applications (Hu et al. 2008; Giada et al. 2010). Water is one of the most common but valuable objects on the earth, and it exists widely in information extraction domain, for its importance and easiness to be extracted caused by its special spectral characteristic. Water extraction is mainly realized through two ways in the former researches: one is classification, and the other is spectral signature feature analysis. Classification method is popularly used in remote sensing processing, not only to water extraction, but also to many other objects’ classification and extraction, and it can be realized by many means, such as maximum likelihood, parallelepiped, SVM classifications, and so on (Robert 2007). Because of its distinct spectral characteristic, water extraction via classification is always realized by simple method like maximum classification, which is classified into two categories of water and background, whereas some intelligent classification methods like SVM classification are always over-fitted. As to spectral signature feature analysis, it can also be generally divided into two categories, i.e. single-band and multi-band methods, according to the number of bands used. The single-band method usually involves choosing a characteristic band of water from a multispectral image, and then a threshold is determined for the band to discriminate water from others (Rundquist et al. 1987). The multi-band method takes advantage of reflective differences of each involved band, and extracts water based on the analysis of signature differences between water and others. Recently, index computations using two or more multispectral bands like NDWI are popularly used, to enhance water features while other objects are suppressed. Also, a threshold for NDWI is needed to delineate water from background, which is always selected subjectively or empirically, and thus may lead to an over-estimation or under-estimation of water.

Nevertheless, all the methods mentioned above considerate the extraction model as an integral and simple one. Whereas, water in the nature are always mixed with some impurities, and water in different areas are commonly not the same. Besides, water is easily mixed with shadow, which is a difficult for water extraction. Thus, the former methods are not so suitable for complicated water areas. Moreover, only using an arbitrary empirically selected threshold of signature band or NDWI to segment water would probably bring out a lot of confusion, and is always non-automatic in the meantime. Luo et al. (2009) have developed an automatic water extraction method which gets a better result by a step-by-step iterative transformation mechanism, but the segmentation and buffering maybe not so adequate.

In this context, an adaptive water extraction method from remote sensing, based on NDWI is proposed, where NDWI is adopted to enhance water information in the image, and then it is redefined in order to automatically segment the most fitted part using the peak-valley method of histogram segmentation. Afterwards, according to the selected most fitting pixels, local areas of water could be got through region growing; and then iteratively precise approaching is realized in each local area based on its own spectral features. This method extracts water step by step, which considers both spatial distribution and spectral characteristic of water in image through “whole-local” spatial transformation and precise approaching within each local water area, respectively. Furthermore, in order to discriminate water from shadow, direction and SNR judgment is utilized. Therefore, this method fits water with various types well, with the processing is automatic and the result is more accurate and reliable in the mean time.

Methods

The proposed method consists of several coherent procedures, which consider both spectral and spatial information (Li and Narayanan 2004; Jimenez et al. 2005; Myint 2008), including NDWI computation, “whole-local” progressive spatial scale transformation, iterative accurate boundary approaching, and direction and SNR judgment. Figure 1 illustrates the flow of this method.

Fig. 1
figure 1

Flowchart of the adaptive water extraction

NDWI Computation

Index computation is widely used in remote sensing information extraction, such as NDVI and LSWI, for it is rather effective and simple (Chandranath et al. 2003; Lasaponara and Masini 2006). As to water extraction, NDWI is adopted here, which is defined as

$$ NDWI = \frac{{GREEN - NIR}}{{GREEN + NIR}} $$
(1)

where, GREEN and NIR represent the green band and the near infrared band, respectively. This index is designed to maximize reflectance of water by using green wavelengths, minimize the low reflectance of NIR by water features, and take advantage of the high reflectance of NIR by vegetation and soil features. As a result, water features have positive values and thus are enhanced, while vegetation and soil usually have zero or negative values and therefore are suppressed (McFeeters 1996).

NDWI is widely used, not only for water extraction itself, but also for many other related application researches, such as vegetation water content evaluation, wetland mapping, environmental change, and so on (Gao 1996; Ouma and Tateishi 2006; Xu 2006). The use of it is always through segmentation by a threshold given subjectively to determine the distribution of water or related objects, which may includes some incorrect results; thus, rational and automatic algorithms for determining segment threshold are needed. Considering that peak and valley method based on histogram is commonly adopted in image processing for its simplicity and automation, here a modified peak and valley method is adopted in this letter. Accordingly, NDWI is redefined (RNDWI) by the distance with the largest value, to fit in with followed histogram segmentation, which is defined as

$$ RNDWI = Max\left( {NDWI} \right) - NDWI $$
(2)

where RNDWI represents the redefined NDWI.

Preliminary Auto-segmentation

Histogram segmentation by peak and valley method is a basic way to realize segmentation in image processing domain, and is employed universally in various aspects of applications (Delon et al. 2007). Through NDWI computation, there are mainly two classes in the image, which are water and background. They probably represent two apparent waveforms in the histogram, and then using peak and valley method to find the valley between them could discriminate them from each other. However, this method is likely to be interfered by noise, and the valley is hard to find when there are more than two waveforms in the histogram (Gonzalez and Woods 2002). Therefore, a modified peak and valley method is proposed on the basis that pixel with the smaller value means it is more similar with water. With this precondition, only finding the first valley could find out the most fitting pixels as preliminary water, which could reduce uncertainty and is more convenient.

In the previous step, larger index value means more similar with water in NDWI image, while smaller index value means that in RNDWI image, and the latter fits the requirement of the modified peak and valley method. The detailed process is to search from the minimum value successively in histogram until find the first value g, whose number of pixels is less than that of value g−1 and value g+1, and value g is the threshold needed to proceed to segmentation. It is worth noting that the foregoing rule is only for the situation that the histogram is consequent. If there is a discrete histogram, the first valley chosen there is that of the first waveform, whose peak value is smaller than both those of the previous and the latter adjacent waveforms, which is not limited to the very first lower valley, in order to eliminate contingencies and interference.

Overall Region Growing

Only pixels with smaller RNDWI index value are selected as water pixels through preliminary segmentation, which probably can’t contain all water objects, especially for there are many different subtypes of water. Therefore, the rest of water should be further extracted. According to the rule that objects are always of aggregated distribution (Miller 2004), region growing is adopted to determine action zone of water (Yu and Clausi 2008; Qin and Clausi 2010).

Mean value and standard deviation of spectrums of extracted water pixels are calculated in each band, and then the standard deviation is regarded as the threshold of region growing. Region growing here mainly contains the following steps:

  1. Step 1)

    Traverse the initial image, and check whether the pixel is water or not according to the preliminary segmentation result: if it is, take it as a seed point and add it into the stack which used to store seed pixels; otherwise, exit this step.

  2. Step 2)

    Take out seed pixels from the stack, and then add them into current connected domain.

  3. Step 3)

    Determine whether adjacent pixel of seed pixel is water or not: if it is already a water pixel, leave it as it is; otherwise, compute absolute spectral distances between it and seed pixel in each band, if all distances of each band are less than that of water’s standard deviation correspondingly, reset it as water pixel. Last, put all water pixels into stack.

  4. Step 4)

    Repeat step 2) and step 3) until the stack is empty, and an integrated connected domain is got.

  5. Step 5)

    Traverse the rest pixels of the initial image, and repeat procedure from step 1) to step 4) to get all connected domain of water in whole image.

Local Area Classification

Compared to the whole image, each connected domain acquired through region growing is a local area of water, which is got on the consideration of the whole image. However, local area of water may have its own characteristics, which can’t be fitted well using criteria for the whole image; thereby, precise classification of each local area according to its own features is necessary.

Based on the former steps, the already extracted water pixels are regarded as correct, and the aim of local area classification is to find out water pixels near the boundary, also on the basis of objects’ spatial aggregation as that of region growing. In order to attain this goal, template window is employed here to get the boundary area iteratively. Within each local area, search out its boundary pixels, and then take them as center to acquire template windows with a certain size (which could be adjusted according to actual need). Within each window, spectral mean and standard deviation of water are defined as\( {C_{\text{t}}} = {\left[ {c_t^1,c_t^2, \cdot \cdot \cdot, c_t^n} \right]^T} \)and\( {{\mathbf{\Sigma }}_t} = {\left[ {\delta_t^1,\delta_t^2, \cdot \cdot \cdot, \delta_t^n} \right]^T} \), respectively; those of background are \( {C_b} = {\left[ {c_b^1,c_b^2, \cdot \cdot \cdot, c_b^n} \right]^T} \)and\( {{\mathbf{\Sigma }}_b} = {\left[ {\delta_b^1,\delta_b^2, \cdot \cdot \cdot, \delta_b^n} \right]^T} \), and eigenvector of selected background pixel is defined as \( x = {\left[ {{x_1},{x_2}, \cdots, {x_n}} \right]^T} \). Thus, the criterion of reclassifying background pixel into water is determined as

$$ \left| {{x_i} - c_b^i} \right| - \left| {{x_i} - c_t^i} \right| > \delta_b^i,i = 1,2, \cdots, n $$
(3)

where i and n represent dimensionality and the total number of multispectral bands, respectively. From the above, it can be seen that background pixel which is nearer to water would be revised. In this way, traverse the rest of background pixels in this template window and the rest template windows of this connected domain.

Statistics of Changed Pixels in Local Area

Local area classification is carried out iteratively (Miguel-Ayanz and Biging 1996), in order to get the precise extraction result properly and intellectually. Here, a criterion is needed to stop it, and ratio of changed pixels is employed to do so. Every time after local classification, the number of water pixels should be calculated in each connected domain, and compared with that before reclassification. If the ratio of changed pixels exceeds the predetermined threshold, reclassify this new connected domain again as previous step.

Direction and SNR Judgment

Local classification and change statistics may be repeated iteratively, until the ratio of changed water pixels meet the requirement for each connected domain.

After that, if shadow also exists in the image, direction and SNR judgment is needed to eliminate the interference caused by shadow. Because shadow is casted by sunlight, its direction is determined by solar azimuth, while water has its naturally fixed direction. However, only direction may be inadequate to discriminate water and shadow. Because of the reflection characteristics of the surface of water, the variance of water is relatively large; thus, it has a higher SNR than that of shadow which has lower reflectance (Nath and Dubois 2006). Thereby, the combination of direction and SNR judgment can separate water and shadow accurately. Furthermore, aiming to improve computation efficiency, this process should also be used after preliminary water segmentation to prevent shadow pixels from being seeds for the following processing.

Experiments and Results

In this paper, we select three different types of water areas as experimental data to execute the proposed method: coastal area, urban river area and natural lake area. Each of the three areas has its peculiar representations, and the combination of them can demonstrate the generality of the proposed method.

Coastal Area Experiment

Study Area and Data

The data used for coastal area experiment is an ETM+ image with a resolution of 30 m, which comprised 236 lines and 253 columns, covering part of a coastal city in the eastern China with an area about 7.1 km × 7.6 km, and it consists of six multispectral bands except for thermal infrared bands. This area mainly contains the mainland, water and some coastal cultivated zones (Fig. 2a). Because this area is large, shadows caused by buildings are small and can be omitted. Here, water has several subtypes, because of interference caused by sand and contamination discharged from inland into sea, which makes it difficult to satisfactorily extract all of the water. Normal water extraction methods are usually not so suitable in this condition, whereas the proposed adaptive method could work well.

Fig. 2
figure 2

Sea area water extraction: a ETM+ image in the experimental area with false color composited of bands 4,3,2; b NDWI image; c RNDWI image; d preliminary auto-segmentation of RNDWI; e water region growing result; f finial water extraction image by this method

Experiment Approach and Results

On the basis of the description of the proposed method, NDWI computation is first executed to generate an index image represents the matching degree with water (Fig. 2b), which is also the key point to realize automation of this method. And then, NDWI is redefined so as to get the RNDWI image (Fig. 2c), on which the darker pixel has a smaller index value, to be in accord with the consumption of the modified peak and valley method.

According to the principle of redefined NDWI, pixel with smaller index value is more similar with water, in which case the most matching pixels could be automatically found out through histogram segmentation by threshold of the first valley. The original histogram of RNDWI image is rather converged, where histogram equalization is needed to make it more transparent. The equalized histogram is shown in Fig. 3, where there are several peaks which probably represent some different kinds of objects in the first half, and mainly background information and noise in the latter part. Here, because the histogram here is discrete, the valley of the first waveform, whose peak value is smaller than both those of the previous and the latter adjacent waveforms, is needed by the modified peak and valley method, and the found threshold with value 45 is marked out by a red vertical line in the same figure.

Fig. 3
figure 3

Equalized histogram of RNDWI image for auto-segmentation

After histogram auto-segmentation, the most fitted part of water is extracted out as shown in Fig. 2d. However, water in the coastal area is always mixed with sand, which makes it different from clean water to some extent, and then been missed out. The same situation can be analyzed from Fig. 3, where histograms of different subtypes of water are near to each other, and there are no apparent divergences among them, so that omission is inevitable through segmentation with the first valley. Therefore, further region growing is needed. Figure 2e is the result after region growing, in which most of the mixed water has also been extracted out.

Afterwards, the boundary of each connected domain of water should be searched out by a window template with size of 9 × 9, to reconfirm class of background pixels using local classification and change statics. Here, the predetermined threshold of changed pixel ratio is 20%, and Fig. 2f is the result after 6 times iteration with some post-processing.

Comparative Algorithms

In order to check the validity of the proposed method, segmentation methods by valley between two highest peaks and by valley with the highest drop height (McFeeters 1996), maximum likelihood classification (Hagner and Reese 2007) and SVM classification (Bazi and Melgani 2006) using the same experimental region to extract water are adopted for comparison purpose, all of which are popular method for classification and information extraction, and widely utilized in many applications.

Segmentation by valley between two highest peaks is based on the cognition that the highest peaks represent two main objects in the image, and the valley between them may classify them pertinently. However, if there are two or more valleys between them, the nearest one to the second highest peak is selected to ensure the dominant object. According to this method, the selected threshold with value 96 is marked out by a green vertical line in Fig. 3. Segmentation by valley with the highest drop height is based on the cognition that valley with the highest drop height represents the maximum change, and then it can be used to discriminate the related two objects. By this method, the found threshold with value 132 is marked out by a blue vertical line in Fig. 3.

As to ordinary classification methods, maximum classification classifies object by its maximum similarity, while SVM classification transforms the image into a higher dimensional space, to find the best hyperplane to discriminate classes of objects. In addition, when using classification methods, it needs to select samples, which play an important role to classification results. Thus, here we select two groups of samples to verify their influence: one has more samples, and the other has fewer samples. The results and sample selection images are shown in Fig. 4.

Fig. 4
figure 4

Sea area water extraction using reference methods: a segmentation result by valley between two highest peaks; b segmentation result by valley with the highest drop height; c experimental image with more samples; d maximum classification result with more samples; e SVM classification result with more samples; f experimental image with fewer samples; g maximum classification result with fewer samples; h SVM classification result with fewer samples

Spectral Analysis of Results

According to the results of several methods, there are some coastal areas likely to be classified into background, which are always considered as water subjectively. However, classifying by visual interpretation may lead to misclassification, where objects’ actual spectral profiles should be the correct criterion for judgment. Therefore, spectral profiles of some typical areas are selected out to check the correctness of the results, and the positions of which are marked out in Fig. 5e accordingly.

Fig. 5
figure 5

Spectrums of different objects in experimental sea area : a spectral profile of typical water; b spectral profile of typical background; c spectral profile of a coastal area among coastal cultivating zones; d spectral profile of a coastal area near the mainland; e positions of selected objects’ spectral profiles

Figure 5a represents the spectral profile of typical water, whose reflectance declines from visible bands to NIR band, and when it goes to NIR band, the reflectance is nearly zero because of the strong absorption here. Figure 5b represents the spectral profile of typical water, which is reflected strongly in IR bands. Figure 5c represents the spectral profile of a coastal area among coastal cultivating area, which has an obvious reflection peak in IR bands. Besides, seen from the original image as Fig. 5e, the water here is mixed with impurities severely. Thus, it shouldn’t be classified into water, whereas the classification of background here is correct actually. Figure 5d represents the spectral profile of a coastal area near the mainland, which has the similar declining trend to that of typical water; thus, it should be classified into water, although the water here is also mixed with impurities. It can be concluded that, although part of water here is mixed with impurities, the degree of mixing is not the same: if the mixture changes the spectrum of water, then it turns into background; otherwise, if the spectrum still represents characteristic of water, it is still considered as water. Thus, the classification of area (c) as background in most of the results here is correct, whereas area (d) belongs to water no matter in what method.

Accuracy Analysis of Results

On the basis of extraction results, accuracy analysis is carried out to compare effects of different methods, and the quantitative accuracy evaluation results are listed in Table 1. It can be seen that the proposed method gains the best result with overall accuracy 97.6%, through which water and background are extracted properly, and the misclassification is partly caused by land inside coastal cultivated zones which is hard to handle (Fig. 2e). Neither segmentation by valley between two highest peaks nor segmentation by valley with the highest drop height gets a satisfactory result (Fig. 4a–b), with both accuracies are lower than 90%. The reason is that the characteristic waveforms are entirely belonged to water, and thus some subtypes of water are missed out, which can be clearly seen from Fig. 3. It also reveals the drawback of common histogram segmentation by peak and valley methods, that is it is mainly suitable for segmentation for two simple objects; when there are more objects or object itself is complex itself, it may be rather uncertain. With regard to classification methods, when there are more samples, maximum classification gets a relatively higher accuracy than SVM classification (Fig. 4d–e), for the reason that SVM method is over-fitting which classify some background into water, and thus decrease the accuracy either; when there are fewer samples, the over-fitting of SVM method is not so obvious, and gains a relatively better result with the accuracy of 94.5% (Fig. 4g), whereas the accuracy of maximum classification descends as the reduction of samples (Fig. 4h).

Table 1 Accuracies for different methods of coastal area water extraction

To sum up, the complex histogram of the experimental image is not suitable for only using histogram segmentation by peak and valley methods, which leave out many mixed water. Maximum likelihood classification, the most basic and widely used method, could only extract relatively pure water, leaving most of mixed water misclassified in the mean time, and the accuracy of it depends on samples selection to a large extent. SVM classification, a relatively intelligent, always over-determines many mixed pixels into water class, with many of them are probably to be background class actually. It could gain a good result for ordinary classification of many objects, but may not fit for single thematic information extraction, in which situation it is always over-fitting. Moreover, it is also influenced by the selection of samples, which leads to the uncontrollability of result. Comparatively, the adaptive water extraction method based on NDWI proposed in this letter gains a better result, with more proper extraction result of both water and background.

Urban River Area Experiment

Study Area Analysis

The data used for this experiment is a Quickbird image with the size of 2039 × 1407 pixels, which contains water, shadow, and other background objects (Fig. 8a). Shadows cast by buildings and trees are very common in urban areas, and they become a major disturbance for water extraction. However, all the shadows follow the same direction in a specific image, and the direction is just the solar azimuth which can be obtained from the header file (Ren et al. 2009). The sketch map of the direction of shadow is shown in Fig. 6.

Fig. 6
figure 6

Sketch map of the direction of shadow

However, only direction cannot be the firm criterion to determine water, for some water parcels may also follow the same direction as shadows. In this situation, high SNR caused by reflectance properties of water surface can be the other criterion. The histograms of water and shadow samples are shown in Fig. 7, from which we can clearly see the large fluctuation condition of water while shadow has a relatively steady histogram. According to statistics, the SNR of water is commonly higher than 50 db, and it also is the criterion adopted here.

Fig. 7
figure 7

Histograms comparison of water and shadow

Experiment Approach and Results

The experiment approach is the same as that of coastal area, whereas shadows exist extensively here. The overall processing can be seen in Fig. 8. Specifically, most of the shadow pixels which taken as water pixels during preliminary segmentation, are eliminated through direction and SNR judgment in Fig. 8d, and this procedure is also used in the processing of the final result. SVM classification is employed here for comparison, and the quantitative analysis of the two methods is shown in Table 2.

Fig. 8
figure 8

Urban area water extraction: a Quickbird image of urban area; b NDWI image of urban area; c preliminary auto-segmentation of RNDWI; d shadow elimination by direction and SNR judgment; e water extraction using this method; f water extraction using SVM method

Table 2 Accuracies for different methods of urban river extraction

From both the figures and the accuracies, we can see that the proposed method obviously yield better result in urban water extraction, which removes the confusions caused by shadows, and then obtain a complete and precise water extraction result. However, there still exist a problem, that the proposed method cannot tell water from shadow is if they are mixed with or adjacent to each other.

Natural Lake Area Experiment

The data used for this experiment is an ETM+ image with the size of 919 × 666 pixels, which contains water, clouds and their shadows, together with other background objects (Fig. 9a). In particular, there are clouds in the image, but they have little influence in water extraction. Because the reflectance of cloud is higher than that of water, they can be separated through NDWI computation, just as shown in Fig. 9b. Although shadows are casted by clouds here, they also have the direction as solar azimuth; thus, the experiment approach is the same as that of urban river area. The overall procedures can be seen in Fig. 9, and the accuracies of this experiment are shown in Table 3.

Fig. 9
figure 9

Natural area water extraction: a ETM+ image of natural area; b NDWI image of natural area; c preliminary auto-segmentation of RNDWI; d shadow elimination by direction and SNR judgment; e water extraction using this method; f water extraction using SVM method

Table 3 Accuracies for different methods of natural lake extraction

From both the figures and the accuracies, it can be seen that the proposed method performs better, with water extracted precisely while eliminating shadows. Whereas, SVM classification cannot get rid of shadows; besides, it suffers from disturbance caused by different ingredients inside water bodies.

Conclusion

As water is a fundamental and typical object on the earth, and it has great significance to many related researches. For water is easily influenced by surroundings in the nature, it commonly has different appearances in different areas. Former water extraction methods are mainly based on a single model using only spectral information, which are not suitable for complicated areas. Besides, the methods have some inherent deficiencies themselves: histogram segmentation by peak and valley methods is commonly suitable for images with two simple objects, and the uncertainty is rather large if there are several complicated objects; maximum classification is not so effective for thematic information extraction, and it is closely influenced by the selection of samples; SVM classification is always over-fitting, and it is also influenced by sample selection.

This letter proposes an adaptive water extraction method, which concerns both spectral and spatial information in remote sensing image. The novel aspects of this method are as follows: 1) NDWI computation is adopted to enhance water’s spectral characteristic, and then it is redefined to be RNDWI to become the criterion for segmentation; 2) A modified histogram segmentation by peak and valley method is proposed to automatically select the most fitting pixels in RNDWI image, according to which water’s local areas can be got and can be realized through region growing; 3) Within each local area, reclassification is operated iteratively according to its own characteristic in the boundary area to approach water’s precise extraction, which is more suitable for local area and could adapt to suit water’s diversity caused by disturbance; 4) Direction and SNR judgment is utilized to discriminate water from shadow accurately.

As mentioned above, the precisely expression of diversely spectral information is the essential part of remote sensing image, and the procedure of spatial transformation integrates spatial knowledge, which is necessary for progressive and adaptive water extraction. As scale transforms from whole scale to local scale, and even in patch scale, both spectral and spatial operation transform accordingly to be self-adaptive to research area in each level. Based on all these procedures, this method realizes automation, adaption and intelligence, and different types of experiments verify its effectiveness and generality. Moreover, this method is not only limited to the extraction of water, but also suitable for many other thematic objects, especially those who have peculiar indexes like vegetation. Therefore, the proposed adaptive extraction method can be widely used in many applications in remote sensing field.