Keywords

1 Introduction

Remote sensing change detection (CD) is an essential technique for identifying changes in multi-temporal images of the same geographical region [10, 16]. It provides valuable information for various applications, including deforestation monitoring, target detection, and agricultural advancement [2, 23]. Additionally, CD algorithms support decision-making during natural disasters, enabling timely actions to prevent material losses and save lives [13]. Change detection in remote sensing involves distinguishing changed and unchanged pixels in multi-temporal Earth Observation (EO) images for the same geographical region. These multi-temporal EO images are required to be co-registered. This step is important in aligning EO images to the same coordinate system, which is useful for obtaining consistent radiometric characteristics, such as brightness and contrast. This process enhances the change detection performance [14, 19]. Key point extraction techniques like SIFT, SURF, and CNNs are often used for image registration [6]. Classical change detection can be easily obtained by computing the intensity difference between images. The result of this process is called a change map (CM). However, challenges such as co-registration errors, illumination variations, and speckle noise affect the accuracy of change detection algorithms.

Synthetic aperture radar offers advantages over optical sensors for change detection in remote sensing due to its all-weather capability, penetration through clouds and vegetation, and sensitivity to small changes. SAR change detection methods primarily rely on unsupervised learning due to the lack of annotated SAR datasets. Various unsupervised CD methods use clustering algorithms, such as principal component analysis, fuzzy clustering algorithms (FCM) [12] and fuzzy local information C-mean (FLICM) [17]. Researchers make an effort to reduce the impact of speckle noise on CD methods. Qu et al. [22] introduced a dual domain neural network (DDNet) incorporating spatial and frequency domains to reduce speckle noise. Gao et al. [10] proposed a Siamese adaptive fusion network for SAR image change detection, which extracts semantic features from multi-temporal SAR images and suppresses speckle noise. Meng et al. [20] presented a noise-tolerant network called LANTNet that utilises feature correlations among multiple convolutional layers and employs a robust loss function to mitigate the impact of noisy labels. While these deep learning-based approaches show some robustness against speckle noise, they still struggle to eliminate it and reduce its effectiveness in change detection methods. Furthermore, the presence of speckle noise varies between single-look (pre-change) and multi-look (post-change) SAR imaging processes, further degrading the performance of change detection algorithms when considering different instances in time.

To address the issues with degrading CD performance, we propose a robust despeckling model (DM) architecture that effectively suppresses speckle noise in SAR CD datasets. This approach leads to significant improvements in change detection performance. Experimental evaluations on public SAR CD datasets provide compelling evidence of the superiority of our proposed method when compared to existing approaches.

2 Related Work

SAR change detection is widely used in various applications, including urban extension [16], agricultural monitoring [23], target detection [21], and disaster assessment [2]. Due to the lack of annotated SAR datasets, most researchers rely on unsupervised methods for SAR change detection. However, the presence of speckle noise poses a significant challenge and reduces the accuracy of change detection. Image pre-processing, including despeckling and image registration, is a crucial step in SAR change detection to enhance image quality and align multi-temporal images [19].

Generating a difference image (DI) is important in SAR change detection. Various methods, such as image differencing, log ratio, and neighbourhood-based ratio, have been proposed to generate the DI [5, 30]. The classification of the DI typically involves thresholding and clustering. Some approaches use the preclassification result to train a classifier model and combine the preclassification and classifier results to generate a change map. These methods aim to improve change detection performance by leveraging preclassification and classifier information [8].

Recent approaches in SAR change detection focus on explicitly suppressing speckle noise to improve accuracy. Methods such as DDNet [22], Siamese adaptive fusion networks [10], and LANTNet [20] have been proposed to mitigate the impact of speckle noise and extract high-level features from multi-temporal SAR images. However, these approaches have limitations in effectively handling different speckle noise characteristics in images prior and after the change, especially when the number of looks varies. To address this challenge, we propose a despeckling model to suppress speckle noise and achieve effective SAR change detection for different numbers of looks in pre- and post-change images.

Fig. 1.
figure 1

An overview of the proposed modules

3 Methodology

The despeckling module applies a sequence of convolutional layers to reduce speckle noise in input SAR images. The resulting image with reduced noise is then passed to the subsequent CD methods. Figure 1 presents the DM and CD methods overview. The following sections explain the proposed despeckling model architecture and the change detection methods.

3.1 Despeckling Model Architecture

The proposed despeckling architecture aims to learn a mapping from the input SAR image using convolutional layers to generate a residual image containing only speckle noise. The resulting speckle-only image can be combined with the original image through either subtraction [4] or division [27] operations to produce the despeckled image. The division operation is preferred as it avoids an additional logarithmic transformation step and allows for end-to-end learning. However, training such a network requires reference despeckled images, which are typically unavailable for SAR images. To address this, researchers use synthetic reference images generated using multiplicative noise models [4, 27, 29]. This study also employs synthetic SAR reference images to train the proposed despeckling network architecture, consisting of ten convolutional layers with batch normalisation, ReLU activation functions, and a hyperbolic tangent as the final nonlinear function. The proposed architecture is similar to [4, 27, 29], but with additional convolutional layers and improved loss function presented in Fig. 2. Moreover, the details on hyperparameters are also provided in Table 1 for clarity.

Table 1. Proposed Despeckling Model Configuration. Where L1 and L10 refer to a series of Conv-ReLU layers, while the layers between L2 and L9 consist of Conv-BN and ReLU layers as illustrated in Fig. 1.
Fig. 2.
figure 2

Proposed despeckling model architecture

3.2 Proposed Loss Function

A common approach to training the despeckling network is to use the per-pixel Euclidean loss function \(LE (\theta )\), computed by comparing the predicted despeckled image with the noise-free SAR image. The \(LE (\theta )\) calculates the squared Euclidean distance between corresponding pixels. While effective in various image restoration tasks, such as super-resolution, semantic segmentation, change detection, and style transfer, it often results in artifacts and visual abnormalities in the estimated image. Researchers have incorporated a total variation (TV) loss and an Euclidean loss function \(LE (\theta )\) as supplementary measures. The TV loss reduces artifacts but may lead to oversmoothing and information loss, thus impacting change detection performance. To overcome this, we design a loss function which combines the \(LE (\theta )\) and a structural similarity index (SSIM), initially proposed for image quality assessment, which offers a better trade-off by removing artifacts while preserving essential information, ultimately enhancing change detection performance.

$$\begin{aligned} L_E(\theta ) = \frac{1}{{W \cdot H}}\sum _{w=1}^{W}\sum _{h=1}^{H} \Vert X^{(w,h)} - \hat{X}^{(w,h)} \Vert ^2 \end{aligned}$$
(1)
$$\begin{aligned} SSIM(x,y) = \frac{(2\mu _x\mu _y + C_1) \cdot (2 \sigma _{xy} + C_2)}{(\mu _x^2 + \mu _y^2+C_1) \cdot (\sigma _x^2 +\sigma _y^2+C_2)} \end{aligned}$$
(2)

The total loss is thus calculated as follows:

$$\begin{aligned} L_{T} = L_E(\theta ) + \lambda _{\text {SSIM}}\cdot SSIM \end{aligned}$$
(3)

where X and \(\hat{X}\) are the reference (noise-free) and despeckled images, respectively, \(\mu _{X}\) and \(\mu _{\hat{X}}\) are the mean values of X and \(\hat{X}\)respectively. Similarly, \(\sigma _{X}\) and \(\sigma _{\hat{X}}\) are the standard deviations of X and \(\hat{X}\) respectively. While \(\sigma _{X\hat{X}}\) is the covariance between X and \(\hat{X}\). Finally, \(C_1\) and \(C_2\) are constants set to be 0.01 and 0.03 respectively [28].

3.3 Change Detection

It is critical to suppress speckle noise in our proposed method to enhance CD performance. To evaluate the performance of the proposed despeckling model, we incorporated state-of-the-art CD methods, including DDNet [22] and LANTNet [20]. PCA-k-means [3] is an unsupervised change detection method that utilises principal component analysis and k-means clustering to identify changes by splitting the feature vector space into two clusters. NR-ELM [9] employs a neighbourhood-based ratio to create a difference image and subsequently utilises an extreme learning machine to model high-probability pixels in the difference image. This information is then combined with the initial change map to produce the final change detection result. DDNet [22] combines spatial and frequency domain techniques to reduce speckle noise, while LANTNet [20] leverages feature correlations across multiple convolutional layers and incorporates a robust loss function to mitigate the impact of noisy labels.

4 Experimental Results and Evaluation

In this section, we introduced the datasets and evaluation metrics. Subsequently, we presented and evaluated the results by comparing them with those obtained from state-of-the-art CD methods.

4.1 Datasets and Evaluation Metrics

Two types of datasets were used in this paper. The first is the Berkeley Segmentation Dataset 500, widely employed to generate synthetic SAR images for training the despeckling model. Real SAR images were used for testing, specifically for change detection purposes, to assess the model’s performance. Detailed descriptions of both datasets can be found in the following subsections:

  • Synthetic SAR Images

    The Berkeley Segmentation Dataset 500 (BSD-500) was originally developed to evaluate the segmentation of natural edges, including object contours, object interior and background boundaries [1]. It included 500 natural images with carefully manually annotated boundaries and edges of natural objects collected from multiple users. This dataset has been widely used to generate synthetic SAR images for the purpose of despeckling [15, 18, 25]. Inspired by these studies, we have used it to train our despeckling model.

  • Real SAR Images

    For the purpose of change detection, we employed three real SAR image datasets that are multi-temporal and have been co-registered and corrected geometrically.

    • Farmland and Yellow River Datasets: The images for both datasets were captured by RADARSAT-2 in the region of the Yellow River Estuary in China on 18th June 2008 (pre-change) and 19th June 2009 (post-change). The pre-change images are single-look, whereas the post-change images have been acquired via a multi-look (four) imaging process. The single-look pre-change image is significantly influenced by speckle noise compared to the four-look post-change image [10]. The disparity between the single and four looks in these two SAR datasets poses a significant challenge for change detection methods.

    • Ottawa Dataset: The images for this dataset were also captured by RADARSAT-2 in May 1997 (pre-change) and August 1997 (post-change) in the areas affected by floods [11, 22, 26]. Because of the single imaging process, the pre- and post-change images are less affected by noise in this dataset.

The synthetic SAR images were utilised to train the proposed DM, as depicted in Fig. 1. In contrast, the real SAR images were despeckled for the purpose of change detection (CD datasets). Figure 3 presents the real SAR datasets.

To evaluate the results, we used two common evaluation metrics, including Overall Accuracy and F1 score. The F1 score is usually used to evaluate the change detection accuracy [7, 24].

Fig. 3.
figure 3

The real SAR datasets. (a) Image acquired in T1. (b) Image acquired in T2. (c) Ground truth image(GT).

Table 2. Quantitative evaluation on three CD datasets based on despeckling model. Here, w/o means it is the original method without despeckling, and DM is our proposed despeckling model.

4.2 Experimental Results and Discussion

To evaluate the effectiveness of the despeckling model, we compared the results of change detection methods (namely PCA-k-means (PCAK) [3], NR-ELM [9], DDNet [22] and LANTNet [20]) with and without the despeckling model using three real SAR datasets. Figures 56 and 7 demonstrate the proposed despeckling model performance on Yellow River, Farmland and Ottawa datasets. DM has considerably enhanced the F1 score for existing (including state-of-the-art) change detection methods. In all these experiments, we empirically set the \(\lambda _{\text {SSIM}}\) to be 5 in the loss objective (3) as a trade-off between despeckling and change detection performance. Table 2 presents the OA and F1 score on three real SAR datasets for four CD methods. However, in Fig. 4, the NR-ELM algorithm with despeckling model achieved a lower F1 score because the Ottawa dataset is less affected by speckle noise. This is why we observe a higher F1 score for all other methods without DM. Additionally, compared to other methods, NR-ELM exhibits more resistance to speckle noise due to its built-in despeckling process within its architecture. Therefore, the decrease in the F1 score when incorporating the DM module is attributed to the extra despeckling process, which over-smooths the input image and subsequently reduces the F1 score.

Fig. 4.
figure 4

The correlation between DM and the F1 score for SAR CD datasets

It can be observed that in Yellow River and Farmland datasets, the proposed DM achieves a superior F1 score for CD methods compared to without DM (W/O) results due to the ability to efficiently cope with the single-look pre-change and multi-look post-change SAR images via robust loss function. It should be noted that CD methods without the despeckling model perform well on Ottawa dataset because the dataset is slightly affected by speckle noise. Nevertheless, the performance of CD methods was further improved with the proposed DM as presented in Table 2 and Fig. 4.

4.3 Hardware and Running Times

The experiments were conducted using three datasets (described in Sect. 4.1) on a Tesla GPU P100 with 16 GB of RAM and 147.15 GB of disk space, resulting in a training duration of approximately 11 h. The framework used to train the proposed despecking model was TensorFlow 2.0.

Fig. 5.
figure 5

Change detection results on Yellow River dataset. Rows: (1st row) Yellow River ground truth(GT), (2nd row) CD methods results without despeckling, (3rd row) the CD methods results with the proposed DM. Columns: (a) PCAk [3], (b) NR-ELM [9], (c) DDNet [22], and (d) LANTNet [20].

Fig. 6.
figure 6

Change detection results on Farmland dataset. Rows: (1st row) Yellow River ground truth(GT), (2nd row) CD methods results without despeckling, (3rd row) the CD methods results with the proposed DM. Columns: (a) PCAk [3], (b) NR-ELM [9], (c) DDNet [22], and (d) LANTNet [20].

Fig. 7.
figure 7

Change detection results on Ottawa dataset. Rows: (1st row) Yellow River ground truth(GT), (2nd row) CD methods results without despeckling, (3rd row) the CD methods results with the proposed DM. Columns: (a) PCAk [3], (b) NR-ELM [9], (c) DDNet [22], and (d) LANTNet [20].

5 Conclusion

In recent years, deep-learning architectures have shown promise in improving SAR change detection performance. However, the challenge of speckle noise persists in these methods. To overcome this challenge, we propose a despeckling model that effectively suppresses speckle noise and enhances the performance of existing change detection methods. Extensive evaluations and comparisons with state-of-the-art methods demonstrate the superior performance of our proposed despeckling model. It should be noted that our current approach focuses solely on a single-imaging modality. Future work of this work could explore the domain of multi-modal change detection, incorporating both optical and SAR data.