Keywords

1 Introduction

Conventional (manual) crowdsourced map updating procedures utilises remote sensing imagery as a background layer to guide mappers as they manually digitise objects (e.g. buildings, roads, etc.). For example, OpenStreetMap (OSM) allows for the use of multiple satellite image sources when updating their maps [1]. However, to detect changes in satellite imagery automatically when comparing to the latest versions of online vector maps is an important next step for many GIScience related problems, including mapping.

Previously, we introduced our methodology for detecting changes (both constructions and destructions) between vector maps and raster images [2]. Consequently, a series of experiments was conducted to evaluate the accuracy of this OSM-GAN procedure. This paper reports on these experiments and other related outcomes of OSM-GAN predictions with various datasets.

Specifically, this study evaluates the prediction accuracy of various OSM-GAN models on several spatial datasets to select the best change detection model for use in further map updating operations. Two different raster and vector data sources were tested: 8-bit (panchromatic) and 24-bit (RGB) raster image data with spatial resolution 15 cm/pixel and 30 cm/pixel [5] and; OpenStreetMap (OSM) vector map data plus Ordnance Survey IrelandFootnote 1 (OSi) building footprint data of Dublin city centre were also used in conjunction with the raster data mentioned above.

1.1 Data Sources

Raster Data

Raster satellite image data was used to train the OSM-GAN models for detecting any changes to the map in a given Area of Interest (AoI). First, a satellite image dataset was created using freely available Google Earth satellite images discovered online using customised data crawlers that considers both spatial resolution (15 cm and 30 cm) and AoI. Second, a 25 cm resolution aerial orthophoto dataset of Dublin area registered to the Irish Transverse Mercator (ITM) coordinate system was acquired from OSi with an academic research license.

These TIFF (Tagged Image File Format) orthophotos needed to be pre-processed before inputting to the deep neural network – e.g., resampled to 30 cm pixels, co-registered, tiled, and served from QGISFootnote 2. Figure 1 shows the qualitative differences between both data sources (Google Earth and OSi). Note how some buildings currently visible in the Google Earth imagery are not present in the OSi orthophotos as they have since been demolished in preparation for constructing the new TU Dublin campus.

Fig. 1.
figure 1

Comparison of Google Earth satellite images (first row) and OSi aerial ortho images (second row) of the same AoI around Grangegorman, Dublin. The displayed resolution for both is 30 cm/pixel. The Google Earth images appear more vivid, the OSi images can be obtained at a higher resolution.

Vector Data

OSM vector data was the primary map data source checked for changes in this study. As OSi building footprint (vector) data is produced by Ireland’s National Mapping Agency, it was used as ground truth for model training and prediction purposes. The OSM vector data was downloaded using their Overpass API [3] by first parsing the minimum bounding rectangle (MBR) of a user generated AoI. The OSi building footprints were provided in DWG (AutoCAD) format. A series of operations converted the DWG formatted data into GeoJSON format to be compatible with further processing steps. Table 1 gives a summary of both the OSM and OSi building footprint datasets.

Table 1. Details of the two vector datasets.

1.2 The Kay Supercomputer

The Irish Centre for High-End Computing (ICHEC) allows institutional users (e.g. academic researchers) access to its super computing infrastructure, named Kay [4]. Kay is comprised of five sub-components: Cluster, GPU, Phi, High Memory, and Service and Storage. Specifically, for experiments in this study, the GPU service was utilised.

The GPU service is a partition of 16 nodes where each node has 2 × 20-core 2.4 GHz Intel Xeon Gold 6148 (Skylake) processors, 192 GiB of RAM, a 400 GiB local SSD for scratch space and a 100 Gbit Omni-Path network adaptor. Two NVIDIA Tesla V100 16 GB PCIe (Volta architecture) GPUs are integrated on each node. Each GPU has 5,120 CUDA cores and 640 Tensor Cores. In order to reduce training times, this study tested the Kay Supercomputer with various parameter settings [4]. As such, overall training times per model reduced from a few days spent training on a high-end “gamer spec” laptop, to just a few hours on Kay.

2 Related Work

Spatial change detection is a well-researched area in both the GIScience and computer vision domains. Historically, many different image processing techniques, including Markov Random Field [6] and Principal Component Analysis [7], were used to perform spatial change detection operations. More recently, artificial neural network-based techniques have been introduced to address various limitations of more traditional approaches (e.g. to overcome low performance, low segmentation accuracy, higher time complexity, etc.).

Now, common solutions to this problem rely on applying Machine Learning (ML) techniques such as U-Net [8], SegNet [9, 10], Mask R-CNN [11], and Pix2Pix among others [12, 13]. These approaches follow the encoder-decoder architecture to perform image segmentation, a critical step in any change detection process. In particular to GIS, image segmentation attempts to isolate various entities in the environment visible in aerial imagery. For example, entities such as buildings [14,15,16], road networks [17,18,19,20,21,22,23,24], and land-use classifications [25,26,27,28].

With the emergence of Generative Adversarial Networks (GANs), image segmentation procedures have been redefined as image-to-image translation. GANs are a ML technique of training a generative neural network model by representing the task as a supervised learning problem with two sub-models: a generator that learns to generate new examples; and a discriminator that tries to classify these generated examples as either real or fake (generated) [29]. Several studies have proposed GAN-based solutions for various spatial change detection problems, such as generating heat-maps of possible changes [30], seasonal change detection [31], and image classifications [32].

2.1 OSM-GAN for Spatial Change Detection

The OSM-GAN approach presented in this paper suggests a change detection methodology that employs spatial imagery (satellite images) and OSM vector map data [2] to train its models. The deep learning model should be accurate enough to detect image objects (e.g. buildings) to predict any change detection outcomes to these map features. Technically, the OSM-GAN model needs to perform a satellite image to feature-map translation with a high confidence level [2]. This paper proposes a methodology to evaluate various GAN models (trained with different datasets to perform satellite image to feature-map translation) to detect spatial changes accurately.

Producing our OSM-GAN model begins with the data crawling process. Freely available raster and vector data sources are crawled (mined) and saved in local directory structures ready for further processing. Geo-referenced satellite images are merged to construct the left half of the training sample, and OSM vectors are merged into a binary (black & white) single image to create the right half of the training sample. This process results in a single 600 × 300 pixel sized training sample as shown in Fig. 2.

Fig. 2.
figure 2

One sample of the OSM-GAN training dataset. The left side illustrates the satellite image component, and the right side shows the corresponding feature-map.

An object-density based data filtering mechanism is used to remove false-negative data samples from the training dataset [2]. The filtered dataset is then split into a 3:2 ratio of training:validation sets of data. Finally, these datasets were fed into the training algorithm on the Kay supercomputer to generate the resulting OSM-GAN model.

To initiate the change detection process, a feature-map (binary image that represents particular map features (e.g. buildings) as white blobs) needs to be predicted for a particular satellite image using the OSM-GAN model generated previously. Then the predicted feature-map is segmented into separate objects and compared to current OSM vector data using an Overlap Score Matrix (percent overlap of a feature-map object and its OSM vector footprint). Finally, any detected changes are post-processed to compose OSM-acceptable changesets. Figure 3 illustrates the overall workflow for spatial change detection based on OSM-GAN.

Fig. 3.
figure 3

System architecture of proposed OSM-GAN methodology for spatial change detection.

3 Experiments and Results

For this study, a series of experiments were conducted to evaluate the accuracy of OSM-GAN models qualitatively and quantitatively. Sixteen OSM-GAN models were trained with different datasets. A combination of two different spatial resolutions (15 cm and 30 cm) with two different types of images (panchromatic and RGB) were used to create the raster image segment (left half of the training sample). OSM and OSi vector data were used to create the right half of the training sample (Fig. 2). For instance, the Google_OSi_8bit_z19 dataset was created from panchromatic (8-bit) Google Earth satellite images with 30 cm/pixel and OSi vectors. After the training process, the final model is named the same as the name of the dataset used to create it.

3.1 Modelling OSM-GAN with OSi Data

OSi raster (orthophotos resampled to 30 cm and 15 cm pixels to match the satellite data) and vector building footprint data was used in this experiment. Four datasets with different spatial resolutions and bit-depth were created from the above-mentioned sources. These datasets were smaller than the OSM datasets since the data provided by OSi was of a limited area of Dublin city centre only. Table 2 summarises the two datasets produced for this experiment.

Table 2. Details of datasets used in OSi-OSi experiment

Upon completion of the training process, each model was evaluated on a new dataset within the same AoI. These results were qualitatively and quantitatively evaluated, and Accuracy, Recall, Precision and F1 score measurements were calculated for each model (Table 3). The model trained with 30 cm/pixel resolution RGB images can be considered more accurate than the other three models.

Table 3. Quantitative evaluation of the model trained with OSi Orthophoto and OSi vector data

Figure 4 qualitatively compares the outcomes of the above-listed OSM-GAN model predictions. It can be seen in Fig. 4a that the OP_OSi_24bit_z19 model gives comparatively more accurate results, demonstrating the importance of a qualitative analysis of testing. For example, the OP_OSi_8bit_z20 model predicted a large building that could be identified as an “extension” to existing OSi vector data by the subsequent change detection process.

Fig. 4.
figure 4

Predictions of OSM-GAN model trained with OSi Orthophotos and building footprints.

3.2 Modelling OSM-GAN with OSi-OSM Data

The second experiment was designed to investigate the consistency/coherence between OSi Orthophotos and OSM vectors. Apart from a difference in the spatial reference system used, it was observed that current OSM vectors of the test area are outdated. Therefore, many data samples were filtered out in the data filtering phase. Table 4 summarises the generated datasets using OSi Orthophoto images and OSM vectors.

Table 4. Details of datasets employed in OSi-OSM experiments.

A quantitative analysis of the testing results is listed in Table 5. In this case, the model trained with a 30 cm (z19) 8-bit dataset showed the highest accuracy, while the first experiment produced a model trained on RGB images as the most accurate; in both cases with the same resolution.

Table 5. Quantitative results of the OSi-OSM experiment.

Figure 5 illustrates the qualitative comparisons of the models in Table 5. Since visual comparisons show only two random instances taken from the test dataset, the visual results perhaps do not support well the quantitative measurements obtained above. However, to improve the change detection workflow, the results should be accurate both quantitatively and qualitatively.

Fig. 5.
figure 5

Qualitative analysis of predictions from the model trained with OSi Orthophotos and OSM building footprints.

3.3 Modelling OSM-GAN with Google-OSi Data

Models with Google Earth satellite images and OSi building footprints were trained in a third experiment. Four datasets were created using different spatial resolutions and bit-depths. A relatively small AoI fitting OSi boundary constraints was applied to the Google Earth image crawler to collect the relevant satellite images or the area. Table 6 summarises the details about these datasets.

Table 6. Details of the datasets that used in the Google-OSi experiments.

Table 7 lists the quantitative measurements calculated for the trained models. A model trained with 15 cm RGB images scored better quantitative results than the other three models. After comparing to previous experiments (OSi-OSi and OSi-OSM), the model trained with higher resolution images is quantitatively more accurate.

Table 7. Quantitative results obtained from experiments conducted with Google-OSi datasets

Figure 6 shows a qualitative comparison of some prediction samples. The predictive results of the Google_OSi_24bit_z20 model (Fig. 6b) agree with the above quantitative results. The predicted polygons can be used in the subsequent change detection process since they are allied to ground truth polygons.

Fig. 6.
figure 6

Predictions of models trained with Google Earth satellite images and OSi buildings.

3.4 Modelling OSM-GAN with Google-OSM Data

Finally, yet importantly, Google Earth satellite images and OSM vectors were combined to perform another training phase. Since both data resources are free and unlimited, a wider AoI was chosen and crawled to create the following datasets.

Table 8. Details of datasets used in Google-OSM experiment.

The above-listed datasets were used to train four OSM-GAN models. These models were then evaluated using the same accuracy measurements such as Accuracy, Recall, Precision, and F1 Score (Table 9). The model trained with 30 cm RGB Google Earth satellite images and OSM vector footprints performed better. Significantly, this is the most accurate OSM-GAN model obtained when compared to all the models evaluated in the four experiments.

Table 9. Quantitative measurements obtained from the final experiment.
Fig. 7.
figure 7

Qualitative comparisons of models trained with Google Earth satellite images and OSM building footprints.

Figure 7 qualitatively compares the prediction results of models trained with Google Earth satellite images and OSM building footprints. In this case, the prediction results of Google_OSM_24bit_z19 shows the best qualitative test result, agreeing with the quantitative results above. Comparing all 16 models tested, the results of model Google_OSM_24bit_z19 suggests to train 30 cm Google Earth satellite images with OSM vector data to obtain the most accurate OSM-GAN models for change detection.

4 Conclusions

This paper presented sixteen different OSM-GAN based experiments with a quantitative analysis of each model produced, as well some qualitative observations. It evaluated different OSM-GAN models against different raster and vector data sources. Each dataset offers its own benefits and limitations and the qualitative results motivated continued training with larger area datasets.

The study concludes that the vector footprint generated by OSM-GAN image-to-image translation could be extended to spatial change detection procedures. Experiments show that model training with larger datasets (i.e. datasets built from Google Earth satellite images and OSM building footprints) yielded more accurate feature-map predictions.

Since the proposed spatial change detection methodology is highly based on OSM-GAN model accuracy, the model should be as accurate as possible to translate a satellite image to its corresponding feature-map. The final experiment reveals that training on larger sets of geographically similar areas could be a solution to generating more accurate OSM-GAN models. In other words, models trained with Dublin data should not be used for mapping Paris, for example. The model trained with Google Earth satellite images (24-bit, 30 cm/pixel) and OSM building footprints scored the highest accuracy (88.4%) among all the experiments. Moreover, the predictions of the Google_OSM_24bit_z19 model can also be observed qualitatively as more accurate than the other model predictions.

The ultimate objective of this research is to build an end-to-end workflow to update crowdsourced maps automatically with the use of freely available data (satellite images, vector footprints) and Artificial Intelligence (AI) techniques. Automated map update success ultimately depends on the accuracy of the ML change detection process since correctly identifying spatial changes in the map is an initial key step in this process. Compared to contemporary approaches for automatically detecting spatial changes, the proposed OSM-GAN approach offers an appropriate mechanism to follow.

As a next step, a new OSM-GAN model will be trained on the original 25 cm Dublin orthophotos with a larger AoI to potentially produce even more accurate feature-map results. The training process will utilise a Transfer Learning approach and so begins by initiating the process using the Google_OSM_24bit_z19 model parameters as the base input model. In order to evaluate the performance and accuracy of OSM-GAN against other spatial change detection models in the literature, a new test phase will also be carried out on the crowdAI Mapping Challenge datasetFootnote 3 in future work.