Introduction

Meteorological and climatic aspects in urban areas can be controlled or improved by a wise urban planning and management. Object recognition and land cover classification based on recent advances in airborne and space-borne sensor technologies and digital imaging techniques is a powerful solution for urban planning (Abbate et al. 1995; Alavipanah 2008).

The excessive use of fossil fuel energy and greenhouse gases caused by surface and atmospheric modifications due to the urbanization generally lead to a warmer climate than the surrounding non-urbanized areas. This phenomenon is the urban heat island. According to heat islands, thermal remote sensing data have been used over urban areas for indicating temperature differences and comparing the relationships between urban surface temperatures and land cover types (Voogt and Oke 2003). On the other hand, synthetic-aperture radar (SAR) sensors are playing an increasingly important role in land cover classification due to their ability to operate day and night through cloud cover, and capturing the structure and dielectric properties of the earth surface materials (Zhu et al. 2012).

Data fusion appears as an effective way for a synergistic combination of information from various sources in order to provide a better understanding of a given scene (Esteban et al. 2004; Stathaki 2008; Tabib Mahmoudi et al. 2013; Huang and Zhang 2012; Ran et al. 2012; Huang et al. 2012; Ye et al. 2014; Tabib Mahmoudi et al. 2014). According to the different characteristics of optic and SAR remote sensing data, they can complete each other. Thus, optic data fusion and SAR data fusion increase the capabilities of object recognition algorithms in urban areas (Abbate et al. 1995; Borghys et al. 2007).

As many land cover classes in an urban environment have similar spectral signatures, textural and structural information such as energy, entropy, contrast and topological relationships must be exploited to produce accurate classification maps. Already, many researchers have investigated the potential of the object-based image analysis (OBIA) approaches for dealing with high-resolution images and complexities in urban areas (Blaschke 2010; Laliberte et al. 2012; Tabib Mahmoudi et al. 2013).

In this research, an object-based image analysis methodology is proposed for feature-level fusion of high-resolution Landsat 8 optical and thermal bands and SAR image in order to improve the accuracy of land cover classification map in urban areas.

Object Recognition Algorithm

The proposed urban object recognition in this research is an object-based image analysis strategy composed of two main steps, image segmentation and knowledge-based classification of segmented regions (Fig. 1).

Fig. 1
figure 1

General structure of the proposed object recognition strategy

Image Segmentation

In the first step of the proposed object-based image analysis strategy, a multi-resolution segmentation technique is applied individually on the content of SAR and Landsat 8 images in order to segment each of them into image regions. The multi-resolution segmentation procedure starts with single image objects of one pixel and repeatedly merges a pair of image objects into larger ones. The merging decision is based on the local homogeneity criterion, describing the similarity between adjacent image objects (Baatz and Schape 2000; Tabib Mahmoudi et al. 2014).

Knowledge-Based Classification

For knowledge-based classification, the properties of each segmented region should be measured based on the characteristics of input data. In this research, per-segment spectral and textural characteristics of a segmented region generate region’s properties. After that, for optimum selection of spectral and textural features among all of the generated ones, it is necessary to perform diversity analysis. In this phase of our investigation, the process is based on the visual inspection of an expert operator and testing different thresholds for each feature.

Feature Measurement

In this proposed object-based image analysis algorithm, spectral features in the form of simple ratios (SR) and normalized difference indices (NDI) are measured based on the thermal and optical bands of Landsat 8 images (see Eqs. 1, 2).

$$ {\text{SR}}_{i,j} = {\frac{{\text{Band}}_{i}}{{\text{Band}}_j}} $$
(1)
$$ {\text{NDI}}_{i,j} = {\frac{{{\text{Band}}_{i} - {\text{Band}}_{j} }}{{{\text{Band}}_{i} + {\text{Band}}_{j} }}}. $$
(2)

The following spectral features are selected as the effective optical and thermal features for utilizing in the proposed knowledge base:

  • Temperature–vegetation index (TVX) is negatively related to water conditions. The major advantage of TVX is that it integrates both the reflective and thermal bands of remotely sensed data, which offers more spectral information for drought detection (Amiri et al., 2009; Jiang and Tian 2010; Orhan et al. 2014; Jiang et al. 2015).

$$ {\text{TVX}} = {\frac{{{\text{LST}}}}{{{\text{NDVI}}}}}, $$
(3)

where NDVI is the normalized difference vegetation index between red and near-infrared bands of Landsat 8 image (Eq. 4).

$$ {\text{NDVI}} = {\frac{{{\text{NIR}} - {\text{Red}}}}{{{\text{NIR}} + {\text{Red}}}}}. $$
(4)

Standard algorithm was applied to retrieve the land surface temperature (LST). The DN values are converted back to the radiance units using Eq. 5 in order to obtain radiance values.

$$ {\text{Radiance}} = \frac{{(L_{\hbox{max} } - L_{\hbox{min} } )}}{{(Q_{\text{CalMax}} - Q_{\text{CalMin}} )}} \times (Q_{\text{Cal}} - Q_{\text{CalMin}} ) + L_{\text{Min}} , $$
(5)

where \( Q_{\text{CalMin}} = 0, Q_{\text{CalMax}} = 65536 \;{\text{and}}\;Q_{\text{Cal}} = {\text{Digital Number}} \). Then, temperature is obtained from the following equation:

$$ T = \frac{{K_{2} }}{{\ln \left( {\frac{{K_{1} }}{{L_{\lambda } }} + 1} \right)}}, $$
(6)

where \( K_{1} \) and \( K_{2} \) are the sensor calibration coefficients for each band of Landsat 8, \( L_{\lambda } \) is the radiance calculated value of each band, and T is the temperature measured in Kelvin. Finally, the land surface temperature (LST) is calculated in Erdas Imagine software.

  • Chlorophyll vegetation index (CVI) is a spectral feature based on the relations between near-infrared, red and green bands (Hunt et al., 2013).

    $$ {\text{CVI}} = {\text{NIR}}\frac{\text{Red}}{{{\text{Green}}^{2} }} $$
    (7)
  • Coloration index (CI) is a spectral feature based on the relations between red and blue bands (Vina et al. 2011).

    $$ {\text{CI}} = \frac{{{\text{Red}} - {\text{Blue}}}}{\text{Red}} $$
    (8)
  • SR (b11, b10) is simple ratio between two thermal bands of Landsat 8 data

    $$ {\text{SR}}_{11.10} = \frac{{{\text{Band}}_{11} }}{{{\text{Band}}_{10} }} $$
    (9)

Moreover, for feature extraction from SAR image, gray value relationships between each pixel and its neighbors in the pre-identified segmented regions are utilized. Many researchers have been utilizing different texture analysis methods in their object recognition algorithms based on SAR data (Zhu et al. 2012). In this paper, entropy, contrast, correlation and mean are measured in gray-level co-occurrence matrix (GLCM) space as the optimum features on SAR image based on their capabilities for recognition of each of the individual urban object types (Table 1).

Table 1 Basic mathematics of textural features

These measured features generate a knowledge base for urban object recognition and classification of the segmented image regions. After generating the knowledge base, the proposed methodology performs feature-level fusion in order to utilize the capabilities of both SAR and Landsat 8 images for improving the accuracy of the classification map. The object classification can be performed by encapsulating the knowledge base into a rule set (Table 2).

Table 2 Sample structure of reasoning rules in the proposed object classification scheme

Experiments and Results

Dataset

The potential of the proposed object recognition and classification methodology is evaluated for Landsat 8 and SAR data over an urban area in Barcelona, Spain. The utilized SAR data were collected on May 2011 with 2.5 m spatial resolution. These SAR data are Terra SAR-X Strip map with HH polarization channel, and its angle of incidence is equal to 35.2. Landsat 8 data were collected on June 2014 and are pan-sharpened with the panchromatic band of SPOT5 which is collected on the same region. Pan-sharpening increases the spatial resolution of the multispectral images in order to improve object recognition results. The pan-sharpened Landsat 8 image with all of the optical and thermal spectral channels is acceptable in both of the spatial and spectral resolutions for object recognition (Fig. 2).

Fig. 2
figure 2

a SAR image, b Landsat 8 image

The investigated area is a city including a number of large and high buildings and a mixture of community parks and private housing.

Obtained Results

In the stage of performing segmentation, the multi-resolution segmentation algorithm is applied to the content of each of the individual images using eCognition software. 250, 0.8 and 0.9 are defined for scale, shape and compactness parameters, respectively. Then, the spectral and textural features mentioned in “Feature Measurement” section are measured on all of the image regions on the Landsat 8 and SAR data.

Figure 3 depicts all of the measured spectral and textural features on input data those utilized for classification. Threshold setting for each object class is performed semiautomatically by an expert operator using the quantitative and visual analysis on the features in feature view of eCognition software.

Fig. 3
figure 3

Measured features on Landsat 8 (ad) and SAR (e, f): a SR (band11, band10,), b TVX, c CVI, d CI, e GLCM contrast, f GLCM mean

As it is recognized from Fig. 3, after performing defined threshold values, CVI spectral feature is capable for filtering non-vegetation regions such as shadow and ground. Moreover, CI has been utilized for detecting vegetation and water bodies. According to the expert investigations performed, thermal features such as SR (b11, b10) and TVX are capable for detecting warmer structures such as built-up areas. However, SAR texture images are capable for recognizing vegetation and shadow areas. Therefore, fusion of these two kinds of data with various natures can help better classify urban objects.

In the second stage of the proposed object-based image analysis algorithm, building, road, vegetation, water bodies and shadow areas are recognized based on the feature-level fusion of Landsat 8 and SAR data. Moreover, for investigating the capabilities of the proposed feature fusion, the results of performing OBIA on Landsat 8 just using spectral features are also compared (Fig. 4).

Fig. 4
figure 4

a Classification results of feature-level fusion on SAR and Landsat 8, b classification results of performing OBIA on Landsat 8

Comparisons depict the improvements in road and vegetation detection after performing feature-level fusion. Moreover, for the quantitative evaluation of the results, some segmented regions of the pre-defined object classes are manually selected by an expert operator. Sample areas are compared with their corresponding results from object recognition algorithm. The comparison is based on the number of correctly detected pixels (true positive), wrongly detected pixels (false positive) and the not correctly recognized pixels (false negative), determined after the object recognition.

According to Table 3, after performing OBIA based on spectral features on Landsat 8, utilizing SAR texture features can improve the overall accuracy and kappa to 2.48 and 0.06, respectively. Moreover, using quantitative values for each object class, correctness and quality criteria are determined for the results.

$$ {\text{Correctness}} = \frac{\text{TruePositive}}{{{\text{TruePositive}} + {\text{FalsePositive}}}} $$
(10)
$$ {\text{Quality}} = \frac{\text{TruePositive}}{{{\text{TruePositive}} + {\text{FalsePositive}} + {\text{FalseNegative}}}}. $$
(11)

Comparisons showed the improvements in the road and vegetation detection after performing feature-level fusion of SAR and Landsat 8 data. However, no considerable progress can be seen in recognition of other object classes. Also, there are some negative progress in building class because of spectral similarities between asphalt road and building’s roofs which can be solved using elevation data (Fig. 5).

Table 3 Comparison between the accuracies of classification results
Fig. 5
figure 5

a Comparison between the quality of the classification results, b comparison between the correctness of the classification results

Discussion and Conclusion

The object-based image analysis strategy is proposed for feature-level fusion of Landsat 8 spectral features and SAR texture features. According to the spectral and textural similarities between objects in urban areas such as building’s roofs and road surfaces, generating an accurate classification map in urban areas is so complicated. Fusion of the capabilities of SAR data in deriving soil moisture and surface roughness, with thermal patterns derived from thermal infrared data, is investigated as a solution for producing better classification map. The results of feature fusion of SAR and Landsat 8 data depicted some improvements in object recognition, especially in vegetation and road classes those SAR texture images affected on.

Despite the improvement in the accuracies, this method still needs further modifications in the field of defining the contextual information such as neighborhood definition for each of the regions, utilizing the digital elevation models as input data and using artificial intelligence techniques such as multi-agent system for decreasing the classification errors related to the lack of segmentation capabilities. For performing the proposed feature-level fusion method on the object-based image analysis of other remotely sensed datasets, all aspects of the method are transferable, but only the spectral features may need to modify according to the spectral capabilities of new datasets.