Introduction

Urbanization is the process characterized by urban land expansion and demographic changes as well as a drastic metamorphosis of land use pattern that effect on the physical boundaries of the cities (Grimm et al. 2008; Sun and Zhao 2018). Unrestrained urbanization has resulted in intensive changes in the natural landscape (Haas and Ban 2014), urban ecosystem (Zhang et al. 2017), biodiversity (Li et al. 2016a, b), urban microclimate (Weng et al. 2008; Sannigrahi et al. 2017), and energy flow (Decker et al. 2000) in various spatio-temporal scale (Sun and Zhao 2018). According to the report of UN World Urbanization Prospects (United Nation 2014), about 54% of the world total population were residing in the urban area in 2014 and it was projected to reach up to 66% in 2050. The cities of India are no exemption of this. The urban population of India is 31.16% (Census of India 2011) and it is estimated that it will reach 66% in 2050 (United Nation 2014). Apart from this, as per Census of India 2011, the growth of urban population and urban land expansion of the large mega-cities are at an unpretending pace than the medium of small cities (Census of India 2011). According to Chen et al. (2014) the process of urbanization is the combined result of economic development, continuous migration and various socio-environmental facilities in urban areas. However, the impromptu urbanization is the focal concern of the planners because, it has several implications and impacts (Sharma et al. 2013). At the present context, urban land transformation and its impact on the urban landscape and urban system are the most studied phenomenon (Jiang and Tian 2010; Ng et al. 2011; Kuang 2012; Scolozzi and Geneletti 2012; Liu et al. 2016). Comprehensively studies have been done worldwide to quantify of environmental alteration due to rapid urbanization (Kates et al. 2001; Ren et al. 2003; Sharma et al. 2013; Hull et al. 2015).

The continuous and rapid urbanization can alter the numbers of physical and biological characteristics of urban landscape, including vegetation cover (Li and Liu 2017; Shifaw et al. 2018), water bodies (Yang et al. 2008; Popa et al. 2012; Sun et al. 2016; Ghosh et al. 2018), soil properties (Voogt and Oke 2003) and change of microclimate (Zhou et al. 2004; Chen et al. 2006; Li et al. 2009; Pal and Ziaul 2017) due to the expansion of impervious surface i.e. built-up area (Sharma et al. 2015; Li et al. 2016a, b; Lu et al. 2017). It is indispensable to understand the impact of urbanization on the urban environment because, the sustainable urban growth can be achieved through the proper understanding of the coupling relationship between urbanization and its environmental impact (Li and Ma 2014). In this perspective, the aim of this present study is to quantify and analyze the spatio-temporal patterns of urbanization and urban biophysical components in the city Kolkata. Generally, biophysical components are defined as a set of indicators that are able to track the human impact on a given environment (Dietz et al. 2007; Sannigrahi et al. 2017). Oke (1987) stated that the increase of impervious surface due to rapid urbanization could alter the urban biophysical components that can significantly affect Earth-Atmospheric energy process at micro scale. The mean average temperature at the local level is gradually increasing due to unprecedented change of landuse land cover (LULC) (Hoffmann et al. 2012).

Unpredicted and uncontrolled urban growth with increases of impervious area and haphazard development are the main features of Indian urbanization which causes significant decreases of agriculture land, vegetation covers, wetland and other natural water bodies and increased pollution, slum development and various social economic problems (Sudhira et al. 2004; Rahman et al. 2011; Punia and Singh 2012). Therefore, the modern remote sensing techniques are considered as an invaluable tool for quantification and monitoring of urban land cover type with a higher level of accuracy (Bhatta 2009; Khan et al. 2017). The detailed and enhanced information can be obtained from remote sensing data, with automatic and semi-automatic techniques and this information, i.e. images can be offered from the past. Consequently, the dynamic of the urban landscape can be easily monitored (Mushore et al. 2017; Santos et al. 2017). Landsat satellite imageries (such as MSS, TM, ETM+) are extensively used as the input database for various researches to extract the urban built-up area as well as estimation of other various biophysical elements (Yeh and Li 2001; Chen et al. 2006; Bhatta 2009; Sharma et al. 2013; Bhatti and Tripathi 2014; Sun and Zhao 2018). Researchers employ various remote sensing based indices to monitor urban expansion such as Normalized Difference Built-up Index (NDBI), Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI), Normalized Difference Bareness Index (NDBal), Modified Normalized Difference Water Index (MNDWI) etc. and their impacts (Chen et al. 2006; Sharma et al. 2013; Li and Liu 2017). Table 1 highlighted the various indices and their application in urban research along with references.

Table 1 Different urban biophysical parameters and their applications in urban study

The Land surface temperature (LST) is considered as one of the important biophysical parameters for the analysis of urban health (Xiao and Weng 2007). The increase of impervious surface, i.e. built-up area due to modification and expansion of LULC in the urban area results in higher surface temperature (Qin et al. 2001; Amiri et al. 2009; Li et al. 2009; Zhou et al. 2011; Sharma et al. 2015; Wang et al. 2018). Kalnay and Cai (2003) considered that LST is one of the important indicators for monitoring urban ecological performance. Weng et al. (2008) stated that LST is closely associated with urban biophysical attributes such as NDVI, NDBI, NDWI, and MNDWI. Usually, LST in the urban area is higher than other non-urban area and it is known as Urban Heat Island (UHI) effect (Chen et al. 2006). The UHI is related to the intensity of the built-up area (Trlica et al. 2017). Moreover, several studies have been carried out analyzing the relation between greenness (NDVI), impervious land (NDBI) and land use and land cover changes (Chen et al. 2006; Weng et al. 2004; Xiao and Weng 2007; Zhou et al. 2011) with LST (Sandholt et al. 2002; Weng et al. 2006; Raynolds et al. 2008; Julien and Sobrino 2009; Estoque and Murayama 2017). Moreover, it is immensely important to assess the relationship among the various biophysical components with the variation of LULC. However, in this background, the present study is trying to evaluate and analyze the spatio-temporal patterns of LULC, dynamics of biophysical composition and LST and also assess the urban footprint using mixed method of remote sensing technique and statistical calculation. Moreover, researchers attempted hot spot analysis, which is also useful to assess the high concentration of the LST visually (Adeyeri et al. 2017; Tran et al. 2017).

So, there are extensive evidences of remote sensing based research on the urban biophysical composition and urban heat island phenomena (UHI) (Chen et al. 2006; Amriti et al. 2009; Sharma et al. 2013; Sannigrahi et al. 2017). Thus the study highlights the spatio-temporal dynamics of urban land use and its associated bio-physical parameters such as vegetation cover, urban water bodies, soil moisture, built-up area as well as a change of land surface temperature (LST). The two explicit objectives of this research are (a) to assess the change of urban biophysical components with the change of land use, and (b) to find out the relationship between Land Surface Temperature (LST) and urban biophysical components and their spatio-temporal patterns. Therefore, the present study may be significant of the for the sustainable urban management.

Study area

Kolkata Urban Agglomeration (KUA) known as Kolkata Metropolitan Area (KMA) is the 3rd largest Urban Agglomeration (AU) in India (Census of India 2011), spreading over 1851 Km2 area situated on 22°00′19″ North latitude to 23°00′01″ North latitude and 88°00′04″ East latitude to 88°00′33″ East longitudes (Fig. 1). It has developed as a linear and contiguous urban concentration in the North–South direction along the bank of the Hoogly River in both Eastern and Western part. This urban agglomeration consists of 3 Municipal Corporation viz. Kolkata Municipal Corporation (KMC), Howrah Municipal Corporation (HMC) and Chandannagar Municipal Corporation (CMC), and 38 municipalities which are contagious with these 3 Municipal Corporations along with 77 Census towns i.e., non-municipal town and 16 outgrowths (UGs). Apart from this, the entire urban areas are surrounded by 445 rural villages consists of the settlement, agricultural land, and green belt. The tropical wet and dry climate is mainly dominated over the area with 1650 mm monthly temperature ranging from 18 to 35 °C. The population of this agglomeration is 14.72 Million and the average population density is 7950 persons/Km2 (Census of India 2011). The projected population of Kolkata urban agglomeration for 2021 will be 21.10 Million with 1.8% annual growth rate (Census of India 2011). The high population density, compact urban growth and widespread transit network with low infrastructure only (6%) are the major spatial outlook of this agglomeration (Census of India 2011). As a progressive metropolitan, KMA is habitually faced problems like air pollution, poverty, excessive migration, increasing slum population, traffic congestion and various social economic problems (Bhatta 2009) due to rapid and unplanned urban growth (Dasgupta et al. 2013). Particularly surrounded rural areas are becoming more vulnerable in terms of loss of vegetation, natural water bodies, soil and water pollution (Ramachandra et al. 2014; Mukherjee 2015; Mithun et al. 2016).

Fig. 1
figure 1

Location of the study area along with different administrative boundaries and validation points

Materials and methods

Database

Landsat images of 1991, 2001, 2011 and 2017 are used for this study. Landsat Thematic Mapper (TM) images of 1991, 2001 and 2011 and Operational Land Imager-8 (OLI-8) images for 2017 are used for measurement of urban biophysical parameters. The Thermal Infrared Sensor (TIRS) was also used for retrieval of the LST. The multi-temporal images were collected from the USA Geological Survey (http://earthexplorer.usgs.gov/). The collected images were pre-georeferenced to the UTM-Zone 45 North projection with WGS-84 datum (other information about collecting images are given in Table 2).

Table 2 Selected satellite data used for this analysis

Image pre-processing

The collected images were pre-processed using ERDAS Imagine 2014 software. Spectral enhancement was done by computing band-rationing combination with the help of vegetation indices to illustrate the relationship between various biophysical parameters like built-up intensity (NDBI), greenness (NDVI), wetness (NDWI) water bodies (MNDWI) etc. with the thermal behaviour (LST) of an urban area. No further atmospheric correction was done as the cloud-free images were selected for analysis (Deng and Wu 2013; Bhatti and Tripathi 2014; Khan et al. 2017). The boundary of the study was clipped initially to reduce null pixels at the processing stage and the careful observation was taken during band combination from false color composition (FCC) (Zha et al. 2003). Landsat TM band 4 and 5 are most appropriate and functional to extract built-up area from other areas (Zha et al. 2003; Bhatti and Tripathi 2014; Khan et al. 2017). But OLI band is differing from TM bands, therefore, the spectral signatures of the built-up area should be checked (Bhatti and Tripathi 2014). At the same time, the other associated signatures, i.e. vegetation, built-up area, water bodies, wetness etc. should be examined for various multi-temporal images (Khan et al. 2017). Red band (Band-3, wavelength 0.63–0.69 µm) and Near-Infrared (NIR) band (Band-4, wavelength 0.76–0.90 µm) were used to identify vegetation cover in an area (Maxwell and Sylvester 2012). The built-up area was estimated by Zhang et al. (2009) using Near Infrared band (NIR) (Band-4, wavelength 0.76–0.90 µm) and Shortwave middle-infrared (SMIR) (Band-5, Wavelength 1.55–1.75 µm). But we confirmed the Digital Numbers (DN) of various land types, using 30 samples of DN values both for TM and OLI-8 bands, and plotted in Fig. 2 and it is shown that higher spectral reflectance of vegetation in both TM 4 and OLI 5 band. In this way, a positive value will be given by subtracting band 3 and 4 for TM and subtraction band 4 and 5 for OLI-8 which notably indicates the vegetation (Zha et al. 2003).

Fig. 2
figure 2

Comparison of spectral signatures of built-up land, vegetation cover, water bodies, barren land and agricultural land between a optical bands of 1–5 and 7 of Landsat TM and b optical band of 2–6 and 7 of OLI-8

To produce thermal maps, noise reduction is required, especially for thermal infrared band (TIR) (Chen et al. 2006). The noise can be distressing through retrieval of LST and thus we adopt a self-adaptive filter method for reduction of non-periodic noise and fast fourier transform method (FTM) for removal of periodic noise. These tasks were performed by ERDAS Imagine 2014 software. The DN values of thermal bands of Landsat OLI-8 (TIRS, band 10 and 11) were converted Top of Atmosphere (ToA) reflectance and at-satellite brightness temperature in °C was determined (USGS 2015). All the images were geometrically corrected and rectified using 450 ground control points (GCPs) collected through GPS and Google Earth (Fig. 1). All the GCPs were validated by Root Means Square Error (RMSE) and the range of RMSE is 0.30 to 0.81 pixels which are accepted for further processing (Askne et al. 2003).

Method of land uses land cover (LULC) classification

The Landsat satellite data were used to LULC classification. There have several of methods for the classification of urban LULC such as object-based classification (Drǎguţ and Blaschke 2006), algorithm-based classification (Mather 2004), artificial neural network (ANN) and support vector machine (Mitra et al. 2004; Van der Linden et al. 2007) etc. But supervised classification is the most useful methods for extraction of LULC map particularly in the urban area (Sahana et al. 2018). The maximum likelihood (ML) classification algorithm was adopted for supervised LULC classification as the ML algorithm is most useful and well known parametric classifiers (Otukei and Blaschke 2010). The ML algorithm is based on Bayes’ theorem for computing the most likely class (\({{\upomega }}_{j})\) from a set of N classes to any spectral band i.e. feature vector (\(x\)) which have a highest posterior probability, \({\text{Pr}}({{\upomega }}_{j} \left| x \right.)\). Therefore, all posterior probability is calculated i.e. \({\text{Pr}}({{\upomega }}_{j} \left| x \right.)\), \(j\in \left[1\right]\)and highest value are selected with likely class, \({{{{\upomega}}}_j}\). The calculation of \({\text{Pr}}({{{{\upomega}}}_j}\left| x \right.)\) is:

$${\text{Pr}}({{{{\upomega}}}_j}\left| x \right.) \,=\,\frac{{{\text{Pr}}({{{{\upomega}}}_j}\left| x \right.) \times \Pr \left( {{{{{\upomega}}}_j}} \right)}}{{\Pr \left( x \right)}}.$$
(1)

Posterior probability with feature vectors (\(x)\) should be classified with most likely class \({({\upomega }}_{j})\). The major advantage of this classifier is that it considered variance–covariance values within the class distribution and therefore it is well performed than other parametric classifiers (Erbek et al. 2004; ERDAS 2009). EARDAS Imagine 2014 is used to perform supervised classification. Therefore, classification scheme has been developed in two stages: first, LULC types have been determined (detailed LULC classification scheme are shown in Table 3) and, then, accuracy assessment has been done through ground truth. LULC change has been calculated using a change matrix technique using ArcGIS

Table 3 Various LULC classes and their descriptions

Accuracy assessment

Accuracy assessment is done for LULC classes and LULC change of the study area. It indicates the degree of difference between classified images and reference data. Thus, to determine the quality of information extracted from the data, classification accuracy of 1991, 2001, 2011 and 2017 images were analyzed. We use some of the accuracy statistics, namely, the overall accuracy (OA), user’s accuracy (UA), producer’s accuracy (PA) and Kappa coefficient as accuracy statistics were derived from the error matrix to assess the classification accuracies (Congalton and Green 2009)

$$OA\;=\;\frac{{Number~of~true~positive+Number~of~true~negative}}{{~Pixels~~in~the~ground~truth}}\; \times \;100\% .$$
(2)

UA is measuring the commission error and represent the probability of classified pixel represent the category truly on the ground when PA represents the how the classification was fit (Zhou et al. 1998). The PA and UA are derived as:

$$UA=\frac{{Row~element{s_{diagonal}}}}{{Ro{w_{total}}}},$$
(3)
$$PA=\frac{{Column~element{s_{diagonal}}}}{{Colum{n_{total}}}}.$$
(4)

Kappa coefficient is used for measures of inter-observer agreement for characterized items and widely used for LULC accuracy assessment (Foody 1992). Kappa coefficient \((K)~\)is calculated as:

$$K=\frac{{n\mathop \sum \nolimits_{{k=1}}^{q} {n_{kk - \mathop \sum \nolimits_{{k=1}}^{q} {n_{k{+^{{n_{+k}}}}}}}}}}{{{n^{2 - \mathop \sum \nolimits_{{k=1}}^{n} {n_{k+{n_{+k}}}}}}}},$$
(5)

where n is the total no of pixels the references data, nkk is the total no of i class, nk is the total no of pixels for the ith class derived from the classified data, n+k is the total no of pixels for the ith class derived from the reference data, q is the total no of class. The value of K with more than 0.85 is considered for excellent agreement (Monserud and Leemans 1992).

Extraction of biophysical components

Normalized difference built-up index (NDBI)

The NDBI is the important index to the extract built-up area i.e. impervious areas (Chen et al. 2006). It is efficiently used to extract built-up area from remote sensing data using the reflectance of Middle Infrared (MIR) and Near Infrared (NIR) (Zha et al. 2003). It is a common and useful technique used by the researchers for the identification of impervious surface (Zha et al. 2003; Zhang et al. 2009). The generalized equation of NDBI is

$$NDBI=\frac{{MIR - NIR}}{{MIR+NIR}},$$
(6)

where MRI is the middle infrared band and (band 5 for TM and band 6 for OLI-8) and NIR is the near-infrared band (band 4 for TM and band 5 for OLI-8). The Value of NDBI range from − 1 to + 1 and values closer to 0 represents vegetation cover, the negative value represents water bodies and a positive value indicates the built-up area (Zha et al. 2003).

Normalized difference vegetation index (NDVI)

Increases of the built area can be easily comprehended by the loss of vegetation cover and NDVI is a useful technique to determine the greenness of any area (Sharma et al. 2013). It is a significant variable for the analysis of urban growth and urban micro-climatic phenomena (Chen et al. 2006). Townshend and Justice (1986) calculated a method of NDVI extraction using reflectance of NIR and red band (R). It expressed as:

$$NDVI=\frac{{NIR - R}}{{NIR+R}},$$
(7)

where NIR represents band 4 for Landsat TM and band 5 for OLI-8 and R represent band 3 for TM and band 4 for OLI-8. The range of NDVI value is − 1 and + 1. The large value of NDVI indicates vegetation cover, small positive values indicate the built-up area or bare land and negative value, i.e. close to 0 indicates the water bodies (Zhang et al. 2009).

Normalized difference water index (NDWI)

NDWI is used for the assessment of liquid water present in vegetation as NDWI are directly proportional to vegetation, water content (Chen et al. 2006). The equation of NDWI was proposed by McFeeters (1996) using the reflectance of Green (G) and NIR band and it can be expressed as

$$NDWI=\frac{{G - NIR}}{{G+NIR}},$$
(8)

where Green represents band 2 and NIR represent band 4 for TM and band 3 and band 5 are green and NIR band respectively for OLI-8. The value of NDWI ranges is from − 1 to + 1. Actually, NDWI is significant over NDVI because NDWI is less sensitive to the atmospheric scattering (Gábor and Jombach 2010).

Modified normalized difference water index (MNDWI)

MNDWI can be useful for estimating water bodies significantly without any built-up area and vegetation noise (Xu 2006). Here this method is used for clear identification of urban water bodies and their changing patterns. MNDWI can be expressed as

$${\text{MNDWI}}=\frac{{G - MIR}}{{G+MIR}}.$$
(9)

Normalized difference bareness index (NDBaI)

The dynamics of NDBaI are considered as an important biophysical component for urban expansion. Chen et al. (2006) proposed the concept of NDBaI and extracted the bare land using the Eq.

$$NDBaI=\frac{{SWIR~1~ - TIR}}{{SWIR~1+TIR}},$$
(10)

where SWIR 1 (shortwave infrared) represents band 5 for TM and band 6 for OLI-8 and TIR (thermal infrared) represent band 6 for TM and band 10 and 11 for OLI-8. It has good contrast between bare land with vegetation and moist surface.

Soil adjusted vegetation index (SAVI)

SAVI as proposed by Huete (1988) to extract vegetation cover without the noise of soil background. Liu et al. (2018) used SAVI as significant indicators for land use mapping. The index of SAVI is calculated as

$$SAVI=\frac{{(NIR - R)~(1+L)}}{{(NIR+Red+L)}},$$
(11)

where L is the constant value and it is basically the denominator of NDVI formula.

Extraction of LST from thermal band

The effects of the urban growth can be reflected through the change of the surface temperature. We used Landsat imageries after necessary correction for the determination of the LST. The step-wise procedures of LST retrieval are given below:

  1. 1.

    Conversion of digital number (DN) to spectral radiance (Lλ)

    The object having a temperature above absolute zero (K) emits energy of thermal electromagnetic appearance. Using this principle, thermal sensors were converted from sensor radiance. The spectral radiance (Lλ) can be given as (Markham and Barker 1985; Avdan and Jovanovska 2016; USGS 2016)

    $$L{{{\uplambda}}}={{\text{L}}_{{{\text{min}{\uplambda}}}}}+{\text{~}}\left[ {\frac{{({{\text{L}}_{{{\text{max}{\uplambda}}}}} - {\text{~}}{{\text{L}}_{{{\text{min}{\uplambda}}}}})}}{{({{\text{Q}}_{{\text{CAL~max}}}}{\text{~~}} - {\text{~}}{{\text{Q}}_{{\text{CALmin}} {\text{~}}}})}}{\text{~}} \times {\text{~}}{{\text{Q}}_{{\text{CAL}}}}} \right],$$
    (12)

    where L is the sensor-derived spectral reflectance, (W m− 2 sr− 1 µm− 1), \({\text{L}}_{\text{max}{\uplambda }}\) and \({\text{L}}_{\text{min}{\uplambda }}\) are the minimum and maximum spectral radiances for band 6 respectively. \({\text{Q}}_{\text{CAL}}\) is the digital number (DN) of each pixel. \({\text{Q}}_{\text{CALmin}}\) is the minimum DN value of the image, here \({\text{Q}}_{\text{CALmin}}\) = 0; and \({\text{Q}}_{\text{CALmax} }\) is the maximum DN value of the image, here \({\text{Q}}_{\text{CALmax}}=255\).

  1. 2.

    Conversion of spectral radiance (Lλ) to brightness temperature (Tβ)

    It is necessary to convert spectral radiance (Lλ) to reflectance (Tβ) for the correction of emissivity according to land use variation. According to Nichol (1994), almost a value of 0.95 is given to vegetated area and 0.92 for the non-vegetated area. The emissivity can be computed by using the formula of Artis and Carnahan (1982).

    $${{\text{T}{\upbeta}}}={\text{~}}\frac{{{{\text{K}}_2}}}{{{\text{In~}}\left( {\frac{{{{\text{K}}_1}}}{{{{\text{L}}_{{{\uplambda}}}}}}+{\text{~}}1} \right)}} - {\text{~}}273.15,$$
    (13)

    where Tβ is the brightness temperature (K), Lλ represent spectral radiance of sensor (W m− 2 sr− 1 µm− 1) K1 and K2 are the calibration constant, (K1 = 60.776 mW cm− 2 sr− 2 µm− 1 and K2 = 1260.56 K for Landsat band). An absolute zero (approx − 273.15 °C) should be added to revise the temperature in terms of a degree Celsius (Xu and Chen 2004).

  1. 3.

    Emissivity correction through NDVI method

    The retrieval of the temperature value is necessary for correction of spectral emissivity (\(\epsilon\)). It can be achieved through the nature of land use land cover change or by the computing of an emissivity value using NDVI of individual pixels. NDVI is important as the proportion of vegetation\((Pv\)) should be calculated using NDVI. The NDVI is calculated by using the Eq. (6). The \(Pv\) can be calculated as

    $$Pv={\left( {\frac{{NDVI - NDV{I_{soil}}}}{{NDV{I_{veg}}+NDV{I_{soil}}}}} \right)^2},$$
    (14)

    where \({NDVI}_{soil}\) and \({NDVI}_{veg}\) are the threshold values of soil pixel and the pixel of vegetation. The threshold values of \({NDVI}_{s}\) is 0.2 and \({NDVI}_{V}\) is 0.7 (Sobirno et al. 2004).

  1. 4.

    Land surface emissivity calculation (\({\upepsilon }\))

    The calculation of land surface emissivity (\(\epsilon\)) is important to estimate LST and it is considered as a proportionality factor of Plank’s Law i.e., blackbody radiance that predicts emitted radiance (Jiménez-Muñoz et al. 2006). The emissivity (\(\epsilon\)) is calculated as

    $$\varepsilon {{{\uplambda}}}={\varepsilon _{{\text{veg}}.{{{\uplambda}}}}}{{\text{P}}_{\text{v}}}+{\text{~}}{\varepsilon _{{\text{soil}}.{{{\uplambda}}}}}{\text{~}}\left( {1 - {P_v}} \right)+{{\text{C}}_{\text{v}}},$$
    (15)

    where \({\epsilon }_{\text{veg}}\)and \({\epsilon }_{\text{soil}}\) are the vegetation and soil emissivity respectively, and C is the representation of surface roughness and 0.005 is constant by the following equation

    $$\varepsilon \uplambda = \left\{ {\begin{array}{*{20}l} {\varepsilon _{{S\uplambda }} } \hfill & {NDVI < NDVI_{{soil}} } \hfill \\ {\varepsilon _{{{\text{veg}} \cdot \uplambda }} P_{v} + ~\varepsilon _{{{\text{soil}} \cdot \uplambda }} ~\left( {1 - P_{v} } \right) + C} \hfill & {NDVI \le NDVI \ge NDVI_{{veg}} } \hfill \\ {\varepsilon _{{{\text{soil}} \cdot \uplambda }} + C} \hfill & {NDVI> NDVI_{{veg}} .} \hfill \\ \end{array} } \right.$$
    (16)
  1. 5.

    Retrieval of LST

    Emissivity corrected LST is the last stage of the LST retrieval and it is computed as:

    $$LST=~\frac{{T\beta }}{{\left[ {1+\left\{ {\left( {\lambda \cdot T\beta /\rho } \right)In \cdot \varepsilon \lambda } \right\}} \right]}},$$
    (17)

    where LST means land surface temperature in °C, \(T\beta\) is the sensor brightness temperature (°C) derived from Eq. 12, \(\lambda\)is the wavelength of emitted radiance in meter (\(\lambda =10.895 \mu m)\) (Markham and Barker 1985), \(\epsilon \lambda\) is the emissivity determined by the Eq. (14) and,

    $$\rho =h\frac{C}{\sigma }=1.438 \times {10^{ - 2}}mK,$$
    (18)

    where \(\sigma\) is the Boltzmann Constant (1.38\(\times\)10−23 J K−1), \(h\) is the Planck’s constant (6.626\(\times\)10−34 J K−1) and \(C\) is the velocity of light (2.998 \(\times\) 108 m s−1).

Statistical techniques

In order to compare the relationship between urban biophysical components and LST, we used Pearson’s correlation coefficient with R statistical programming. The 350 valid sample points were selected using systematic random sampling from the altered and unaltered areas. The extracted values of these sample points were used for correlation statistics. The LST prediction model is considered as a significant assignment for urban suitability analysis and urban sustainable planning. The regression model is considered as the most useful technique regarding this purpose (Adam and Smith 2014; Shi et al. 2018) because it is a very common, significant and suitable technique for factor based empirical down-scaling study (Crawley 2007). In this study, we used Multiple Linear Regression (MLR) model in which a dependent variable (\({y}_{i}\)) can be explained by a linear combination with more than two independent variables \(({x_{in}})\) and it is expressed as:

$${y_i}={\beta _0}+{\beta _1}{x_{i1}}+{\beta _2}{x_{i2}}+ \ldots +{\beta _n}{x_{in}}+{ \in _i}.$$
(19)

In this model, \({\beta }_{0}\) is the intercept or constant and\({ \beta }_{1}\), \({\beta }_{2}\), ... \({\beta }_{n}\) are called regression slopes or regression coefficient estimated by ordinary least square method. \({\in }_{i}\)is the error term called residuals of models. The MLR model of four selected years was validated by Durbin-Watson value and multicollinearity test.

Hotspot analysis using Getis-Ord Gi* statistics

Thermal Index is commonly used for LST to show the Urban Heat Island (UHI) effect and the impact of the change of urban biophysical composition (Mathew et al. 2017). But hot spot analysis is more useful to assess the high concentration of the LST visually (Adeyeri et al. 2017; Tran et al. 2017) Using the mean values of LST hot spot maps was produced over the study area in different periods. Hotspot analysis tool (Getis-Ord Gi*) in ArcGIS is used for this purpose. In general, the high clustered value considered as the hotspot, but it should be statistically significant (Getis and Ord 1992). Getis-Ord Gi* is calculated as:

$$G_{i}^{*}=\frac{{\mathop \sum \nolimits_{{j=1}}^{n} {\varphi _{i,j}}~xj - \bar {x}\mathop \sum \nolimits_{{j=1}}^{n} {\varphi _{i,j}}}}{{\sqrt[S]{{\frac{{n\mathop \sum \nolimits_{{j=1}}^{n} \varphi _{{i,j}}^{2} - {{\left( {\mathop \sum \nolimits_{{j=1}}^{n} {\varphi _{i,j}}} \right)}^2}}}{{n - 1}}}}}},$$
(20)

where \({x}_{j}\) is the attribute value for the feature \({j, \phi }_{i,j}\) is the spatial weight between feature i and j,n is equal to the total number of the feature.

$$\bar {X}=\frac{{\mathop \sum \nolimits_{{j=1}}^{n} xi}}{n},$$
(21)

and,

$$S=\sqrt {\frac{{\mathop \sum \nolimits_{{j=1}}^{n} x{j^2} - {{(\bar {X})}^2}}}{n}} .$$
(22)

The output of Getis-Ord Gi* represented by z-score and, the largest positive value of z score indicates high clustering (Wulder and Boots 1998) i.e. hotspot and the largest negative value of z-score indicate the cold spot. In this study, we categorised seven classes of hot spot and cold spot according to their significant z-value: < 99% significant indicates high hot spot (z-score ≥ 2.58, 99–95% significant indicates hot spot (z-score = 2.58 to 1.96), 95%–90% significant indicates warm spot (z-score = 1.96 to 1.65), z-value of 1.65 to − 1.65 indicates not significant, 90% significant indicates cool spot (z-score = − 1.65 to − 1.96), 90–95% significant indicates cold spot (z-score = − 1.96 to − 2.58) and > 95% significant indicates highly cold spot (z-score ≤ − 2.58) (Tran et al. 2017; Ranagalage et al. 2018a, b). The brief methodological highlights are shown in Fig. 3.

Fig. 3
figure 3

Methodological work flow of this study

Results and discussion

Analysis of land uses land cover (LULC) change

Supervised land classification with maximum likelihood (ML) classification algorithm was employed to classify LULC and change pattern as it is more suitable and accurate in urban areas. First, we classify the land use types for Kolkata Metropolitan Area (KMA) into six major categories (a) highly dense built-up area, (b) moderately dense built-up area, (c) vegetation and plantation, (d) water bodies, (e) agricultural land and (f) Barren land. Wetland and river are considered underwater bodies and similarly open place and fallow lands are considered under barren land. Considering the aim of the research we have analyzed the LULC change maps for the 4 selected years, i.e. 1991, 2001, 2011 and 2017. It is seen that KMA is experiencing rapid change of its LULC dynamics in last 25 years. All the classified maps were validated using Kappa statistics, producers accuracy, users accuracy and overall accuracy. The overall accuracy for LULC classifications are 90.22%, 92.00%, 94.22% and 96.78% for the 1991, 2001, 2011 and 2017 respectively. Similarly, Kappa coefficients for the four classified LULC maps are 0.882 (in 1991), 0.904 (in 2001), 0.930 (in 2011) and 0.951 (in 2017) respectively, which indicates that the output LULC maps can be significantly used. The producer accuracy and user’s accuracy of each LULC categories are shown in Table 4. Thus the calculated parameters and change patterns can be used for further analysis. The patterns of LULC and their area are tabulated which offers a comprehensive database for analysis in spatio-temporal dimensions (Table 5). It is seen that the total built-up area has been continuously increasing between 1991 and 2017. The built-up area has been increased from 322.68 to 982.86 Km2 with a significant encroachment of open space, agricultural land and vegetation area. Thus the barren land, vegetation cover and agricultural land have significantly decreased from 1991 to 2017 are − 10.83%, − 16.08% and − 12.09% respectively. Similarly, water bodies, particularly wetland area also decreased (− 1.34%). The patterns of land use at different time periods were shown in Fig. 4.

Table 4 Accuracy assessment table for the classified LULC maps of 1991, 2001, 2011 and 2017
Table 5 Area under various LULC classes and changes of area from 1991 to 2017 in KMA
Fig. 4
figure 4

Land use land cover maps of KMA showing the patterns of land use dynamics in four different time periods i.e. 1991, 2001, 2011 and 2017

So, the results revealed that the drastic positive change occurs in the built-up area as it is continuously increasing in different time nodes (i.e. 1991, 2001, 2011 and 2017). The growth of population and various developmental activities tend to the expansion of urban area is not only the city core, but also the periphery of KMA (Sahana et al. 2018). Consequently, vegetation cover, agricultural land, and barren land continuously decreasing.

Dynamics of bio-physical indices

The six major biophysical parameters were used for the analysis of urban dynamics and the spatio-temporal urbanization footprint. NDBI, NDVI, MNDWI, NDBaI, SAVI and NDWI are considered as the major components of the urban landscape. It is seen that the NDBI which designates the built-up area and it is gradually increasing with different time periods. The range of NDBI value of 1991 was − 0.07 to 0.64 with the mean value of − 0.042 (SD = 0.137). The mean value of NDBI is increasing with the rapid urbanization and the value of mean NDBI were 0.017 (SD = 0.159), 0.031 (SD = 0.183) and − 0.104 (SD = 0.098) for the year of 2001, 2011 and 2017 respectively. The range of NDBI value in 2017 is 0–0.53. The expansion of the built-up area has been continuously extended in Northern, South-eastern and Western direction (Fig. 5). Consequently, NDVI and SAVI are gradually decreasing in response to built-up expansion. The decreasing pattern of NDVI is evidently identified by the NDVI maps. The mean value of NDVI are 0.121 (SD = 0.148), 0.098 (SD = 0.124), 0.083 (SD = 0.077) and 0.075 (SD = 0.125) for the year 1991, 2001, 2011 and 2017 respectively. The mean value of SAVI is also representing a decreasing trend. Therefore, the vegetation cover of KMA is continuously decreasing. Here we applied SAVI as an enhanced accurate method of vegetation estimation (Shifaw et al. 2018). The mean value of SAVI are 0.152 (SD = 0.185), 0.142 (SD = 0.178), 0.129 (SD = 0.224) and 0.117 (SD = 0.105) respectively. The MNDWI, NDWI and NDBaI are also following the similar trend which indicates urban expansion in an exceeding manner with the loss of biophysical components. The mean value of MNDWI decreased from − 0.126 in 1991 to − 0.020 in 2017 (SD = 0.217). The bareness index (NDBaI) is also continuously decreasing i,e. − 0.534 (SD = 0.008) in 1991 to − 0.264 (SD = 0.081) in 2017 due to the encroachment of built-up area which is validated by LULC maps. However, the change of these biophysical indices demonstrated the rapid urbanizing situation in the KMA in a small time period. For the demonstration of the dynamics of various biophysical components, we carried out correlation statistics using 350 sample points (discussed in Sect. 3.7). The analysis shows that NDBI is negatively correlated with NDVI, NDWI, MNDWI and SAVI. It is seen that the correlation coefficient between NDVI and NDBI are increased from − 0.02 in 1991 (which is not significant) to − 0.47 in 2017 (which is significant with p < 0.05) that indicates the impact of built-up expansion. SAVI, NDWI and MNDWI are following the increasing negative relationship with NDBI (Fig. 5). Hence this analysis concluded that water bodies, open land and vegetation cover are gradually transformed into built-up areas. Therefore, environmental components are highly reactive to the stimuli of urbanization.

Fig. 5
figure 5figure 5

Composition of urban biophysical elements and their spatio-temporal dynamics over the KMA

The relationship between urban biophysical composition and LST

The rapid change of urban environment due to urbanization could be effected on the urban landscape. The land surface temperatures (LST) are gradually increasing with the increases of the urban impervious surface (UIA) i.e built-up area and the phenomena are known as Urban Heat Island effect (UHI) (Owen et al. 1998). Identifying the change of LST is one of the major objectives of this research. To assess this, we performed Pearson’s correlation statistics to show the relationship between LST and other selected biophysical components. It is evident from Fig. 5 that NDVI, MNDWI, SAVI, NDWI, and NDBaI are more dynamic over the built-up area. The LST is radically controlled by surface moisture content, vegetation cover, surface water bodies and the amount of impervious surface etc. (Guo et al. 2016). Accordingly, LST has varied in different places across the study area. The mean LST are 18.47 °C (SD = 1.495), 18.39 °C (SD = 1.370), 21.04 °C (SD = 1.719) and 23.30 °C (SD = 0.093) in 1991, 2001, 2011 and 2017 respectively. Therefore, it is seen that, mean LST is gradually increasing over the last 25 years. Here it should be elucidated that, we only measured the changes of LST over different time periods and thus we considered only winter season (January) for this purpose because, the intensity of UHI effect can be clearly depicted in this season (Sharma et al. 2015). The highest value of LST is gradually increasing i,e. 25.83 °C, 26.68 °C, 28.35 °C and 29.78 °C in 1991, 2001, 2011 and 2017 respectively (Fig. 6). Among the six biophysical parameters, NDVI, SAVI, NDWI and MNDWI are negatively correlated with LST. The distribution of vegetation cover and water bodies can reduce the surface heat of a landscape (Deng and Wu 2013; Li et al. 2016a, b; Shiflett et al. 2017). But the degree of these correlations is gradually decreased from 1991 to 2017. The correlation between NDVI and LST was (R) − 0.56 and − 0.51 in 1991 and 2001 respectively (Fig. 7), which are significant with p value 0.01 and also similar to the result of Yue et al. (2007), Zhu et al. (2013) and Guo et al. (2015). However, it should be illustrated that the value of the correlation (R) between NDVI and LST is gradually decreasing from − 0.51 in 2001 to − 0.38 in 2017 due to a considerable decrease of vegetation cover due to unremitting urban expansion. Similarly, the correlation between SAVI and LST have significantly decreased from 1991 to 2017. The patterns of correlation between MNDWI and LST are − 0.45, − 0.44, − 0.40 and − 0.41 in 1991, 2001, 2011 and 2017 respectively, which indicate a decrease of water bodies and increase LST (Fig. 7).

Fig. 6
figure 6

LST maps of KMA in different time periods. The mean LST are 18.47 °C, 18.39 °C, 21.04 °C and 23.30 °C in 1991, 2001, 2011 and 2017 respectively shows a continuing rising trend

Fig. 7
figure 7

Correlation coefficient graph to showing the relationship between of urban biophysical parameters and LST in different time periods (***significance level = p < 0.01)

We introduced NDBI and NDBaI to accurately demarcate the built-up area and bare land respectively because the spectral reflectance of built-up area and bare land are too difficult to separate. However, NDBaI is positively related to LST and the value of the correlation coefficient (R) is 0.49, 0.50, 0.47 and 0.46 in 1991, 2001, 2011 and 2017 respectively. But the relationship between NDBI and LST are significantly positive with a range of 0.60–0.95 (Chen et al. 2006; Sannigrahi et al. 2017). In this study, the range between NDBI and LST are 0.53 to 0.71 (p ≤ 0.01). The value of correlation coefficient is gradually increasing i,e. 0.53 in 1991, 0.58 in 2001, 0.67 in 2011 and 0.71 in 2017 respectively (Fig. 7). Accordingly, LST significantly varied with the variation of urban biophysical composition. The, cross-sectional profiles of LST are also showing the changes of LST in different landuses over four selected periods (Fig. 8). The core area are shows higher LST than surrounding rural due to UHI effect (Ranagalage et al. 2018a, b).

Fig. 8
figure 8

LST profiles of three selected rural–urban gradients, characterized by different LULC in four selected time periods. The marking a–e indicates, various LULC and their respective LST. The comparisons of LST graph indicated contineously rising of LST

Multiple linear regression (MLR) model for LST prediction

The LST is contrasted with the variation of LULC and its spatio-temporal dynamics. No doubt, urban biophysical parameters are considered as the predictor variables of the LST. LST is positively correlated with the built-up areas and bare land and negatively correlated with water bodies and vegetation. Thus, it is significant to study all the predictors which are controlling of LST rather than single variable. The multiple linear regression (MLR) model has been used to find out the relationship between dependency and all independent variables and their role in controlling the LST (Connors et al. 2013). MLR is a significant tool that can easily predict LST and its changing pattern using the selected control’s variables, i.e NDBI, NDBaI, NDWI, MNDWI, NDVI, and SAVI. MLR has been done for four selected time periods to characterize the variability of LST with the changeability of biophysical composition. The Table 6 shows the results of MLR. It can be explained from the table that, the coefficient of NDBI is always an important factor in predicting LST in different years. The MLR model of the year 1991, we got the correlation coefficient, R = 0.715 and coefficient of determination, R2 = 0.512, i.e. 51.20%, and the predicted R2 = 51.75%. Similarly, The correlation coefficient of the variables for the year 2001, R = 0.751 and the correlation of determination, R2 = 0.564, i.e. 56.40% and predicted R2 = 54.89%. So the correlation coefficient value of the MLR model with two time periods (i.e. 1991 and 2001) are not well fitted to the data. That indicated independent variables are not significantly describing the dependent variable. Actually, the built-up areas are not significantly increasing and vegetation and water bodies remarkably existed. Therefore, the variation of LST cannot be significantly represented by the correlation coefficient values. But the results of 2011 and 2017 indicate a significant correlation coefficient among the variables (> 60%). The correlation coefficient of the year is R = 0.791 in 2011 and coefficient of determination, R2 = 0.626 i.e. 62.60% and the predicted R2 = 60.16%, which represent that controlling variables are significantly defined the patterns and distribution of LST with the p-value of < 0.001. Similarly the R-value of the year 2017 is 0.864 and the R2 = 0.716 i.e. 71.60% and the predicted R2 = 69.93%. So the coefficient value indicates a good fit for the linear model. It represents 71.60% of the variance of the dependent variables (Table 6). Consequently, it can be explained that, rapidly increasing built-up area, decreasing vegetation and water bodies are the causes for raising the LST in the study area.

Table 6 Multiple linear regression equation showing the R2 and predicted R2 values of predictors variables of LST

Assessment of model fit

We performed numerous statistical test to validate and predict the applicability of the selected MLR model. We performed a t-test for the selected years and the value is (ti) > 2.0 with \(\alpha\) < 0.001 significance level that signifies the importance of selected variables of the model. The Durbin-Watson (D–W) values are 1.743, 1.795, 1.866 and 1.876 for the year of 1991, 2001, 2011 and 2017 respectively. It shows that, as the D-W values are close to 2, therefore, the residuals are independent. The linearity of each model was checked by fitting value versus residual values, observation order versus residual order and frequency histogram of residual (Fig. 9). The Centered Leverage value and the Cook’s distance are in an acceptable range. Thus the regression model does not affect by Outliers. Variance inflation factors (VIF) and Tolerance value are used to measure the multicollinearity. The normal probability plots of the linear regression model indicate the residual called error a term that is normally distributed and follows a straight line, particularly the model for the year of 2011 and 2017 (Fig. 9). It signifies that the distribution is a good fit and the model can be accepted.

Fig. 9
figure 9

Results of MLR representing normal probability plots, residuals, versus order and versus fits for four selected time periods a 1991, b 2001, c 2011 and d 2017

Hotspot analysis of LST in KMA

The hot spot analysis has been done to indicate the area affected by increasing LST. The hot spot maps were prepared for four selected time periods and each of the maps was significant of p < 0.0001. The hot spot maps show that maximum hot spot was founded in the built-up area and the low spot was mainly dominated over the vegetation and water bodies which indicated the maximum clustering of high and low LST pixels in those areas respectively. It is clearly seen that the vegetation and water bodies have a significant role in controlling LST. Hotspot map of 1991 clearly indicates that the high values of LST are mainly concentrated over the core built-up areas and cool spot were dominated over the water bodies study area. The maps for respective years clearly visualize the change of the cold spot area in hot and highly hot-spot area (Fig. 10). Accordingly, with the increasing of built-up area, the hot spot areas are also extended from the core area to peripheral areas. However, the hotspot analysis can clearly depict the existing situation of LST in KMA and its related vulnerability. Thus, hot spot maps can be helpful for assessment of LST exposure as well as ecological assessment can also be possible for sustainable urban growth.

Fig. 10
figure 10

Hot spot and cold spot maps of KMA for four time periods showing the hot spot areas are gradually extending towards cold spot areas

Conclusion

In this study, we have assessed the spatiotemporal pattern of land use dynamics and its response to urban biophysical compositions. Simultaneously, we have been trying to assess the relationship between urban biophysical components and the rising trend of the LST. As a megacity and 3rd largest urban agglomeration, KMA is facing a continuous influx of population and resultant continuous expansion of the built-up area. LULC maps of KMA show that KMA faces the rapid change of LULC after 2001. The built-up area of the KMA is rapidly increasing from 322.68 Km2 in 1991 to 982.86 Km2 in 2017. The urban biophysical parameters significantly respond to the change of LULC. It is seen from the analysis that, vegetation cover, agricultural land and water bodies are significantly decreased. Consequently, LST of the entire study area is continuously increasing from 18.47 °C mean LST of 1991 to 23.30 °C mean LST in 2017.

This analysis is based on modern comprehensive approaches combining of geospatial and statistical methods. Remote sensing and GIS-based analysis are useful and effective for such a study. Among the existing biophysical indices, which are already used by the scholars, we have introduced two related parameters i.e. SAVI and NDBaI in this study. The methodological pathway was clear to recognize and successfully convene with the objectives. The statistical tools combining correlation statistics, MLR and Getis-Ord-Gi* statistics have been successfully investigated the targeted analysis. The result shows that among the six biophysical parameters NDBI is considered the most important controlling variables as the correlation between NDBI and LST are gradually increasing from 1991 to 2017. MLR is adopted for prediction of LST using selected bio-physical factors. The result of the MLR of 2017 shows that LST is significantly predicted by the predictor variables (R2 = 71.60%). Thus, it is established that the change of LST is gradually increasing with the changes in biophysical components.

A continuous, suitable and conservation policy actions should be taken to minimize urbanization impacts. At the issue of policy action, KMA comes under the Kolkata Municipal Development Authority (KMDA), but the KMDA consists of several legal institutions, such as KMC, HMC, CMC, West Bengal Housing, and Infrastructure Development Corporation (WBHIDCO), various municipal offices and punched offices. The management plans should be taken by collaborating these authorities for sustainable urban growth. Moreover, some initiatives have been taken by KMDA, for example, Land Use and Development Control Plan including demarcation of development control zones, housing regulation, conservation of green space etc. (KMDA 2016). However, some management options can be summed up for sustainable development of KMA:

  1. (a)

    Proper land use planning and regulation should be implemented without destroying natural biodiversity.

  2. (b)

    The conservation of open place and green space should be strictly maintained to reduce the impact of the LST.

  3. (c)

    Construction of housing and building materials should be scientifically used and well planned for reduction of surface temperature.

  4. (d)

    The proper zoning system should be done according to the adjoining ecosystem and their resilience capacity.

  5. (e)

    Moreover, a comprehensive plan should be taken and must be implemented rather than a township plan.