1 Introduction

All substances of the Earth’s floor having a temperature higher than − 273 °C or 0 K (absolute zero) emit radiation after absorbing the energy from the sun by the random movement of particles [14]. The emitted energy passes from the ecosystem and recorded via the thermal infrared (TIR) sensors and gets converted into a digital number (DN) value [18]. The expertise of emissivity identification was crucial to retrieve the land surface temperature (LST) from satellite imagery [37]. The emissivity has been a complex venture to fix due to heterogeneity of ground and spectral variation of the Earth’s surface material [36]. The land surface emissivity (aside from the ocean) can substantially range with flora [6], soil moisture [22], surface roughness [28], and viewing angle [57]. In classification-based technique, land use/land cover (LU-LC) information was retrieved from satellite image and assigned an emissivity outlay to each LU-LC class [56]. In urban areas, the emissivity value estimation for each LU-LC classes was more problematic due to overlapping of Earth’s surface features [13]. The classification-based approach was mostly implemented previously, but no longer suitable because it required throughout knowledge about the study vicinity as well as vegetation cover of the scene captured.

Earth’s surface vegetation proportion has been calculated by different remote sensing–based vegetation indices (VI) like normalized difference vegetation index (NDVI), soil-adjusted vegetation index (SAVI), enhanced vegetation index (EVI), and fractional vegetation cover (fc) [7, 33, 59]. The red and near-infrared bands were main input data to determine the VI [52]. The NDVI of satellite images was dependent on multiple factors, such as elevations, sensor angle, and level of vegetation [25, 44]. The vegetation interpreting the strength of SAVI has been better than NDVI and EVI in a low vegetated area like a deserted oasis and high sandy areas [63]. The fc parameter has been unbiased on the surface vegetation percentage and shown global applicability in different remote sensing sensor [13]. The fraction vegetation cover (fc) and soil-adjusted vegetation index (SAVI) were better connected with LST compared with NDVI [31, 60]. The NDVI has been largely used to predict the LST values at higher resolution by using numerous statistical techniques. The NDVI-based method and ratios of vegetation and barren land have been most prominent to calculate emissivity [1]. Experts moreover found that NDVI and LST have been proportionally related to the world’s surface total reflectance design [26].

LST-NDVI relationship has been tested in diverse environments such as high mountain, forests field, deserted areas, and different biological systems over various LU-LC classes [23, 30, 49, 50, 55]. The LST-NDVI connection was measurably present for each LU-LC class, excluding the water bodies like the ocean [30]. The increase in NDVI has contributed to a decrease in LST over the selected satellite image [17]. On the contrary, a positive connection between LST and NDVI was found in the Arctic-Tundra framework and high altitude territories because of the higher temperature of vegetation than ice [17]. The pioneer LST downscaling research was administrated inside the corn and soybean fields of the USA, defined the dimensionality of LST-NDVI relationship [34]. The bias of the LST-NDVI relationship shifted with LU-LC type [66], atmospheric condition [41], soil wetness [46], and vegetation thickness [21]. The slope of the regression model between LST and NDVI also varied with the thermal properties, evapotranspiration, entrap-radiation, soil, and vegetation water content [10, 24, 29, 55].

The vast majority of the algorithms have utilized the vegetation indices based physical relationship in LST downscaling models. The analysts had created LST downscaling strategies reinforced by the factual connection between the biophysical factors. The basic assumption behind the LST downscaling (LSTD) techniques was that the model coefficients from low to high resolutions were considered scale-invariant [39]. LSTD models typically include linear or multiple linear regression [32], principal component regression [27], regression tree [8], artificial neural networks (ANN) [60], support vector machine or regression (SVM or SVR) [15], geographically and temporally weighted regression [65], extreme machine learning [1], and random forest (RF) [20]. The performances of some of these LSTD models have been compared in past studies; still, the conclusions remain controversial probably due to limited coverage in terms of land cover type, topography, and climate zone.

Recent studies emphasized that the performance of the regression models primarily depends on the geographical location, on which the downscaling algorithms were performed. The findings indicated that the seasonal comparison of LST downscaling (LSTD) models was necessary for different land cover to provide convincing assessments. Numerous LSTD algorithms have been proposed for improving the spatial or spatiotemporal resolution of satellite-derived LSTs. Among them, the following three categories of the algorithm are most popular: statistical LSTD, data fusion–based LSTD, and hybrid LSTD. Statistical LSTD improves the spatial resolution of LSTs by directly applying the statistical relationships between LSTs and one or more LST predictors obtained at low spatial resolution referred to as auxiliary data. The visible and near-infrared (VNIR) and visible (blue, green, and red) were manipulated to acquire auxiliary data at a higher resolution [61]. The hybrid LSTD was derived by combining statistical LSTD and data fusion–based LSTD in (day-night) diurnal or annual temperature cycle models [8]. The implementation of the hybrid methods was relatively difficult due to the complexity in the projection and resolution uniformity.

A negative connection between LST-NDVI can be accepted in South-Asian nations because of the tropical warm atmosphere [3]. In any case, it must be built before creating the LST downscaling model. The present research aims to analyze the seasonal performance of the linear and polynomial regression LST downscaling model for free to access moderate-resolution imaging spectroradiometer (MODIS) satellite datasets. This examination leads to break down the characteristics of the (LST-VI) relationship for different seasons over the study area of Jaipur city, India. The three novel research issues addressed in this paper are as follows: (i) to identify the seasonal performance of linear and polynomial regression model using vegetation indices (VI) as auxiliary dat;. (ii) to compare the level of agreement between three different VI parameters (NDVI, SAVI, and fc) and LST for summer, winter, and monsoon season; and (iii) to determine the highest resolution of LST, which can be achieved from an LST-VI regression model without significant error. The investigation furthermore assesses the LST-NDVI test decision and sample filtering effect in building up the LST downscaling model.

2 Remotely Sensed Materials and Study Area

2.1 Dataset

MODIS satellite images were utilized in this investigation for LST prediction at higher resolution. MODIS Terra sensor gained vegetation, emissivity, and LST information of surface material have been mentioned in Table 1. The visible near-infrared (VNIR) reflectance groups of 250-m spatial scale has acted as a supporting variable to obtain vegetation indices (VI) of Earth’s surface.

Table 1 Dataset used in the present study

2.2 Study Area

The case study area is Jaipur urban and countryside territories in the Rajasthan state of India. The study area is topographically in between 26°40′0″ to 27°10′0″ North latitude and 75°40′0″ to 76°0′0″ East longitudes. The topographical location of the study is shown in Fig. 1.

Fig. 1
figure 1

Study area

2.3 Test Dates

The reflectance band (MOD11A2) dataset has been converted into LST by multiplying to the scale converter available in the data user handbook downloaded from the earth explore website (https://earthexplorer.usgs.gov/). Mainly three seasons have been observed by Indian Meteorological Department (IMD), as summer season (April–June), monsoon season (July–Mid of September), and winter season (November–February). The date of acquisition of MODIS input data from 7th of February to 24th of December of the year 2019 has been mentioned in Table 2. The 21 sampled test dates for the study year 2019 have been obtained (DOY resembles Julian days of the year 2019).

Table 2 Dataset time line and availability

For the accurate LST downscaling, two continuous clouds free data is required. However, 9 of these dates (marked as red tag) have lacked at least one out of the two required continuous scenes. The 12 remaining images have the required two continuous clouds free MODIS scenes before 1 day apart (marked as green tag). The average values of 4 dates of each summer, winter, and monsoon season (marked as a green tag) have been taken in the regression model seasonal analysis.

3 Methodology

The NDVI, SAVI, and fc derived MODIS picture of 250-m spatial resolutions were resampled to 1000 m for spatial coordinating with LST for the development of the LST-vegetation indices relationship. LST was considered the dependent variable, and NDVI, SAVI, and fc were taken as independent physical factors. The regression model fitting accuracy from low to high resolution was determined by the correlation coefficient (R2) values.

3.1 Processing of MODIS Data

The radiometric calibration and environmental corrections were administrated on MODIS VNIR groups. The at-sensor brightness of VNIR groups was recovered into the territorially adjusted at-surface reflection by the dark object subtraction (DOS) barometrical revision model. The DOS model has been utilized due to its straightforwardness and non-accessibility of radio sounding information. The MODIS surface reflectance acquired in sinusoidal projection were re-projected to Universal Transverse Mercator (UTM) projection with zone number 43 N at WGS 84. Twenty well-dispersed ground control points (GCPs) were taken as reference in MODIS imagery geo-referencing. The nearest neighbor resampling technique has been utilized for geo-referencing by 0.3 RMSE (root mean square error). From the corrected MODIS surface reflectance information, NDVI, SAVI, and fc have been determined.

3.2 Spectral Indices Calculation

The environmentally corrected VNIR groups have been utilized for calculation of NDVI, soil-adjusted vegetation index (SAVI), and fraction vegetation cover (fc).

3.2.1 Normalized difference Vegetation Index (NDVI)

The NDVI is a mathematical indicator of Earth’s surface vegetation or greenness [51]. The NDVI values are dependent on the surface material emitted electromagnetic radiation (EMR) in the red and near-infrared (NIR) spectrum. As the differentiation among NIR and red band reflectance increases, the vegetation also increases [11]. The value of NDVI lies in between (− 1 to + 1). The NDVI values were calculated as shown in Eq. (1).

$$ NDVI=\frac{\rho_{NIR}-{\rho}_{RED}}{\rho_{NIR}+{\rho}_{RED}} $$
(1)

3.2.2 Soil-Adjusted Vegetation Index (SAVI)

The SAVI is a parameter quantifying the presence of vegetation over the Earth’s surface materials and biophysical components [12]. The external factor influenced the NDVI values, where the vegetated percentage was very low and a large percentage of area was soils. The SAVI is the modified version of NDVI, in which soil moisture variety has been utilized to determine more accurate vegetation values in sandy areas. To take out the soil inputs and exposed surface impact, a soil change factor (L) has been proposed by, where L = soil alteration factor [47]. The SAVI was calculated utilizing this L adjustment as given in Eq. (2). The SAVI values ranges from − 1 to + 1.

$$ SAVI=\frac{\rho_{NIR}-{\rho}_{RED}}{\rho_{NIR}+{\rho}_{RED}+L}\ast \left(1+L\right) $$
(2)

3.2.3 Fraction Vegetation Cover (fc)

The fc value is a percentage measure of vegetation level for any Earth’s surface material. For an image pixel, fc value would represent an amount of absolute pixel that is secured by the trees, plants, shrubs, grass, or any other form of vegetation. The fc was determined by [2]), shown in Eq. (3). The fc value ranges from 0 to 1.

$$ fc=1-{\left(\frac{NDVI_{\mathrm{max}}- NDVI}{NDVI_{\mathrm{max}}-{NDVI}_{\mathrm{min}}}\right)}^{0.625} $$
(3)

The NDVImax and NDVImin are maximum and minimum NDVI values in all the pixels of captured data.

3.3 Sample Filtering Using Pixel Changeability Coefficient (CV)

LST-NDVI graph plots involve an enormous number of outliers. The outliers are due to the presence of mixed pixels in urban areas at low-resolution scale which leads to a false representation of Earth’s surface land cover. It is necessary to reduce outliers from the regression model to build a robust regression model in urban regions. The impact of anomalies has to be removed from input data before calibration of regression model. The pixel’s low NDVI values (less than 0.3) brings the exceptions into the physical-based LSTD models. To decrease the impact of anomalies from the model connection and error minimization, a pixel changeability (CV)–based filtering strategy was established, as given in Eq. (4).

$$ CV=\frac{\sigma }{\mu } $$
(4)

The symbol “σ” is the standard deviation and “μ” is the variance between vegetation indices and LST images. The regression model input data have been filtered by taking CV of 0 to 15% initially and then 16 to 100% in the subsets of 10% interval.

3.4 Uncertainty of the (LST-VI) Regression Model

The LST-VI regression model has been tested for uncertainty. The regression model uncertainty was the measure of resolution level, which could be achieved, without observing significant errors [58]. The resolution uncertainty was determined to monitor the distribution of model parameters, i.e., inclination and block and correlation coefficient (R2) values variation at different spatial scales. The LST-Vi regression relationship was set up utilizing a vegetation-based regression model at 100 m, 200 m, 300 m, 400 m, 500 m, 600 m, 700 m, and 800 m spatial resolution (Fig. 2).

Fig. 2
figure 2

Graphical representation of VI-based LSTD model

3.5 Downscaled Temperature Validation

In situ surface temperatures were collected by data loggers for validation of downscaled MODIS LST image variation in different land uses. Twenty-four-hour duration collection was conducted for monsoon, winter, and summer season, from seven locations of the Jaipur study area, simultaneously. The Google Earth image of areas selected for thermal readings is shown in Fig. 3. The 3 locations are placed in the urban boundary, and 4 locations are in the rural boundary. About 10–15 measurements were taken for each land uses location using infrared thermometers, and the average value of measurements has been considered the LST of that point, as shown in Fig. 4. The device used was FLUKE thermometer infrared calibrator, model 59 mini having distance to spot (D:S) ratio during calibration which was 8:1.

Fig. 3
figure 3

In situ LST collection points location

Fig. 4
figure 4

Land cover representing, i.e., (a) soil, (b) shrubs, (c) grass, (d) concrete, (e) bitumen, and (f) thermal logger used for identification of LST

4 Result and Discussion

In this research, three vegetation indices was used as auxiliary variables, so n = 3. Therefore, the regression models up to n-1 = 2nd order was taken into consideration. The order of the polynomial model should be kept as low as possible. If the linear model does not acquire satisfactory results, then the higher-order polynomials should be attempted. As well as, the higher degree models would need more physical parameters to calculate the regression coefficients. The arbitrary fitting of higher-order polynomials can be a serious violation of regression analysis.

4.1 Linear Model vs Polynomial Model

Figure 3 is displaying LST in the y-axis and VI (NDVI/SAVI/fc) in the x-axis of linear regression model and polynomial regression model. The 4 days of each season have been mentioned in the dataset (Table 2) by green tag. The average values of 4 dates of each season were taken in regression model building. The R2 values in the linear model was 0.71, 0.62, and 0.82 for winter, summer, and monsoon, respectively, whereas for polynomial models, the R2 of 0.59, 0.54, and 0.73 was seen for respective seasons.

The R2 estimations of the linear model were much higher than the polynomial model, as seen in Fig. 5 for all the mentioned seasons. The higher value of R2 implies the linearity of the LST-VI relationship. The farthest points and the lowest R2 were seen in summer season information, trailed by monsoon and winters. A better relationship of LST-VI has existed in winter and monsoon seasons than compared with the midyear season, due to lack of vegetation. A similar sort of seasonal variation in the LSTD models has been obtained by researchers in the semiarid climatic regions [19, 40, 54]. The summer season has shown the lowest accuracy in LST prediction, as seen in past LSTD studies [45, 62]. The character of the connection between LST and NDVI was in a straight line and negative [4]. The higher R2 esteem was accomplished in winter and monsoon because of low vegetation high sandy zones in the midyear time frame [24, 30].

Fig. 5
figure 5

LST-VI regression models seasonal variation

4.2 Sample Filtering

The regression models have been filtered by CV values in the calibration process before estimating the LST. The NDVI values were taken for LST estimation at higher spatial resolutions, due to its better fitting compared with SAVI and fc, as seen in the earlier section. The regression model parameters (incline, block, and R2) were plotted using CV in 10% intervals. The regression parameters (Incline, Block) and R2 of fitted regression model in all season are shown in Table 3. Table 3 shows that the highest values of R2 were observed for CV in between (16 to 25%). Post (25%) CV, the R2 values started decreasing for all season data. The most noteworthy R2 has been found in between 16 and 25% CV for all seasons.

Table 3 Correlation coefficient (R2) variation with sample filtering

Figure 6 shows the linear regression plot generated between LST and NDVI in winter, summer, and monsoon season for full data and the sampled (15 ≤ CV ≤ 25%) data. The R2 of full data were 0.58, 0.39, and 0.60 in winter, summer, and monsoon season, respectively, whereas sampled data has shown much higher R2 values of 0.74, 0.63, and 0.85 for winter, summer, and monsoon season, respectively. The total error in LST estimation was reduced significantly by incorporating sample data of 15 ≤ CV ≤ 25. The sample screening idea introduced in this research has facilitated the elimination of outliers. The decreased outliers have strengthened the regression model performance and helped in predicting the LST more accurately. This demonstrated that the LST prediction models had poor working at the lower end of NDVI values below 0.2. For all seasons, the average error of nearby 2 °C has been found for NDVI value between 0.2 and 0.3. In all seasons, the LST prediction error has gone below 1 °C for the NDVI values above 0.3. The total error in the predicted LST has been higher than 3 °C for NDVI value below 0.2. The higher vulnerabilities of regression model parameters were found in seasonal ANOVA investigation.

Fig. 6
figure 6

LST-NDVI seasonal relationship for full data and filtered (15 ≤ CV ≤ 25) data

In earlier studies, the researchers have suggested to employ pixel subsets selection procedure from a data of minimum inter-pixel variation [43]. Researchers found that spectral or spatial filtering of input data in regression models during the calibrations stage has ad hoc the higher accuracy in LST prediction [38, 48]. The total error in LST prediction has been plotted against the varying NDVI values to assess the dependence of the LST prediction on the volume of surface vegetation percentage [35, 64]. The earlier researches had reported a sudden drop in total error when NDVI values reach above 0.3 and thereafter error gradually increases [63]. The range and standard deviation of incline and block pictures were higher in the midyear summer season compared with winter and monsoon dates, showing higher vulnerability in summers [53, 61].

4.3 NDVI Relationship with Vegetation Indices (NDVI, fc, and SAVI)

Linear regression models work better than polynomial for all the seasons, so the linear regression model was utilized to determine the level of connection between LST and VI (NDVI, SAVI, and fc). Figure 7 shows the relationship models of LST versus NDVI, SAVI, and fc obtained from MODIS information for winter, summer, and monsoon, respectively. The highest R2 values were observed in the monsoon season, followed by winter and lowest in the summer season. The most noteworthy R2 has been seen among LST and NDVI, outperformed SAVI and fc in all seasons. LST relationship was marginally better with SAVI compared with fc in winter and summer season. The different forms of flora and their variation with time show differences in the patterns of seasonal NDVI values.

Fig. 7
figure 7

LST vs (NDVI/SAVI/fc) seasonal relationship with R2 value

The NDVI calculations have been highly influenced by the form of vegetation, such as deciduous or evergreen [9]. A same sort of comparable perception was found in LSTD models for summer and monsoon season [16]. The seasonal downscaling results have shown that the linear regression model achieved better accuracy as compared with polynomial models. NDVI was profoundly connected with LST in the all seasons dataset. Subsequently, the LST-NDVI relationship was used for the improvement of a downscaling model over a heterogeneous scene [5]. The higher performance of NDVI has been also reported in previous statistical prediction models connecting vegetation fractions to Earth’s surface materials temperature [42]. The error distribution has indicated an apparent seasonal variation in the performance of the LST simulations [17].

4.4 Resolution Dependency

The spatial resolution dependency is the measure of resolution that can be achieved successfully, without having any significant error in the predicted variable, i.e., LST. In heterogeneous scenes, the pixels changeability and scene heterogeneity increases with higher spatial goal. The spatial uncertainty of the regression models was identified by measuring the variation of the regression coefficient, i.e., inclination and block and R2.

The LST-NDVI regression model parameters have been plotted over 200 m, 400 m, 600 m, 800 m, and 1000 m of spatial resolution. Figure 8 delineates the incline, block, and R2 values shift from (LST-NDVI) regression models of different spatial resolution. The values of regression model coefficients sharply change after the spatial resolution goes less than 200 m.

Fig. 8
figure 8

Plot of inclination, block, and R2 for different resolution in (LSTD-NDVI) model

5 Conclusion

In this research, the nature of the regression relationship between land surface temperature (LST) and vegetation indices (VI) was identified. The full data and filtered sample data have been tested by using the pixel changeability coefficient (CV). The spatial dependency of the LST-NDVI egression model was identified by measuring the variation of regression coefficient from low to high resolution. The research has indicated that seasonal climatic fluctuations and crop condition variation profoundly affected the LST-VI relationship. The following points are the concluding remarks for the LST prediction using VI-based regression models, in the case study of Jaipur city, India.

  • The linear model was more accurate results compared with polynomial models in LST estimation for all seasons. The polynomial model suited better than the linear model only in the peak and tail ends of data distribution. The LST-VI data distribution has been found almost in a straight line (1:1) for a case study of Jaipur city, India.

  • The sample filtering by the coefficient of variation (CV) has significantly contributed to increasing the R2 of the regression models for all seasons. CV in between 15 and 25% can be considered in calibration process for LST prediction by vegetation indices based regression models.

  • The higher R2 has been accomplished in winter and monsoon seasons compared with the summer season due to low vegetation in the midyear time frame.

  • The SAVI model was better than fc in winter season only, whereas fc was better than SAVI in summer and monsoon seasons.

  • The NDVI was showning the highest correlation coefficient (R2) values compared with SAVI and fc. The NDVI parameters were best suited for LST prediction.

  • The R2 value changes bit by bit up to 200 m goals and below 200 m, all the parameter values have shifted quickly. The LST-VI regression relationship can be utilized up to the resolution of 200 m from 1000-m spatial resolution data with LST prediction error less than (1 °C).

  • The VI seasonal variation was mainly dependent on the materials’ greenness properties and chlorophyll quality. The regression model line coefficients (inclination and blocks) parameters were found geographically exclusive for any particular globe location.

The research work presented in this paper has contributed to measure the thermal radiation of Earth’s surface at higher resolution from low-resolution data. The LST-VI regression models can be used for applications related to the identification of fire-prone materials, thermal comfort monitoring in urban areas, and estimations of thermal emissions from a variety of materials. The regression-based models were practically proportionate in nature and required further examination in such a manner. For future scope, the LST estimation can be further tested obtaining the auxiliary variable from higher resolution thermal sensors onboard satellites like LANDSAT8 and Sentinal2/3 series.