1 Introduction

Soil moisture (SM) is the water content of land surface soil expressed as a ratio of the volume (or mass) of water contained in a unit volume (or mass) of soil. SM quantity plays a significant role in controlling the exchange of water and heat energy between land surface and atmosphere. In addition, SM is an important geophysical variable in operational applications such as agricultural activity, global flood management, numerical weather forecasting, and climate change (e.g., GCOS, 2010).

Ground-based point measurements of SM quantity are characterized by disadvantages such as high spatial and temporal variability, a lack of data, variations in consistency, integrity, and reliability of data quality, and others. Satellites are useful for global SM estimation because they provide accurate and consistent measurements. In general, microwave satellite remote sensing is widely used for this purpose because of minimal dependence on weather conditions and direct relation to SM (e.g., Wang and Schumugge 1980; Dobson et al. 1985; Jackson 1993; De Jeu et al. 2008; Hong and Shin 2011). However, it has the disadvantage of relatively low spatial and temporal resolution compared to optical and infrared (IR) satellite remote sensing. In addition, strong radio-frequency interference (RFI) of microwave remote sensing (Ellingson and Johnson 2006; Li et al. 2004) causes difficulties in developing SM retrieval algorithms.

There have been many studies on visible and IR-based SM retrieval (e.g., Bowers and Hanks 1965; Price 1977; Stoner and Baumgardner 1981; Lobell and Asner 2002; Liu et al. 2003). However, these approaches have limitations due to the impact of land type difference and soil texture on SM (e.g., Stoner and Baumgardner 1981; Escadafal et al. 1989; Mattikalli 1997) and the low transmittance of vegetation cover layers.

In order to include the effect of vegetation on SM, the temperature vegetation dryness index (TVDI) (e.g., Price 1990; Gillies et al. 1997) has been proposed as a combination of two factors of the normalized difference vegetative index (NDVI) and land surface temperature (LST). It was developed for the empirical interpretation of water stress on vegetation (e.g., Sandholt et al. 2002; Wang et al. 2004; Han et al. 2010; Chen et al. 2011), and satellite-derived TVDI was found to have a negative linear relationship with in situ SM measurements (e.g., Chen et al. 2015). Carlson (2007) showed that TVDI could provide more complete information on SM monitoring based on the soil-vegetation-atmosphere transfer (SVAT) model. Gillies and Carlson (1995) described the NDVI-LST spatial distribution as a triangular distribution, interpreting it as TVDI. Moran et al. (1996) hypothesized that LST difference and NDVI space data are trapezoidal, and developed the water deficit index (WDI), (e.g., Petropoulos et al. 2009). Sandholt et al. (2002) found a high correlation coefficient (CC) (CC = 0.70) between TVDI and SM through a study of northern Senegal.

In this study, we present a TVDI-based SM retrieval algorithm using Moderate Resolution Imaging Spectroradiometer (MODIS) data to compensate for the disadvantages of crude spatial resolution and limitation of SM retrieval over dense vegetation region in passive microwave remote sensing.

2 Data and Method

2.1 Data

For this case study, we selected the Far East Asia region including the Korean peninsula as the study area. For the land cover information and elevation map, we used the MODIS land cover product (MCD12Q1) and the Advanced Space-borne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (GDEM) (version 2.0) product. Figure 1 shows land cover information (Fig. 1a) and the elevation map (Fig. 1b) of the study area. Figure 1a shows that four different land types occupy most of study area including barren or sparsely vegetated land (pink color), grassland (yellow color), cropland (orange color), and forest (green and dark green color). The MCD12Q1 from the year 2014 were used to obtain the different relationship coefficients between TVDI and SM. The MCD12Q1 during 2015 were used to verify our algorithm.

Fig. 1
figure 1

a Land cover and (b) elevation information of the study area. Land covers are: (1) Evergreen Needleleaf Forests; (2) Evergreen Broadleaf Forests; (3) Deciduous Needleleaf Forests; (4) Deciduous Broadleaf Forests; (5) Mixed Forests; (6) Closed Shrublands; (7) Open Shrublands; (8) Woody Savannas; (9) Savannas; (10) Grasslands; (11) Permanent Wetlands; (12) Croplands; (13) Urban and Built-up Lands; (14) Cropland/Natural Vegetation Mosaics; (15) Permanent Snow Nd Ice; (16) Barren

For estimating TVDI, we used MODIS LST (MOD11A2, version 6) data and MODIS NDVI of vegetation index (MOD13A2, version 6) data. The MODIS instruments onboard the TERRA satellite (launched in 1999) have 36 spectral bands ranging from 0.4 μm to 14.4 μm. We used the MODIS 16-day NDVI products and 8-day LST products with 1 km resolution. Accordingly, we converted 8-day MODIS LST into 16-day MODIS LST by calculating the averages in order to coincide with the MODIS NDVI data.

Several parameters influence LST including solar incident radiation, angle of incidence of solar radiation, land cover, and air temperature. Recently, Kwon et al. (2017) and Khandelwal et al. (2017) studied the effect of changes in elevation on LST. They found consistent inverse linear trends between LST and elevation for all seasons. Therefore, we applied an elevation effect on the MODIS LST because the study area contains various types of land cover and a wide elevation range from 0 to more than 8000 m. For the elevation map to correct for the elevation effect on the LST product, we used the GDEM (version 2.0) product generated using stereo-pairs-collected by the ASTER observation. The GDEM version 2.0 has 30-m spatial resolution with 10–25 m accuracy in elevation and coverage of land between 83°N - 83°S.

For SM retrieval based on TVDI values, we used the Global Land Data Assimilation System (GLDAS) surface SM product. It produces high quality and synthesized SM data using a variety of observations and surface models. GLDAS data has a spatial resolution of 0.25° and 1.0° at 3-h intervals. Monthly-averaged GLDAS SM data is also provided. The GLDAS provides four different types of surface models including Mosaic, Noah, Community Land Model (CLM) and Variable Infiltration Capacity (VIC). In this study, we adapted the SM data from the Noah model (Noah 2.7: GAS/Noah experiment 881). The Noah model consists of four SM layers from the surface (0–0.1 m) to the root zone (1–2 m) along the surface depth. Our TVDI values were comparable with the surface SM data from GLDAS because of the penetration depth of IR bands. The GLDAS SM data were converted from kg/m2 into volumetric SM (m3/m3). In addition, GLDAS SM products were produced as 16-day averaged SM data to coincide with the MODIS data.

To assess the remote sensing data, we used the in situ measurements provided by the Korea Rural Development Administration databases (RDA; http://weather.rda.go.kr/). The RDA provides hourly ground SM data measured at 10 cm depth using Time Domain Reflectometry (TDR). Most of the RDA stations used in this study were located in croplands. Table 1 summarizes all input and auxiliary data used for developing this IR-based SM algorithm. The RDA SM is the point-based SM, while our IR-based SM and the GLDAS SM have spatial resolutions of 1 km and 25 km, respectively. Thus, three SMs should be compared qualitatively on the basis of the spatial pattern and temporal variation trends of each SM rather than quantitative SM amounts.

Table 1 Summary of data

2.2 Method

The TVDI is defined as follows (e.g., Sandholt et al. 2002):

$$ TVDI=\frac{\left({T}_s-{T}_{s,\mathit{\min}}\right)}{\left({T}_{s,\mathit{\max}}-{T}_{s,\mathit{\min}}\right)} $$
(1)

where Ts is the observed LST. Ts, max and Ts, min are obtained from distribution of observed Ts. TVDI values show a linear increase from 0 (wet edge) to 1 (dry edge).

In general, TVDI is 0 for saturated soil surfaces and 1 for completely dry soil. Therefore, TVDI values are estimated between 0 and 1 for general soil surfaces, from bare soil to vegetation canopy. Previous studies have demonstrated the TVDI-based SM retrieval is available from top soil to dense forest (e.g., Capehart and Carlson 1997; Sandholt et al. 2002; Cho et al. 2016). This is one advantage of using the TVDI; microwave satellite-based SM retrievals are not available in area with dense vegetation, including forest regions (e.g., Wigneron et al. 2003).

In Equation (1), Ts, max and Ts, min are the maximum and minimum LST as a function of NDVI, respectively. They are described as follows:

$$ {T}_{s,\mathit{\min}}={a}_1+{b}_1\cdotp NDVI $$
(2)
$$ {T}_{s,\mathit{\max}}={a}_2+{b}_2\cdotp NDVI $$
(3)

where a1, a2, b1 and b2 are the slopes and offsets of the lower and upper LST limit functions, which are linearly fitted and determined empirically from the given data.

In this study, we used the 16-day averaged NDVI with a 1-km resolution (MOD13A2.5). Fig. 2 shows a conceptual diagram of the TVDI with a triangular or trapezoidal shape in the space of vegetation index and LST (e.g., Sandholt et al. 2002; Cho et al. 2016). Previous studies (e.g., Sandholt et al. 2002; Patel et al. 2009) reported that there was a negative linear relationship between in situ measurements and satellite-derived TVDI.

Fig. 2
figure 2

A conceptual diagram of TVDI in the space of LST and NDVI

In different circumstances, the TVDI depends on LST as well as NDVI values. LST is influenced by several parameters including SM, solar incident radiation, angle of incidence of solar radiation, land cover, and air temperature. In general, air temperature decreases as altitude increases for a stationary atmospheric model. Figure 3 shows that LST is inversely proportional to elevation. The LST at high elevation land could be coupled not only SM but also with the low air temperature. Khandelwal et al. (2017) reported that the elevation effect in the area caused a difference of 3.5 to 4.6 °C per/km in MODIS LST in the area surrounding Jaipur, India. Thus, this study used the elevation-corrected LST for the specific study area using different simple linear relationships between LST and elevation for different land types as follows:

$$ {T}_{s, corr}={T}_s+(slope)\times Elevation $$
(4)

where Ts, corr is the elevation-corrected LST.

Fig. 3
figure 3figure 3

Elevation effect on LST for different land types: (a) Evergreen Needleleaf Forests; (b) Evergreen Broadleaf Forests; (c) Deciduous Needleleaf Forests; (d) Deciduous Broadleaf Forests; (e) Mixed Forests; (f) Closed Shrublands; (g) Open Shrublands; (h) Woody Savannas; (i) Savannas; (j) Grasslands; (k) Croplands; (l) Cropland/Natural Vegetation Mosaics; (m) Barren

We used Ts, corr in equation (1) to calculate TVDI. Thus, for different land types, we calculated the Ts, corr corresponding to 0 m above sea level using equation (4). Table 2 summarizes the conversion coefficients with slopes and offsets for different land types between MODIS LST and Ts, corr, which were estimated using the MODIS LST product and GDEM elevation data. Figure 4 shows the effect of the elevation correction of the LST on the TVDI on August 13, 2015. MODIS LST overestimates TVDI values at high elevation regions (Fig. 4a), while the TVDI values obtained by the elevation-corrected LST are more reasonable for the same regions (Fig. 4b). As a result, the Tibetan Plateau region exhibits high TVDI values (dry), while southern China region shows lower TVDI values (wet).

Table 2 Regression coefficients between LST and elevation-corrected LST for different land types
Fig. 4
figure 4

TVDI maps using (a) MODIS LST and (b) elevation-corrected LST on August 13, 2015

Subsequently, we determined the upper (dry edge), and lower (wet edge) lines in the Ts, corr-NDVI space to compute TVDI using equations (1–4). The data points in Ts, corr-NDVI space show a triangular distribution.

We then derived the SM conversion equation using pairs of the calculated TVDI and GLDAS SM content with the following linear regression equation:

$$ {M}_v=c+d\cdotp TVDI $$
(5)

where Mv is the retrieved SM amount and c and d are the slope and offset, respectively.

We calculated 16-day averaged TVDI using MODIS elevation-corrected LST and NDVI products from January to December 2014. We calculated the coefficients in equation (5) separately for the different land types. The calculated TVDI were then compared with GLDAS SM to derive the relationship between TVDI and SM. Table 3 tabulates the offset (c) and slope (d) in equation (5). These coefficients were calculated using the collocation data of MODIS TVDI and GLDAS SM data throughout 2014. Figure 5 shows the flowchart of our TVDI-based SM retrieval algorithm.

Table 3 Regression coefficients between TVDI and GLDAS soil moisture for different land types
Fig. 5
figure 5

Flowchart of the proposed TVDI-based SM retrieval algorithm

Finally, the TVDI-based SM is validated with the 2015 in situ RDA measured data. We used the root mean square error (RMSE) (equation (6)) and Pearson correlation coefficient (equation (7)) as validation indicators. The RMSE and CC are defined as:

$$ \mathrm{RMSE}=\sqrt{\frac{1}{N}\sum \limits_{i=1}^N{\left({y}_i-{x}_i\right)}^2} $$
(6)
$$ {r}_{x,y}=\frac{\sum \left({x}_i-\overline{x}\right)\sum \left({y}_i-\overline{y}\right)}{\sqrt{\sum {\left({x}_i-\overline{x}\right)}^2}\sqrt{\sum {\left({\mathrm{y}}_i-\overline{y}\right)}^2}} $$
(7)

where N is the number of data points, xi and yi are the i-th TVDI-based SM and in situ RDA SM measurement data, respectively. rX, Y is the CC between the TVDI-based SM (x) and the in situ RDA SM measurement (y) data, respectively. \( \overline{x} \) and \( \overline{y} \) are the means of the X and Y, respectively.

3 Results

Figure 6 shows scatterplot examples of NDVI and LST. Subsequently, we compared the TVDI with GLDAS SM data. We obtained a 16-day averaged TVDI in the study area using the elevation-corrected MODIS LST and NDVI data in 2014. The MODIS TVDI was spatially-averaged in the GLDAS grid after collocation between MODIS TVDI and GLDAS SM data.

Fig. 6
figure 6

Scatterplots between NDVI and LST for various land types: (a) Evergreen Needleleaf Forests; (b) Evergreen Broadleaf Forests; (c) Deciduous Needleleaf Forests; (d) Deciduous Broadleaf Forests; (e) Mixed Forests; (f) Closed Shrublands; (g) Open Shrublands; (h) Woody Savannas; (i) Savannas; (j) Grasslands; (k) Croplands; (l) Cropland/Natural Vegetation Mosaics; (m) Barren

Figure 7 shows an example of the relationship between MODIS TVDI and GLDAS SM on August 21, 2014. Land types such as deciduous needleleaf forests, deciduous broadleaf forests, open shrublands, grasslands, and croplands (barren) show relatively strong negative correlations between TVDI and SM (Fig. 7a), while land types such as evergreen needleleaf forests, evergreen broadleaf forests, mixed forests, woody savannas, savannas, and cropland/natural vegetation mosaics show relatively weak positive correlation (Fig. 7b). The land types with positive linear relationships between TVDI and SM are evergreen broadleaf forests, mixed forests, savannas, woody savannas, or cropland/natural vegetation mosaics. The positive linear relationships may be due to misclassification of the land cover rather than to linear changes in the SM. The data points with high TVDI (> 0.5) and high SM (> 0.3 cm3/cm3) are distributed mainly in agricultural areas around southern China. In this region, data may be contaminated by lakes or reservoirs.

Fig. 7
figure 7

Relationship between MODIS TVDI and GLDAS SMs for land types of (a) land types with a negative linear relationship (Deciduous Needleleaf Forests, Deciduous Broadleaf Forests, Open Shrublands, Grasslands, Croplands and Barren), (b) land types with a positive linear relationship (Evergreen Needleleaf Forests, Evergreen Broadleaf Forests, Mixed Forests, Woody Savannas, Savannas and Cropland/Natural Vegetation Mosaics)

Figure 8 shows the results of our proposed SM retrieval algorithm applied in Far East Asia including the Korean Peninsula on August 13, 2015. Figures 8a-i show the distribution of land cover type, MODIS NDVI, MODIS LST, elevation-corrected LST, TVDI, and estimated SM, GLDAS SM, SMOS SM, and SMAP SM. In this study, we did not calculate the TVDI for urban areas, permanent waters, or lakes. In addition, cloudy and frozen land (LST less than 273.15 K) areas were excluded, as shown in Fig. 8. The TVDI map in Fig. 8(e) was generated using NDVI and elevation-corrected LST (Fig. 8d). The NDVI and elevation-corrected LST show details of the dryness of the topography of the Korean Peninsula. The TVDI-based SM map (Fig. 8f) also describes a typical summer pattern of SM in the Korean Peninsula, which is similar to the GLDAS SM (Fig. 8g). However, the SMOS SM (Fig. 8h) showed relatively low SM in the Manchuria region. The SMAP SM (Fig. 8i) showed high SM along the coastlines of the Korean Peninsula and Southern China. The SMOS and SMAP SMs displayed different SM amounts and distributions in the Korean Peninsula and Manchuria regions.

Fig. 8
figure 8figure 8

Distributions of (a) land cover type, (b) MODIS NDVI, (c) MODIS LST, (d) elevation-corrected LST, (e) TVDI, (f) estimated SM, (g) GLDAS SM, (h) SMOS SM, and (i) SMAP SM

Figure 9 shows an example of a validation result with a time series assessment compared with station 2711 of the RDA (Yeongwol: longitude = 128.4618°E, latitude = 37.1836°N) among 73 RDA stations. Figure 9a shows the location of the RDA 2711 station. Figure 9b shows the time series of 16-day averaged retrieved SM, GLDAS SM, and RDA SM amounts. GLDAS SM values are relatively high, from 0.25 m3/m3 to 0.38 m3/m3 in January to February 2015 (winter season), and our TVDI-based SM and RDA SM show good agreement in the same period. From spring to autumn seasons, the three SM products show better agreements. Generally, RDA SM shows higher SM content. The GLDAS and TVDI-based SMs tend to underestimate RDA SM. The large discrepancy among the three SMs occurs in November and December 2015. The RDA SM values decrease to approximately 0.1 m3/m3. The GLDAS and TVDI-based SMs with small variations tend to overestimate RDA SM. Figures 9c, d (d) show the statistical results of TVDI-based SM and GLDAS SM compared with in situ RDA measurements during 2015. If the winter season is included (Fig. 9c), the GLDAS and TVDI-based algorithm shows a very low performance in estimating the SM. If the winter season is excluded because the land surface is frozen during this period, TVDI-based SMs show reasonable agreement with the RDA SM, with CC = 0.556, bias = −0.039 m3/m3 and RMSE = 0.051 m3/m3. The performance of the TVDI-based SMs is very similar to the GLDAS SMs with CC = 0.609, bias = −0.035 m3/m3, and RMSE = 0.047 m3/m3.

Fig. 9
figure 9

a location of RDA 2711 station, (b) time series of 16-day averaged retrieved MODIS TVDI-based, GLDAS and RDA SMs. Scatterplots between MODIS TVDI-based SM vs. RDA SM, and GLDAS SM vs. RDA SM (c) including winter season and (d) excluding winter season at RDA 2711 station during 2015

Figure 10 shows the spatial distribution of RMSE and CC between the TVDI based SM and the in situ RDA SM measurements during nine months from March to September 2015. The RMSE values are less than 0.1 m3/m3 for all study areas, except for several stations. The CC values are generally low.

Fig. 10
figure 10

a Spatial distribution of RMSE and (b) correlation coefficient between the TVDI based SM and the in situ RDA SM measurements over the course of 9 months, from March to September 2015

Figure 11 displays the comparison results between GLDAS SM and passive microwave satellite as well as SMAP and SMOS-provided SMs during September 2015. The statistical results between SMAP-provided SM and GLDAS present CC = 0.637, bias = 0.042 m3/m3, and RMSE = 0.152 m3/m3. The SMOS-derived SM displays CC = 0.741, bias = 0.010 m3/m3, and RMSE = 0.103 m3/m3. Moreover, our TVDI-based SM showed CC = 0.609, bias = −0.035 m3/m3, and RMSE = 0.047 m3/m3. This discrepancy may arise from the land type classification, because the in situ RDA SM data were mainly measured for agricultural croplands. In addition, the presence of rivers or lakes around the measurement area within MODIS pixel size (1 km × 1 km) can lead to a difference between TVDI-based SM and in situ RDA SM. Therefore, our SM algorithms showed good agreement with the GLDAS SM similar to the SMAP or SMOS SMs. Accordingly, the proposed approach is sufficient for effective SM retrieval.

Fig. 11
figure 11

GLDAS SM vs. (a) SMAP SM and (b) SMOS SM during September 2015

4 Discussion

In general, the IR-based SM retrieval approach has physical shortcomings such as no direct response to the SM and shallow penetration depth compared to microwave satellite remote sensing. In addition, the IR-based SM retrieval approach has physical limitations because of the low transmittance of the vegetation layer and the impacts of land type dependence and soil texture on SM. This study presents an IR-based SM retrieval algorithm using the relationship between SM and TVDI. Our TVDI-based SM retrieval algorithm showed similar physical limitations to the previously investigated IR-based SM retrieval approaches in that the CC values were generally low in comparison to the RDA SM amounts in the Korean Peninsula.

However, this study has the advantage of explainable geophysical variables such as LST and NDVI, which, in particular, can be estimated via the IR satellite observations. Our TVDI-based SM showed similar amounts and distributions to the GLDAS SM because the latter was used to obtain the relationship between TVDI and SM. However, our TVDI-based SM retrieval algorithm has the advantage of higher spatial resolution than GLDAS SM. In addition, our TVDI-based SM showed good agreement with the ground observations (RDA SMs) but tended to underestimate RDA SM, except for the frozen season. Notably, the SMOS and SMAP SMs showed different SM amounts and distributions in the Korean Peninsula and Manchuria regions (Fig. 8).

This paper presents a land type dependence of the TVDI-SM relationship; in other words, the LST and vegetation type dependence of SM. A previous study by Chen et al. (2015) showed that satellite-derived TVDI has a negative linear relationship with in situ SM measurements; negative relationship between TVDI and SM were observed for land types such as deciduous needleleaf forests, deciduous broadleaf forests, open shrublands, grasslands, and croplands. It was also revealed that the land types such as evergreen broadleaf forests, mixed forests, savannas, woody savannas, or cropland/natural vegetation mosaics showed a weak positive relationship between TVDI and SM.

5 Concluding Remarks

This study presented a TVDI-based SM retrieval algorithm using MODIS data including elevation-corrected LST, to complement the disadvantages of crude spatial resolution and limitations of SM retrieval over dense vegetation regions in passive microwave remote sensing. The TVDI is estimated using LST and NDVI information from optical satellite observations. MODIS LST, NDVI products, and GLDAS SM data were used to develop the 16-day averaged TVDI and SM estimates. The LST dependence on elevation was also analyzed. The in situ RDA SM data were used for the validation of the proposed algorithm. As evident from the validation results, the TVDI-based SM algorithm produces a similar accuracy to GLDAS SM products, with reasonable agreement with RDA SM within 0.1 m3/m3 of RMSE in the Korean Peninsula, excluding the winter season.

The proposed TVDI-based SM retrieval algorithm could be effective at a relatively higher spatial resolution from microwave satellites, such as the SMOS and SMAP satellites, in addition to overcoming the temporal limitations of models such as GLDAS SM.