1 Introduction

The availability of reliable ground-based soil moisture data is crucial for the validation of remotely sensed satellite soil moisture estimates. Such validated satellite soil moisture products could be used for initialization and validation of the land model, which further helps in better understanding of land-surface processes and land–atmosphere feedbacks. It has been shown in many studies that initialization of soil moisture in the land model can significantly improve seasonal predictions and inter-annual variability (Douville and Chauvin 2000; Ni-Meister et al. 2006). An improper initialization of a land model can result in accumulation of anomalies in the land model, which can lead to faulty interpretation of the climate variables (Rodell et al. 2005). General circulation models are also used to predict long-term climate change and it has been shown that the soil moisture contributes significantly to the surface temperature and precipitation trends (May et al. 2015, 2017), making it all the more necessary to investigate the soil moisture biases in land models, which can only be achieved by using accurate soil moisture estimates with global coverage and high spatial resolution.

In recent times, the European Space Agency, under its Climate Change Initiative, has generated 35 years (starting at 1978) of global gridded soil moisture dataset based on satellite observations and the soil moisture product is referred as European Space Agency Climate Change Initiative Soil Moisture (ESACCI SM) product. Dorigo et al. (2015) validated this dataset using in-situ soil moisture observations from 596 stations of the International Soil Moisture Network, with decent coverage over Asia, eastern Europe and the United States of America. They also noted an increase in spatiotemporal coverage of ESACCI dataset with time. However, over India and China, historic datasets were used for validation at weekly or monthly temporal resolution. Agrawal and Chakraborty (2016) used this satellite-derived soil moisture dataset to identify a surface soil moisture bias in the land model, which drastically modulated the seasonal cycle of monsoon over the Indian region.

Many studies have validated satellite-derived soil moisture products against in-situ measurements over various parts of the world (Draper et al. 2009; Albergel et al. 2012; Bitar et al. 2012; An et al. 2016; McNally et al. 2016; Ikonen et al. 2018) and these studies have helped in identifying the bias in the satellite products, consequently improving their accuracy over those regions. But there are very few studies which use considerable in-situ measurements over the Indian region. A major reason for this research gap was the lack of high temporal resolution ground-based soil moisture measurements over the India region until very recent. ESACCI soil moisture dataset has not yet been validated over the Indian region using any recent in-situ datasets.

However, a recently available in-situ soil moisture dataset at high temporal resolution, measured during the Continental Tropical Convergence Zone (CTCZ) Programme, has proved very beneficial in filling this gap in observational dataset of soil moisture over the Indian region. In this paper, we use these CTCZ in-situ soil moisture observations to validate the ESACCI soil moisture dataset over the Indian region. The analysis is conducted for June–September months for the years 2011 and 2012, for which both products have continuous data. Both datasets are briefly described in section 2, followed by the results in section 3 and summary in section 4.

Figure 1
figure 1

(a) JJAS climatology (years: 2000–2014) of volumetric surface soil moisture (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)) from ESACCI satellite observations. (b) Inter-annual standard deviation of JJAS mean volumetric surface soil moisture of ESACCI. (c) JJAS precipitation (mm/day) climatology from IMD data. (d) Inter-annual standard deviation of JJAS mean IMD precipitation.

Figure 2
figure 2

The location of Automatic Weather Stations (AWS) of CTCZ are marked in the map of India. The boxes show the six regions over which soil moisture data was validated.

Figure 3
figure 3

The mean percent volumetric surface soil moisture (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)) for June–September over the six regions. Left: year 2011, Right: year 2012.

Figure 4
figure 4

The standard deviation of volumetric surface soil moisture (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)) for June–September over the six regions. Left: year 2011, Right: year 2012.

Figure 5
figure 5

Root mean squared difference (RMSD) of volumetric surface soil moisture (in %) for JJAS over the six regions.

2 Data and methodology

2.1 ESACCI soil moisture data

The European Space Agency, under its Climate Change Initiative, provided long-term (starting 1978) surface soil moisture product, titled ‘ESA CCI ECV Surface Soil Moisture Combined Product’. This recently released version 2.2 is validated over the Indian region. It is referred as ESACCI_SM in this study. This dataset is a combined product of active and passive sensors (Liu et al. 2011, 2012; Wagner et al. 2012) and is available since 1978 till 2014 at daily temporal resolution. It covers the globe at a resolution of 0.25° × 0.25° and represents the surface soil moisture, with a soil depth of 0.5–2 cm. Regions with surface snow and permafrost have been masked in the dataset.

The passive soil moisture product is derived from passive sensors – Scanning Multichannel Microwave Radiometer (SMMR), Special Sensor Microwave Imager (SSM/I), TRMM Microwave Imager (TMI), and Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E), which operate in microwave frequency range. The active soil moisture product is derived from active sensors – ERS Scatterometer (SCAT) and METOP Advanced Scatterometer (ASCAT), which operate at microwave frequencies of 5.6 and 5.255 GHz, respectively. These two soil moisture products are then combined together to create a merged soil moisture product, by rescaling against the model output from GLDAS-1-NOAH (Global Land Data Assimilation System), using cumulative distribution function (CFD) matching techniques (Liu et al. 2011). Over regions where the active and passive soil moisture products have correlation values greater than 0.65, both products are combined to get the merged product. These regions are classified as ‘transitional’ regions. In regions where the two products show lesser agreement, that is, a correlation value smaller than 0.65, only one of the active or passive products is used to create the combined product. The choice between active and passive products for such regions is made based on the vegetation density over the region. For sparsely vegetated regions passive product is used, whereas for moderately vegetated region active product is used. Over ‘transitional’ regions, the merging of two products results in a higher temporal and spatial coverage.

2.2 CTCZ soil moisture data

Ground-based soil moisture observations were obtained from the Continental Tropical Convergence Zone (CTCZ) Programme, which was funded by the Ministry of Earth Sciences, Government of India. The dataset is referred as CTCZ_SM in this study. The soil moisture values represent top layer in-situ measurements at 130 Automatic Weather Stations (AWS), spread across the country. Most of the sensors deployed for the purpose of measuring soil moisture are ML2 ThetaProbe sensors, manufactured by Delta-T devices of the United Kingdom. These sensors are based on frequency domain reflectometry (FDR) method of soil measurement and operate at 100 MHz signal. The sensors produce a voltage output which is proportional to the square root of the dielectric constant of the soil, which in turn is modulated by the moisture content of the soil. These sensors need to be calibrated according to the type of soil and have an accuracy of the order of \(\pm 0.02\) \(\hbox {m}^{3}\,\hbox {m}^{-3}\). Five out of all the AWS are installed with the HydraProbe sensors from Stevens, which are highly accurate and uses coaxial impedance dielectric reflectometry method of soil moisture measurement.

The soil sampling depth for the top layer is about 20 cm from the surface, and the soil moisture sensors are installed at this depth approximately at all AWS. The data is available for only June–September (JJAS) months of years 2010–2013. The percentage of volumetric soil moisture is recorded every hour at all AWS. We calculated the daily average volumetric soil moisture by averaging these hourly values. This process is repeated for each station. Out of the total 130 AWS, 117 have hourly measurements of soil moisture, and thus, measurements from these 117 stations are used for the analysis.

In this work, we validated the ESACCI soil moisture data against these CTCZ observations for the overlapping time-period, that is JJAS of year 2011 and 2012, for which continuous data is available.

Figure 6
figure 6

Spatial distribution of uncertainty in the measurement of volumetric surface soil moisture (in \(\hbox {m}^{3}\,\hbox {m}^{-3}\)) averaged for June–September, 2011 (left) and 2012 (right). These values are provided along with the merged ESACCI soil moisture product.

Figure 7
figure 7

Daily time series of volumetric surface soil moisture values from ESACCI and CTCZ, for year 2011, over (a) Northern India, (b) Gangetic Plains, (c) Western central India, (d) Eastern central India, (e) Southern India, and (f) Northeast India.

Figure 8
figure 8

Daily time series of volumetric surface soil moisture values from ESACCI and CTCZ, for year 2012, over (a) Northern India, (b) Gangetic Plains, (c) Western central India, (d) Eastern central India, (e) Southern India, and (f) Northeast India.

3 Results

Figure 1(a) shows the spatial distribution of June–September (JJAS) mean volumetric surface soil moisture for the time-period 2000-2014, from ESACCI_SM. It captures the spatial heterogeneity of mean surface soil moisture over India, which is strongly affected by the regional soil type, topography and vegetation cover. Based on this, we divide our study region into six smaller domains for comparative analysis. Each small domain is broadly a representative of homogeneous soil type and similar mean surface soil moisture values. These six regions are: (a) Northern India (NI), (b) Gangetic Plains (GP), (c) Western Central India (WCI), (d) Eastern Central India (ECI), (e) Southern India (SI), and (f) Northeast India (NEI). These regions are marked in figure 1(a). The inter-annual standard deviation of the mean JJAS surface soil moisture for the same time period is shown in figure 1(b). Over most of the Indian subcontinent, the inter-annual standard deviation of soil moisture values is less than 4% (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)). Noticeably, the north-western side of India shows a higher soil moisture variability despite low mean soil moisture values.

Here, we would like to highlight that surface soil moisture and its variability do not depend entirely on the mean precipitation or its variability over a region, but also depends on soil type and vegetation. Figure 1(c and d) shows JJAS mean and inter-annual standard deviation of precipitation from the India Meteorological Department (Rajeevan et al. 2006), respectively. Precipitation mean and its variability are higher over the eastern-central India as compared to the western-central India. Whereas, mean soil moisture and its inter-annual variability are much higher towards the western side of central India. Thus, soil moisture has its own characteristic variability, and is not solely governed by precipitation variability over a region. In fact, soil moisture variability can significantly influence the atmospheric conditions. Chakraborty and Agrawal (2017) showed that May surface pressure over western Asia, which is strongly affected by local soil temperature and moisture, modulates the onset of summer monsoon over central Indian region. A long-term satellite based soil moisture dataset, such as ESACCI_SM, could be useful in studying soil moisture–precipitation coupling at inter-annual timescale and also be used to examine the pre-monsoon conditions over western Asia.

Next, we describe the CTCZ ground-based soil moisture dataset, which is used in the validation of ESACCI_SM. Locations of CTCZ AWS are shown in figure 2. Though there are 130 such stations in total, only 117 stations have continuous soil moisture observations and they are marked with a dot on the map. These stations are also grouped into same six smaller domains, as described earlier for ESACCI_SM. The advantage of grouping the stations in a smaller domain is that now they are a better representative of the spatial variation of surface soil moisture within the region, and are not just point observations. It also reduces the bias in measurements due to instrument errors and operator errors related to a single station.

To further make the satellite measurements more comparable to the in-situ measurements, the following technique is used. For a CTCZ station with certain coordinates, a grid with closest latitude–longitude is identified from the ESACCI grids. Additionally, eight surrounding grids are also identified. The soil moisture is then averaged over this cluster of nine ESACCI grids to represent the ESACCI surface soil moisture value corresponding to the CTCZ station. This process is followed for each CTCZ station and a corresponding ESACCI daily data is created for JJAS of years 2011 and 2012. This reconstructed data is then used for the validation purpose over India.

Mean volumetric JJAS surface soil moisture over the Indian region based on CTCZ_SM, that is averaged over all 117 CTCZ stations, is 23% (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)) with an intra-seasonal standard deviation of 6.1% for the two years. ESACCI_SM follows in-situ observations very closely over the Indian region, with a mean volumetric JJAS surface soil moisture value of 25.5% and an intra-seasonal standard deviation of 6%. ESACCI_SM has a positive bias of 2.5% (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)) with respect to CTCZ_SM. The average root mean squared difference (RMSD) is nearly 10% (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)) when all 117 stations are considered. The analysis is further carried out for smaller and more homogeneous domains to identify the regions where the two products are in good agreement and possible cause of differences over other regions. Mean volumetric surface soil moisture values (in %) are shown for the six regions in figure 3 for JJAS in 2011 and 2012. The values differ considerably from region to region. However, it can be noted that the two datasets are fairly close to each other over all the regions (with mean error less than 5% \(\hbox {m}^{3}\,\hbox {m}^{-3}\)), except southern India for both years. Figure 4 shows a comparison between the intra-seasonal standard deviation of soil moisture values in JJAS for the two datasets and ESACCI_SM effectively captures standard deviations over these six regions, except the overestimation over southern India.

Figure 5 shows RMSD between ESACCI_SM and CTCZ_SM in 2011 (blue) and 2012 (red), for the same six regions. RMSD values between these two products are less than 5% (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)) for all the regions in northern and central India. Whereas, RMSD values are higher for southern India and northeast India. Figure 6 shows the spatial pattern of uncertainty in soil moisture measurements in ESACCI product over the Indian region in the same unit as percentage volumetric soil moisture, for 2011 and 2012. These values are provided by the ESACCI along with the surface soil moisture values. It can be noted that uncertainties in soil moisture measurements are higher along the Himalayas and north-eastern region of India, which could be partly attributed to hilly terrain and dense vegetation in these regions. Southern India also shows higher uncertainty in soil moisture measurements, as compared to plains of northern and central India, in spite of moderate vegetation in the region. This could be due to the fact that over southern India, only the active soil moisture product contributes to the merged product, as it does not fall in the ‘transitional’ regions (Liu et al. 2011) (refer section 2.1). Whereas, over most of central and northern India, both active and passive soil moisture products are merged and thus have relatively smaller uncertainty in the soil moisture product.

Having validated ESACCI_SM over seasonal time-scale, we further examine the dataset over daily timescale. We are interested in daily timescale because the Indian region (especially northern and central plains) shows strong land–atmosphere coupling (Koster et al. 2004; Agrawal and Chakraborty 2016) at intra-seasonal timescale. Agrawal and Chakraborty (2016) demonstrated that the use of ESACCI_SM to correct the model land surface bias greatly improved the model’s ability to simulate the seasonal cycle of the Indian summer monsoon. Thus, ESACCI_SM could be used in the study of intra-seasonal variability of the Indian monsoon. Figure  7 shows a comparison between daily soil moisture values for JJAS over six smaller domains for the year 2011. ESACCI_SM follows CTCZ in-situ observations very closely. It captures the range of variability very well, from low soil moisture values at the beginning of June to higher soil moisture values as the monsoon season progresses. Correlation values between the two products are fairly high at 0.8, 0.89 and 0.94 for northern India, Gangetic Plains, and western-central India, respectively. Correlation values for eastern-central India is moderate at 0.58, nevertheless it can be noted that ESACCI_SM closely follows CTCZ_SM. Over southern India and northeast India, correlation values are relatively small (0.46 and 0.31, respectively), and the two products agree little over these domains. Lesser agreement between the two datasets over southern and northeast India is also reflected in RMSD values (figure 5).

Figure  8 demonstrates a similar analysis for the year 2012, with high correlation values at 0.77, 0.92, 0.91 and 0.73 for northern India, Gangetic Plains, western-central India, and eastern-central India, respectively. Over southern India, though correlation is slightly higher at 0.64, ESACCI_SM has a positive bias compared to CTCZ_SM. Over northeast India, correlation is extremely poor (0.01) for 2012, as ESACCI_SM has very strong intra-seasonal variability.

For both years, ESACCI_SM captures the strong intra-seasonal variability of 8–12 days time-scale in surface soil moisture, which is present in CTCZ observations. However, ESACCI_SM also shows high intra-seasonal oscillation of time-period fewer than 4 days (clear from spectral analysis), especially towards the later phase of the monsoon season. This is not seen in CTCZ_SM observations. It could possibly be related to the difference in soil sampling depths of these two datasets. Satellite observations essentially capture the topmost soil surface which represents a depth of around 0.5–2 cm, based on the microwave frequency range. This top layer, owing to its exposure to the atmosphere, is heavily influenced by winds, solar insolation, and various external factors and responds to these factors quickly. On the other hand, in-situ observations are made using electronic sensors which are installed at a slightly deeper and safer depths, depending on the AWS standards. Thus, these sensors sample a soil depth which is affected slightly slowly and less by surface–atmosphere interactions. Hence, the high-frequency variability with a time period of 2–4 days in ESACCI_SM could be in response to surface–atmosphere interactions.

It is worth mentioning here that the vegetation cover also undergoes strong changes after the onset of monsoon over the Indian region. Thus, the changing vegetation cover could contribute partially to these high-frequency variability noted in ESACCI_SM. The high intra-seasonal variability over northeast India, seen in ESACCI_SM, could be attributed to the dense vegetation and hilly terrain of the region, which also explains the poor correlation with the in-situ data. Finally, an important previously stated fact that one soil moisture product is point observation (CTCZ) and the other satellite product (ESACCI) represents an area of 25 × 25 km, explains hugely the cause of small bias in two products. Nevertheless, it could be said that the satellite-derived ESACCI soil moisture dataset is satisfactorily in agreement with in-situ observations from CTCZ, over plains of northern and central India.

4 Summary

In this study, we validated the satellite-based merged surface soil moisture product (ESACCI_SM) over the Indian region using ground-based measurements from a widespread network of Automatic Weather stations covered under the Continental Tropical Convergence Zone (CTCZ) Programme. The validation is carried out for the monsoon season (June–September) of years 2011 and 2012, as both products have continuous data for these years.

Overall, ESACCI_SM is in good agreement with the ground-based observations over the Indian region. Further, validation is carried out over six smaller domains with more homogeneous soil type and mean soil moisture values. The root mean squared difference (RMSD) in percent volumetric soil moisture (\(\hbox {m}^{3}\,\hbox {m}^{-3}\)) between the two products is less than 5% for the four domains in northern and central India. These four domains lie in the ‘transitional’ regions of the ESACCI_SM product, that is, both active and passive products are merged to form the soil moisture product in these regions. Over southern and north-east India, RMSD values are greater than 5%. These two domains lie outside the ‘transitional’ region and the ESACCI_SM product in these regions mainly comprises of soil moisture estimates from active sensors only. These regions also have a relatively higher uncertainty of soil moisture measurements as compared to regions in central and northern India for ESACCI_SM.

Additionally, ESACCI_SM shows a considerable high-frequency variability in the 2–4 days time period, which is absent in ground-based observations from CTCZ_SM. This is attributed to the difference in soil sampling depths of these two products. The satellite-based soil moisture product, measured in the microwave frequency range, represents the topmost layer of soil (0.5–2 cm), which can be strongly influenced by atmospheric processes at shorter timescales. Whereas, CTCZ in-situ observations are made at a relatively higher depth of 20 cm, where the effect of atmospheric feedbacks is much more shielded. Dorigo et al. (2015) have also discussed the potential sources of uncertainty involved in the comparison of satellite-based observations with the in-situ data, such as the difference in spatial coverage of the two datasets, time of sampling, and the depth of soil which is sampled.

In summary, it can be concluded that ESACCI_SM is reliable over the Indian region and very close to the ground truth. Availability of this long-term dataset can greatly help in the study of land–atmosphere coupling and validation of land-surface models, which are coupled to the atmospheric models employed for prediction.