1 Introduction

Groundwater is a predominant part of the world's water resources and is critical for human life. Groundwater aquifers are deemed ideal water supply reservoirs providing high-quality water with wide distribution and convenience. Almost all the agricultural, domestic, and industrial water requirements of the users in semi-arid and arid parts of the world are met by groundwater. The rate of water exploitation from groundwater aquifers has skyrocketed in the last decades mainly on account of the increased population, economic development, and climate change impact (Khorrami and Gunduz 2021a). The uncontrolled use of groundwater has triggered a sharp diminish in the groundwater level, which is generally accompanied by the deterioration of water quality and the formation of geological problems, such as land deformation (Khorrami et al. 2021). The problems ascribed to the fluctuations in groundwater storage signify the importance of groundwater monitoring and management at regional and local scales (Long et al. 2016).

As the first remote sensing mission to monitor the large-scale variations of the gravitational pull, The Gravity Recovery and Climate Experiment (GRACE) started its task in March 2002 and has been providing the monthly Terrestrial Water Storage Anomalies (TWSA) at large scales (Meng et al 2021; Khorrami and Gunduz 2021a). Although the spatial footprint of the GRACE twin satellites is coarse, it has been very successful in providing unprecedented opportunities for the scientific community. To date, the GRACE/GRACE-FO estimates have been utilized to understand the regional and global trends in water storage changes over different areas of the world (e.g., Seyoum and Milewski 2016; Lezzaik and Milewski 2018; Banerjee & Kumar 2018; Moghim 2020; Ali et al. 2021; Meng et al. 2021).

As a central compartment of the hydrological water cycle, Terrestrial Water Storage (TWS) controls the water, energy, and biogeochemical states (Ning et al. 2014). Therefore, having a determining impact on the Earth’s climate system, the TWS variations are crucial for the sustainable management of water resources and weather and climate modeling (Ning et al. 2014). The inception of the GRACE mission paved the way for tackling the challenges of the scarcity of direct observations of large-scale TWS estimates. The medium and large-scale variations of the Earth’s TWSA can now be traced by GRACE (Rodell et al. 2018). However, the GRACE-derived TWSA values are of coarse resolution, which limits their application for local to regional-scale purposes (Miro and Famiglietti 2018). Therefore, downscaling the data is a mandatory task to unearth the local-scale fluctuations of groundwater storage and depletion from the GRACE estimates.

Downscaling of the GRACE estimates is recently implemented by utilizing simulated hydro-meteorological variables from hydrological models. To date, GLDAS has overwhelmingly been used by researchers from around the world for downscaling the GRACE data (e.g., Rahaman et al. 2019; Zhang et al. 2021). With the same modeling algorithm and data assimilation technique, the Famine Early Warning Systems Network Land Data Assimilation System (FLDAS) is another model with improvements in the spatial resolution (10 km) of the variables. Unlike the GLDAS model, the application of the FLDAS is not widespread, especially for downscaling purposes.

Keeping in view the limitations of high-resolution data of independent variables, this study proposes FLDAS model data to downscale JPL mascon solution from GRACE/GRACE-FO for extracting fine-resolution variations of groundwater storage to better understand the local and basin-scale variability of groundwater storage and depletion based on the machine learning model, which is the main contribution and novelty of this study. The downscaling of GWSA is conducted based on the downscaled TWSA using high-resolution FLDAS model data. This study is one of the earliest examples of generating 10 km GWSA that can later be utilized for regional-scale assessment of TWSA and GWSA. The testing and application of the method are conducted over the Western Anatolian Basin (WAB), Türkiye, which has been experiencing significant groundwater depletion over the last decades.

2 Methodology

2.1 Overview of the Study Area

The Western Anatolian Basin (WAB) is situated between 37.00°–40.00° N and 26.00°–30.30° E and is a combination of 4 hydrological river basins namely Kucuk Menderes Basin (KMB), Buyuk Menderes Basin (BMB), Gediz Basin (GB) and North Aegean Basin (NAB) located in western Türkiye (Fig. 1). With an approximate area of 60,000 km2, the WAB covers the majority (71%) of western Anatolia. The climate of the WAB is predominantly a typical Mediterranean climate with dry and hot summers and cold and rainy winters (Ay 2021). The total annual precipitation and mean annual temperature of the region are 650.8 mm and 15.8 °C, respectively. The description of each subbasin, their geological (Fig. SM1) and hydrogeological setting as well as their groundwater potentials (Table SM1) are presented in the Supplementary Materials.

Fig. 1
figure 1

Illustration of the geographic position of the Western Anatolian Basin (WAB), west of Türkiye

2.2 Data Used

Different sources of data sets obtained by remote sensing and in-situ observations were used to implement this study. The characteristics of the data used are given in Table 1 and discussed in the following sections.

Table 1 Description of the data used in this study

2.2.1 GRACE/GRACE-FO JPL Mascon

The twin-satellite GRACE project is composed of two main missions: the GRACE-1, and the GRACE Follow-On (GRACE-FO). The GRACE-1 mission was terminated in 2017, and with a year of latency, the GRACE-FO was put into orbit in 2018 and has been working since then. The processing of the GRACE/GRACE-FO signals is up to three data processing centers such as the Center for Space Research (CSR), GeoForschungsZentrum (GFZ), and the Jet Propulsion Laboratory (JPL) (Chambers 2006). In this study, the GRACE JPL mascon (release 06) was used to extract the time series of TWSA.

2.2.2 FLDAS Model

Famine Early Warning Systems Network Land Data Assimilation System (FLDAS) is a data assimilation project which, similar to the commonly used GLDAS model, simulates different hydro-meteorological variables by integrating remote sensed, modeled, and in-situ observations at a global scale (Loeser et al. 2020). The simulations are done under two land surface models (LSMs) including the Variable Infiltration Capacity (VIC) model and the Noah model (McNally et al. 2017) with spatial resolutions of 0.25° and 0.1°, respectively. In this study, simulated variables of soil moisture, snow water, evapotranspiration, rainfall, and runoff were extracted from the latest version (version 4) of the FLDAS-Noah.

2.2.3 Field Observations

The field-based observations of precipitation and depth to groundwater were obtained from the Turkish State Meteorological Service (TSMS) and the State Hydraulic Works (SHW), respectively. The characteristics of the groundwater observation wells utilized in this study are given in Table SM2 in the Supplementary Material.

To validate the GRACE-derived GWS with in-situ observations of depth-to-groundwater (DTG), the latter should be converted into field-based GWSA. To this end, first, the DTG data of each observation well was changed to the groundwater level (GWL) based on the topographical elevation above the mean sea level (H) of each corresponding well (Eq. 1).

$$\mathrm{GWL}=\mathrm{H}-\mathrm{DTG}$$
(1)

Then, the variations of GWLA (∆GWLA) were extracted by subtracting the same baseline as the GRACE mission (mean of 2004–2009). And finally, the in-situ GWSA values were calculated by using the specific yield (Sy) of each well for the unconfined aquifers or storativity for the confined aquifers (Eq. 2).

$$\Delta \mathrm{GWS}={\mathrm{S}}_{\mathrm{y}}\times \Delta \mathrm{GWL}$$
(2)

The specific yield has a high spatial and temporal variability, especially for shallow water tables, and it can be affected by many factors including soil type, water table depth, time since rise or fall of the water table, hysteresis, soil texture, antecedent soil moisture condition, temperature, and chemical composition of water, as well as plant water demand (Lv et al 2021). Accordingly, the rock type that is responsible for the soil type and texture is one of the main parameters that influence the specific yield as shown in Table SM3 (Karasu 2019). The accurate estimation of this parameter requires the implementation of pumping tests. Unfortunately, there is no available information on the estimated Sy over the WAB. Therefore, the Sy of the seasonal observation wells, used for validation purposes, was estimated based on the values reported by Karasu (2019) for each lithological pattern of the wells in the study area. Accordingly, the Sy estimates used in this study ranged from 17 to 26 percent.

2.3 Random Forest Machine Learning (RFML)

As a new ensemble-based nonlinear machine learning model (Rahaman et al. 2019), Random Forest (RF) is a supervised learning algorithm (Breiman 2001) that can be applied to both classification and regression problems. Consisting of a combination of multiple classifications and regression trees (CART), RFML generates a regression tree based on a set of homogenous subsets of predictors randomly and uses the average results of each decision tree (Rahaman et al. 2019). In light of its unique advantages, such as the feasibility of working with numerous inputs, high precision, and detection of the significance of the variables (Rahaman et al. 2019; Chen et al. 2019), the RFML model has started to gain widespread applications in remote sensing and hydrology (Hua et al. 2018). The RFML model used in this study was designed as follows:

  1. 1.

    The model inputs were aggregated to the spatial resolution of the GRACE-JPL Mascon (50 km) and then the statistical associations between the TWSA, Digital Elevation Model (DEM), and the hydrological variables of soil moisture, snow water, rainfall, surface runoff, and evapotranspiration at 50 km resolution were used in developing the RFML model to predict TWSA.

  2. 2.

    The residual values were calculated by deducing the model-derived TWSA from the original GRACE-TWSA.

  3. 3.

    The model was applied to the fine-resolution inputs to attain the estimated 10 km TWSA.

  4. 4.

    Residuals were added to the estimated TWSA at 10 km to obtain the downscaled TWSA through three steps:

    1. (a)

      Re-aggregating the 10 km predictors to the original GRACE resolution (50 km),

    2. (b)

      Computing the residuals between the 50 km predictors and the 50 km original TWSA,

    3. (c)

      Resampling the 10 km residuals and adding them to the 10 km predicted TWSA, which yields the final downscaled TWSA.

  5. 5.

    Eventually, the downscaled GWSA values were extracted by subtracting the FLDAS-derived SMS and SWE from the downscaled TWSA.

The flow of the entire analysis including the RFML stage is shown in Fig. 2.

Fig. 2
figure 2

Graphical description of the methodological flow of the study

2.4 Extraction of Groundwater Storage

TWSA is a vertically integrated hydrologic variable that encompasses different components such as soil moisture storage anomalies (SMSA), snow water equivalent anomalies (SWEA), groundwater storage anomalies (GWSA), and surface water storage anomalies (SWSA) (Eq. 3) (Khorrami and Gunduz 2021b). The GWSA is extracted through the GRACE isolation process (Eq. 4).

$$\mathrm{TWSA}=\mathrm{GWSA }+\mathrm{ SMSA}+\mathrm{SWEA}+\mathrm{ SWSA}$$
(3)
$$\mathrm{GWSA}=\mathrm{ TWSA}- \left[\mathrm{ SMSA}+\mathrm{SWEA }+\mathrm{ SWSA}\right]$$
(4)

Because the variations of TWSA in arid and semi-arid regions are mainly controlled by soil water storage (Deng and Chen 2017), the GRACE isolation is done through simplified Eq. (5) by taking only the SMSA and SWEA into account. In this equation, the SMSA and SWEA variables were derived from the FLDAS Noah model outputs.

$$\mathrm{GWSA}=\mathrm{ TWSA}- \left[\mathrm{SWEA}+\mathrm{ SMSA}\right]$$
(5)

2.5 Uncertainty Estimation

The associated uncertainties of the downscaled GWSA values were estimated based on the error propagation theory, according to which, the estimated errors in the TWSA (achieved from the data provider) and in the land surface model outputs culminate into the uncertainties of the resulting GWSA. The GWSA errors were calculated according to Eq. (6).

$${\upsigma }_{\mathrm{GWSA}}=\sqrt{{\left({\upsigma }_{\mathrm{TWSA}}\right)}^{2}+{\left({\upsigma }_{\mathrm{SMSA}}\right)}^{2}+{\left({\upsigma }_{\mathrm{SWEA}}\right)}^{2}}$$
(6)

where σ stands for the standard deviation of the corresponding parameter.

3 Results

3.1 Sensitivity of the Model Inputs

The RFML model provides the relative contribution of each input variable in the simulation process to determine their importance. The importance of each variable is given in terms of %IncMSE, which demonstrates the possible increase in MSE by randomly permuted variables (Zhao et al. 2018). The sensitivity result is interpreted such that the larger values of %IncMSE signify more importance and thus are more favorable. Associations between the model inputs impact the model performance. In this study, different values such as precipitation, soil moisture, snow water, surface runoff, evapotranspiration, surface elevation, and GRACE-TWSA were integrated into the RFML model. The predictability of the inputs of the developed model for the study area was investigated based on the variable importance measures predictive (VIMP) test. The VIMP calculates the variations of the prediction error of each input variable before and after permutation (Rahaman et al. 2019).

The sensitivity results for the harsh climatic situation of Oct 2020 are given in Fig. SM2 in Supplementary Materials. It reveals that precipitation and surface elevation (DEM) are the two most crucial variables in predicting the GRACE TWSA using the RFML model. The anomalies of the snow water storage, on the other hand, turned out to be the least effective parameter for the RFML-based predicting of the GRACE TWSA. This is rational because the variations of the snow water are restricted to a small spatial fraction of the country in the Eastern regions. Therefore, with its least impact on the variations of Türkiye’s TWSA, snow water manifests the least importance among the used variables. The results revealed that all the input variables were quite skillful in downscaling the TWSA with above-zero sensitivity values.

3.2 Investigating the Predictive Precision of the RFML Model

The performance of the RFML model was tested using the original and modeled (predicted) TWSA values. Five statistical metrics (R, R2, RMSE, MAE, and D) were applied to show the accuracy of the model. According to the results (Fig. SM3), with a correlation of coefficient of more than 0.98, coefficient of determination of above 0.97, and D value of about 0.99, the RFML model turned out to be highly successful and can be applied in modeling the finer resolution of TWSA and GWSA over the study area.

To check the spatial consistency of the pre-and post-downscaling TWSA and GWSA over the WAB, the variations of Oct 2020 were mapped as an example. As observed in Fig. 3, the original and downscaled distribution of the TWSA and GWSA is consistent from the perspective of spatial distribution. Figure 3 manifests the good performance of the RFML model in filling in the spatial gaps of the TWSA and GWSA values in the basin, which overall correspond with the trends of the original values. The spatial variations of the original and the downscaled TWSA and GWSA are consistent throughout the WAB with increased values towards the north and the south and concentrated decreasing values inside the basin. The thematic maps for the TWSA and GWSA before and after downscaling process suggest that by using the downscaled products the area texture information can be better retrieved than by using the original GRACE data (Zhang et al. 2021).

Fig. 3
figure 3

Spatial variation of the original and downscaled TWSA and GWSA in Oct 2020

3.3 Validating the Downscaled Groundwater Storage

For the validation of GWSA derived from GRACE/GRACE-FO, RFML-downscaled, and in-situ were used. The point-wise validation results (Fig. 4) show that while the correlation of coefficient between the original GRACE-GWSA and the in-situ GWSA is 0.56 at the seasonal scale, the downscaled GWSA has a higher correlation of 0.60. The improvement of the correlation indicates that the RFML model was successful in modeling the finer resolution (10 km) GWSA.

Fig. 4
figure 4

Validation of the original (50 km) GWSA (a) and downscaled (10 km) GWSA (b) against seasonal groundwater level

3.4 Temporal Fluctuations of Terrestrial and Groundwater Storage

The time series of the downscaled TWSA and GWSA over the WAB is shown in Fig. 5a, b, respectively. The basin-wise monthly variations of the TWSA and GWSA manifest decreasing trends for all the studied basins that stress they have lost water storage over time. The monthly fluctuations of the TWSA and GWSA indicate a generally critical situation for the study area in 2007, 2008, 2014, 2016, 2018, 2019, and 2020. The maximum storage loss detected in 2020 for the TWSA over the basins is calculated as 266.86 mm (Oct 2020) for the NAB, 233.87 mm (Dec 2020) for the GB, 211.07 mm (Oct 2020) for the KMB, and 206.96 mm (Dec 2020) for the BMB. The GWSA, on the other hand, manifests the maximum storage loss in Nov 2007 for all the basins with loss values of 265.07 mm, 238.45 mm, 217.19 mm, and 191.29 mm over the NAB, the GB, the KMB, and the BMB, respectively. Overall, the dry periods for the TWSA and GWSA time series over the WAB are in accordance with the drought analysis results for Türkiye (Khorrami and Gunduz 2021b).

Fig. 5
figure 5

Temporal fluctuations of terrestrial (a) and groundwater (b) storage over the WAB and thematic map of the annual GWSA trend (c)

3.5 Trend and Uncertainty Estimates

The trend values and associated uncertainties of each component of the hydrological cycle over the study area are reported in Table 2. The ERA-5 Land model simulates 10 km resolution hydrometeorological parameters that were applied for the uncertainty evaluation of data from the Noah model (10 km). The standard deviation between the Noah-derived and ERA5-derived soil moisture and snow water values was used to estimate the uncertainties associated with the soil moisture and snow water over the WAB. Consequently, the uncertainty of GWSA was estimated based on the error propagation theory.

Table 2 Zonal statistics of the annual trend and uncertainty values of the water cycle components and the volumetric GWS

The results reveal that, from 2003 to 2020, the NAB, the GB, the KMB, and the BMB basins have suffered from diminishing TWSA with annual rates of 6.63 mm/year, 6.44 mm/year, 6.69 mm/year, and 5.56 mm/year, respectively. The annual variations of the GWS over these basins are 2.65 mm/year, 3.20 mm/year, 2.40 mm/year, and 2.18 mm/year, respectively in the given order. The variations of GWSA over the GB are the largest among the basins of the study area, which are ascribed to the overwhelming water consumption as a result of industrial and agricultural development as well as population growth (Harmancioglu et al. 2008). Although the NAB is a basin with good standing in terms of the groundwater situation, the results indicate that it is the second river basin in the WAB suffering from large decreasing variations of groundwater storage during the time. This can be justified considering the fact that the NAB experiences relatively large variations in the SWEA compared to other basins. While the GB, the KMB, and the BMB manifest no significant variations of SWEA over the study duration, the NAB shows an annual rate of 0.02 mm/year. The trend results also suggest that, except for the NAB, the variations of the SMSA over the WAB basins are more than those of the GWSA, which stresses the determining impact of the soil moisture changes on the variations of the TWSA over the study area. This finding is in accordance with the findings of Okay Ahi and Jin (2019) where they have reported a noteworthy impact of soil moisture on the variations of the GRACE-derived TWSA over Türkiye.

The spatial depiction of the annual trend of the GWSA (Fig. 5c) indicates that during the last 18 years, almost all the WAB has suffered from groundwater depletion with the maximum depletion rates of -9.2 mm/year mainly in the eastern areas, which include the eastern parts of the BMB and GB basins. The annual trend map also suggests that in small proportions of the southern BMB and western GB basins, the GWSA shows an increasing trend, which reaches + 5.8 mm/year.

The volumetric trends of the variations of GWSA over each basin (Table 2) were calculated by taking the number of pixels and the underlying area of each basin into account. The results reveal that the groundwater aquifer in the WAB has lost about 2.57 km3 of its storage during the last 18 years. The BMB and GB turned out to be the most critical basins in terms of total groundwater depletion (0.97 km3). The total depletion in the KMB, on the other hand, is the least (0.29 km3) among the basins.

3.6 Temporal Associations Between the TWSA and Drought Indices

To relate the water storage fluctuations of the study area to the climate of the region, the interactions between drought indices (SPI, SPEI, and scPDSI) and the variations of the TWSA over the WAB were investigated at various timescales. The SPI values were calculated based on the in-situ precipitation records. The SPEI and scPDSI, on the other hand, were retrieved from the gridded data repository, which provides global distribution of the SPEI and scPDSI values (Table 1).

According to the results, the basin-wise variations of TWSA are better correlated with the SPEI than the scPDSI and SPI at a monthly timescale. The monthly correlation between the zonal TWSA and SPEI is 0.4 over the NAB and the GB, 0.45 over the KMB, and 0.46 over the BMB. The monthly variations of the TWSA turn out to be in lower agreement with the SPI over all the basins with a correlation of 0.38 over the NAB and the KMB, 0.41 over the GB, and 0.43 over the BMB.

According to the annual time series, the annual variations of the drought indices show higher correlations with the fluctuations of the TWSA over the basins. The best correlation (0.71) for the annual TWSA with the SPEI and scPDSI is seen over the BMB. While the annual associations between the SPEI, scPDSI and the TWSA over the GB are the least among the basins, the variations of the TWSA over the GB show the highest correlation (0.71) with the annual SPI. The monthly and annual association graphs are given in Figs. SM4 and SM5, respectively, in the Supplementary Material.

4 Discussions

4.1 RFML Downscaling

The RFML demonstrated a good performance in simulating the finer resolution of TWSA and GWSA over the study area. Higher associations between the downscaled TWSA and the GRACE-TWSA have been found by several researchers in different study areas (e.g.,Milewski et al. 2019; Rahaman et al. 2019). The RFML algorithm works based on the statistical relationships between the independent and the dependent input variables. Therefore, the accuracy of the input variables is a critical issue regarding the reliability and uncertainty of the downscaled values. The GLDAS has been a leading data model used for downscaling the GRACE data. The FLDAS model makes use of better and more accurate simulations of different hydrometeorological variables at a higher resolution than GLDAS (Shahzaman et al. 2021). The downscaling task in this study is based on the FLDAS simulations and the results confirm a good potentiality for the FLDAS model to be integrated into the RFML for GRACE downscaling purposes.

The RFML sensitivity results revealed more contribution of precipitation and surface elevation to the RFML predicting over the study area. The highest weight of precipitation in the process of predicting GRACE-TWSA has also been reported by Ali et al. (2021). It can be ascribed to the determining role of precipitation in the variations of different hydrological components, especially the total water storage (Khorrami and Gunduz 2021a). The surface topographic elevation is the second most effective parameter for the RFML model. This can be ascribed to the role that the diverse surface topography of the country plays in the variations of different hydrometeorological variables over Türkiye. Especially the orographic effects (Ding et al. 2014) of the surface elevation on the climatic parameters such as precipitation and temperature are probably a controlling factor for the RFML-based model predicting the GRACE TWSA over the study area. Rahaman et al. (2019) also reported a significant role of surface elevation in predicting the GRACE data over the United States.

4.2 TWSA Associations with Drought Indices

The drought indices used in this study showed dry events from 2007 to 2009, 2014, and 2016, which correspond to the results obtained by Okay Ahi and Jin (2019) and Khorrami and Gunduz (2021b). Although the SPI was calculated using the point precipitation observations over the study area, the results indicate that the GRACE-derived TWSA is more correlated with the simulated SPEI and scPDSI. The results also show that on a monthly and annual basis, the SPEI is better associated with the TWSA compared to the scPDSI. The high association of the TWSA with SPEI and scPDSI can be justified by taking their calculation approach into account. While the SPI is solely based on the precipitation anomalies, SPEI and scPDSI integrate temperature into the calculation (Beguería et al. 2010; Pei et al. 2020). Especially, by using SPEI, the evaporative demand is taken into consideration to better describe the global hydrometeorological extremes (Beguería et al. 2010). The higher correlation achieved with TWSA and SPEI over the WAB demonstrates the dual impacts of the climatic parameters of precipitation and evapotranspiration on the variations of the TWSA (Jing et al. 2020).

5 Conclusions and Recommendations

Within the scope of the current study, the authors made use of 10 km simulated parameters to generate a 10 km finer resolution of GRACE data based on the RFML algorithm. The findings suggest that the RFML model was successful in simulating the finer resolution of TWSA and GWSA over the study area with high accuracy and low error. From the perspective of spatial accuracy, it is found that sub-grid variation and heterogeneity of the TWSA and GWSA values can be portrayed with good confidence by using the data inputs and following the proposed methodology. From the viewpoint of the correlations with the in-situ observations of groundwater level, it is also found that the RFML model is highly competent at improving the spatial resolution of the coarse GRACE estimates, which ensures the accuracy of the estimates with improved correlation value.

The statistical downscaling techniques are based on the relationship between the target and the input variables therefore, the final accuracy of the simulations is highly dependent on the accuracy of the input variables. Since the model-based parameters are associated with uncertainties from the used land surface model, algorithms, etc., it can be stated that even higher accuracy downscaled values can be achieved for other watersheds around the world by integrating high-accuracy satellite estimates of hydrometeorological parameters into the downscaling model.

The other limitation of this study was the validation of the downscaled GWSA. Although a good improvement in the spatiotemporal variations of the GWSA was achieved through this study, the limited seasonal groundwater observations and the lack of monthly groundwater data limited this study to even better showcase the precise mission of the downscaled GRACE estimates in catching the basin-scale variations of the GWSA. The authors believe that by using more groundwater observation data at different temporal bases, better correlation values can be obtained through validation.