Introduction

The prediction of groundwater hydraulic heads or groundwater levels is vital for the sustainable management of groundwater resources. In the temporal domain, the groundwater levels are predicted broadly by the deterministic physical models or stochastic time series models. Stochastic time series modeling methods are widely applied for predicting the groundwater levels as they require limited data, unlike physical-based models that require intensive data and parameterization process. The stochastic time series modeling approach can be broadly grouped under univariate and multivariate time series modeling methods. There have been many applications of univariate and multivariate time series modeling approaches for predicting the groundwater levels (Bierkens et al. 1999, 2001; Tankersley et al. 1993; Shirmohammadi et al. 2013; Mohanasundaram et al. 2017, 2019). In spatial domain, the kriging approach is widely used in environmental studies to interpolate the point-based discrete variables such as rainfall, soil moisture, groundwater levels, surface elevation, air temperatures, soil and water quality parameters at the unknown locations (Theodossiou and Latinopoulos 2006; Dash et al. 2010; Wagner et al. 2012; Arun 2013; Wu and Li 2013; Gong et al. 2014; Vereecken et al. 2014; Plouffe et al. 2015; Guekie simo et al. 2016; Landrum et al. 2016; Martínez-Murillo et al. 2017; Rizo-Decelis et al. 2017; Shtiliyanova et al. 2017; Theodoridou et al. 2017; Varouchakis et al. 2018; Yin et al. 2019; Amini et al. 2019).

The groundwater levels mostly follow a non-stationary process and exhibit a strong trend pattern in the spatial domain due to varying hydrogeological, hydrological and climatic conditions (Varouchakis and Hristopulos 2013). Therefore, the spatial trend component in the groundwater levels must be accounted for a suitable trend function while interpolating the groundwater levels at unknown locations. The spatial trend component of the groundwater levels can be modeled with various mathematical functions by different kriging group of models. For example, the spatial trend component can be modeled with a known stationary constant mean by simple kriging (SK) approach (Sun et al. 2009). Alternatively, if the mean of the variable is unknown, the data can be modeled by ordinary kriging (OK) approach (Ahmadi and Sedghamiz 2008; Varouchakis et al. 2012; Daya and Bejari 2015; Adhikary and Dash 2017). The universal kriging (UK) approach is adopted if a regional gradient present in the modeled datasets (Kumar 2007). Sometimes, the spatial trend in the groundwater levels can be explicitly modeled in terms of auxiliary variables such as digital elevation model (DEM), distance to stream network (DS), and other explanatory variables. This group of kriging is called regression kriging (RK) where groundwater levels at the observation locations are first detrended with appropriate deterministic trend functions and the remaining random residuals are interpolated using OK method (Rivest et al. 2008; Varouchakis and Hristopulos 2013; Zhu et al. 2013; Mhamad 2019). The overall accuracy in the groundwater water level predictions is relatively better for RK methods as it separately filters the spatial trend component from the groundwater levels which, in turn, reducing the variance of the residual component when compared to SK and OK methods (Varouchakis and Hristopulos 2013). Many studies have correlated the auxiliary variables such as spatial coordinates, topographic index, DS, and DEM with groundwater levels in the trend functions of RK models (Desbarats et al. 2002; Nikroo et al. 2010; Chung and Rogers 2012; Varouchakis and Hristopulos 2013; Zhu et al. 2013). However, the overall uncertainty which is arising from RK model interpolations can be effectively minimized by minimizing the uncertainty from the trend function of the model.

Kumar and Remadevi (2006) compared the OK method against a deterministic interpolation and inverse distance weighting (IDW) methods for the application of groundwater level interpolation in the northwestern part of Rajasthan, India. In their study, independent variograms were fitted for pre-monsoon and post-monsoon groundwater levels for six consecutive years and were interpolated by corresponding variogram model parameters. They found that the OK method outperforms the IDW method in the groundwater level interpolation process. A study by Ahmadi and Sedghamiz (2008) used the OK method to model the spatial distribution of groundwater levels in the southern parts of Iran. The study predicted groundwater level drops with reasonable accuracy at unvisited locations of the study area using cross-validation analysis. Furthermore, the study emphasized on identifying locations with severe water level drops on a spatial scale which is critical for water managers and land-use planners to optimally manage the groundwater resources and cropping systems, respectively. Chung and Rogers (2012) compared multiple linear regression (MLR), OK and co-kriging (CK) methods for interpolating groundwater levels at St. Louis metro area, New Madrid seismic zone, St. Louis County. The MLR coefficients corresponding to minimum water table elevation and minimum water table depth were estimated against the dependent variable of water table elevation data. The MLR method closely mimics the groundwater table depths under the highly undulating topographic surface than CK, whereas OK is little influenced under the same conditions. Kumar and Ahmed (2003) applied the UK method with a linear drift to interpolate groundwater levels in Maheshwaram watershed located in Andhra Pradesh, India. Simple linear and quadratic trend equations with space coordinates were fitted to calculate the drift component of UK. The residual variance variogram can easily be modeled with less uncertainty when groundwater level data show a linear drift with space coordinates. In another similar study by Adhikary and Dash (2017), the pre-monsoon and post-monsoon groundwater table depths from 116 locations over the space were interpolated using IDW, OK, UK, and radial basis function models. They reported that the UK method outperformed over OK and IDW methods. Various authors analyzed the accuracy in spatial modeling of groundwater levels by comparing different kriging methods with model performance indices such as mean error (ME), mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE) (Kumar 2007; Nikroo et al. 2010; Delbari et al. 2013; Adhikary and Dash 2017).

The RK models are useful in predicting groundwater levels when the trend function in RK captures most of the variance in the groundwater level datasets. More often the trend function is fitted between groundwater levels and the variable that is strongly associated with the groundwater levels. For example, the land surface elevations or DEM are often strongly correlated with the shallow unconfined aquifer groundwater levels. Therefore, the DEM is routinely used as a main auxiliary variable in RK methods for the prediction of groundwater levels in the spatial domain. The advantages of DEM-based trend surface modeling in groundwater level interpolation were demonstrated with a linear and two quadratic functions with and without considering DEM variables in the Fuyang River Basin, North China, by Zhu et al. (2013). Furthermore, the study justified that correlating DEM as an auxiliary variable with groundwater levels was more appropriate in predicting the groundwater levels as there was a strong dependency between groundwater levels at 23 observation wells and corresponding DEM values with a coefficient of determination (R2) of 0.95. However, the study also reported that the discrepancy in considering only DEM values in one of the trend surface functions generated uneven distribution of residuals from the trend models. This fact emphasizes that the DEM variable alone is not perfectly filtering the spatial trend from the groundwater level datasets.

A study by Desbarats et al. (2002) used collateral surface elevation information as external drifts in groundwater level predictions where the trend was modeled with a linear deterministic function given by TOPMODEL topographic index. The trend surface function with standalone DEM and groundwater levels was developed as another kriging-based external drift (KED) model. The benchmark OK model without the surface collateral information was developed to compare with the other two KED models. The cross-validation statistics for MSE show that those values were slightly more than the nugget variance of the experimental variograms which indicates that most of the error in the cross-validation statistics attributed to the noise in the data due to temporal fluctuations in the water levels. The study reported that the largest prediction error was observed in the deepest water table locations in the study area. It further emphasizes that the groundwater level predictions based on collateral surface elevation information alone may not be able to capture the maximum variability in the predicted groundwater levels especially when the groundwater level dataset shows a considerable variation in the water levels.

The spatial prediction performance of the groundwater levels by standard kriging-based methods including the RK method shows a relatively better result over physically based IDW methods where the distribution of observation points is relatively uniform and dense (Kumar 2007). However, the spatial prediction of the groundwater levels under sparsely distributed datasets is also vital as the measurements in the remote locations are often not available in many parts of the world. A study by Varouchakis and Hristopulos (2013) attempted to improve the spatial prediction of the groundwater levels with sparsely distributed datasets in the island of Crete, Greece, using RK method in which they used auxiliary trend variables based on DS and physical-based Thiem’s multiple-well equations in the trend functions to predict the groundwater levels. The study compared the prediction model results across the OK, UK, DS, DEM, and Thiem’s equation-based RK models. The DS, DEM, and Thiem’s equation-based RK models were relatively superior over OK and UK methods. In addition, the study reported that the estimated groundwater levels in the far-away location from the measurement points were predicted with relatively higher uncertainty although the RK model trend functions were reducing the residual variance to a significant extent. The estimated standard deviations in the predicted groundwater levels were varied up to 4.5 m above mean sea level (MSL). It indicates that the improvement in the groundwater level predictions is still possible with a novel trend function which captures relatively higher variability from sparse distribution of groundwater level observations.

In this background, the present study aims to introduce a mean groundwater level variable as an auxiliary variable in the trend function as a part of the RK model to improve the overall groundwater level predictions under sparsely distributed observations. The study proposes a new trend function with mean groundwater level generated using long term groundwater level datasets from the well locations used in the RK approach, to effectively capture the trend component in sparsely distributed well locations, thus minimizing the overall uncertainty in groundwater level predictions. The study hypothesizes that the mean value of the groundwater levels can be considered as a drift variable instead of a routinely used standalone DEM as a drift variable as the mean groundwater levels are highly correlated with groundwater levels at a specific time than DEM variable. Moreover, the mean groundwater levels-based interpolation methods can be applied to all ranges of groundwater hydraulic heads or depth to water table data (shallow, medium and deep water levels), unlike DEM-based interpolation methods which are most suitable for shallow groundwater level datasets (Desbarats et al. 2002). The present study uses the unique explanatory variable, mean groundwater level, as a part of a trend function in the RK model to predict groundwater levels. The specific objectives of the present study are: (1) to develop a new trend function with mean groundwater level as an auxiliary variable in RK method and (2) to assess the performance of the new trend function-based RK method with a deterministic IDW method and other kriging-based methods such as OK and RK methods with standalone DEM and DS as the auxiliary variables in the trend functions.

Methodology

Inverse distance weighting method

The basic concept of the IDW method is that the closest points are more relevant in predicting a better value than farther away points. The IDW method uses the weight function, which is inversely related to distances from the prediction point to the known surrounding points. Furthermore, the distances are raised to the power value of \(p\). The value of \(p\) plays a major role in the final predicted values. As the value of \(p\) increases, the weights for the distant points decrease rapidly. The formulation of the IDW is as follows:

$$\hat{z}(x_{0} ) = \frac{{\sum\limits_{x = 1}^{N} {\frac{{z_{x} }}{{d_{x}^{p} }}} }}{{\sum\limits_{x = 1}^{N} {\frac{1}{{d_{x}^{p} }}} }}$$
(1)

where \(\hat{z}(x_{0} )\) is the predicted value at an unknown location \(x_{0}\)(L); \(z_{x}\) is the value of the sample from the prediction location \(x_{0}\) (L); \(d_{x}\) is the distance from the prediction location to the surrounding sample points locations (L); \(p\) is the power function; and \(N\) is the number of sample points.

Spatial modeling of groundwater levels by kriging techniques

Semivariogram analysis

A semivariogram can model the spatial structure of groundwater level. A semivariogram is a functional relationship between the sampled variance calculated from observed data pairs and distance separated by a lag distance \(h\). The semivariance can be estimated as follows (Isaaks and Srivastava 1989):

$$\gamma (h_{lag} ) = \frac{1}{{2N(h_{lag} )}}\sum\limits_{i = 1}^{{N(h_{lag} )}} {\left[ {Z(x_{i} ) - Z(x_{i} + h_{lag} )} \right]^{2} }$$
(2)

where \(\gamma (h_{lag} )\) is estimated value of the semivariance at lag \(h_{lag}\) [L2]; \(N(h_{lag} )\) is the number of experimental data pairs separated by a lag distance \(h_{lag}\); \(Z(x_{i} )\) and \(Z(x_{i} + h)\) are the values of the variables at locations \(x_{i}\) and \(x_{i} + h_{lag}\) respectively [L].

The experimental variogram data points can be fitted by many theoretical models such as spherical, exponential, and Gaussian models. However, the spherical model is the most efficient theoretical model for fitting experimental data for modeling groundwater level as compared to exponential and Gaussian models (Ma et al. 1999). Therefore, the spherical model was adopted in the present study to model the spatial structure of the groundwater levels. The spherical model formulation is given as follows (Clark 1979):

$$\gamma (h_{lag} ) = \left\{ {\begin{array}{*{20}l} {C_{0} + C_{1} \left[ {\frac{3}{2}\frac{{h_{lag} }}{a} - \frac{1}{2}\left( {\frac{{h_{lag} }}{a}} \right)^{3} } \right]} & \quad{h_{lag} \le a} \\ {C_{0} + C_{1} } &\quad {h_{lag} > a} \\ \end{array} } \right.$$
(3)

where \(C_{0}\) is the semivariogram intercept in the y-axis (nugget effect) [L2]; \(C_{0} + C_{1} \,\) is sill variance of the semivariogram [L2]; and \(a\) is a range of the semivariogram [L].

The sill is the variance value at which semi-variance cloud points level off in the semivariogram plot. In other words, it is the maximum variance value to be modeled in the semivariogram analysis. The range is the lag distance up to which the autocorrelation process exists significantly and beyond which the autocorrelation decreases to zero. All the models, including IDW, kriging, and semivariogram models, were coded in MATLAB software (MATLAB 2014a version) in this study.

Ordinary kriging

Different kinds of kriging methods can be applied for predicting groundwater levels in the spatial domain depending on the nature of the groundwater level data and availability of additional auxiliary variable datasets such as DEM. However, the OK method uses only the variable of interest which, in this case, is groundwater level data. The OK model formulation is given below:

$$Z(x_{0} ) = \sum\limits_{i = 1}^{N} {\lambda_{i} } Z(x_{i} )$$
(4)
$$\left\{ \begin{gathered} \sum\limits_{i = 1}^{N} {\lambda_{i} \gamma (x_{i} ,x_{j} ) - \mu = \gamma (x_{i} ,x_{0} )} \hfill \\ \sum\limits_{i = 1}^{N} {\lambda_{i} = 1} \hfill \\ \end{gathered} \right.$$
(5)

where \(Z(x_{0} )\) is unbiased estimator of the prediction variable at the unvisited location \(x_{0}\) [L]; \(\lambda_{i}\) is the kriging weights between the unvisited location \(x_{0}\) and the visited location \(x_{i}\)where \(i\) = 1, 2, 3…N are the number of observation locations; \(\mu\) is the Lagrange coefficient; \(\gamma (x_{i} ,x_{j} )\) is average semivariogram values between the location \(x_{i}\) and extremity location \(x_{j}\) [L2]; and \(\gamma (x_{i} ,x_{0} )\) is average semivariogram value between the location \(x_{i}\) and the prediction location \(x_{0}\) [L2].

The important advantage of the kriging methods over deterministic methods is that it can estimate the prediction error variance along with the prediction results. The prediction error variance of the OK method in a matrix form can be written as follows:

$$\sigma_{OK}^{2} (x_{0} ) = (C_{0} + C_{1} ) - c_{0}^{T} .\,C^{ - 1} .\,c_{0}$$
(6)
$$c_{0} = \{ C(x_{0} ,x_{1} ),...,C(x_{0} ,x_{N} )\}^{T}$$
(7)
$$C = \left[ {\begin{array}{*{20}c} {C(x_{1} ,x_{1} )} & {...} & {C(x_{1} ,x_{N} )} \\ . & . & . \\ {C(x_{1} ,x_{N} )} & {...} & {C(x_{N} ,x_{N} )} \\ \end{array} } \right]$$
(8)

where \(\sigma_{OK}^{2} (x_{0} )\) is the OK variance of the estimation error at \(x_{0}\); \(c_{0}\) is covariance vector at the unvisited locations; and C is a covariance matrix.

Regression kriging

The OK method assumes that the groundwater level data is stationary, and the trend is a constant, which is also an unknown process over the spatial domain. However, on many occasions, groundwater level data is non-stationary, and it shows a robust spatial trend with other attributes such as spatial coordinates and elevation values. This kind of deterministic spatial trend can be modeled separately along with the stochastic variation of the groundwater level data as a residual component using the RK approach. The general formulation of RK is given as follows (Hengl et al. 2003; Varouchakis and Hristopulos 2013):

$$\hat{Z}(x) = \hat{m}(x) + \hat{\varepsilon }(x)$$
(9)

where \(\hat{Z}(x)\) is predicted water level at location \(x\) [L]; \(\hat{m}(x)\) is fitted trend value at location \(x\) [L]; and \(\hat{\varepsilon }(x)\) is trend surface residual at location \(x\) [L].

Trend function with distance to stream network data

The correlation among the groundwater level surface and distance from the stream network is often studied as the groundwater aquifer, and surface water systems interact with each other. Therefore, in this study, we developed a simple linear function between groundwater levels and distance to the stream network variable. As a first step in generating the stream network map from DEM, the flow accumulation raster map was generated from a sink-filled DEM data. The flow accumulation map was used to create the streams network map in Adyar River Basin. We used the SAGA GIS module as a part of the QGIS package (https://qgis.osgeo.org) to create the GIS layers such as flow accumulation raster maps and stream network vector layers. The stream network layer and the vectorized DEM data were used to create the closest distance from the stream network raster map using distance to the nearest hub module in QGIS. A simple linear function was fitted between the groundwater level data from the well locations and their corresponding distance to stream network values. The formulation of the trend function between the groundwater level and distance to stream network (RK-DS) is as follows:

$$m(x)_{DS} = k_{DS} + a_{DS} ({\text{DS}}) + \varepsilon$$
(10)

where \(m(x)_{DS}\) is a deterministic trend value of groundwater hydraulic heads based on DS variable [L]; \(a_{DS} ,\,k_{DS}\) are least-square fitting parameters; and \(\varepsilon\) is the trend residual assumed to be normally distributed [L] with mean 0 and variance \(\sigma^{2}\).

Trend function with standalone DEM data

The shallow unconfined aquifer systems groundwater level surface often reflects the surface topography (Desbarats et al. 2002). As there is a strong correlation between groundwater levels and surface elevation values, the DEM values are accounted as an auxiliary variable in the trend model of the RK method. Sometimes, the trend function can also be drifted with DEM along with spatial coordinates (Kumar and Ahmed 2003; Gundogdu and Guney 2007; Zhu et al. 2013). In the present study, a simple linear function between groundwater level data (dependent variable) and DEM values (independent variables) (RK-DEM) can be fitted as follows:

$$m(x)_{DEM} = k_{{_{DEM} }} + a_{{_{DEM} }} ({\text{DEM}}) + \varepsilon$$
(11)

where \(m(x)_{DEM}\) is a deterministic trend value based on DEM [L]; \(a_{{_{DEM} }} ,\,k_{{_{DEM} }}\) are least-square fitting parameters; and \(\varepsilon\) is the trend residual assumed to be normally distributed [L] with mean 0 and variance \(\sigma^{2}\).

The parameters of Eq. (11) were obtained by the least square method of fitting a function.

Proposed trend function with mean groundwater level as a function of DEM

In the present study, a new trend function was proposed with a long-term mean groundwater level (MGWL) variable as a function of the predicted groundwater levels. The schematic diagram of the proposed trend function is shown in Fig. 1.

Fig. 1
figure 1

The proposed trend model based on mean groundwater level as an auxiliary variable

The variable, MGWL, was calculated at the observation well locations using a simple arithmetic averaging method from the existing long-term monthly groundwater level datasets. As the groundwater level variation is influenced by the season, MGWL values were calculated based on long-term monthly average groundwater levels (for January through December) at the observation well locations (Fig. 1). This process estimates only point-based MGWL values at the well locations for all the months (i.e., January through December). However, the spatially continuous grid-based MGWL raster maps (like DEM) for the corresponding months are needed to model the spatial trend from the corresponding month's groundwater level data over the entire spatial domain. This demands the development of a spatially continuous or gridded MGWL maps for all the months in the study area. The development of the gridded MGWL maps from DEM grid values is explained in two steps as follows:

  • Step 1: Fitting a linear function for MGWL and DEM at well sites.

First, a simple linear function was established between MGWL and DEM (MGWL-DEM) values based on well sites data for the corresponding months as follows:

$$\sum\limits_{mi = 1}^{12} {\sum\limits_{i = 1}^{N} {{\text{MGWL}}_{i,mi} } } = \sum\limits_{mi = 1}^{12} {ks_{mi} } + \sum\limits_{mi = 1}^{12} {cs_{mi} } \left[ {\sum\limits_{i = 1}^{N} {{\text{DEM}}_{i} } } \right] + \varepsilon$$
(12)

where \({\text{MGWL}}_{i,mi}\) is the mean groundwater levels at the well locations \(i = 1...N\) and for the months \(mi\) = January through December [L]; \({\text{DEM}}_{i}\) is DEM values at the corresponding well locations \(i = 1...N\); \(ks_{mi}\) and \(cs_{mi}\) are generalized least-square fitting parameters of linear MGWL-DEM model for the months \(mi\) = January through December [L]; and \(N\) is number of observation well sites.

The coefficients of Eq. (12) were estimated using a generalized least square method of minimizing the errors. As a functional relationship between MGWL and DEM is established through Eq. (12), it can be used to estimate the gridded MGWL maps for the months January through December by extrapolation of gridded DEM values.

  • Step 2: Developing a new trend function with MGWL as a predictor variable in RK trend model.

A new trend equation based on spatially derived MGWL variable (Eq. 12), RK-MGWL, is expressed as follows:

$$m(x)_{MGWL} = k_{mi} + b_{mi} ({\text{MGWL}}_{mi} ) + \varepsilon$$
(13)

where \(m(x)_{MGWL}\) is a deterministic trend value filtered from groundwater level based on MGWL values [L]; \(k_{{_{mi} }} ,\,b_{{_{mi} }}\) are fitting parameters; and \(\varepsilon\) is the trend residual assumed to be normally distributed [L] with mean 0 and variance \(\sigma^{2}\).

The spatially continuous gridded trend surface was generated by extrapolating the values at the grid points with the corresponding gridded MGWL values using Eq. (13). It is assumed that the calibrated parameters \(k_{{_{mi} }} ,\,b_{{_{mi} }}\) based on well site data are the representative values for the whole study region of interest. Thus, they can be used for predicting the trend values at all unvisited grid locations in the study area. The same assumptions were applied for the other trend models such as RK-DS and RK-DEM (Eqs. 10 and 11) functions for generating the gridded trend surface maps.

The residual part of RK

The spatial trend in the groundwater level data can be filtered by the deterministic trend functions using Eqs. (10), (11) and, (13). The trend component of the RK model was calculated after estimating the parameters for Eqs. (10), (11) and, (13). Once the trend component from the groundwater level data was determined, the residual component was calculated as the difference between the observed groundwater hydraulic head values and the calculated linear trend model values (Eqs. 10, 11, and 13) at well sites. These estimated residuals were then modeled over the entire spatial domain by adopting the OK method (similar to Eqns. 4 and 5) as follows:

$$\varepsilon (x_{0} ) = \sum\limits_{i = 1}^{N} {\lambda_{i} } \varepsilon (x_{i} )$$
(14)
$$\left\{ \begin{gathered} \sum\limits_{i = 1}^{N} {\lambda_{i} \gamma (x_{i} ,x_{j} ) - \mu = \gamma (x_{i} ,x_{0} )} \hfill \\ \sum\limits_{i = 1}^{N} {\lambda_{i} = 1} \hfill \\ \end{gathered} \right.$$
(15)

Prediction equations and error variance of RK

The estimated deterministic gridded trend component and the interpolated stochastic residual component were added together to get the total predicted gridded groundwater levels according to Eq. (9). Therefore, the final prediction equations for RK-SD, RK-DEM, and RK-MGWL models can be written based on Eq. (9) as follows:

$$\hat{Z}_{RK - SD} (x_{0} ) = \hat{k}_{SD} + \hat{a}_{SD} ({\text{SD}}) + \sum\limits_{i = 1}^{N} {\lambda_{i} } \hat{\varepsilon }(x_{i} )$$
(16)
$$\hat{Z}_{RK - DEM} (x_{0} ) = \hat{k}_{DEM} + \hat{a}_{DEM} ({\text{DEM}}) + \sum\limits_{i = 1}^{N} {\lambda_{i} } \hat{\varepsilon }(x_{i} )$$
(17)
$$\hat{Z}_{RK - MGWL} (x_{0} ) = \hat{k}_{mi} + \hat{b}_{mi} ({\text{MGWL}}_{mi} ) + \sum\limits_{i = 1}^{N} {\lambda_{i} } \hat{\varepsilon }(x_{i} )$$
(18)

In general, the prediction error variance of RK model can be written as follows (Hengl et al. 2003; Varouchakis and Hristopulos 2013):

$$\sigma_{RK}^{2} (x_{0} ) = \sigma_{RK}^{2} \{ \hat{m}(x_{0} )\} + \sigma_{RK}^{2} \{ \hat{\varepsilon }(x_{0} )\}$$
(19)

For a clear understanding of the terminologies, furthermore, the kriging variance of RK model in a matrix form can be calculated as follows (Hengl et al. 2003; Varouchakis and Hristopulos 2013):

$$\sigma_{RK}^{2} (x_{0} ) = \{ q_{0}^{T} \,.\,\,(q^{T} \,.\,\,C^{ - 1} \,.\,\,q)^{ - 1} \,.\,\,q_{0} \,\} + \{ (C_{0} + C_{1} ) - c_{0}^{T} .\,\,C^{ - 1} .\,\,c_{0} \}$$
(20)

where \(q\) is the vector of the number of independent variables used at well sites (visited) locations; and \(q_{0}\) is the vector of the number of independent variables used at unknown (unvisited) locations \(x_{0}\).

Model evaluation methods and performance indices

Cross-validation is a powerful technique to evaluate the model predictions over a given spatial domain (Cooper and Istok 1988; Isaaks and Srivastava 1989). Leave one out validation (LOOV) is the widely used method in the cross-validation technique to effectively assess the model performance based on the observations. The LOOV method was carried out by first leaving an observation intentionally at a place and modeling with the remaining observation points. This process was repeated for all observations sequentially one after another. In each iteration, the predicted values at the locations where the observations were not accounted were compared with the corresponding observed values from the same locations. The model performance indices such as ME, MAE, MSE, and RMSE are formulated as follows (Delbari et al. 2013):

$$ME = \frac{1}{N}\sum\limits_{i = 1}^{N} {Z(\hat{x}_{i} ) - Z(x_{i} )}$$
(21)
$$MAE = \frac{1}{N}\sum\limits_{i = 1}^{N} {|Z(\hat{x}_{i} ) - Z(x_{i} )} |$$
(22)
$$MSE = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left[ {(Z(\hat{x}_{i} ) - Z(x_{i} )} \right]}^{2}$$
(23)
$$RMSE = \sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {\left[ {(Z(\hat{x}_{i} ) - Z(x_{i} )} \right]}^{2} }$$
(24)
$$R^{2} = 1 - \left( {\frac{{\sum\limits_{t = 1}^{N} {(Z(x_{i} ) - Z(\hat{x}_{i} ))} }}{{\sum\limits_{t = 1}^{N} {(Z(x_{i} ) - Z(\overline{x}_{i} ))} }}} \right)^{2}$$
(25)

where \(Z(\hat{x}_{i} )\) is estimated value of the variable [L]; \(Z(x_{i} )\) is observed value of the variable [L]; \(Z(\overline{x}_{i} )\) is mean value of the observations; and N is total number of observations.

Case study: Geography of Adyar basin

For testing the effectiveness of the developed IDW and kriging methods, a sub-catchment of the Chennai Basin, Adyar basin, was chosen in the current study. The Adyar basin is geographically located in the North East coastal part of Tamil Nadu (Fig. 1). The Adyar River is one of the major rivers in the Chennai Basin which drains rainwater to the Bay of Bengal Sea during monsoon season. Semiarid and humid climatic zones characterize the prevailing climate in the basin. The long-term annual average rainfall in the basin is about 1315 mm (WRO 2007). The elevation ranges from 183 m amsl in the western part of the basin to sea level in the eastern part of the basin.

The soils in the basin have been classified into clayey, black, red sandy, and alluvial soils. The black soils occur in the depressions adjacent to hilly areas in the western part. The alluvial soils occur along with the river courses and eastern part of the coastal areas. The major hydrogeology in the basin has been classified as unconsolidated, semi-consolidated, and weathered fractured rock formation. The groundwater occurs under phreatic and semi-confined conditions in inter-granular pore spaces in sands, sandstones, bedding planes, and thin fractures in shales. The depth to groundwater table fluctuation in the observation wells varies minimum from near-surface to the maximum of 12 m below ground surface. The distribution of observation wells is more concentrated in the eastern part and sparse in the western parts of the basin. Also, as the surface elevations vary from the eastern to the western part with irregular topographical conditions, the occurrence and distribution of groundwater vary widely from eastern to the western parts in the basin. As a result, there are considerable variations in the observed groundwater levels across the basin.

Data used in the analysis

Monthly groundwater level data of 29 observation wells for the period 1988–2007 (20 years) were acquired from the Institute of Water Studies (IWS), Taramani, Tamil Nadu, India. The Shuttle Radar Topography Mission (SRTM)-based DEM gridded data (90 m spatial resolution) for the study area was downloaded from the SRTM-Earth Models website (https://srtm.csi.cgiar.org/SELECTION/inputCoord.asp). The depth to groundwater table data was converted to groundwater level elevation or groundwater hydraulic heads with respect to MSL by subtracting depth to water table values from SRTM-DEM values at the observation well sites.

Pre-monsoon and post-monsoon season’s groundwater level data

For brevity, the pre-monsoon (June) and post-monsoon (December) seasons groundwater level datasets of the year 2007 were chosen for predicting a spatially continuous gridded groundwater level using the developed IDW and kriging methods. Typically, the groundwater level during pre-monsoon season is relatively deeper from the surface while the groundwater level rises close to the surface during post-monsoon seasons. As the variation in the water levels during these months are higher (June and December), these two months datasets were considered for demonstrating the performance of various developed models in this study.

Results and discussion

Groundwater hydraulic head data analysis

Most of the observation wells in the study area were concentrated in the eastern part of the basin where the groundwater levels are located at shallow depth and mostly consists of alluvial formation in the coastal region of the basin. Therefore, the actual groundwater level values with respect to MSL were also relatively lower. The groundwater hydraulic head values vary from near-surface to few meters below the surface in some of the observation wells, which are located closer to the coastal boundary of the basin (Fig. 3). On the contrary, the groundwater levels were typically varied up to 10 m in some of the observation wells located in the central part of the basin where hard rocks characterize the topography. For example, the well nos. 14 and 20 (Fig. 2) typically lie in the central part of the basin that belongs to hard rock hilly regions showing a relatively higher variation in the groundwater levels (Fig. 3).

Fig. 2
figure 2

Location map of the Adyar basin with observation well numbers

Fig. 3
figure 3

Variation of groundwater hydraulic heads at the well sites in the study area during 1988–2007

Furthermore, three representative wells (well nos. 8, 20, and 9) from eastern, central, and western parts of the study area and corresponding temporal variations in the groundwater levels are shown in Fig. 4. As the elevation decreases from the western to the eastern part in the study area, a clear distinction in the hydraulic head values was noticed in the representative wells (Fig. 4). For example, the hydraulic head values of the western part of the wells were ranging from 25–30 m amsl due to relatively higher topography of the region. On the other hand, the hydraulic head values of the eastern part show the variation between 5–10 m due to relatively flatter topography. A relatively higher variation in the groundwater hydraulic heads (12–22 m) was noticed for the central part of the well where the geological settings have been characterized as hard rock formations with rugged hilly terrain (Fig. 4). No significant trend was observed in any of the observed groundwater level data, including the three plotted groundwater level datasets shown in Fig. 4. Therefore, the proposed trend function, which is based on the MGWL function, is valid as the calculated long-term mean groundwater level is derived from these stationary groundwater level datasets.

Fig. 4
figure 4

Temporal variation of groundwater hydraulic head values at three unique locations in the basin

Correlation of trend variables with groundwater hydraulic heads

The generated distance to stream map from SRTM DEM is shown in Fig. 5a. The distances from measurement sites to the nearest streams were ranging from 0 to 6250 m. The extreme distances from streams, more than 5000 m, were mostly located in the western and southern borders of the basin due to sparsely distributed measurement sites. Most of the observation wells, around 85 percent, fall within a stream network distance of 1250 m. However, few wells were located relatively away from the stream network (Fig. 5a). The calculated distance to stream values was extracted for all 29 observation wells. The correlation between the extracted distance to stream values and the groundwater hydraulic heads during pre-monsoon and post-monsoon seasons in 2007 is shown in Fig. 5b, c. It was found that the correlation was relatively weak between groundwater hydraulic heads and distance to stream values during both pre-monsoon and post-monsoon periods (Fig. 5b, c). This could be because the streams in the basin are not perennial streams; thus, the groundwater contribution to the streams is very limited during non-monsoon seasons as groundwater tables are at relatively higher depths. Moreover, due to heavy groundwater pumping to meet the water demands, the population and industrial activities in the region are causing a decrease in groundwater level. These factors, collectively, cause the surface streamflow systems disconnected from the groundwater systems. Therefore, the groundwater hydraulic heads and distance to streams values were not highly correlated.

Fig. 5
figure 5

a Distance to streams map and correlation of groundwater hydraulic heads and distance to streams data at the well sites during b pre-monsoon, and c post-monsoon season of the year 2007

The scatter plots with linear relationship models of observed groundwater hydraulic heads against DEM and MGWL values at the well locations during pre-monsoon and post-monsoon seasons are shown in Fig. 6. Both DEM and MGWL values have a strong correlation with groundwater hydraulic heads during both the seasons. The DEM variable with groundwater hydraulic heads clearly shows a strong correlation as the topography of the land surface closely resembling with the shallow unconfined aquifer hydraulic head values in the Adyar basin (Fig. 6a, b). However, the MGWL variable with groundwater hydraulic heads shows a relatively higher correlation over DEM variables with the correlation coefficient values of 0.99 and 0.99 during pre-monsoon and post-monsoon seasons respectively (Fig. 6c, d). This is because the MGWL values were calculated based on the long-term monthly average of groundwater levels and the correlation of any specific month groundwater levels with the monthly average values would naturally have a higher correlation. Therefore, we tested this hypothesis in this study that the MGWL variable will have a higher influence in separating the spatial trend from groundwater level datasets than any other variables, thereby the prediction accuracy would be improved in spatial modeling of groundwater hydraulic heads.

Fig. 6
figure 6

Correlations between groundwater hydraulic heads and DEM at well sites during a pre-monsoon and b post-monsoon; and correlations between groundwater hydraulic heads and MGWL at well sites during c pre-monsoon and d post-monsoon seasons of the year 2007

Estimation of trend model parameters at well sites

The parameters of the RK trend models (RK-DS, RK-DEM, and RK-MGWL) were estimated by the generalized least square (GLS) method with 29 observations wells datasets (n = 29). The t-statistics of the estimated model parameters are given in Table 1. It was observed that at least one of the two parameters of the linear models (slope and intercept) was statistically significant at 0.01 significance level (99% confidence limit) or 0.05 significance level (95% confidence limit) for all RK methods (Table 1). Although the distance to stream values with groundwater hydraulic heads was fairly correlated, the fitted linear regression model slope coefficient (\(k_{DS}\)) value was significantly different from zero at 0.01 significance level for both pre-monsoon and post-monsoon seasons. The slope parameters for both RK-DEM and RK-MGWL were significant at 0.01 significance level as DEM and MGWL values were strongly correlated with groundwater hydraulic heads.

Table 1 Estimated parameters of RK trend models for pre-monsoon and post-monsoon groundwater level datasets of the year 2007

Unlike DEM, the MGWL is not a constant variable in temporal scale. As MGWL is a function of various hydrogeologic factors, it varies over both spatial and temporal dimensions. Therefore, it is also possible to establish unique functions for unique seasons or months if the observations are available for long time periods at the observation well sites. In present study, as the groundwater hydraulic head observations are available at a monthly interval for 20 years, twelve different DEM-MGWL functions with unique parameters for January through December were estimated by the GLS approach. The estimated parameters for Eq. (12) are given in Table 2. The estimated coefficients for DEM-MGWL models were statistically significant at 0.01 significance level for all the months as DEM and MGWL values were strongly correlated in the study area (Table 2). The spatially continuous MGWL maps over the study area were derived using the gridded DEM map as an independent variable and the optimized DEM-MGWL parameters for January through December. These 12 gridded MGWL maps were used for estimating trend components for corresponding months over the entire study domain in RK-MGWL trend models.

Table 2 Estimated parameters of DEM-MGWL relationship for the months January through December

Pre-monsoon groundwater level predictions by IDW and kriging methods

The fitted semivariogram models of the developed kriging models using pre-monsoon groundwater level data are shown in Fig. 7. The parameter value for the number of bins or lags for calculating the experimental semivariance values was kept equal to 20. The spherical model coefficients values were initially assigned to arbitrary values, and later, it was optimized using ‘fminsearch’ optimization algorithm in the MATLAB software. The fitted semivariogram parameters, range and sill, for different kriging methods were different as the data points used to fit the semivariogram model were completely different depending upon the type of kriging model adopted in the present study (Fig. 7). The OK model used the actual groundwater hydraulic head data for constructing the semivariogram, whereas RK-DS, RK-DEM, and RK-MGWL methods used the residual data after removing the trend component from the actual groundwater hydraulic head signals to construct semivariograms. In all the cases, either the raw groundwater hydraulic heads or the trend removed residuals were checked for the normality conditions. The trend removed residuals data were normally distributed while raw actual groundwater hydraulic heads were approximately following the normal distribution. Therefore, data transformation to satisfy the normality conditions was not performed. The nugget effect in the semivariograms was observed in terms of a small semivariance value at a relatively a closer lag distance because two observations sampled at a closer distance not necessarily always have the same value. Sometimes nugget effect could occur due to the measurement error in the observations.

Fig. 7
figure 7

Fitted spherical semivariogram models for a OK, b RK-DS, c RK-DEM, and d RK-MGWL using pre-monsoon season hydraulic head data for the year 2007

The estimated range parameters for OK, RK-DS, and RK-DEM methods (17.69 km, 18.11 km, and 19.33 km) were relatively higher than the RK-MGWL method (6.82 km). This indicates that the spatial autocorrelation process quickly declines for the proposed RK-MGWL method than OK, RK-DS, and RK-DEM methods (Fig. 7). Similarly, the estimated sill values were 39.93 m2 and 36.27 m2 for OK and RK-DS models which were relatively higher than 5.14 m2 and 1.35 m2 for RK-DEM and RK-MGWL methods (Fig. 7). This sill variance for the OK method was relatively higher because of the higher variance in the observed groundwater hydraulic head data as the observation wells were sparsely distributed across the basin ranging from low altitude to high altitude regions in the basin (Fig. 7a). In general, the sill variance of the RK based models was relatively lesser as they used the residual data pairs for constructing the semivariograms. As the trend component in the RK model filters a major part of the variability in the groundwater level dataset, the remaining variability in residuals was relatively smaller. This is the reason that RK-DEM and RK-MGWL methods show a relatively lower sill variance than the OK method (Fig. 7c, d). However, the RK-DS method shows a sill variance almost equal to the OK method, unlike other RK methods (36.27 m2) (Fig. 7b). This could be because of the poor performance of the trend function which captured only a small variability in the groundwater level data leaving major variability in the residual portion of the RK model which was later modeled separately by the OK method. As the proposed trend surface function in RK-MGWL (Eq. 13) captures the maximum variability in the trend component of the RK model, the remaining residual variance values were minimal in magnitude. This was the reason for a significant reduction in the sill variance (1.35 m2) for the RK-MGWL method (Fig. 7d) as compared to other RK methods.

The total groundwater hydraulic head predictions by the developed kriging and IDW models are shown in Fig. 8. The intrinsic deterministic method of IDW interpolates groundwater hydraulic heads only based on the distance criteria. As there is no spatial covariance modeling accounted in the IDW method, it fails to produce a real variation in the groundwater levels under the sparse distribution of the measurement sites in the study area (Fig. 8a). As the OK method (Eq. 4) could not capture the accurate spatial trend component in the high-altitude regions, it severely under predicts the groundwater levels in the western parts of the basin (Fig. 8b). For example, groundwater heads predicted by the OK method in the western parts were ranging from 30 to 35 m which were far away from the realistic groundwater hydraulic heads variation as the surface elevation in these regions varies about 50 to 70 m amsl (Fig. 8b). The distance to stream parameter-based RK model produced a groundwater level surface map with the maximum hydraulic head value of 40–45 m in the western parts of the basin which are still underestimated values in that part of the basin (Fig. 8c). It was observed that the IDW, OK, and RK-DS methods estimate a relatively lower hydraulic head value even in the highly elevated regions of the basin. This is clear from Fig. 8a–c that the interpolated hydraulic head values near to the boundary were relatively lesser than in the central part of the basin. This leads to the conclusion that the groundwater flow moves from the central part to the surrounding boundary regions of the basin, which is not realistic based on the topography of the basin.

Fig. 8
figure 8

Predicted groundwater hydraulic heads by a IDW, b OK, c RK-DS, d RK-DEM, and e RK-MGWL using pre-monsoon hydraulic head data

Interestingly, both RK-DEM and RK-MGWL methods (Eqs. (17) and (18)) which compute the trend component of the groundwater level data with respect to DEM and MGWL variables, respectively, predict a reasonably realistic variation of groundwater hydraulic heads across the basin (Fig. 8d, e). Although the groundwater hydraulic head prediction by both RK-DEM and RK-MGWL methods shows a similar pattern, there is a distinct variation among these two models’ predictions (Fig. 8d, e). The groundwater hydraulic head predictions by both RK models (RK-DEM and RK-MGWL) were ranging from few meters in the eastern parts to 70 m in the western parts of the basin which closely aligns with the existing topographic elevations of the basin. Unlike IDW, OK, and RK-DS methods, RK-DEM and RK-MGWL methods predict a realistic variation of groundwater hydraulic heads in the central hard rock hilly regions of the basin as well (Fig. 7d, e).

The total prediction error variance by OK (Eq. 6), RK-DS (Eq. 20), RK-DEM (Eq. 20), and RK-MGWL (Eq. 20) methods is shown in Fig. 9. The color code scheme of the total prediction variance maps is given with a range of 0 to 6 m2 for RK-DEM and RK-MGWL methods and 0 to 30 m2 for OK and RK-DS methods (Fig. 9). We did not provide the same color code scale for all the methods because the magnitude of the prediction error variance for OK and RK-DS methods was significantly higher than RK-DEM and RK-MGWL methods. Thus, it could potentially suppress the higher values variation in the OK and RK-DS-based total error variance maps. The prediction error variance was significantly higher (more than 50 m2) for the OK method as the trend component was not modeled properly in the OK method (Fig. 9a). The same is the case for the RK-DS method (Fig. 9b), but the total prediction error variance was significantly reduced for the RK-DEM method which was ranging from 0 to 6 m2 (Fig. 9c). In general, it was observed that the total prediction error was comparatively higher up to 6 m2 in the western parts of the basin due to the sparse distribution of observation wells in these regions. On the other hand, due to the dense availability of the observation wells in the central and eastern parts of the basin, the total prediction error was significantly reduced to less than 1 m2 in these regions (Fig. 9c). Interestingly, RK-MGWL method was outperforming over OK, RK-DS, and RK-DEM models as the maximum prediction error itself was observed less than 1 m2 over the entire basin (Fig. 9d). The prediction error variance was substantially reduced for this proposed method because the maximum variance in the groundwater hydraulic head data was effectively captured by the RK-MGWL trend function (Eq. 13).

Fig. 9
figure 9

Total error variance created by a OK, b RK-DS, c RK-DEM, and d RK-MGWL models using pre-monsoon hydraulic head data

Post-monsoon groundwater level predictions by kriging methods

The fitted semivariogram spherical models for the post-monsoon season groundwater level analysis were similar to pre-monsoon season groundwater level analysis (Fig. 10). Therefore, as expected, the sill variance was higher for the OK method (50.41 m2) followed by RK-DS (44.11 m2), RK-DEM (5.21 m2), and RK-MGWL (2.26 m2) methods (Fig. 10).

Fig. 10
figure 10

Fitted semivariogram models for a OK, b RK-DS, c RK-DEM, and d RK-MGWL using post-monsoon season hydraulic head data

Similarly, the groundwater hydraulic head prediction surfaces obtained during the post-monsoon season were similar to pre-monsoon season groundwater hydraulic head predictions (Fig. 11). The IDW, OK, and RK-DS methods predicted a relatively higher (40 m to 45 m) groundwater hydraulic head values in the northwestern part of the basin. In contrast, the remaining parts of the basin were predicted with relatively lower hydraulic head values (0 m to 30 m) (Fig. 11a–c). The RK-DEM and RK-MGWL methods predicted the groundwater hydraulic head values in a realistic manner such that the groundwater flow moves from relatively higher elevated regions in the western part of the basin towards the lower elevation in the eastern coastal boundary of the basin (Fig. 11d, e). In general, it was observed that the post-monsoon season groundwater hydraulic head predictions were relatively higher as compared to pre-monsoon season because the northeast and southwest monsoon seasons increase the groundwater table.

Fig. 11
figure 11

Groundwater hydraulic head predictions by a IDW, b OK, c RK-DS, d RK-DEM, and e RK-MGWL using post-monsoon hydraulic head data

A significant reduction in the total prediction error variance (0 to 2.5 m2) was observed for the RK-MGWL method over OK, RK-DS, and RK-DEM methods (Fig. 12). When compared to pre-monsoon prediction error, the post-monsoon prediction error for the RK-DEM method was comparatively higher which was ranged up to 14 m2 against less than 6 m2 in the pre-monsoon season analysis (Fig. 9c and Fig. 12c). However, a significant reduction in the prediction error variance for the RK-MGWL method in both pre-monsoon and post-monsoon season groundwater hydraulic head analysis (Figs. 9d and 12d) show that the proposed method outperforms over other traditional methods in all ranges of groundwater level values irrespective of the seasons.

Fig. 12
figure 12

Total error variance created by a OK, b RK-DS, c RK-DEM and d RK-MGWL models using post-monsoon hydraulic head data

Cross-validation statistics for the developed models

The overall model accuracy was calculated by LOOV analysis after iterating the respective kriging model predictions continuously after leaving one observation at each iteration and repeating the process for 29 repetitions for 29 well sites. Figure 13a, b shows the prediction and observation agreement plots for different interpolation methods, including IDW and other kriging methods. The effectiveness of the proposed RK method was observed as the predicted values were in close agreement with the observed values as compared to the other methods (Fig. 13). The IDW method severely under predicted the actual values at higher elevations, especially, in the western part of the basin. This could be because the IDW is based on the distance-based weighting function and very few points were available to interpolate most of the grid cells in the higher elevation regions from the western parts of the basin. In general, all the developed methods were performing better under relatively shallow groundwater table regions. At the same time, there was considerable uncertainty when the models predicted the groundwater levels at deeper groundwater zones of the basin except for the proposed RK-MGWL method (Fig. 13). The RK-MGWL method predicts the groundwater level surface with greater accuracy even at the higher elevated regions in the basin as the trend function used in the method effectively models the rising and lowering trends in the groundwater level data based on the season-specific trend model parameters.

Fig. 13
figure 13

Observed and predicted groundwater hydraulic head values by different interpolation methods using a pre-monsoon and b post-monsoon datasets of the year 2007

The prediction results at 29 observation well locations and corresponding observed groundwater hydraulic head values were used in calculating the model performance statistical indices such as ME, MSE, MAE, RMSE, and R2. The LOOV statistics (n = 29) for the respective kriging methods is given in Table 3. The results from Table 3 show that the proposed method (RK-MGWL) outperforms over other methods (IDW, OK, RK-DS, and RK-DEM) with respect to ME, MSE, MAE, RMSE, and R2 indices values. The performance of the IDW method was observed poor among other methods as it severely underestimated the groundwater hydraulic heads both in pre-monsoon and post-monsoon seasons. The RK-DEM model performance was comparatively better than OK and RK-DS methods. However, it predicts the groundwater hydraulic heads with some considerable level of uncertainty when compared to the RK-MGWL method. For example, RK-DEM method RMSE values were calculated as 2.20 m and 2.51 m during pre-monsoon and post-monsoon seasons, respectively. These RMSE values were almost 60% and 40% more as compared to the proposed RK-MGWL method (1.37 m and 1.75 m) during pre-monsoon and post-monsoon seasons, respectively (Table 3).

Table 3 Performance assessment of various spatial interpolation models by a cross-validation technique

Conclusions

Spatial modeling of groundwater hydraulic heads in arid and semiarid regions is crucial for optimal management of groundwater resources, thus ensuring the sustainability of groundwater resources. Spatial modeling of groundwater levels is often attempted by several methods such as physical-based deterministic functions and stochastic-based kriging techniques. However, the traditional kriging interpolation methods have a considerable level of uncertainty associated with the predictions due to poor accountability of the trend component in the groundwater level datasets. An improved RK method with MGWL-based trend function has been proposed in the present study. The proposed trend function was formulated based on the MGWL variable, which was calculated based on long-term season-specific averaging of the groundwater levels. The effectiveness of the proposed kriging model was demonstrated against the traditional kriging and IDW methods in Adyar basin. The major conclusions of the present study are summarized below:

  • The sill variance parameter was very high for OK and RK-DS methods as compared to RK-DEM and RK-MGWL methods. The OK and RK-DS methods did not effectively capture the spatial trend in the groundwater level data. The RK-DEM method modeled the trend surface based on a linear model with DEM as an auxiliary variable. The RK-DEM-based trend function effectively filtered the trend component in the groundwater level data as the shallow groundwater levels nearly align with the basin topography. Therefore, the remaining residual variance was relatively lesser for RK-DEM method as compared to RK-DS and OK methods. The residual variance generated from RK-MGWL method was significantly lesser as compared to all other methods developed in the present study. The MGWL variable was a function of long-term monthly averaged values which was used as a perfect representation for filtering the trend component effectively in the groundwater level signals, thus reduced the residual variance significantly as compared to other methods.

  • Groundwater hydraulic head predictions by RK-DEM and RK-MGWL methods showed a realistic groundwater level variation with respect to the ground surface as RK-DEM and RK-MGWL methods used DEM- and MGWL-based trend functions to filter the trend component of the groundwater hydraulic heads data. However, a unique variation in the groundwater level predictions by the RK-DEM and RK-MGWL methods was also highlighted in the low-lying regions of the basin which shows the clear difference between the two models (RK-DEM and RK-MGWL). The IDW, OK, and RK-DS methods mostly underestimated the groundwater hydraulic head values in higher altitude regions; thus, these methods are not capable of modeling the groundwater level surface in complex hydrogeological conditions.

  • Total error variance developed by different interpolation methods showed that the RK-MGWL method estimated the least error variance values in the spatial domain as compared to all other models. The total error variance at the well locations was relatively lesser, while the error variance increased more than 50 m2 in the case of OK and RK-DS methods. On the other hand, RK-DEM and RK-MGWL methods predicted the error variance map with the maximum value of 6 m2 and 1 m2, respectively. Interestingly, the RK-MGWL method predicted the error variance map with less than 1 m2 almost uniform over the entire spatial domain which shows the ability of the proposed method to predict the groundwater level values precisely even under a sparse distribution of the measurement locations.

  • Cross-validation statistics during both pre-monsoon and post-monsoon groundwater level predictions show that the proposed kriging method outperforms over traditional kriging and IDW methods in terms of ME, MSE, MAE, RMSE, and R2 values. For example, the calculated RMSE values for the proposed method were about 1.37 m and 1.75 m during pre-monsoon and post-monsoon seasons, respectively. But, the same index (RMSE) for RK-DEM and other methods was calculated nearly two to four times more than the RK-MGWL method.

The proposed method was applied to only one geographic region due to data limitations. The is a scope to evaluate the effectiveness of the proposed method in different geographical regions with different climatic and hydrogeologic settings. Although the proposed method has been successfully validated in the spatial domain, it can also be used under spatiotemporal groundwater level modeling. In the spatiotemporal modeling, the proposed method can be used in the spatial domain for interpolating the groundwater levels in combination with the temporal prediction of groundwater levels by the time series models.