1 Introduction

Groundwater is a precious water resource on earth excluding the polar ice caps and glaciers [32]. From the perspective of utilization, groundwater is utilized mainly for irrigation, which accounts for 80%. Besides agriculture, groundwater is used for drinking water supply, domestic and industrial uses, etc. In order to effectively utilize the ground water resources, it is mandatory to predict the groundwater level. The groundwater level is subjected to variations resulting from differences in the recharge and release of groundwater, streamflow variations, tidal effects, meteorological impacts and also global climatic changes [48]. Computational modeling of flow and transport has become an important technique to understand the hydrology of the aquifers and other related aspects of the subsurface media [3, 22]. The drawbacks with the usage of such numerical models are difficulty in conversion of the physical processes into mathematical formulation and lack of sufficient data. To overcome these difficulties and shortcomings, artificial intelligence techniques have been proposed over the last two decades.

Artificial neural network (ANN) has been widely used in the past for prediction of groundwater levels [1, 7, 9,10,11, 14, 18, 23, 26, 29, 36, 54]. Coulibaly et al. [9], Mao et al. [26] and Coppola et al. [6] applied ANN to predict groundwater levels under variable weather conditions, whereas Daliakopoulos et al. [10], Lallahem et al. [23], Coppola et al. [7, 8] and Feng et al. [14] observed in particular for the effects of pumping. Affandi and Watanabe [1] used ANN to forecast groundwater level fluctuations using time-lagged water levels as input. Jalalkamali and Jalalkamali [18] improved the ANN method by incorporating a genetic algorithm to overcome partly the well-known problem with traditional ANN optimization of finding a global minimum of the error cost function. Ty et al. [49] predicted the groundwater level under different impact factors using ANN for the Tra Noc industrial zone in the Can Tho city of Vietnam for the period of 2000–2015. Kaya et al. [19] predicted the groundwater level for the Reyhanli region of Turkey using ANN and M5Tree models. They reported that the performance of the ANN and M5Tree models was found to be very close to each other. Wunsch et al. [52] used nonlinear autoregressive networks with exogenous inputs (NARX) to obtain the groundwater level forecasts for several wells in southwest Germany. They found that NARX is outstanding in predicting the groundwater level for very small set of inputs for the three aquifer types. Kouziokas et al. [20] investigated the application of multilayer feed-forward network models for forecasting groundwater levels in the region of Montgomery country in Pennsylvania and found that the multilayer feed-forward network shows good accuracy in predicting the groundwater levels. Lee et al. [24] used artificial neural network models for groundwater level forecasting with input variables composed of one natural factor and two anthropogenic factors in Yangpyeong riverside, South Korea.

Recently, genetic programming (GP) has gained importance and has been successfully used in many water management problems. Fallah-Mehdipour et al. [13] investigated the capability of an adaptive neural fuzzy inference system (ANFIS) and genetic programming (GP) artificial intelligence tools to predict and simulate groundwater levels in three observation wells in the Karaj plain of Iran. Sivapragasam et al. [39] analyzed the suitability of using GP modeling for groundwater level prediction. Researchers have concluded that GP simulated equations decrease the computational effort by using common simulation packages that can yield results with acceptable accuracy [13, 31, 34].

The basis for the support vector machines (SVM) was developed by Vapnik [50]. The solution provided by SVM is always unique and global as its implementation requires the solution of a convex quadratic constrained optimization problem [35]. Asefa et al. [4] used SVM for developing long-term groundwater head monitoring networks. Yoon et al. [55] compared ANN and SVM methods to predict the groundwater levels. Moreover, Shiri et al. [38] investigated the abilities of different data mining techniques including SVM for groundwater level forecasting. Tapak et al. [47] predicted the groundwater level of Hamadan–Bahar plain, west of Iran, using support vector machines. Mirzavand et al. [27] compared the abilities of two different data-driven methods, support vector regression (SVR) and an adaptive neuro-fuzzy inference system (ANFIS) in estimating the monthly groundwater level fluctuations in the Kashan plain, Isfahan province of Iran, using the inputs of stream flow, evaporation, spring discharge, aquifer discharge and rainfall. Although SVM has been successful in prediction, the output of SVM depends on the choice of the suitable kernel function and its parameters that are being adopted. Sattari et al. [33] predicted the monthly groundwater level in Ardebil plain using support vector regression (SVR) and M5 tree model based on the data collected during the period of 1997–2013 from 24 piezometers. They found that both the models performed well in predicting the groundwater level and the results obtained from the M5 decision tree model were straightforward and easy to interpret compared to the SVR results. Tang et al. [46] adopted a two phase data-driven framework to model the time series groundwater level with spatial–temporal analysis and least-square support vector machine (LS-SVM), and the LS-SVM model was found to outperform other machine learning models. Guzman et al. [15] compared the prediction capabilities of NARX and SVR trained with a radial basis function (RBF) algorithm for an irrigation well located in a highly productive agricultural region in the southeastern USA. They found that the SVR has a better modeling performance for both the seasons considered for their study.

Huang et al. [16] proposed a novel data-driven algorithm for a single layer feed-forward neural network popularly known as the extreme learning machine (ELM). The ELM reduces the computational time required for training a neural network. Moreover, since ELM simplifies the entire learning process, it yields faster learning with good generalization performance [25, 28, 37]. Nurhayati et al. [30] used extreme learning machine (ELM) to predict the groundwater level on tidal lowlands for the purpose of reclamation. Yadav et al. [53] have conducted a comparative study between SVM and ELM for the prediction of groundwater level. They have concluded that the performance of the ELM method is better than the performance of SVM for groundwater level forecasting. Alizamir et al. [2] investigated the ability of ELM model in modeling the groundwater level fluctuations using hydro-climatic data obtained for Hormozgan province in southern Iran. They found ELM model performance to be superior compared to the ANN and RBF models in modeling the 1-, 2- and 3- month ahead groundwater level.

Literature indicates that ELM performs better than SVM for groundwater level forecasting. In an attempt to improve the performance of SVM, the hybrid SVM models, namely SVM-PSO (support vector machine-quantum-behaved particle swarm optimization) and SVM-RBF (support vector machine-radial basis function), are being employed in this study to forecast the groundwater levels in the districts of Vizianagaram, Andhra Pradesh. Hybrid SVM models have been used in the past by various researchers. Sudheer and Mathur [44] used particle swarm optimization (PSO) to determine the optimal parameters of SVM. They applied the SVM-PSO model for estimating the groundwater level of Rentachintala region of Andhra Pradesh in India and concluded that the performance of SVM-PSO was better compared to autoregressive moving average (ARMA), ANN and ANFIS. Balavalikar et al. [5] used PSO-based ANN for forecasting the groundwater level of Udupi district. The monthly variations in groundwater level and rainfall data in three observation wells located in Brahmavar, Kundapur and Hebri were investigated and examined for 2000–2013. They concluded that PSO-ANN-based hybrid model gave a better accuracy than ANN alone. Sudheer et al. [43] adopted SVM-QPSO to estimate the groundwater level of Rentachintala region and found that SVM-QPSO performs far better compared to ANN in terms of accuracy and reliability. Moreover, Sudheer et al. [45] have employed SVM-QPSO for streamflow forecasting and have reported that SVM-QPSO performs much better compared to SVM-PSO, and thus, SVM-QPSO has been considered in this study instead of SVM-PSO. Sreedevi et al. [41] used SVM-RBF to predict the groundwater levels at Maheshwaram watershed, Hyderabad, Andhra Pradesh. Although SVM-QPSO and SVM-RBF models have been used in the past, the efficiency of these hybrid models in comparison with ELM for groundwater level prediction has not yet been explored. The objective of this study is to evaluate the efficiency of the SVM hybrid models in comparison with ELM for the prediction of the groundwater levels in Vizianagaram district, Andhra Pradesh, India. This paper is sub-divided into several sections: (1) description of the study area, (2) brief explanation of the various soft computing techniques adopted in this study, (3) comparison of ANN, GP, SVM and ELM, and (4) comparison of SVM hybrid models with ELM.

2 Study area

Vizianagaram district is a northern coastal district of Andhra Pradesh, India. The district is bounded on the east by the district of Srikakulam, southwest by the district of Visakhapatnam, southeast by the Bay of Bengal and northwest by the state of Odisha. The climate of this district is characterized by high humidity nearly all round the year with oppressive summer and good seasonal rainfall. The summer season from March to May is followed by South West monsoon season, which continues up to September. October and November constitute the retreating monsoon season. December to February is the season for fine weather. The climate of the hilly regions of the district received heavier rainfall and cooler than the plains. The district receives the benefit of both South West and North East monsoons. Figure 1 illustrates the topographical map of the Vizianagaram district. The six stations from which groundwater level data have been collected are highlighted in Fig. 1 for easy identification. Table 1 summarizes the geographical description of these stations.

Fig. 1
figure 1

Map of Vizianagaram district

Table 1 Geographical details of the selected stations

The monthly groundwater level data were collected from six locations, namely Cheepurupalli, Gantyada, Garugubilli, Jiyyammavalasa, Pachipenta and Vizianagaram for the period of 2007–2012.

3 Materials and methods

In the present study, the groundwater level in the six chosen stations has been predicted using four soft computing tools, namely ANN, GP, SVM, ELM, SVM-QPSO and SVM-RBF. The brief description of each soft computing method is given below.

3.1 Artificial neural network (ANN)

ANN is being used for prediction in a wide range of fields as a multilayer feed-forward network with backpropagation learning algorithm. A typical neural network consists of three layers: an input layer, a hidden layer and an output layer. Each layer is made up of a number of neurons. The number of neurons in the hidden layer is not fixed. When the number of neurons in the hidden layer is very high, the computational time for training the network is quite long [12, 40]. Therefore, the number of neurons in the hidden layer is arrived at by a trial and error procedure and the best model that is capable training the given input data is considered. The model with the best results is chosen as the final ANN model.

3.2 Genetic programming (GP)

Genetic programming is very much similar to genetic algorithm (GA), an evolutionary algorithm based on the Darwin’s theory of natural selection and survival of the fittest. But, GP is different in the sense that it operates on parse trees, rather than on bit strings as in GA, to approximate the equation that best describes how the output is related to the input. The algorithm considers an initial population of random generated equations, derived from a random combination of input variables, random numbers and functions. The selection of the function has to be done appropriately to ensure the comprehension of the process. The population of the possible solutions is dependent on an evolutionary process, and then, the ‘fitness’ of the evolved problems is evaluated. Fitness is a measure of how well they solve the problem. The programs that best fit the data are selected from the initial population. These programs then exchange a part of the information between them to produce better programs through the process known as ‘crossover’ followed by ‘mutation,’ which is random changing of programs to create new programs [21]. The user must decide the values of the GP parameters before applying the algorithm to model the data, namely the population size, number of generations, crossover probability, mutation probability, etc. Only the programs that best fitted the given data are considered, while the less fitted ones are discarded. This evolutionary process is repeated over successive generations and is driven toward finding symbolic expressions that describe the data.

3.3 Support vector machine (SVM)

The SVM equations are formulated as per Vapnik’s theory [50, 51], that if {(I1, T1), … (IN, TN)} are assumed as the given training data sets, where IkRn refers to the space of input variable, Tk ∈ R refers to the space of the target value, and N represents the length of the training data. The linear regression of SVM is estimated by solving the equation given below:

$${\text{Minimize}}\;\frac{1}{2}\left\| w \right\|^{2} + C\mathop \sum \limits_{k = 0}^{N} \left( {\xi + \xi^{*} } \right)$$
(1)

Subject to the conditions

$$T_{k} - \left\langle {w,I_{k} } \right\rangle - b \le \varepsilon_{k} + \xi$$
(2)
$$\left\langle {w,I_{k} } \right\rangle + b - \left\langle {w,I_{k} } \right\rangle - b \le \varepsilon_{k} + \xi^{*}$$
(3)
$$\xi_{k} \xi_{k}^{*} \ge 0,\quad k = 1, \ldots ,N$$
(4)

where w is the weight vector, b is a bias, C represents the regularization constant, ε is the error tolerance range of the function, and ξ, ξ* are the slack variables.

3.4 SVM-RBF

The Gaussian RBF kernel is the most successful kernel and has been widely used in many problems. It uses the Euclidean distance between two points in the original space to find the correlation in the augmented space [42].

$$K\left( {x,y} \right) = \exp ( - \gamma \left\| {x - y} \right\|^{2} )$$
(5)

3.5 SVM-QPSO

In quantum physics, the state of a particle with momentum and energy can be depicted by its wave function ψ (x,t). As per QPSO theory, each particle is in a quantum state and is formulated by its wave function ψ (x,t) instead of the position and velocity described by PSO. According to the statistical significance of the wave function, the probability of a particle’s appearing in a certain position can be obtained from the probability density function \(\left| {\psi \left( {x,t} \right)} \right|^{2}\). The probability distribution function of the particle’s position can be calculated through the probability density function. By employing the Monte Carlo method, the particle’s position is updated as per the equation given below:

$$X_{ij}^{t + !} = p_{ij}^{t} \pm 0.5L_{ij}^{t} .\ln \left( {1/u_{ij}^{t} } \right)$$
(6)

where \(u_{ij}^{t}\) is a random number uniformly distributed in (0,1): \(p_{ij}^{t}\) is the local attractor and defined as

$$p_{ij}^{t} = \varphi_{ij}^{t} P_{ij}^{t} \pm \left( {1 - \varphi_{ij}^{t} } \right). P_{gj}^{t}$$
(7)

where \(\varphi_{ij}^{t}\) is a random number uniformly distributed in (0,1), \(P_{gj}^{t}\) is the global best position. The parameter \(L_{ij}^{t}\) is evaluated by

$$L_{ij}^{t} = 2.\beta .\left| {p_{ij}^{t} - X_{ij}^{t} } \right|$$
(8)

where β is called the contraction–expansion coefficient which can be tuned to control the convergence speed of the algorithms. Then, we get the position update equation as

$$X_{ij}^{t + !} = p_{ij}^{t} \pm \beta .\left| {p_{ij}^{t} - X_{ij}^{t} } \right|.\ln \left( {1/u_{ij}^{t} } \right)$$
(9)

The PSO algorithm with position update equation (Eq. 9) is called as the quantum delta-potential-well-based PSO (QDPSO) algorithm. Keeping in view the vital position of L for convergence rate and performance of the algorithm, an improvement was proposed to evaluate parameters L. As per this algorithm, the mean best position (mbest) is defined as the center of pbest positions of the swarm.

$$mbest^{t} = mbest_{1}^{t} ,mbest_{2}^{t} , \ldots ,mbest_{D}^{t} = \left( {\frac{1}{M}\mathop \sum \limits_{i = 1}^{M} P_{i1 }^{t} ,\frac{1}{M}\mathop \sum \limits_{i = 1}^{M} P_{i2}^{t} ,\frac{1}{M}\mathop \sum \limits_{i = 1}^{M} P_{i3}^{t} , \ldots ,\frac{1}{M}\mathop \sum \limits_{i = 1}^{M} P_{ij }^{t} , \ldots ,\frac{1}{M}\mathop \sum \limits_{i = 1}^{M} P_{iD }^{t} } \right)$$
(10)

where M is the population size and Pi is the personal best position of particle i. Further, parameter L is given by

$$L_{ij}^{t} = 2.\beta .\left| {mbest_{ij}^{t} - X_{ij}^{t} } \right|$$
(11)

Hence, the particle’s position is updated according to the following equation:

$$X_{ij}^{t + !} = p_{ij}^{t} \pm \beta .\left| {mbest_{ij}^{t} - X_{ij}^{t} } \right| .\ln \left( {1/u_{ij}^{t} } \right)$$
(12)

The PSO algorithm with Eq. (12) is called as the quantum-behaved particle swarm optimization (QPSO). The most commonly used control strategy of β is to initially setting it to 1 and reducing it linearly to 0.5.

3.6 Extreme learning machine (ELM)

ELM is a three-layered structure algorithm. In the ELM structure, the input weight (connection between the input and hidden) and the bias values (in the hidden layer) are randomly generated. ELM analytically calculates the output weight matrix between hidden layers and output layers through a simple generalized inverse operation of the hidden layer output matrix. ELM can be formulated as a function with L hidden nodes and N training samples as follows:

$$\mathop \sum \limits_{i = 1}^{L} w_{i} g\left( {W_{in\left( i \right) } ,b_{i} ,x_{j} } \right) = \mathop \sum \limits_{i = 1}^{L} w_{i} g\left( {W_{in\left( i \right) } .x_{j} + b_{i} } \right) = y_{j} ,\quad j = 1, \ldots N$$
(13)

where xjRn is the input vector, Win(i)Rn is the input weight vector, Win(i)·xj represents the inner product of Win(i) and xj, biRn represents the bias of the ith hidden node, g(.) denotes the approximation function (sigmoid), wiRn is the output weight matrix, and yjR denotes the simulated output of ELM. In the ELM algorithm, the input weight and bias are randomly chosen at the initial stage. If the ELM model with L hidden nodes is able to learn these N training samples with no residuals, then w can be predicted, such that [17]:

$$\mathop \sum \limits_{i = 1}^{L} w_{i} g\left( {W_{in\left( i \right)} .x_{j} + b_{i} } \right) = t_{j} , \quad j = 1, \ldots , N$$
(14)

where tj represents the target output. The above equation can be further expressed as

$$Aw \, = \, T$$
(15)

where T = [t1, … tN]T is the target vector. The random selection of Win(i) and bi converts the above equation into a linear parameter system such that the minimum norm least-squares solution of the linear parameter system can be further written as

(16)

where is the Moore–Penrose generalized inverse of the hidden layer output matrix A. In practice, is calculated using the singular value decomposition (SVD), and then, the nonzero singular values construct the output weights. However, when L and N are large, the computational complexity of the SVD decomposition impacts the learning speed of the ELM immensely.

3.7 Model performance evaluation

The following statistical indicators were selected in the performance evaluation of ANN, GP, SVM and ELM models:

  1. 1.

    Root-mean-square error (RMSE)

    $${\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {P_{i} - O_{i} } \right)^{2} }}{n}}$$
    (17)
  2. 2.

    Pearson correlation coefficient

    $$r = \frac{{n\left( {\mathop \sum \nolimits_{i = 1}^{n} O_{i} P_{i} } \right) - \left( {\mathop \sum \nolimits_{i = 1}^{n} O_{i} } \right).\left( {\mathop \sum \nolimits_{i = 1}^{n} P_{i} } \right)}}{{\sqrt {\left( {n\mathop \sum \nolimits_{i = 1}^{n} O_{i}^{2} - \left( {\mathop \sum \nolimits_{i = 1}^{n} O_{i} } \right)^{2} } \right).\left( {n\mathop \sum \nolimits_{i = 1}^{n} P_{i}^{2} - \left( {\mathop \sum \nolimits_{i = 1}^{n} P_{i} } \right)^{2} } \right)} }}$$
    (18)
  3. 3.

    Coefficient of determination (R2)

    $$R^{2} = \frac{{\left[ {\mathop \sum \nolimits_{i = 1}^{n} \left( {O_{i} - \overline{{O_{\iota } }} } \right).\left( {P_{i} - \overline{{P_{\iota } }} } \right)} \right]^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {O_{i} - \overline{{O_{\iota } }} } \right).\mathop \sum \nolimits_{i = 1}^{n} \left( {P_{i} - \overline{{P_{\iota } }} } \right)}}$$
    (19)
  4. 4.

    Mean absolute error (MAE)

    $${\text{MAE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {P_{i} - O_{i} } \right|$$
    (20)
  5. 5.

    Mean absolute percentage error (MAPE)

    $${\text{MAPE}} = \frac{100\% }{n}\mathop \sum \limits_{i = 1}^{n} \left| {\frac{{O_{i} - P_{i} }}{{O_{i} }}} \right|$$
    (21)

where n is the total number of test data, Oi and Pi are the observed and predicted groundwater level values.

The summary of the usage of data and the bifurcation of the available data into training and testing sets is provided in Table 2.

Table 2 Number of training and validation data sets used in training the models

4 Results and discussion

In this study, the groundwater level at six locations in the Vizianagaram district of Andhra Pradesh, India, has been predicted using different soft computing models.

4.1 Comparison of ANN, GP, SVM and ELM

The performance indices of the groundwater level using ANN, GP, SVM and ELM for the six locations are provided in Tables 3, 4, 5, 6, 7 and 8.

Table 3 Performance indices for groundwater level using ANN, GP, SVM and ELM for Cheepurupalli
Table 4 Performance indices for groundwater level using ANN, GP, SVM and ELM for Gantyada
Table 5 Performance indices for groundwater level using ANN, GP, SVM and ELM for Garugubilli
Table 6 Performance indices for groundwater level using ANN, GP, SVM and ELM for Jiyyammavalasa
Table 7 Performance indices for groundwater level using ANN, GP, SVM and ELM for Pachipenta
Table 8 Performance indices for groundwater level using ANN, GP, SVM and ELM for Vizianagaram

From Tables 3, 4, 5, 6, 7 and 8, it is observed that ELM has the least RMSE values of 0.594, 0.511, 0.537, 0.349, 0.583 and 0.277 for Cheepurupalli, Gantyada, Garugubilli, Jiyyammavalasa, Pachipenta and Vizianagaram, compared to SVM, GP and ANN models. ELM has produced the highest correlation coefficient of 0.981, 0.92, 0.953, 0.971, 0.966 and 0.953 compared to other methods for all the six locations. SVM has performed better compared to GP and ANN models. The performance of GP and ANN is very much similar. The MAPE is the least for ELM model for all the locations in this study, except at Garugubilli. The observed and predicted results obtained using ANN, GP, SVM and ELM have been plotted for each location (Figs. 2, 3, 4, 5, 6, 7) in the form of scatter plots.

Fig. 2
figure 2

Groundwater level prediction at Cheepurupalli using a ANN, b GP, c SVM and d ELM

Fig. 3
figure 3

Groundwater level prediction at Gantyada using a ANN, b GP, c SVM and d ELM

Fig. 4
figure 4

Groundwater level prediction at Garugubilli using a ANN, b GP, c SVM and d ELM

Fig. 5
figure 5

Groundwater level prediction at Jiyyammavalasa using a ANN, b GP, c SVM and d ELM

Fig. 6
figure 6

Groundwater level prediction at Pachipenta using a ANN, b GP, c SVM and d ELM

Fig. 7
figure 7

Groundwater level prediction at Vizianagaram using a ANN, b GP, c SVM and d ELM

From the above Figs. 2, 3, 4, 5, 6 and 7, it is observed that the value of R2 is highest for ELM model. The predicted values are lying much closer to the measured values in the ELM model, whereas in the other models, the predicted values are scattered. This shows that the groundwater level values have been more accurately predicted by the ELM model, while there are cases of under and over prediction in the other models. The GP and ANN model performance is quite similar to each other. The results also indicate the ELM model is capable of predicting the nonlinear behavior of the groundwater levels at various sites. This is the main advantage of using soft computing tools compared to traditional prediction models. The results obtained this study are in corroboration with the conclusions of Yadav et al. [53] that ELM performs better than SVM. We have also found that SVM performs better compared to GP and ANN. Moreover, GP and ANN prediction performance is similar.

4.2 Comparison of SVM hybrid models with ELM

The performance indices of the groundwater level using SVM, SVM-QPSO, SVM-RBF and ELM for the six locations are provided in Tables 9, 10, 11, 12, 13 and 14.

Table 9 Performance indices for groundwater level using SVM, SVM-QPSO, SVM-RBF and ELM for Cheepurupalli
Table 10 Performance indices for groundwater level using SVM, SVM-QPSO, SVM-RBF and ELM for Gantyada
Table 11 Performance indices for groundwater level using SVM, SVM-QPSO, SVM-RBF and ELM for Garugubilli
Table 12 Performance indices for groundwater level using SVM, SVM-QPSO, SVM-RBF and ELM for Jiyyammavalasa
Table 13 Performance indices for groundwater level using SVM, SVM-QPSO, SVM-RBF and ELM for Pachipenta
Table 14 Performance indices for groundwater level using SVM, SVM-QPSO, SVM-RBF and ELM for Vizianagaram

From Tables 9, 10, 11, 12, 13 and 14; it is observed that SVM-QPSO performance is far better when compared to SVM and SVM-RBF in terms of the RMSE values. This is in corroboration with Sudheer et al. [45]. Even though its performance is exceptional when compared to SVM and SVM-RBF, it is marginally under par when compared to ELM in predicting the groundwater levels at different locations in the district. ELM has the least RMSE values of 0.594, 0.511, 0.537, 0.349, 0.583 and 0.277 for Cheepurupalli, Gantyada, Garugubilli, Jiyyammavalasa, Pachipenta and Vizianagaram. Based on MAPE, SVM-QPSO has performed marginally better in predicting the groundwater levels at Garugubilli, Jiyyammavalasa and Pachipenta. Therefore, ELM cannot be always considered superior when compared to SVM-QPSO. The observed and predicted results obtained using SVM-QPSO, SVM-RBF, SVM and ELM has been plotted for each location (Figs. 8, 9, 10, 11, 12, 13).

Fig. 8
figure 8

Groundwater level prediction at Cheepurupalli using a SVM, b SVM-QPSO, c ELM and d SVM-RBF

Fig. 9
figure 9

Groundwater level prediction at Gantyada using a SVM, b SVM-QPSO, c ELM and d SVM-RBF

Fig. 10
figure 10

Groundwater level prediction at Garugubilli using a SVM, b SVM-QPSO, c ELM and d SVM-RBF

Fig. 11
figure 11

Groundwater level prediction at Jiyyammavalasa using a SVM, b SVM-QPSO, c ELM and d SVM-RBF

Fig. 12
figure 12

Groundwater level prediction at Pachipenta using a SVM, b SVM-QPSO, c ELM and d SVM-RBF

Fig. 13
figure 13

Groundwater level prediction at Vizianagaram using a SVM, b SVM-QPSO, c ELM and d SVM-RBF

From the above Figs. 8, 9, 10, 11, 12 and 13, it is observed that the value of R2 is highest for ELM model except at Garugubilli, Jiyyammavalasa and Pachipenta where the R2 values of SVM-QPSO are on par with the R2 values of ELM. The predicted values are lying much closer to the measured values in the ELM and SVM-QPSO models unlike other models. This shows that ELM and SVM-QPSO have the capability of predicting the groundwater levels quite accurately compared to other models considered in this study. Thus, ELM and SVM-QPSO models will enable the hydrologists to predict the groundwater level in the future based on the observations recorded in the past.

5 Conclusions

Groundwater level prediction is mandatory to understand and resolve various water management issues. In this study, groundwater level has been predicted for six locations in the Vizianagaram district of the state of Andhra Pradesh, India, using six soft computing tools, namely ANN, GP, SVM, ELM, SVM-QPSO and SVM-RBF. Five different statistical indicators have been used for the comparison of the models, namely RMSE, r, R2, MAE and MAPE. In the first phase of this study, only ANN, GP, SVM and ELM were compared. The performance of ELM is found to be the best compared SVM, ANN and GP models as RMSE is the least for the ELM model. SVM performed better than GP and ANN. The performance of GP and ANN is similar. The second phase of this study is aimed at improving the performance of SVM by using hybrid SVM models, namely SVM-QPSO and SVM-RBF. The performance of SVM-QPSO is found to far better when compared to SVM and SVM-RBF. Moreover, the RMSE values have been found to be the least for ELM compared to other models. On the basis of MAPE, SVM-QPSO is found to perform on par with ELM in predicting the groundwater level for some locations in this study.