Introduction

Field capacity (FC) and permanent wilting point (PWP) are usually evaluated as two vital parameters in irrigation, agriculture, and study of the water and the minerals within the soil (Rab et al. 2011). The definition of the FC is slightly modified in the glossary of the soil science (SSSA 1984) as the amount of moisture or the remained water in the soil sample after which 2–3 days of excessive water is drained from the soil or as the water content when the soil suction is − 33 kPa. This can usually be reached when several days from the precipitation or irrigation within a uniformly structured soil are passed. On the other hand, PWP is defined as the water content in the soil which plants cannot extract from the soil profile. It represents a lower limit of water available for the plant which is retained by the soil particles under a tension of 1500 kPa (Slatyer 1967). Thereby, the FC and the PWP are two parameters in evaluation of the moisture in calculation of the available water for irrigation.

Hence, for a relatively small area with an acceptable homogeneity, in terms of soil physicochemical properties, it would be possible to gain a good approximation of the moisture by performing an adequate number of costly and time-consuming field and lab experiments (Veihmeyer and Hendrickson 1949; Keshavarzi et al. 2012). On the other hand, other properties related to textural characteristics of the soil are valuable in defining hydraulic properties, and simulation of the deep and subsurface flow in modeling the movement of the water in the soil.

In this respect, characteristics like the amount of water in the soil sample, as the difference between FC and PWP, can be evaluated to describe the ability of water retention which is an essential information in irrigation management, modeling the movement of water in the soil, rainfall-runoff simulation, and environmental management (Pachepsky and Rawls 2003; Wosten et al. 2001). In practice, the amount of FC and PWP are physically assessed and evaluated together with other properties of soil such as dry bulk density, amount of sand, silt, clay, and organic matter. However, the measuring of the FC, PWP, and several properties of soils such as dry bulk density in large scale is very costly (Mohanty et al. 2015) and time-consuming, while many researchers believe that they are vital in evaluation of the soil properties. Thereby, researchers usually prefer to define the FC and the PWP using simple techniques, e.g., pedotransfer functions with accurate methods to detour the need for costly information.

For instance, since the measurement of the silt, sand, clay, and OM is less costly and convenient, Ghorbani et al. (2017) suggested to use artificial intelligence methods considering silt, sand, clay, and OM data in order to define the FC and PWP in the soils. Likewise, scientists are trying to provide alternative ways to evaluate FC and PWP using other optimization or machine learning techniques. Notably, some studies reported that clay, sand, and silt content together with OM is effective in predicting FC and PWP either by using parametric or nonparametric modeling techniques (e.g., Bishop and McBratney 2001; Khosla et al. 2002; Mzuku et al. 2005; Liu et al. 2006; Merdun et al. 2006).

The nonparametric nature of artificial intelligence techniques in this respect represents significant advantages since they do not require a conceptual approach (Moazenzadeh et al. 2018; Moazenzadeh and Mohammadi 2019; Mohammadi 2019a, b; Aghelpour et al. 2019). Similarly, such developments found to be successful according to Minasny and McBratney (2002), Sarmadian et al. (2009), Keshavarzi et al. (2010), Rab et al. (2011), Jafarnejadi et al. (2012), and Mohanty et al. (2015), while Sarmadian and Taghizadeh (2008), Moazenzadeh et al. (2019), Jahani and Mohammadi (2018), and Moazenzadeh and Mohammadi (2019) used these techniques in studying FC and the PWP.

Despite the widespread application of these methods, there are also significant drawbacks to the application of these models. The primary disadvantage of such models is their dependence on the tuning parameters of the optimal learning process, while the main concern is the predictability and performance of these models in action (Chen et al. 2017). In this respect, trial and error is usually used in parameter estimation while it can be time-consuming and sometimes gives unrealistic estimations (Ghorbani et al. 2017). Most recently, the practice of meta-heuristic optimization algorithms demonstrated a considerable solution to alleviate the difficulties in parameterization of these models (Kisi et al. 2015). These algorithms also enable the parameter estimation automatically and improves the model performance. Hence, various biologically inspired meta-heuristic algorithms have been invented to cope with optimization issue, using imitation of the biological phenomena (Mirjalili and Lewis 2016).

Along with the development of soil moisture models, invention and widespread of more efficient computers and machine learning techniques accelerated the application of these approaches, while there were also drawbacks in practice. For instance, while linear models had to face nonlinearity, dynamical approaches deal with the curse of dimensionality and state-space discretization. On the other hand, nonlinear solutions had problems in trapping into local extremes while the stochastic methods have to do with large-scale changes and randomness. Thereby, real-world issues such as defining FC and PWP started to benefit from some nonparametric and ranked based approaches (Ghorbani et al. 2017).

Herein, the whale optimization algorithm (WOA) is proposed as an optimizing method at the core of an ANN model. The aim of this study is to develop a hybrid model for coupling the whale optimization algorithm with ANN (ANN-WOA). The portion of clay, silt, and sand together with measured OM of the soil samples was used in prediction of FC and PWP. Besides, the performance of the suggested model was evaluated against basic artificial neural network (ANN) and multilinear regression (MLR) in prediction of the FC and PWP, to validate the applicability of the ANN-WOA in practice.

Materials and methods

Study area

The performance of the models was tested in a real-life case including samples taken from soil profiles across the East and West Azerbaijan provinces located in north of Iran, covering more than 50,000 km2 (East Azerbaijan province being 45,650 km2 and West Azerbaijan province being 34,437 km2). The study area partially covers four important water basins in Iran including Lake Urmia, Caspian Sea, Persian Gulf-Oman Sea, and Central Iran basins. The study area has a semi-arid and cold climate while the average annual precipitation particularly around Lake Urmia is about 300 mm/year (Vaheddoost and Aksoy 2017). The climate of this area is largely under influence of air fronts coming from the Atlantic Ocean and Mediterranean, while the highest and the lowest temperatures are about 35 °C (July) and − 17 °C (January) respectively.

Figure 1 depicts the study area as well as the location of the taken samples from the basins, and sub basins of the study area.

Fig. 1
figure 1

Study area including the East and West Azerbaijan provinces in northwest of Iran

Data used

Data used in this study are obtained between October and November 2016, from 217 soil samples considering nonirrigated lands scattered across the East Azerbaijan and the West Azerbaijan provinces in northwest of Iran (Fig. 1). Afterward, the portion of the clay (< 0.002 mm), silt (0.002–0.05 mm), and sand (0.05–2 mm) in the samples was acquired using the hydrometer method. Also, the organic matter (OM) of the soil samples was measured by the Walkley–Black method (Nelson and Sommers 1982), while the soil moisture was determined at − 10 kPa (FC) for undisturbed samples, and − 1500 kPa (PWP) for disturbed samples using ceramic plate bubble-tower suction tables. Soil samples were taken from the depths of the soil used for agriculture (depths of 10–30 cm). The interested reader may refer to Romano et al. (2002) for more details.

In this respect, the statistical characteristics of these samples are given in Table 1.

Table 1 Statistical characteristics of variables

For a better overview over the nature of the relationship between clay (< 0.002 mm), silt (0.002–0.05 mm), sand (0.05–2 mm), and OM as independent variables with FC and PWP, a scatter plot (Figs. 2 and 3) together with the correlation matrix of the pair observations is used (Table 2). Although the correlation matrix of the variables could be used in selecting the best and the most reliable variable in depicting the linear relationships, the possible nonlinear or dynamic nature of the relations was taken into account by considering curve fitting. In this respect, analysis given in Figs. 2 and 3 shows that although sand, silt, and clay have poor linear relationship with FC and PWP (Table 2), there is still a great possibility of nonlinear relationship between those allocated variables. This relationship particularly between sand and FC or PWP shows a convergence once the amount of clay increases. The other illustrations at Figs. 2 and 3 also show that there are deviations from the mean in all samples which indicates to a large varieties in the soil profiles.

Fig. 2
figure 2

Relationship between FC with clay, silt, sand, and OM

Fig. 3
figure 3

Relationship between PWP with clay, silt, sand, and OM

Table 2 Correlation matrix of variables

Based on Table 2, the most correlated independent parameter with FC and PWP is OM. The same results are confirmed by Figs. 2 and 3 while second-degree curve fittings are used to evaluate the relationship between OM versus FC and PWP, with R2 of 0.47 and 0.57 respectively. Other parameters, i.e., clay, sand, and silt, showed more random relationship while neither a linear (Table 2) nor a 2nd degree nonlinear (Figs. 2 and 3) relation could explain the core relationship between the dependent and independent variables perfectly. This shortcoming expected to be addressed by the ANN-WOA which is the main motivation for the present study.

Since the OM is the footprint of living organisms, it is expected to be found within the soil structure of the study area, which are wetland and an agricultural zone. The OM together with micro-organisms participates in binding soil particles, resulting in a more aggregative soil structure, that means a well-structured soil which performs better in aeration, water infiltration, and resistant to erosion. Hence, OM with the highest linear and nonlinear bound can be recognized for its availability and importance in the study and prediction of FC and PWP.

Since the data analysis indicated to a potential lack of fit in parameter estimation (i.e., low R2 exposed at Figs. 2 and 3), nonparametric methods were also used in this study to deal with the rank and the presence of outliers in the sample data. Thereby, all of the 217 sample sets were divided into two randomly selected halves, the training and the validation. Thus, 174 observation sets out of 217 observations (80%) were randomly selected and used as the training samples, while a set of 43 observations (20%) were used in validation of the models. To avoid over fitting and undefined conditions, data separation at the previous step (i.e., the definition of training and validation data sets) was performed randomly. Then, all of the observation data sets were normalized using

$$ {X}_n=\frac{X_i-{X}_{\mathrm{min}}}{X_{\mathrm{max}}-{X}_{\mathrm{min}}} $$
(1)

while normalized values, Xn, are calculated using observation values Xi, together with the maximum (Xmax), and the minimum (Xmin) observed data of each sample sets to reduce the effect of dimensionality and outliers.

Several nonparametric models are used, while the clay, sand, silt, and OM data were selected as independent variables (i.e., inputs of the model) in estimation of FC and PWP separately (i.e., outputs). For this aim, Matlab codes were developed and the ANN, ANN-WOA, and MLR models were obtained.

Multi linear regression

Linear regression as a parametric approach found to be handy in previous studies. In this method, the goal is to determine coefficients αi, and an intercept, c to define the relationship between the dependent and independent variables as

$$ y=\sum \limits_{i=1}^n{\alpha}_i{x}_i+c $$
(2)

where n is the number of independent variables. Coefficients of the MLR are obtained by minimizing the difference between the observed values and model outputs using ordinary least square approach.

Artificial neural networks

ANN models are robust nonlinear modeling techniques, which can facilitate the establishment of links between input and output variables via allocated weights and activation functions (Mohammadi, 2020) In this study, the Matlab software is used to implement and train a feed-forward back-propagation neural network with a variety of activation functions, different number of neurons, and hidden layers. Also, a multi-layered feed-forward perceptron (MLP) approach is used in parameter optimization of the models. Hence, a three-layered MLP and Levenberg–Marquardt back-propagation algorithm was used in training stage together with a tangent and a linear transfer functions in hidden and output layers, respectively. The interested reader may also refer to the recent studies of Jahani and Mohammadi (2018) and Moazenzadeh and Mohammadi (2019) for more information about the application of ANN models.

Whale optimization algorithm

Whale optimization algorithm was introduced by Mirjalili and Lewis (2016) to solve the optimization problems using an evolutionary method. The theory of WOA algorithm inspired from the bubble-net feeding behavior of the humpback whales. The Humpback Whales hunt the small fishes and other marine creatures by creating the bubbles along the circles. In WOA algorithm, the target prey is considered the best solution and the possible situation of the Humpback Whales around the prey is formulated as (Mirjalili and Lewis 2016):

$$ \overrightarrow{X}\left(t+1\right)={\overrightarrow{X}}^{\ast }(t)-\overrightarrow{A}.\left|\overrightarrow{C}.{\overrightarrow{X}}^{\ast }(t)-\overrightarrow{X}(t)\right| $$
(3)

where t is the running iteration, \( \overrightarrow{X} \) is the location vector of the whale, and X* is the location vector of the best solution and updated if there is a better solution, while the \( \overrightarrow{A}=2\overrightarrow{a}.\overrightarrow{r}-\overrightarrow{a} \) and \( \overrightarrow{C}=2.\overrightarrow{r} \) are coefficient vectors to be estimated. In this respect, \( \overrightarrow{a} \) is linearly reduced from 2 to 0 as iteration proceeds, while \( \overrightarrow{r} \) is a randomly selected vector ∈ [0,1].

Bubble-net attacking approach includes (i) a shrinking encircling which is represented by reduction in \( \overrightarrow{a} \) and \( \overrightarrow{A} \), together with (ii) the spiral updating position that is employed to imitate the spiral motion of the whales in periphery of the hunt by calculating the space between the hunt (X*, Y*) and hunter (X, Y):

$$ \overrightarrow{X}\left(t+1\right)={\overrightarrow{D}}^{\prime }.{e}^{bl}.\cos \left(2\pi l\right)+{\overrightarrow{X}}^{\ast }(t) $$
(4)

In this respect, \( {\overrightarrow{D}}^{\prime }=\left|{\overrightarrow{X}}^{\ast }(t)-\overrightarrow{X}(t)\right| \) defines the space between the hunt and ith whale, b is a constant in determination of the logarithmic helix-shaped motion, and l is a random number ∈ [− 1,1]. Thereby, the motion of the whales around the hunt is conceptualized along a spiral-shaped paths by shrinking the circles towards the pray (i.e., goal). The following mathematical model is used to conceptualize the whale behavior,

$$ \overrightarrow{X}\left(t+1\right)=\left\{\begin{array}{c}{\overrightarrow{X}}^{\ast }(t)-\overrightarrow{A}.\overrightarrow{D}\kern1.00em \begin{array}{cc}& \end{array}\begin{array}{cc}& if\begin{array}{cc}& p<0.5\end{array}\end{array}\\ {}{\overrightarrow{D}}^{\prime }.{e}^{bl}.\cos \left(2\pi l\right)+{\overrightarrow{X}}^{\ast }(t)\begin{array}{cc}& if\begin{array}{cc}& p\ge 0.5\end{array}\end{array}\end{array}\right. $$
(5)

where p∈ [0,1] and determines the probability of maintaining the rotation mode or taking a shrinking encircling to update their location. In searching phase (exploration), the Humpback Whales search for a hunt randomly compared to the location of the other whales (Kaveh and Ghazaan 2017). Hence, the whales update their location in accordance with randomly selected searching factor, instead of the best searching factor as

$$ \overrightarrow{D}=\left|\overrightarrow{C}.{\overrightarrow{X}}_{rand}-\overrightarrow{X}\right| $$
(6)
$$ \overset{\rightharpoonup }{X}\left(t+1\right)={\overrightarrow{X}}_{rand}-\overrightarrow{A}.\overrightarrow{D} $$
(7)

where \( {\overrightarrow{X}}_{rand} \) signifies a random position determined from the current population. Some of the most important parameters in WOA algorithm are maximum number of iterations (MaxIt), number of whales (nPop0), minimum limit for generating unit (Pi), total losses (r), total load demand (Mutation rate), up coefficient vector (A), and down coefficient vector (C). The interested reader may also refer to Mirjalili and Lewis (2016) and Aljarah et al. (2018) for more details.

The hybrid model

ANN model does not require complicated calculations, but it needs to adjust network weights and coordinate neurons when performing local converging and optimization. One of the novelties of this study is to apply the newly developed WOA into a hybrid ANN-WOA model to fulfill a rapid and efficient weight estimation for PWP and FC estimation at basin scale.

The performance of the WOA based on the ANN is determined using weights, while the bias of each neuron in the ANN is optimized using the WOA. ANN-WOA stops when a mathematical fit between the ANN weights and the WOA is reached, or the maximum number of iteration occurs. Hence, this could be evaluated as an estimating technique which utilize both ANN and optimization algorithm capabilities. The flow chart of the ANN-WOA is given in Fig. 4.

Fig. 4
figure 4

Flow chart of the ANN-WOA model used

Performance criteria and evaluation methods

For evaluating and make comparison between the results of models, several performance criteria are used. The determination coefficient (R2), root mean square error (RMSE), and root relative mean square error (RRMSE) are used to define a model with the best fit and lowest errors. Therefore, the goal in using the determination coefficient is to evaluate the goodness of fit between observation (validation set) and results as

$$ {R}^2={\left(\frac{1}{n}\times \frac{\sum \left({x}_i-\overline{x}\right)\left({y}_i-\overline{y}\right)}{\left({\sigma}_x\right)\left({\sigma}_y\right)}\right)}^2 $$
(8)

where n is the number of data, x and y are observed and estimated values, and σx and σy are the standard deviation of the observed and estimated data.

Other performance criteria of RMSE and RRMSE can also be evaluated as

$$ \mathrm{RMSE}=\sqrt{\frac{\sum {\left({x}_i-{y}_i\right)}^2}{n}} $$
(9)
$$ \mathrm{RRMSE}=\frac{\sqrt{\frac{1}{n}{\sum}_{i=1}^n{\left({x}_i-{y}_i\right)}^2}}{\sum_{i=1}^n{x}_i}\times 100 $$
(10)

Results and discussion

To make a strong model, the selection of input variables is a crucial step. Hence, a new hybrid model called ANN-WOA is used to be evaluated against ANN and MLR models in estimation of FC and PWP at the basin scale. The feed-forward back-propagation neural networks with the Levenberg–Marquardt training algorithm are employed on the ANN models, while a combination of tangent and linear functions is used in approximation of the activation functions at hidden and output layers, respectively. Trial and error procedure has been used to determine the number of hidden neurons, as well as obtaining the most accurate model with the least possible error (Deo and Ahin 2016). Then, a set of 174 data sample sets is used for training the models.

The best MLR models for FC and PWP were respectively obtained as

$$ FC\ \left(\%\right)=-5.666+\left(0.308\ast Clay\ \left(\%\right)\right)+\left(0.295\ast Silt\left(\%\right)\right)+\left(0.158\ast Sand\left(\%\right)\right)+\left(3.212\ast OM\left(\%\right)\right) $$
(11)
$$ PWP\ \left(\%\right)=-62.143+\left(0.761\ast Clay\ \left(\%\right)\right)+\left(0.707\ast Silt\left(\%\right)\right)+\left(0.723\ast Sand\left(\%\right)\right)+\left(1.791\ast OM\left(\%\right)\right) $$
(12)

The related determination coefficients of the MLR method at Eqs. 11 and 12 are 0.66 and 0.59 respectively, which are mediocre results and are in agreement with the analysis given in Figs. 2 and 3. Parameter setting is an important part in machine learning modeling process (Mohammadi, 2019c, d), so the best ANN model was obtained using a three-layered MLP with a tangent and linear sigmoid activation function at the core of hidden and output layers, respectively. Up to 1000 iteration is used in optimization, while the optimum number of neurons was reached by try and error using adding up technique, of 30 neurons at maximum. The hybrid ANN-WOA model was then optimized and calibrated using details given in Table 3.

Table 3 Parameters of WOA algorithm in calibration of hybrid ANN-WOA model

In this respect, given results in Tables 3 and 4 show that the ANN-WOA is the best model at training and validation stage due to the core of the WOA which helps the model in faster and accurate convergence while preventing it from being trapped at local extremums. Similar results are obtained in estimation of the PWP (Table 4).

Table 4 Results of the best model fitting in estimation of FC and PWP

The scatter plot of all models under confidence band of 95% and 90% is also given (Fig. 5). Similarly, the lowest discrepancy and highest likelihood is associated with the ANN-WOA (Fig. 5). There are fewer values which are overpassed the confidence limit while most of the points are located near the perfect fit line, y = x. It is obvious that the shrinking and encircling mechanism together with the spiral updating position of the vectors towards the prey (i.e., the best results) is an efficient way in obtaining the weights of the ANN model. When compared to the back propagation (in ANN) or least square error (in MLR), WOA shows that the encircling approximation used in descending the distance and finalization in defining the global extremum is an efficient way in function approximation.

Fig. 5
figure 5

Scatter plot of the best models for ANN, ANN-WOA, and MLR in estimating the a FC and b PWP

In general, all models showed good performances, in recognizing the pattern of the relationship between independent variables and FC or PWP. It is concluded that the hybrid ANN-WOA has upper hand which makes it more satisfactory in practice. Since ANN and ANN-WOA are nonparametric models, it is essential to update the core algorithm of models overly to make sure that the best performance is obtained at each time.

Figures 6 and 7 show the distribution of the estimated and observed data in validation sample set. These figures illustrate the probability mass function of the ANN-WOA, which is located within the 25% and 75% quartile of the observation set. However, the ANN-WOA shows more accurate estimation for FC compared to PWP. Based on Fig. 6, minima, maxima, and the standard deviations of the models showed different variations. Hence, it is not confident to use them in approximation of the models. The only dependable criterion which shows more dependability is the median of the samples. As shown in box plots of Fig. 6 and histograms of Fig. 7, median of the predicted samples by ANN-WOA shows less discrepancies from the observation values in validation data set. Thereby, align with the concept of nonparametric approach, the median of the samples seems to be more dependable compared to the other moments of the distribution. Figure 7 also shows that the MLR and ANN are over estimating the portion of FC (Fig. 7a) and PWP (Fig. 7b) in practice. Similar results can be obtained from Fig. 5, while the overpassing from the 90% or the 95% confidence limits occurred and over estimation in MLR and ANN are inevitable. Based on the convergence of the WOA, the circular arcs and the displaced centers function approximation technique reveal more promising results.

Fig. 6
figure 6

Box plot of the validation phase of the models in estimation of FC and PWP

Fig. 7
figure 7

Histogram plot for analysis prediction FC and PWP for all models

In brief, it was concluded that the relationship between the sand, silt, clay, and OM against FC given in Fig. 2 depicts OM as the most important variable in prediction of FC. These results were also confirmed in Table 2, while the Pearson’s correlation coefficient of sand, silt, clay, and OM with FC is 0.07, 0.03, − 0.11, and 0.77 are respectively. The results of evaluation for PWP in Fig. 3 also showed similar results. Particularly, none of the variables except OM could predict the PWP portions effectively. The results of the Pearson’s correlation coefficients in Table 2 with PWP also showed that the OM had the upper hand in prediction of the water content in the soil samples.

Results of the modeling were also in favor of the ANN-WOA hybrid model which could predict the amount of FC and PWP satisfactorily. In this respect, the results obtained for FC were more accurate and it was concluded that the spiral goal seeking provided by the whale optimization algorithm (Fig. 4) effectively depicts the relationship between sand, silt, clay, and OM against FC or PWP.

Conclusions and recommendations

In this study, a newly developed ANN-WOA method is used in estimating the FC and PWP. The study is based on the soil samples taken from a study area covering the West Azerbaijan and East Azerbaijan in north of Iran. In modeling, classical MLR and ANN models together with hybrid of ANN-WOA were used. The portion of clay, sand, silt, and OM in the soil samples were used as the independent variables while the FC and the PWP were evaluated as dependent variables in the models. It is found that the OM has the highest linear and nonlinear bound with the outputs, i.e., FC and PWP. This depicts the importance of OM at the basin scale which should be detailed in further studies. The nonlinear bound is also conceptualized in the analysis by taking into account the relationship between PWP and FC with sand, silt, and clay. Later, normalized data of randomly selected sets were used in training and validation of the models separately. Results of the models were evaluated using several performance criteria while the best and the second-best models found to be the hybrid of ANN-WOA and the ANN models respectively. The superior results of the ANN-WOA model found to be linked to the fast and proper convergence of the ANN core in defining the optimum solution while other models could easily trap in local extremums.

Overall, the results of this study proved that the WOA is a useful add-on tool for enhancing the predicting accuracy of ANN models. The ANN-WOA can be considered as a global optimizer since it includes exploration/exploitation ability, while it can search the neighboring space for the best solution. Based on the high accuracy of the proposed ANN-WOA model, short-term forecasting scenarios using hydrological variables (e.g., soil parameters, evaporation, groundwater levels, rainfall, evaporation, flood, and drought forecasting) could be an interesting topic in future studies. The broader application is warranted, noting but the effectiveness of the newly evaluated ANN-WOA model that must be explored in soil and water studies.