1 Introduction

Evaporation is known as one type of vaporization that happens on the surface of a liquid as it converts into the gas phase [1,2,3,4]. As a significant environmental and climatic concern, and to minimize the negative its effects on the environment, the water-loss step of the water cycle, are extensively discussed [5,6,7,8,9,10,11,12,13,14]. There have been many case studies working on current worldwide environmental concerns [15,16,17,18,19]. In this sense, and during recent decades, artificial intelligence has proved to be one of the most popular methods for the indirect analysis of environmental engineering-based factors [20,21,22,23,24,25,26,27], and more specifically the pan evaporation (EP) [28, 29]. Fuzzy-based tools [30,31,32], regression-based methods [33, 34], neural learning [35,36,37], and support vector approaches [38,39,40] are well-known techniques that have been highly regarded by experts. Further applications of intelligent tools can be found in different fields such as in engineering [41,42,43,44,45,46], structural health monitoring [47,48,49,50,51,52,52], reinforced concrete structure performance [53,54,56], computer vision techniques such as machine vision [57], moving object detection [58, 59], image enhancement [60, 61], air quality [62, 63], energy [64,65,66,67,68,69,70,71], computational image processing [72,73,74,75,76,77,78], groundwater remediation strategies [79], big-data in traffic management [80], prefabricated walls [81,82,83], socially aware networks [84], climatic change [85, 86], or even in medical sciences [87,88,89,90,91,92,93,94,95]. Recently, programmers have used various metaheuristic algorithms for optimization aims [96, 97]. A particular application of these techniques is hybridizing the existing predictive models. In this sense, hybrid kernel extreme learning machine [98], fruit fly optimization [99], bacterial foraging optimization [100], many-objective sizing optimization [101,102,103,104], Harris hawks optimizer [105], data-driven robust optimization [106], multi-objective 3-d topology optimization [107], global numerical optimization [108], moth-flame optimization [109] are some good examples of machine learning, conventional neural network, and hybridized optimization algorithms. Different studies are trying to extend the superiority of prediction techniques such as deep learning [110,111,112,113,114], feature selection [115, 116], or feature extraction [117,118,119]. Arunkumar et al. [120] employed three data mining approaches including artificial neural network (ANN), model tree (MT), and genetic programming (GP) for developing EP evaluative tools. With a correlation of 0.959, the GP was stronger than other methods. This is while the ANN was found to be more suitable for cause–effect mapping. Alsumaiei [121] applied an ANN to the prediction of the daily rate of EP in hyper-arid climates in Kuwait. With reference to the obtained Nash–Sutcliffe coefficients (varying in [0.405, 0.755]), the ANN could satisfactorily handle this task. The applicability of multivariate adaptive regression spline (MARS) and MT incorporated with maximum overlap discrete wavelet transform was examined by Ghaemi et al. [122]. As a result of this hybridization, significant decreases were observed in the error of both standard MARS and MT. Likewise, a combination of response surface method (RSM) and support vector regression (SVR) was proposed by Keshtegar et al. [123] for EP modeling. According to the accuracy measures, this model outperformed single methods and a capable ANN, namely multilayer perceptron (MLP).

By combining the nature-inspired searching algorithms, optimal configurations of the intended models are achieved. This measure also prevents computational threats like local minima [124]. Roy et al. [125], for example, used biogeography-based optimization (BBO), teaching–learning-based optimization (TLBO), firefly algorithm (FFA), and particle swarm optimization (PSO) for optimizing an adaptive neuro-fuzzy inference system (ANFIS) applied to evapotranspiration prediction. Gocić et al. [126] achieved two powerful methodologies for reference evapotranspiration modeling by combining the support vector machine (SVM) with FFA and wavelet technique. They compared the performance of these two hybrids with ANN and GP, and witnessed the better performance of the FFA-SVM and SVM-wavelet. Mohammadi and Mehdizadeh [127] created a hybrid of whale optimization algorithm (WOA) and SVR for the same purpose in Iran. They also showed that random forest (RF) is a good tool for input evaluation. With normalized root-mean-square errors (RMSEs) of 5.466, 9.958, and 5.412% calculated for Isfahan, Urmia, and Yazd stations, respectively, the proposed method outperformed seven other predictors including typical SVR. In a different study, Liu et al. [128] proposed a searching algorithm to solve box constrained global optimization problems. They used an example of structural design.

As for the ANNs, many scholars have coupled these tools with nature-inspired optimizers for hydrological simulations [129131]. A hybrid of FFA-ANN was proposed by Ashrafzadeh et al. [132] for the EP approximation. Due to the better performance of the ensemble model relative to the conventional ANN, they concluded that retrofitting this processor with the FFA is a promising way toward accuracy enhancement. Tikhamarine et al. [133] predicted reference evapotranspiration in India and Algeria using five ensembles of ANN with PSO, WOA, grey wolf optimizer (GWO), ant lion optimizer (ALO), and multi-verse optimizer (MVO). A comparative assessment of the results pointed out the superiority of the GWO for training the ANN. A similar application of the GWO and WOA, as well as genetic algorithm (GA), was presented by Seifi and Soroush [134]. They coupled these optimizers with ANN for predicting the EP in different parts of Iran. Among the metaheuristic-based ANNs, those trained the GA outperformed the WOA and GWO.

Despite the adequate competency shown by metaheuristic approaches in optimizing standard predictors, they mostly take a large time for this process. Due to the importance of time-efficiency in engineering assessments, this study offers optimal hybrids for the EP prediction. To this end, the optimization of an ANN is assigned to two fast metaheuristic algorithms, namely shuffled complex evolution (SCE) and electromagnetic field optimization (EFO). While these algorithms have been effectively employed for optimization objectives [135], no prior effort can be found regarding their application in the EP estimation.

2 Data acquisition and statistics

As is known, data provision is a crucial step in machine learning implementation for predicting any parameter [136]. Therefore, the data should be obtained from a valid source. Generally, besides the intended parameter(s) (i.e., to be predicted), some factors are required for creating a database. These factors are selected based on a logical (here dependent-independent) relationship and play the role of influencing factors for the intended parameter in the real world. In the case of this study, the EP is the dependent parameter influenced by wind speed (SW), air temperature (TA), daylight pressure (PD), solar radiation (RS), and daylight humidity (HD). These independent factors are called inputs hereafter.

The climatic records belonging to a 5-year period are used in this work. More clearly, the values of EP, SW, TA, PD, RS, and HD from January 01, 1986 to December 31, 1990 were downloaded from the website of the US environmental protection agency (http://www.epa.gov). Figure 1 shows the variation of the target parameter (i.e., the EP) in this period. As is seen, this figure is divided into two separate parts by the name training and testing. The training data cover the first 4 years and the fifth year is dedicated to testing data. The reason for doing this is to evaluate the generalizability of the models using new climatic conditions. In this regard, once the models capture the EP pattern by analyzing the training data (1986–1989), they are asked to predict the EP for the year 1990. Table 1 gives the statistical description of both datasets.

Fig. 1
figure 1

The EP variation in the intended period

Table 1 Statistical indices of data in both periods

Finally, in this section, Fig. 2 shows the location of the studied station. It is the Bakersfield station located in the central part of Kern County, California, with a warm and semi-arid climate [137]. The longitude and latitude are 119° 03′ W and 35° 25′ N, respectively. Also, the elevation in this area is around 151 m above sea level. According to Table 1, the average temperature was around 18.3 °C over the selected time. Also, the EP ranged in [0.3, 20.5] mm with an average of 7.8 and 7.9 mm in the training and testing period, respectively.

Fig. 2
figure 2

Location of the Bakersfield station (WBAN Number: 23155)

3 Methodology

3.1 The SCE

The SCE was developed by Duan et al. [138] as an efficient and simple metaheuristic optimizer. This algorithm relies on the synthesizing four concepts, including (a) combining probabilistic and deterministic techniques, (b) evolving a set of points (called complex) that span the space toward an optimal situation, (c) performing a competitive evolution strategy, and (d) shuffling the complex [139]. Like other algorithms, each member represents a possible solution to the given problem. For doing the optimization, first, initial individuals are equally divided into a number of complexes. A local optimum solution is discovered by each complex by executing downhill simplex method. After repeating this process for new points, these solutions are then processed and collected to attain a global response.

In the first step, the problem and parameters are initialized. Given \(OF\left( H \right)\) as the objective function, Eq. 1 expresses how the problem is defined.

$${\text{Min}}\, {\text{OF}}\,\left( H \right) = \mathop \sum \limits_{x} \frac{{\left| {H_{i} + h_{px} - h_{Lx} - H_{j} } \right|^{{\left( \frac{1}{n} \right) + 1}} }}{{R_{x}^{\frac{1}{n}} }} + \left( {\frac{1}{n} + 1} \right) \mathop \sum \limits_{j} q_{oj} \left( {H_{j} } \right),$$
(1)

where H symbolizes the group of each decision parameter. In the second step, the initial population is generated as follows:

$$H\left( {i, j} \right) = H_{{{\text{min}}}} \left( {i, j} \right) + {\text{ rand}}\left( {H_{{{\text{min}}}} \left( {i, j} \right) - H_{{{\text{max}}}} \left( {i, j} \right)} \right),$$
(2)

in which rand is a random number uniformly distributed from 0 to 1, and \(H_{{{\text{min}}}} \left( {i, j} \right)\) and \(H_{{{\text{max}}}} \left( {i, j} \right)\) denote the lower and upper bounds of j at the ith node. In the following, the \({\text{OF}}\left( H \right)\) is calculated for all individuals. Given N as the number of unknown nodes, the population matrix can be expressed as follows:

$$P = \left[ {\begin{array}{*{20}c} {C_{1} } \\ {C_{2} } \\ \vdots \\ {C_{{{\text{NP}}}} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {H_{1}^{1} } & {H_{2}^{1} } & \ldots & {H_{N}^{1} } \\ {H_{1}^{2} } & {H_{2}^{2} } & \ldots & {H_{N}^{2} } \\ \vdots & \vdots & \ddots & \vdots \\ {H_{1}^{{{\text{NP}}}} } & {H_{2}^{{{\text{NP}}}} } & \ldots & {H_{N}^{{{\text{NP}}}} } \\ \end{array} } \right].$$
(3)

In the third step, the algorithm sorts S solutions with respect to the objective functions. Next in the fourth step, it partitions these solutions into M complexes, so that each of these units contains m points. In this regard, the points by the number M(k − 1) + 1 go to the first complex, the points by the number M(k − 1) + 2 go to the first complex, and so on (k = 1, 2, …, m). After implementing the competitive complex evolution as the fifth step, the sixth step is dedicated to shuffling the complexes. Finally, termination criteria are checked to stop repeating steps 3, 4, and 5 [140]. The SCE is also explained in earlier studies [141, 142].

3.1.1 The EFO

The EFO draws on electromagnetics rules to provide a fast and capable optimizer. Abedinpourshotorban et al. [143] developed the EFO in 2016. In a cooperative process, the population, which is composed of electromagnet particles (EMPs), enhances the positions to replace poor solutions with promising ones. The interaction between the EMPs is based on the attraction–repulsion rule.

The optimization strategy in the EFO can be expressed in four major stages. First, a certain number of EPMs are randomly generated. Concerning their fitness, the EMPs are organized. The next stage is dedicated to the classification of individuals. Three groups of EMPs are formed, so that the first group, called positive field, contains the best-fitted individuals; the second group, called negative field, contains the worst individuals; and the third group, called neutral field, contains the individuals with low negative polarities. Producing and organizing new EMPs are crucial steps of the EFO. The production process is illustrated in Fig. 3.

Fig. 3
figure 3

Generating new members in the EFO

The bounds of the new member have to comply with the existing space. In other words, once it is not limited to space, another EMP is produced. Given n as the number of EMPs and GR as the golden ratio, the production process can be mathematically expressed as follows:

$${\text{EMP}}_{n}^{{{\text{new}}}} = {\text{EMP}}_{n}^{{K_{n} }} + \left( {\left( {{\text{GR}}*{\text{rand}}} \right)* D_{n}^{{P_{n} K_{n} }} } \right) - \left( {{\text{rand}} *\left( {D_{n}^{{N_{n} K_{n} }} } \right)} \right),$$
(4)

where rand is a random value ranging in [1]. Also, \(D_{n}^{{N_{n} K_{n} }}\) and \(D_{n}^{{P_{n} K_{n} }}\) are obtained by the following equations:

$$D_{n}^{{N_{n} K_{n} }} = {\text{EMP}}_{n}^{{N_{n} }} - {\text{EMP}}_{n}^{{K_{n} }} { ,}$$
(5)
$$D_{n}^{{P_{n} K_{n} }} = {\text{EMP}}_{n}^{{P_{n} }} - {\text{EMP}}_{n}^{{K_{n} }} ,{ }$$
(6)

where \({\text{EMP}}_{n}^{{N_{n} }}\), \({\text{EMP}}_{n}^{{P_{n} }}\), and \({\text{EMP}}_{n}^{{K_{n} }}\) stand for the negative, positive, and neutral electromagnet, respectively.

In the last step, a random number is generated and compared to the parameter \(R_{{{\text{Rate}}}}\) to decide about the replacement process. Note that \(R_{{{\text{Rate}}}}\) is the probability of changing an electromagnet (of the produced EMP) with a random one. If \(R_{{{\text{Rate}}}}\) is larger than the random value, the replacement occurs [144]. Further explanations about the EFO mechanism can be found in Refs. [145, 146].

3.2 Hybridization

The SCE and EFO aim to train the ANN for predicting the EP. For this purpose, a valid ANN structure should be determined as the skeleton of the hybrid models. The ANN structure suggested for this work is shown in Fig. 4. It is an MLP network with five inputs and one output. As is seen, there are five neurons in the middle layer that are determined after a trial and error proceeding. The suitability of this structure has also been professed in earlier studies [147]. In such networks, the calculation is carried out by the neurons by applying an activation function to a linear combination of weights, biases, and input values. The same process is executed by the subsequent neurons to produce the overall response (i.e., EP) [148,149,150]. According to Table 2, 36 parameters are involved in the prediction procedure. Therefore, each of the SCE and EFO should tune 36 parameters to attain an optimal ANN.

Fig. 4
figure 4

The suggested ANN for hybridization

Table 2 The number of parameters in the suggested ANN

In metaheuristic algorithms, an iterative strategy is taken for improving the quality of the results. The goodness of the response in each iteration (i.e., the weights and biases) is evaluated by measuring the accuracy of training. Note that training data are used for this process. The algorithm tries to increase the accuracy by achieving a more promising solution to the ANN problem, and eventually, it is terminated somewhere. The last response of the algorithm builds the optimal ANN. The trained hybrids, called ANN-SCE and ANN-EFO, then predict the EP for the testing period.

3.3 Accuracy assessment indices

To have a valid assessment of the prediction results, different accuracy criteria can be defined. In this work, the error of prediction is measured by two standard criteria, namely root-mean-square error (RMSE) and mean absolute error (MAE). A percentage form of the MAE, called mean absolute percentage error (MAPE), is also used to give the relative error. Given Error as the difference between the recorded EP (\(E_{{P_{{{\text{Record}}}} }}\)) and modeled EP (\(E_{{P_{{{\text{Model}}}} }}\)), these criteria are expressed by the following equations:

$${\text{MAE}} = \frac{1}{Z}\sum\limits_{i = 1}^{Z} {|{\text{Error}}_{i} |}$$
(7)
$${\text{RMSE}} = \sqrt {\frac{1}{Z}\sum\limits_{i = 1}^{Z} {[{\text{Error}}_{i} ]}^{2} }$$
(8)
$${\text{MAPE}} = \frac{1}{Z}\sum\limits_{i = 1}^{Z} {\Big|\frac{{{\text{Error}}_{i} }}{{E_{{{\text{P}}_{{{\text{Record}}_{i} }} }} }}\Big|} \times 100,$$
(9)

where Z stands for the number of days.

Moreover, Eq. 10 expresses Pearson correlation coefficient (PCC) that is used for assessing the agreement between the \(E_{{P_{Record} }}\) and \(E_{{P_{{{\text{Model}}}} }}\):

$$\begin{gathered} \hfill \\ {\text{PCC}} = \frac{{\sum\nolimits_{i = 1}^{Z} {\Big(E_{{{\text{P}}_{{{\text{model}}_{i} }} }} - \overline{{E_{{{\text{P}}_{{{\text{model}}}} }} }} \Big) \Big(E_{{{\text{P}}_{{{\text{Record}}_{i} }} }} - \overline{{E_{{{\text{P}}_{{_{{{\text{Record}}}} }} }} }} )} }}{{\sqrt {\sum\nolimits_{i = 1}^{Z} {(E_{{{\text{P}}_{{{\text{model}}_{i} }} }} - \overline{{E_{{{\text{P}}_{{{\text{model}}}} }} }} \Big)^{2} } } \sqrt {\sum\nolimits_{i = 1}^{Z} {(E_{{{\text{P}}_{{{\text{Record}}_{i} }} }} - \overline{{E_{{{\text{P}}_{{_{{{\text{Record}}}} }} }} }} )^{2} } } }}. \hfill \\ \end{gathered}$$
(10)

4 Results and discussion

4.1 Training results

The optimization mechanism was explained in Sect. 3.2. Based on the behavior of the used algorithms, different measures are taken for the implementation. As shown in Fig. 5a shows, for the ANN-SCE, eight values, including 2, 5, 10, 15, 20, 50, 100, and 200, are considered as the population size (NP). It is observed that all tested networks reach a relatively stable convergence after 600 iterations. The magnified section illustrates that the response of NP = 20 achieved the lowest objective function (i.e., the RMSE in this case). Therefore, this network represents the proposed ANN-SCE. As for the ANN-EFO, Fig. 5b shows the convergence curves for the NPs of 25, 30, 35, 40, 50, 80, 100, and 200. Due to the same reason, the network with NP = 35 is selected to represent the ANN-EFO. A distinction between the implementation of these two algorithms is the number of iterations, which, based on their behaviors, is selected to be 1000 and 30,000 for the SCE and EFO, respectively. A single ANN trained by the Levenberg–Marquardt [151] is also considered as a benchmark to validate the performance of the used metaheuristic algorithms.

Fig. 5
figure 5

Optimization of ANN using different NPs of the a SCE and b EFO

Figure 6 shows the training results in terms of error values (the same as Error explained above). In this phase, the RMSEs calculated for the single ANN, ANN-SCE, and ANN-EFO were 0.6958, 0.6802, and 0.6749, respectively. Likewise, the MEAs were 0.6080, 0.5954, and 0.5901. It indicates excellent training provided by all three strategies (i.e., LM, SCE, and EFO).

Fig. 6
figure 6

Training errors for a, b single ANN, c, d ANN-SCE, and e, f ANN-EFO

The high quality of training can also be confirmed by the PCCs of 0.98888, 0.98917, and 0.98934. However, by comparison, it can be derived that the SCE and EFO have made stronger ANNs. It implies that the ANN can be properly improved by metaheuristic techniques. It is due to the more suitable responses found by the SCE and EFO. These responses, as explained, comprise the neural parameters that reveal the intricate relationship between the EP and SW, TA, PD, RS, and HD (Fig. 4).

4.1.1 Testing results

The testing inputs were then given to the built networks to estimate the EP for the year 1990. Regarding the fact that the networks had not come across these data, their performance in this section represents the generalizability of their knowledge. The results are assessed in the same way as the training phase. Figure 7 depicts the obtained errors. Having a look at the actual range of testing EPs in Table 1 (i.e., [0.5, 18.7] mm) and comparing it with the calculated errors demonstrate that all three models can elegantly estimate the EP pattern in the testing period. In this regard, the RMSEs were 1.5647, 1.4764, and 1.4239. A low range of MAEs, i.e., 1.4947, 1.4047, and 1.3561, is another indicator of the high accuracy for all applied models. Moreover, the histogram charts show a suitable frequency of error values.

Fig. 7
figure 7

Testing errors for a, b single ANN, c, d ANN-SCE, and e, f ANN-EFO

Furthermore, the agreement between the EPs recorded in the testing period and those estimated by the used models is graphically shown in Fig. 8. As is seen, the products are strongly correlated with real-world data. Referring to the obtained PCCs of 0.99838, 0.99824, and 0.99802, all three models enjoy a very good potential for prediction.

Fig. 8
figure 8

Regression between the recorded and modeled EPs in the testing period after the prediction of a single ANN, b ANN-SCE, and c ANN-EFO

Moreover, Table 3 gives a correlation-based assessment of the testing results from a seasonal point of view. Based on the calculated coefficients of determination (R2s), the models have done a very good prediction for all seasons.

Table 3 Seasonal assessment of the results

Comparing the testing results of the ANN with hybrid models (i.e., ANN-SCE and ANN-EFO) reveals that although the obtained PCCs have slight differences, both error criteria indicate a significantly more accurate prediction for the ANNs trained by the SCE and EFO. It is even more highlighted when the single ANN is evaluated versus the EFO-ANN. In detail, letting the ANN be trained by the SCE and EFO resulted in around 5.64 and 9.00% reduction of RMSE and nearly 6.02 and 9.27% reduction of MAE, respectively. It reflects the higher capability of metaheuristic-trained ANNs in predicting the daily EP. This finding becomes even more noticeable after knowing that the SCE and EFO are two of the fastest optimizers. The matter of time-effectiveness is discussed in the next section.

4.2 Comparison

The objective of the study was met after the above assessments. The ANN, which is a popular predictive model for the EP modeling, experienced appreciable improvements in the accuracy of prediction by incorporating with the SCE and EFO metaheuristic techniques. Depending on different parameters like the type of problem, the number of variables, and the size of data, these algorithms mostly take a considerable time for attaining optimal solutions [152]. Some long calculations regarding ANN optimization can be mentioned for teaching–learning-based optimization and cuckoo optimization algorithm [65], spotted hyena optimizer [153], wind-driven optimization [154], etc. In contrast, scholars like Zheng et al. [155] have reached their desired optimization using the SCE in a shorter time.

In this work, the time taken by the SCE and EFO was about 479.0 and 281.9 s, respectively (on an Intel core i7 64-bit operating system with 16 gigs of RAM). It means that the EFO is a faster algorithm than SCE. Moreover, based on the lower values of the RMSE and MAE obtained for the ANN-EFO, the EFO can also be pointed out as a more capable algorithm, too.

4.3 Importance assessment

To investigate the effect of each input factor, a bagged ensemble of 200 regression trees is executed. The results are shown in Fig. 9. As is seen, SW plays the most important role in the EP simulation, while the lowest effect is exerted by the PD factor. Also, the effect of three other factors can be considered as relatively gentle.

Fig. 9
figure 9

Importance assessment of the input factors

4.4 The E P equation

This section gives the formula of the EP created by the ANN-EFO. Based on Fig. 4, the contribution of the inputs to the output (i.e., the EP) passes through a complicated neural network. It was explained that 36 parameters that are optimized by the EFO are involved in this process. Equation 11 calculates the EP:

$$E_{{\text{P}}} = [{\text{LW}}] \cdot { (}f(([{\text{IW}}] \cdot [{\text{Input}}]){ + }[b{1}])){ + }[b{2}]{.}$$
(11)

In the above relationship, [LW] is the vector of the hidden-output weights given in Eq. 12, [IW] is the vector of the input-hidden weights given in Eq. 13, [Input] is the vector of input factors given in Eq. 14, [b1] is the vector of hidden biases given in Eq. 15, and [b2] is the vector of output bias given in Eq. 16. Also, f denotes the activation function expressed in Eq. 17:

$${\text{LW}} = \left[ {\begin{array}{*{20}c} { - 0.9935} & {0.3151} & { - 0.7192} & { - 0.9814} & { - 0.1872} \\ \end{array} } \right]$$
(12)
$${\text{IW}} = \left[ {\begin{array}{*{20}c} {0.6170} & { - 0.1241} & {1.1595} & { - 0.6882} & { - 1.2317} \\ { - 1.0706} & { - 0.2674} & {0.2105} & {0.7013} & { - 1.4062} \\ {0.1757} & { - 1.1247} & { - 0.7690} & {0.9611} & {0.9593} \\ {1.2087} & {0.3759} & {1.0580} & {0.9998} & { - 0.0992} \\ { - 1.2847} & { - 0.6414} & {0.1228} & { - 1.2334} & { - 0.3647} \\ \end{array} } \right]$$
(13)
$${\text{Input}} = \left[ {\begin{array}{*{20}c} {S_{{\text{W}}} } \\ {T_{{\text{A}}} } \\ {P_{{\text{D}}} } \\ {R_{{\text{S}}} } \\ {H_{{\text{D}}} } \\ \end{array} } \right]$$
(14)
$$b1 = \left[ {\begin{array}{*{20}c} { - 1.9316} \\ {0.9658} \\ {0.0000} \\ {0.9658} \\ { - 1.9316} \\ \end{array} } \right]$$
(15)
$$b2 = \left[ {\begin{array}{*{20}c} { - 0.1385} \\ \end{array} } \right]$$
(16)
$$f\left( x \right) = \frac{2}{{1 + {\text{e}}^{ - 2x} }} - 1.$$
(17)

According to the above formula, the EFO first creates the appropriate weights and biases to produce some middle parameters in the hidden layer. Next, the outcomes are treated as inputs to produce the final output in the last layer.

4.5 Further discussion

The findings of this paper revealed that the suggested optimizers can be favorably used for pan evaporation modeling through neural network. Apart from accuracy, a significant strength of the used algorithms was their low computation time. It means that both SCE and EFO are able to find an optimal way for the EP prediction. Back to Fig. 5, while most of metaheuristic optimizers give their best solution with high number of population, the SCE and EFO performed better with small NPs. Another appreciable point was the optimization behavior of the EFO which reached a relatively steady situation after 15,000 iteration in the case of this problem. In other words, the EFO has a good potential to improve the established contribution of the input factors for many times. Therefore, the number of iterations should be properly regarded for further applications of this algorithm.

5 Conclusions

The performance of an artificial neural network was supervised by shuffled complex evolution and electromagnetic field optimization toward the optimal prediction of pan evaporation. These algorithms found suitable biases and weights of the ANN in a short time. The quality of their performance was compared to the Levenberg–Marquardt algorithm which is a default trainer for the ANN. It showed that the hybridized ANN can predict the EP pattern with higher accuracy. For example, the RMSE fell from 1.5647 to 1.4764 after the performance of the SCE. The EFO was even more capable and reduced this value to 1.4239. This advantage, as well the shorter computation time, made the EFO superior over the SCE. It is also a newer strategy. All in all, due to the crucial role of time-efficient accuracy enhancement in engineering simulations, the findings of this study are of interest. Accordingly, testing different metaheuristic strategies on other leading predictive models (e.g., the ANFIS and SVM) in future efforts can improve the EP modeling in new ways.