1 Introduction

Sediment load information is useful for problems in the design of reservoirs and dams, transport of sediment and pollutants in rivers, lakes and estuaries, design of stable channels and dams, protection of fish and wildlife habitats, determination of the effects of watershed management and environmental impact assessment (Cigizoglu 2004). Water quality and sediment modeling have been a challenging task in the field of computational hydrology (Kişi 2009). Traditionally used methods (e.g., Ahmad et al. 2009, 2010) to determine runoff often do not take into account sediment load. Estimation of sediment load has been approached through empirical relationships, numerical simulations, physically-based models and using remote sensing and Geographic Information Systems (GIS) techniques.

Precise simulation of sediment load is important for sustainable water supplies and environmental systems, because it plays a major role in any decision-making process on water availability. In recent years (Lohani et al. 2007; Boukhrissa et al. 2013; Yadav et al. 2018; Ampomah et al. 2010), the use of data-driven modeling techniques to produce improved sediment yield rating curves has attracted substantial attention. Previously, multiple sediment prediction models were developed by hydrology researchers, ranging from empirical, i.e., USLE/RUSLE (Wischmeier and Smith 1978; Fu et al. 2006; Arekhi et al. 2012; Borrelli et al. 2017) and mathematical, i.e., kinematic/ diffusion wave theory (Liu et al. 2004; Singh and Tayfur 2008; Schneider 2018) or linear/nonlinear programming optimization (Nicklow and Mays 2000; Dutta 2015; Wang et al. 2020) to physically dependent. Physically process-based models such as SWAT (Asres and Awulachew 2010; Chandra et al. 2014; Dutta and Sen 2018; Liu and Jiang 2019), WEPP (Yuksel et al. 2008; Saghafian et al. 2014; Singh et al. 2017; Ahmadi et al. 2020), and many others have shown better understanding of sediment yield modeling, yet their data hunger is often very large, and even watersheds intensively monitored lack adequate input data for these models. Therefore, alternative methods for forecasting runoff and yield of sediments need to be searched for. Soft computing methodology is one of the solutions to solving such problems.

Soft computing (SC) techniques such as the artificial neural network (ANN) model have successfully been used extensively for the prediction of suspended sediment load. The major advantage of such SC techniques is that these models are fully nonparametric and do not require a priori concept of the relations between the input variables and the output data (Gocić et al. 2015; Fahimi et al. 2017). Various researchers have used ANNs for hydrologic studies including time series predictions of runoff or streamflow (Hsu et al. 1995; Govindaraju 2000; Rajaee et al. 2009, 2010; Melesse et al. 2011; Lafdani et al. 2013; Khan et al. 2018; Meshram et al. 2019a, 2019b, 2020, 2021a, 2021b, 2021c; Iraji et al. 2020). Sudheer et al. (2003) used radial-based neural networks for partial weather data; Trajkovic (2005) used radial-based neural networks using temperature-based models; Kisi (2007) applied a neural computing technique using climatic data; and Aytek (2008) applied a co-active neuro-fuzzy interpretation system. Cobaner et al. (2009) used neural networks and adaptive neuro-fuzzy interference system techniques.

The rainfall–runoff correlation is positively modeled using ANNs (Raid and Mania 2004; Maier et al. 2010; Patel and Joshi 2017). ANNs were also measured as a dominant instrument to use in monthly river flow prediction and various groundwater problems (Coulibaly et al. 2001; Singh et al. 2013). Other applications of ANNs comprise unit hydrograph (Bhunya et al. 2011), regional drought analysis/flood frequency analysis (Adamowski et al. 2012; Adamowski et al. 2012; Belayneh et al. 2016), estimation of sanitary flows (Donovan et al. 2016), river basins classification (Fang et al. 2017), assessment of agricultural vulnerability (Ettinger et al. 2016), modeling hydraulic characteristics of severe contraction (Qiwei et al. 2016). Inside the entire work, the term MLPs (multi-layer perceptron) is favored over the common ANN explanation in light of the fact that there are different ANN algorithms, and MLPs are only one of them.

In the various water resource data, the MLPs establish the popular algorithm in the ANN application. Other algorithm such as RBFs (radial basis functions) (Heddam 2016; Nourani et al. 2017), Conjugate gradient algorithms (Yu-hong and Cai-xia 2013) cascade correlation algorithm (Schetinin 2003; Kaladhar et al. 2011) and recurrent neural networks (Graves et al. 2006) have also been active in some studies. However, these algorithms suffer from the capability of finding optimal parameters. Therefore, different optimization algorithms incorporate into the arrangement to improve the prediction accuracy. One of the popular optimization algorithms is Firefly Algorithm (FFA) (Kayarvizhy et al. 2014). FFA is a multimodal nature inspired metaheuristic optimization algorithm based on flashing behavior of fireflies (Yang 2009). A model that integrates the MLP with firefly algorithm (FFA) is developed to predict sediment load (Yang 2010). In another study by Ghorbani et al. (2017), a hybrid SVM-FFA arrangement has been developed to forecast the field capacity and permanent wilting point of soils in East Azerbaijan province, North-west Iran.

In this study, a hybrid model incorporating the firefly algorithm (FFA) into MLP is advanced to forecast sediment load. An ANN-based method developed to forecast the daily sediment load for Mahabad River in Iran. To find the optimal values for MLP parameters, the FFA algorithm incorporated to the model architecture. The model feasibility investigated further by making a comparison between the hybrid MLP-FFA approach and isolated MLP technique. The purpose of this study is, for the first time, to examine the application of MLP-FFA algorithm to predict sediment load data sets in Mahabad River, Iran.

2 Material and methods

2.1 Study area

Mahabad River passes through the city and is composed of three branches. The river after passing through various villages and farmland irrigation and channel their way through the swamps South of the lake sheds. Mahabad river basin is located South of Lake Urmia in West Azarbaijan province. The basin covered 1524.53 km2 area about 3% of the total area is included the catchment basin of the lake. The locations of Mahabad river of its recording station used in this study are East Longitude 45°25′9″ to 46°45′51″ and North Latitude 36°23′51″ to 37°03′11″ (Fig. 1). The basin is roughly oval that the large diameter is the North–South, and the small diameter is East–West. This watershed is shared the Little Zab watershed basins in the South West, Gadr in West and Siminehroodin Southeast also in North borders the Lake Urmia.

Fig. 1
figure 1

Location map of the study area (Mahabad River)

2.2 Statistical specifications of data

In this study, the daily suspended sediment load (SSL) and streamflow data in Mahabad River were used. For this study, the observed streamflow and sediment data are 5 years (60 months) from 2011 to 2016 (Fig. 2). The statistical properties of the streamflow and sediment load data are given in Table 1. The maximum, mean and minimum values (Xmax, Xmean and Xmin), standard deviation (σx) and variation of coefficient (Cv) of the data are provided in Table 1. It is seen that the sediment load has a high standard deviation (3895.65). The statistics results evidence the highly stochastic between the streamflow and the sediment load. For both analysis (MLP and MLP-FFA), the first 70% of the whole data set is used for training, and the remaining 30% is used for testing. In the current study, it is aimed to model the daily SSL using the streamflow data based on the scenarios illustrated in Table 2. One-day, two-day, three-day and four-day streamflow and SSL delays are considered in this study. In fact, it evaluates the dynamic memory of streamflow for estimation of the SSL, and also, it is a way for identifying the best input variables to achieve the best results.

Fig. 2
figure 2

Observed streamflow and sediment load in Mahabad River

Table 1 Statistical characteristics of the data
Table 2 Performance of MLP and MLP-FFA model for daily sediment load prediction

2.3 Model descriptions

2.3.1 Multi-layer perceptron neural networks (MLP)

Multi-layer feed-forward perceptron (MLP) is a multi-layered architecture of Neural Network including hidden layer besides input and output layer with Levenberg–Marquardt back propagation learning algorithm (Fig. 3). In each layer, the neurons are linked via a weight to the neurons in the following layer throughout training. The activation functions determined to be sigmoid and linear function for hidden layer and output layer, respectively. More details description about MLP structure is accessed in (Ghorbani et al. 2013).

Fig. 3
figure 3

Typical arrangement of multi-layer perceptron neural network

2.3.2 Firefly algorithm (FFA)

The FFA algorithm is a bio-inspired, swarm intelligence optimization technique motivated by flashing behavior of fireflies, introduced by Yang (2010) for the first time. In this technique, the arrangement of an optimization subject is established as operator i.e., firefly which beams in extent to its value. Therefore for every sunnier firefly pulls in its accomplices, paying little mind to their gender, makes the search for the pursuit space more operative (Lukasik and Zak 2009; Hemalatha et al. 2016; Al-shammari et al. 2016).

Fire flies are paying attention in to the brightness. The whole swarm transfers toward the sunniest firefly. Thus, the fireflies are attracted by the amount of their brilliance (Kayarvizhy et al. 2014; Fateen et al. 2012; Sudheer et al. 2014). Moreover, the brilliance lean on the concentration of the agent. For the development of FFA, the major issues are the objective function formulation and the light intensity variation. The light intensity I(L), the attraction (α) and the Cartesian distance among every two fireflies j and k can be represented as:

$$ I\left( L \right) = I_{{\text{O}}} \exp \left( { - \gamma L^{2} } \right) $$
(1)
$$ \alpha \left( L \right) = \alpha_{{\text{O}}} \exp \left( { - \gamma L^{2} } \right) $$
(2)
$$ L_{jk} = \left\| x_{j} + x_{k} \right\| = \sqrt {\mathop \sum \limits_{i = 1}^{d} \left( {x_{j,i} - x_{k,i} } \right)} $$
(3)

where I(L) and Io are the light concentration at distance L and the initial light concentration from a firefly, respectively. γ is the coefficient for the light absorption; α(L) and αo are the attraction at a distance L and L = 0, respectively. The subsequent movement of firefly j is exemplified as:

$$ x_{j}^{j + 1} = x_{j} + \Delta x_{j} $$
(4)
$$ \Delta x_{j} = \alpha_{O} e^{{ - \gamma L^{2} }} \left( {x_{k} - x_{j} } \right) + \mu \epsilon_{j} $$
(5)

The initial part in the Eq. (5) corresponds to the attraction between fireflies, and the second part is related to the randomization parameters, wherein µ is the randomization coefficient varies from 0 to 1, and \(\epsilon\_i\) represents the random number vector resulted from Gaussian distribution.

In this research, optimum values of γ, ε and C for the weights of the MLP architecture were computed. Firstly, we divided data: 70% of data for training and 30% of data for testing in FFA, ANN. Then, the data for ANN model should be normalized, and the range of input data within 0–1 has been used. Figure 4 indicates the structure of the MLP-FFA (Lukasik and Zak 2009).

Fig. 4
figure 4

Structure of the MLP-FFA

2.4 Performance criteria

In this study, three statistically criteria namely, coefficient of efficiency (R2), root mean square error (RMSE) and mean absolute error (MAE) were applied in order to evaluate the models performances. The criteria are defined as follows:

$$ R^{2} = \left[ {\frac{{\mathop \sum \nolimits_{N}^{i = 1} \left( {x_{i} - \overline{x}} \right)\left( {y_{i} - \overline{y}} \right)}}{{\sqrt {\mathop \sum \nolimits_{N}^{i = 1} \left( {x_{i} - \overline{x}} \right)^{2} .\mathop \sum \nolimits_{N}^{i = 1} \left( {y_{i} - \overline{y}} \right)^{2} } }}} \right]^{2} 0 \le R^{2} \le1 $$
(6)
$$ {\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {x_{i} - y_{i} } \right)^{2} }}{N}} $$
(7)
$$ {\text{MAE}} = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left| {\left( {x_{i} - y_{i} } \right)} \right|} $$
(8)

where N is the total of observed data, xi and yi are the experimental and expected suspended sediment, individually. \(\overline{x}\) and \(\overline{y}\) are the averaged experimental and expected suspended sediment, respectively.

3 Results and discussion

In order to model a river's suspended sediment load, historical streamflow and suspended sediment load data are essential. The seasonality of rainfall has an impact on discharge and suspended sediment load. In this study, multi-layer perceptron (MLP) model was employed to calculate suspended sediment load. In order to enhance the robustness of the MLP model, a hybrid algorithm was developed by combining the MLP with FFA optimization technique. The performance of the two developed models was compared in terms of accurate suspended sediment load prediction.

In current study, the daily streamflow and sediment load data of current and prior days are used as inputs to the multi-layer perceptron (MLP) models and MLP-FFA models to estimate current sediment load value. These models evaluated by the RMSE, MAE and R2 criteria. The two different steps are employed here. In the first step, various input combination consisting of different number of antecedent and current sediment and streamflow data are tried using an MLP and MLP-FFA models. The best input combination is selected according to the performance criteria. In the second step, the MLP model compared with the MLP-FFA model.

The various input combinations used in the MLP and MLP-FFA model to estimate suspended sediment load for the Mahabad River are (i) St−1 (ii) Qt, St−1 (iii) St−1, St−2 (iv) Qt, St−1, St−2 (v) Qt, Qt−1, St−1 (vi) Qt, Qt−1, St−1, St−2 (vii) Qt, Qt−1, Qt−3, St−1, St−2 (viii) Qt, Qt−1, Qt−3, St−1, St−2, St−3 (ix) Qt, Qt−1, Qt−3, Qt−4, St−1, St−2, St−3, St−4 where Qt is the streamflow at day t, and St is the sediment load at day t. In all cases, the output layer had only one neuron, i.e., the sediment loads (St).

A program code was written in MATLAB for the MLP and MLP-FFA model simulations. Different architectures were tried using this code, and the appropriate model structures were determined for each input combination. Then, the MLP and MLP-FFA models were tested, and the results were compared by means of RMSE, MAE and R2 statistics (Table 2).

The number of neurons in the hidden layer was determined using the trial and error procedure and for each scenario the best network architecture was selected based on the three performance criteria (RMSE, MAE, R2). Table 2 shows the best architecture and their related performance criteria for each scenario. It is also showed that adding streamflow and sediment load of the previous day have a significant effect on the results. Hence, the MLP-9 and MLP-FFA-9 model with 8 inputs, 20 hidden and 1 output was selected as the most optimum model.

The results of MLP modeling at training and testing stages in Table 2 show that according to the results of the test period in all scenarios, it is pointed out that the MLP provides the best results in the 9th scenario where Qt, Qt−1, Qt−3, Qt−4, St−1, St−2, St−3, St−4 are used as input of the model to estimate SSL. It is found that the model error is lower, with 8 input where value of RMSE, MAE and R2 are found as 3044 (ton/day), 2481 (ton/day) and 0.901, respectively. The model error is also lower for the training data set where value of RMSE, MAE and R2 are found as 440 (ton/day), 102 (ton/day) and 0.987, respectively. The most accurate estimation is related to the 9th scenario called MLP9. Figure 5 shown observed and predicted daily sediment load for training and testing period.

Fig. 5
figure 5

Observed and predicted daily sediment load by MLP and MLP-FFA models for training period

The performances of proposed models of MLP-FFA in the current study are examined in terms of RMSE, MAE and R2 in Table 2. It is seen that the hybrid models (MLP-FFA) have a better performance than the MLP models. The MLP-FFA model has been able to estimate SSL with better accuracy than that of MLP models. The MLP-FFA hybrid model has the best performance with inputs combination of Qt, Qt−1, Qt−3, Qt−4, St−1, St−2, St−3, St−4 in terms of different evaluation criteria of RMSE, MAE and R2 with 72 ton/day, 25 ton/day and 0.989 respectively for training period. The most accurate estimation with the least error is related to the 9th scenario called MLP-FFA9. This scenario has the best correlation between observed values and estimated SSL value. This is a sign of the proper functioning of the FFA algorithm in optimizing the MLP to estimate SSL values.

Figure 5 and 6 depicts the observed and predicted sediment load values for the training and testing phase. The estimation of the hybrid model MLP-FFA9 is more closer to the observed data in contrast to the MLP9 model. Finally, it is concluded that the MLP-FFA model (hybrid) provides a very accurate simulation compared with MLP model (standalone). These outcomes are in consistent with the findings of Olatomiwa et al. (2015); Ghorbani et al. (2017); Moazenzadeh et al. (2018); Mohammadi et al. (2021); Darabi et al. (2021).

Fig. 6
figure 6

Observed and predicted daily sediment load by MLP and MLP-FFA models for testing period

4 Conclusion

The prime aim of this research was to predict the sediment load for Mahabad River, by employing the two soft computing techniques i.e., MLP and MLP-FFA. The prediction accuracy of these models was estimated using statistical measures (RMSE, MAE and R2) and graphical examination. Daily streamflow and sediment load data of one-four antecedent historical records are used for the modeling. Different input combinations were examined on all studied models to select the best scenario for further analysis. According to the result, MLP9 and MLP-FFA9 models which consist of four antecedent values of streamflow and sediment load have been selected as the best fit forecasting model. Comparison of the developed models based on the variety of statistical error measurement indices showed that the MLP-FFA9 model provide better performance than the MLP9 models for estimating the daily sediment load. In order to implement appropriate measures of soil conservation in the watershed to reduce the sediment load in the river, predicting the sediment yield is very necessary to maximize the life of the structure.