Introduction

Prediction of sediment load is essential for a broad range of problems related to hydrological, agricultural, and environmental engineering such as water quality, soil erosion, design of dams, and transportation of sediment, which may aid the spread of pollutants in the river, non-point pollution of water resources, mortifying of aquatic environments, watershed management and problems relating to reservoir damage due to heavy discharge of sediments, and water emanating from natural streams. Sediment load is not only dependent on flow, discharge and rainfall, but also on other parameters and characteristics of drainage basin. Therefore, a rough estimate of sediment load can be assessed utilizing data-mining algorithms like ANNs.

The prime concern and objective while designing any reservoir is that it must be potentially equipped to cater a large volume of water and sediments generally referred to as “dead storage”. With due design consideration taking care of the expected sediment load to be accommodated in these reservoirs over a specific period of time would considerably reduce the damages emanating because of natural calamities. The good estimation of sediment load would further assist designers to deal with stochastic events occurring in nature to prevent loss of human life under the abrupt collapse of an improperly designed reservoir. On the other hand, the estimation of sediment load plays a key role in the area of environmental engineering-related problems and helps in analyzing the pollutants transported by streams in the form of suspended sediment load. The major causes of contamination of bottom sediments with toxic materials are the releasing of agricultural and industrial wastes into it. Consequently, these polluted sediments reach the downstream of the river when a river changes its juncture. Therefore, predicting the distribution of these contaminated sediments is the primary step for preventing water pollution and subsequently improving the water quality.

Modelling the complete course of transportation of sediment load in rivers has not been completely succeeded by the conventional approach of hydromechanics because of the chaotic nature of river flow along with the random movement of particles within the river flow regime. In recent times, neural networks have been successfully applied to numerous branches of science. This modelling technique is becoming an effective tool to provide hydrological and environmental engineers with adequate information for management practices and design purposes. The ANN technique may be exceptionally useful in situations, where it is difficult to formulate any mathematical relationship between the dependent and independent variables concerning any physical phenomenon. ANN is non-linear model that is easy to use and understand compared to statistical methods. ANN is non-parametric model, while most of statistical methods are parametric model that need higher background of statistic. ANN with back-propagation (BP) learning algorithm is widely used in solving various classifications and forecasting problems. Even though BP convergence is slow, but it is guaranteed. ANN are suitable for inverse modelling when the numerical relations between input and output variables are unknown, and cannot be established. Quite a good estimation of the modelled parameters can be done by ANNs using the past data. Sediment load estimation and forecasting may provide important information on the uncertainty pertaining to the estimation of some major variables of river systems.

ANN has gained wider acceptability among researchers working in the area of river flow modelling. Useful contributions have been made in predicting rainfall–runoff relationship (Minns and Hall 1996; Fernando and Jayawardena 1998; Rajurkar et al. 2002; Panwar et al. 2016). Rainfall forecasting based on past data using ANN demonstrated that an ANN-based methodology may be effectively utilized for river runoff forecasting (Cigizoglu 2002a, b). Employing an ANN model for drought forecasting in the Cansabati river basin in India was done by Mishra et al. (2007); Tokar and Markus (2000) studied it for rivers: Fraser in Colorado, Raccoon Creek in Iowa and Little Patuxent in Maryland, USA. In the field of hydrology, discharge prediction for tidally affected river using ANN, which was taken up by Hidayat et al. (2014) for Mahakam River, Indonesia, revealed that ANN model can be used as a tool for data gap filling in a disrupted discharge on a time series scale. Pektas and Cigizoglu (2017) examined the employment of two methods, multi-linear regression (MLR) and artificial neural network (ANN) for multi-step ahead forecasting of the suspended sediment. In that study, the ANN model performance is superior to that of the MLR model, as measured by means of both mean square error (MSE) and R2 statistics, although for longer ranges, the MLR models provide better accuracies. Prediction of water level using time series data in the Ramganga River was done by Khan et al. (2016a) and for the same river, Khan et al. (2018) estimated SSC using daily water discharge as an input. Govindaraju (2000) suggested that ANN can be successfully employed in many hydrological processes that exhibit a high degree of non-linearity, chaos, conflicting spatial and temporal scales and stream flow estimation under random events governed by the high degree of uncertainties. Sari et al. (2017) investigated the use of an ANN model for forecasting suspended sediment concentrations (SSC) using turbidity and water level and revealed that it is possible to estimate SSC with water level and turbidity information, with high efficiency, with ANN-based models of ideal complexity, and even with little availability of data records for training and verification. Recently Alok et al. (2013) used neural network with Alman and Cascade characteristics for predicting the discharge of the river Brahmani flowing through the Indian subcontinent. ANN modelling of rainfall–runoff for river Amber and Mole, U.K by Dawson and Wilby (1998) showed ANN to be an appealing alternative to conventional lumped or semi-distributed flood forecasting modelling. Among other problems related to prediction, some of the important contributions were made by stage discharge relation modelling (Sudheer and Jain 2003; Bhattacharya et al. 2005), estimation of ground level forecasting (Daliakopoulos et al. 2005), tidal predictions (Lee 2004; Liang et al. 2008), sediment transport modelling (Yitian and Gu 2003), and water-level prediction (Chang et al. 2010).

Other ANN algorithms commonly employed for prediction problems include radial basis function (RBF) and generalized regression neural network (GRNN). These algorithms were recently employed in many prediction problems. Furthermore, the accuracy of these two algorithms was compared to that of feed-forward backpropagation (FFBP) by many researchers (Alp and Cigizoglu 2007; Mehr et al. 2014). A majority of these researches pertaining to feature extraction and subsequent prediction are based on FFBP for training the proposed network architecture. According to Brikundavyi et al. (2002) and Cigizoglu (2003a, b), the performance of FFBP was superior in problems dealing with continue flow series prediction. As reported in the literature, a major drawback associated with FFBP is the local minima problem.

However, still, the most common learning method used for supervised learning with feed-forward neural networks (FNNs) is backpropagation (BP) algorithm. The BP algorithm calculates the gradient of the network’s error with respect to the network’s modifiable weights. However, the BP algorithm may result in a movement towards the local minimum. To overcome the local minimum problems, many methods have been proposed. A widely used one is to train a neural network more than once, starting with a random set of weights (Park et al. 1996; Iyer and Rhinehart 1999). An advantage of this approach lies in the simplicity of using and applying to other learning algorithms. Nevertheless, this approach requires more time to train the networks.

The ASCE task committee (2000a, b) thoroughly reviewed other ANN algorithms which may be better over conventional FFBP; some of these advanced algorithms for ANN training include the RBF, GRNN, and Recurrent Neural Network (RNN).

Among the most cited works on prediction problems using RBF and GRNN, one includes the works of Specht (1991); Yingwei et al. (1998), Cigizoglu and Alp (2006); Cigizoglu (2005), Alp and Cigizoglu (2007), Tukuda et al. (2013), Mehr et al. (2014), Li et al. (2014), Lu et al. (2014); Chen and Wang (2014); Bayram et al. (2014) and Singh et al. (2014). Cigizoglu and Alp (2006), studied the performance of two different ANN algorithms viz., GRNN and FFBP for the problem involving estimation of river suspended sediments in Junaita River of Pennsylvania, USA. The result of their study revealed that GRNN performs much better as it provides close or sometimes even superior results as compared to FFBP in sediment estimation. In another study by Cigizoglu (2005), a GRNN was used to predict the daily mean flow forecast and estimation of river parameters. In addition, in that study, the GRNN method was found superior as compared to conventional FFBP, regression, and stochastic method in both prediction and estimation of the selected river parameters. Suspended sediment load prediction using hydrometeorological data using FFBP, RBF, and MLR was carried out by Alp and Cigizoglu (2007). The results showed that RBF and FFBP provided results quite close to each other. In that study, they further showed that with FFBP method, different performance criteria were obtained from different FFBP simulations for the same network configuration which was because of random assignment of initial weights while carrying out the training processes. Thus, while using FFBP algorithms, many simulations must be conducted for optimized FFBP performance. In contrast to this, RBF provides results with a unique simulation. Recently, Mehr et al. (2014) studied eight different stream flow prediction models based upon monthly data of two successive stations for Coruh River in Turkey. In the first phase, FFBP algorithm was employed for prediction of the required parameters, and the result showed that 1 month lagged record of successive stations is sufficient to achieve an accurate monthly stream flow prediction with more than 0.97 Nash–Sutcliffe coefficients (NS). Subsequently, in the second phase, GRNN and RBF were applied to predict 1 month ahead successive station stream flow data. The outcome of the study revealed that the RBF network was much superior to GRNN and FFBP technique for prediction which is in line with the findings of Kisi and Cigizoglu (2007). Due to superiority of the RBF algorithm over FFBP, it has been used by Chen and Wang (2014) for prediction of urban built-up area. In India’s context, very few studies have been reported on the Himalayan Rivers. The ANN modelling for Ganga River was carried out for predicting landslide hazard zonation (Arora et al. 2004). Keeping in view the scarcity of research works on Himalayan rivers, the present study has been undertaken to study different ANN models (FFBP, RBF, and GRNN) for sediment load prediction using hydrometeorological data collected from the Ramganga River.

This study explores three ANN methods or algorithms for the prediction of monthly mean total sediment loads at the Bareilly gauging site of the Ramganga River, the first important tributary of Ganga River in Ganga Foreland Basin (GFB). For this purpose, ANN models are developed to predict sediment load based on rainfall and water discharge data collected of the studied area. The ANN models comprised three parts: simulation of the total sediment load with rainfall data as an input, simulation of the total sediment load with water discharge as an input, and simulation of the total sediment load with both rainfall and water discharge as an input. FFBP, GRNN, and RBF are employed for the assessment of expected sediment load. For FFBP, along with training parameters, the model has been optimized for hidden layer neurons, while for the other two algorithms, GRNN and RBF, optimized networks were formulated using spread parameter values. Finally, the predicted values obtained using the three algorithms were compared with the experimental values and the findings were presented and discussed in light of the previous works carried out on the topic.

Study area

Ramganga River basin (Fig. 1) includes 22, 685 km2 (Khan et al. 2016b, c; Khan and Chakrapani 2016; Daityari and Khan 2017; Khan and Tian 2018) catchment area and covers approximately 8% of the total catchment area of the Ganga Basin (Ray 1998). Ramganga is the first major tributary of the Ganga River with a mean elevation of 1494 m above sea level (a.s.l). It is emerging from Dudhotali Mountain in Gairsain village of district Chamoli in Uttarakhand. The Ramganga River basin lies in between 30°06′02.22″N to 27°10′42.11″N and 79°16′59.22″E to 79°50′16″E. The total length of the river is 642 km from the point of origin to confluence with River Ganga. After covering a length of 158 km from its starting point, the river emerges out from the mountains into the GFB, where Ramganga dam was constructed at an elevation of 363 m a.s.l. During its course in the GFB, the river crosses major districts of Uttar Pradesh like Bijnor, Moradabad, Rampur, Bareilly, Badaun, Shahjahanpur, Hardoi, and Farrukhabad (Khan et al. 2016d; Khan 2018). After covering 484 km in GFB, it finally meets the Ganga River at Farrukhabad district of Uttar Pradesh (CWC 2012; Khan et al. 2017).

Fig. 1
figure 1

Map of the Ramganga River Basin

Physiography and relief

Due to contrast between the geomorphology, slope, and elevation, the entire catchment area shows a large diversity in weather and climate (Fig. 2). Ranging from 2000 mm of precipitation at higher altitudes and 700 mm precipitation at the lowest elevation, the catchment area has 1300 mm of average annual precipitation and it is controlled by Indian monsoon (http://www.indiawaterportal.org/met_data/).Ramganga River shows large variation on account of elevation and slope (Fig. 2a, b).

Fig. 2
figure 2

Characteristics of the study area: a ground elevation; b ground slope; c geological map; and d drainage map

The river varies in elevation from less than 305 m a.s.l in the GFP to more than 2438 m a.s.l in the Himalayas (Fig. 2a). The same variation is shown in the slope of the river. The minimum slope of the river is shown in GFP, which is around 0–5%, while in the Himalayas, it is greater than 30% (Fig. 2b). The drainage of the catchment area shows sub-dendritic nature and the high amount of rainfall in the higher elevations during the winter and rainy seasons makes the river perennial. The drainage network of the river shows very complex in nature with stream order ranging from 1 to 6 (Fig. 2d). The map was prepared in ArcGIS using 90 m resolution data of Shuttle Radar Topography Mission–Digital Elevation Model (SRTM–DEM), and the stream is ordered according to Strahler’s number (Strahler 1952).

Geology

In the mountain, the catchment area constitutes two major lithotectonic zones, namely, Sub-Himalayas and Lesser Himalayas (Fig. 2c). Sub-Himalayas comprise siltstone, clays, sandstones, and boulders show the characteristics of molasse sediments of Mid-Miocene to Pleistocene. On the other hand, unfossiliferous sequences of low-to-high-grade meta-sediments of Paleozoic to Mesozoic are the major components of the Lesser Himalayas (Gupta and Joshi 1990).

In Ganga alluvial plain, which is closely associated with the extension of the Himalayan orogenic belt, the catchment area shows quaternary lithostratigraphic sequence comprises (1) Varanasi Older Alluvium with two facies, i.e., sandy facies and silt clay facies, (2) Ganga/Ramganga Terrace Alluvium, and (3) Ganga/Ramganga Recent Alluvium, the latter two constitute the Newer Alluvium.

NN algorithms

Feed-Forward Back-Propagation (FFBP) algorithm

The mainly widespread rule for learning the multi-layer perceptrons is the back-propagation algorithm (BPA) for giving a training stage to input–output data. BPA consist of two stages viz., a feed-forward phase and a backward phase. In feed-forward phase, the computing of output information gesture at the output component is done when the peripheral input gesture at the input nodes is propagated forward, whereas in the case of backward phase, the modification to the correlation potency is prepared based on the dissimilarities between the observed and the computed information signals at the output units (Eberhart and Dobbins 1990). In that study, the structure of the neural network comprised a three-layer learning network viz., an input layer, a hidden layer, and an output layer. Levenberg–Marquardt technique was employed for optimization. According to Hagan and Menhaj (1994); El-Bakyr (2003) and Cigizoglu and Kisi (2005a), the Levenberg–Marquardt technique is more influential than the conventional gradient descent technique. The gradient descent method is a steepest descent algorithm in case of backpropagation. The Marquardt algorithm is extremely competent if the weight of the training network is up to a few hundred (Hagan and Menhaj 1994). For each of the iterations of the Marquardt algorithm, the higher computational obligation is needed. This is particularly true for the requirement of the highest precision. Hagan and Menhaj (1994) showed that in many cases when the backpropagation failed to converge, the Marquardt algorithm successfully converged.

The radial basis function (RBF)

Broomhead and Lowe (1988) introduced RBF networks in the literature of neural networks. The locally tuned reaction observed in biological neurons is the motivation of RBF network. Poggio and Girosi (1990) showed that in several parts of the nervous system, a locally tuned reaction characteristic of neurons can be found, like cells in the visual cortex sensitive to bars oriented in a certain direction or other visual features within a small region of the visual field. In small range of input space, the response characteristics of these locally tuned neurons can be shown. The basis of RBF lies theoretically in the area of exclamation of multivariate functions. For the clarification accurate interpolation, RBF mapping exceeds through each data point (xs, ys). In the occurrence of noise, the accurate resolution of the exclamation problem is normally a function fluctuating between the specified data points. The magnitude of basis functions is equivalent to the magnitude of data points and it is another problem with accurate interpolation procedure and subsequently calculating the contrary of the N _ N matrix Ф becomes obstinate in practice. Study of Taurino et al. (2003) reveals that the interpretation of ANN with RBF method is consisting of three layers: an input neuron layer which feed the feature vectors to the network; a layer of hidden RBF neurons and an output neuron layer. The different numbers spread constant and different neurons in hidden layers were tried in this study.

Generalized Regression Neural Network (GRNN)

It was proposed by Specht (1991). Just like the FFBP method, the procedure of iterative training is not required in the GRNN. However, any arbitrary function can be approximated with the help of GRNN between input and output vectors and it can draw function estimate directly from the training data. Moreover, it is very reliable; that is, the set of the training data becomes large with only gentle limitations on the function, the estimation error approaches zero. Specht (1991), presented full information about GRNN with a schematic diagram of GRNN structure. The pattern of GRNN is consisting of four layers. First units include of input units, pattern units are in the second layer, the outputs of the second layer are agreed on to the summation units in the third layer, and the output units are covered by fourth layer.

Statistical analysis of the data

Monthly means of water discharge and total sediment load data at the Bareilly gauging site of the Ramganga River, Uttar Pradesh, India, were used in this study. The data were taken from Central Water Commission (CWC), Government of India. The rainfall data of Ramganga River were downloaded from the website of India water portal (http://www.indiawaterportal.org/met_data/).

The observation period for the monthly mean water discharge, rainfall, and sediment load data was 15 years (from 1 January 1988 to 31 December 2002). The data were provided by CWC. Table 1 shows the statistical parameters (mean X, standard deviation Sx, skewness coefficient Csx, overall minimum Xmin, and maximum Xmax) of the data. It is obvious that the record of whole data of the suspended sediments load from Table 1 shows the normal skewed distribution (Csx = 4.84).

Table 1 Statistical parameters of the data for training, testing, and the whole data

Methodology

Three algorithms in MATLAB, namely, FFBP, RBF, and GRNN, were modelled for the simulation. An uninterrupted time series data of 15 years (1988–2002) pertaining to monthly mean water discharge, rainfall, and total suspended sediment load were taken. The structure of ANN comprises three layers, i.e., input layer, hidden layer, and output layer. Different hydrometeorological data were used to prepare an input layer. The approach of ANNs consisted of two steps for time series data. The primary step included the training of the neural networks. For this, the monthly mean water discharge and rainfall data were taken as input and monthly mean total suspended sediment load data were taken as output for obtaining the interconnection weights. After the completion of training stage, the ANNs were applied to obtain testing data. For training the network, 13 years of data were used and the rest of the 2 years of data were used for testing (Table 2).

Table 2 Training and testing period for different ANN algorithms

Normalization of data

The normalization of data is important and it will help in transforming the input range of each variable. Therefore, after the collection of the hydrometeorological data, normalization was done for each variable viz., monthly mean water discharge, rainfall, and total suspended sediment load. Accordingly, all the variables are stretched out within the interval (0, 1) using the following equation:

$${X_{{\text{norm}}}}=\frac{{{X_i} - {X_{\hbox{min} }}}}{{{X_{\hbox{max} }} - {X_{\hbox{min} }}}},$$
(1)

where Xnorm is the normalized value of the observed variable, Xi, Xmin is the minimum value of the variable, and Xmax is the maximum value of the variable.

Data sets

A total of first 156 values were taken for training stage and the last 24 values cover the testing stage. It is acceptable that the statistical parameters are quite dissimilar from each other in training and testing data sets (Table 1). It does not make sense if we take the middle values of the whole series for testing data sets, because by doing this, the training data sets will constitute the first part and the last part and it will affect the training of the ANN because of the discontinuity in the data sets. Since the computational density and its generalization capacity are directly affected by network topology, therefore, it is necessary to determine a suitable design of a network for a particular problem.

Modelled NN architectures

For the present problem, the FFBP architecture consisted of three layers representing input, hidden, and output layers, respectively. Three simulations were carried out taking rainfall as a single input variable, water discharge as a single input variable and both rainfall and water discharge as an input variable in the simulations I, II, and III, respectively. Consequently, the input layers consisted of 1, 1, and 2 neurons for simulations I, II, and III, respectively. The FFBP algorithm was tested with varying number of hidden layer neurons and learning rate values to optimize the network topology. Number of hidden neurons varied from 1 to 10, while learning rate values were taken as 0.01, 0.05, and 0.06. The stoppage criteria for training were based on achieving a desired level of MSE which was kept at 0.0009. Tangent sigmoidal functions were used as transfer functions. Learning and momentum rate parameters were adaptive, i.e., their values changed dynamically during the simulation. The performance of the algorithm was sensitive towards the setting of learning rate. If the learning rate was set to a high value, the algorithm may not settle in the global minima or may oscillate and become unstable. On the other hand, if the learning rate is set to small value, the algorithm may take a large time for the training period. Practically, it is impossible to determine the optimal setting of the learning rate because of which each topology was tested for varied values of learning rates. Thus, enumeration technique was applied for determining the optimal network topologies obtained for each simulation and are presented and discussed in “Results and discussion”.

For RBF, the same input layer structure as in FFBP was employed. In this case, the hidden layer neurons are automatically adjusted depending upon the error criteria. Various spread values between 0 and 1 were considered for RBF simulation. The MSE obtained for different simulations with varying spread constant values are presented and discussed in “Results and discussion”. The performance evaluation measures include the MSE and the coefficient of determination R2 values between the simulated and observed suspended sediment loads.

GRNN structure is similar to that of RBF and FFBP with the only difference being the prior stoppage criteria during training for the algorithm need not to be set. The results obtained for the different simulations are presented and discussed in “Results and discussion”.

Results and discussion

Simulation of suspended load with rainfall data (simulation I)

In the primary step of the experiments, simulation of the suspended sediment load was carried out with only rainfall data taken into consideration. The series of rainfall values used to simulate suspended sediment load in this study were downloaded from the website of Indian Meteorological Department (IMD), Ministry of Earth Sciences, Government of India(http://www.indiawaterportal.org/met_data/), and the effectiveness of the downloaded data could not be confirmed. The effective rainfall data are those which contain no losses from infiltration, depression storage, and water absorbed by the plant and due to the absence of the data of these parameters, and it was not possible to compute the effective rainfall. Therefore, it was a challenge to test the capability of simulation capacity of the neural network using untreated precipitation data. Table 2 (Column: simulation I) shows the time periods of the data used in training and testing stage.

The combination of several inputs and hidden layer tested for this simulation is presented in Table 3, and the plots obtained are shown in Figs. 3 and 4. The number of hidden layer’s node showing in Column II of Table 3, and the spread parameters in Columns V and VIII of Table 3 were obtained after practicing several values of hidden layer and spread parameter for a definite input node (Fig. 3). The optimum set of inputs in FFBP, RBF, and GRNN seemed to be the same.

Table 3 Performance criteria values (MSE and R2) for ANNs obtained for testing period
Fig. 3
figure 3

Plots obtained for simulation I. a Error values obtained with varying hidden layer neurons for different input parameters using FFBP, b Minimum error values obtained for each input parameter for FFBP. c Error values obtained with varying spread constant for different input parameters using RBF. d Minimum error values obtained for each input parameter for RBF. e Error values obtained with varying spread constant for different input parameters using GRNN. f Minimum error values obtained for each input parameter for GRNN

Fig. 4
figure 4

Plots obtained for simulation I. a Suspended sediments load obtained using FFBP vs experimental. b Suspended sediments load obtained using RBF vs experimental. c Suspended sediments load obtained using GRNN vs experimental

In case of FFBP, the best performance criteria, i.e., the lowest MSE (176 ton2/day2) with R2 value of 0.48 for testing period, were obtained when the network structure was set to be 5 node in a hidden layer and the input layer contains only one input. For RBF, the closer values (MSE = 6126 ton2/day2, R2 = 0.24) for the testing period were obtained when the spread constant was set at 0.1 and the input layer consisted of 5 inputs. The 5 input nodes or neurons represents the monthly mean rainfall in a month covering a time period of 5 months (months: Rt, Rt−1, Rt−2, and Rt−3, Rt−4), whereas the output layer node corresponds to the unique monthly mean suspended sediment load at month t.

For GRNN, the best prediction values were obtained when the spread constant was kept at 0.4 (MSE = 280, R2 = 0.54). These optimum values were obtained when the GRNN network was fed with three inputs representing rainfall data for Rt, Rt−1, and Rt−2 months.

The monthly mean suspended sediment load for each month presented in Table 4 is in line with the MSE value presented in Table 3 except for the RBF algorithm. The average values of suspended sediment load obtained through simulation agrees well with the observed values, as shown in Fig. 4a–c using FFBP, RBF, and GRNN, respectively. As observed from these figures, the simulated trend matches with observed values better for FFBP and GRNN as compared to RBF.

Table 4 Values of total sediment load for testing data set

The simulations carried out on the testing data and subsequent values obtained for the average suspended sediment load using test data set revealed that rainfall values alone as inputs are not enough to capture the features associated with the suspended sediment load series over a time horizon. Simulations with lower R2 values between the observed and the simulated suspended sediment loads are due to the fact that predicted values may be either under or overestimated of monthly mean suspended sediment load as compared to experimental values both for the training and the test data sets. These findings are in line with the findings of Alp and Cigizoglu (2007), who also showed that rainfall alone, is not enough for the prediction of suspended sediment load.

Simulation of suspended load with water discharge data (simulation II)

This simulation was carried out using the monthly mean water discharge data as an input to all the three considered ANN algorithms. The basic motive behind taking water discharge data as input was the fact that improved simulation performance might be expected, since the water discharge measurements are taken together with the suspended sediment load values at gauging sites.

The present ANN modelling with water discharge data was taken into account as the relationship between the water discharge and the suspended sediments is non-linear and highly complex because of which mathematical relationship is hard to formulate (Alp and Cigizoglu 2007).

The combination of several inputs and a hidden layer tested for this simulation is presented in Table 3, and the plots obtained are shown in Figs. 5 and 6. The number of the hidden layer’s nodes in Column II of Table 3 and the spread parameters in Columns V and VIII of Table 3 were obtained after practicing several values of hidden layer and spread parameter for a definite input node (Fig. 5). The optimum set of inputs in FFBP, RBF, and GRNN seemed to be the same.

Fig. 5
figure 5

Plots obtained for simulation II. a Error values obtained with varying hidden layer neurons for different input parameters using FFBP, b minimum error values obtained for each input parameter for FFBP. c Error values obtained with varying spread constant for different input parameters using RBF. d Minimum error values obtained for each input parameter for RBF. e Error values obtained with varying spread constant for different input parameters using GRNN. f Minimum error values obtained for each input parameter for GRNN

Fig. 6
figure 6

Plots obtained for simulation II. a Suspended sediments load obtained using FFBP vs experimental. b Suspended sediments load obtained using RBF vs experimental. c Suspended sediments load obtained using GRNN vs experimental

In case of FFBP, the best performance criteria, i.e., the lowest MSE (78 ton2/day2) with R2 value of 0.85 for testing period were obtained when the network structure was set to be 9 nodes in a hidden layer and the input layer contained only three inputs representing monthly mean water discharge in a month covering a time period of 3 months (Qt, Qt−1, Qt−3). For RBF, the closer values (MSE = 28 ton2/day2, R2 = 0.80) for the testing period were obtained when the spread constant was set at 0.2 and the input layer consisted of 1 input representing monthly mean water discharge of 1 month (Qt).

For GRNN, the best prediction values were obtained when the spread constant was kept at 0.1 (MSE = 64, R2 = 0.06). These optimum values were obtained when the GRNN network was fed with 5 inputs representing monthly mean water discharge of 5 months (Qt, Qt−1, Qt−2, Qt−3, and Qt−4). The average values of suspended sediment load obtained through simulation agrees well with the observed values, as shown in Fig. 6a–c using FFBP, RBF, and GRNN, respectively. As observed from these figures RBF and GRNN algorithms performed better over FFBP.

Simulation of suspended load with water discharge and rainfall data (simulation III)

The third simulation was carried out by taking both rainfall and water discharge data as input. The combination of several inputs and hidden layer tested for this simulation is presented in Table 3, and the plots obtained are shown in Figs. 7 and 8. The number of hidden layer’s node showing in Column II of Table 3 and the spread parameters in Columns V and VIII of Table 3 were obtained after practicing several values of hidden layer and spread parameter for a definite input node (Fig. 7). The optimum set of inputs in FFBP, RBF, and GRNN seemed to be the same.

Fig. 7
figure 7

Plots obtained for simulation III. a Error values obtained with varying hidden layer neurons for different input parameters using FFBP, b minimum error values obtained for each input parameter for FFBP. c Error values obtained with varying spread constant for different input parameters using RBF. d Minimum error values obtained for each input parameter for RBF. e Error values obtained with varying spread constant for different input parameters using GRNN. f Minimum error values obtained for each input parameter for GRNN

Fig. 8
figure 8

Plots obtained for simulation III. a Suspended sediments load obtained using FFBP vs experimental. b Suspended sediments load obtained using RBF vs experimental. c Suspended sediments load obtained using GRNN vs experimental

In case of FFBP, the best performance criteria, i.e., the lowest MSE (43 ton2/day2) with R2 value of 0.92 for testing period, were obtained when the network structure was set to be 3 nodes in a hidden layer and the input layer contained 4 inputs representing monthly mean rainfall and water discharge of four months (Rt, Rt−1, Rt−2, Qt).

For RBF, the closer values (MSE = 34 ton2/day2, R2 = 0.85) for the testing period were obtained when the spread constant was set at 0.4 and the input layer consisted of 4 inputs representing monthly mean rainfall and water discharge of two months (Rt, Rt−1, Qt, Qt−1). For GRNN, the best prediction values were obtained when the spread constant was kept at 0.2 (MSE = 67, R2 = 0.94). These optimum values were obtained when the GRNN network was fed with 5 inputs representing monthly mean rainfall and water discharge of three months (Rt, Rt−1, Rt−2, Qt, and Qt−1). The average values of suspended sediment load obtained through simulation agrees well with the observed values, as shown in Fig. 8a–c using FFBP, RBF, and GRNN, respectively.

The best trend is obtained for all the three algorithms when only water discharge data is used as an input. However, the values of correlation significantly improve when water discharge and rainfall data are being used for modelling the ANN.

The trends for all the three algorithms are almost similar to the experimental trend both for simulations I and II. However, for simulation I, the trend for FFBP and GRNN matches with that of experimenting with an exception that the trend obtained using RBF is somewhat deviating from the experimental values. The reason might be that in this case, the RBF is catching other features related to trend, cycle, seasonal variations associated with time series data.

Conclusion

In this study, the relationships between hydrometeorological parameters of monthly mean rainfall and water discharge are used to predict the monthly mean suspended sediment load using three ANN algorithms, namely, FFBP, RBF, and GRNN. It has been found that only rainfall values were not sufficient to correctly predict the suspended sediment load. However, considering water discharge values as input significantly improves the performance of all the three considered algorithms which is also evident from correlation values so obtained for the simulations runs. It is also to be noted that the best correlation is obtained when water discharge and rainfall data is being used for modelling the ANN. The reason might be the fact that rainfall data ignore the watershed features which are central to the erosion and suspended sediments load transportation process. When rainfall was clubbed with water discharge data, the performance of the three algorithms improved in comparison with the simulation when rainfall alone was used as input, but the performance was not superior when compared with the result obtained using water discharge alone as input data.

The importance of this study is that ANN models developed successfully captures the features associated with the input data for prediction of suspended sediments values. Further non-linear dynamics within the contributing variables towards the suspended sediments load values were also taken care of by ANN methods.

The prediction of suspended sediment load for important rivers like the Ganga in India carries important design information for water resource projects like dam and reservoir constructions. Furthermore, if the ANNs were trained on exceptional data obtained during any natural calamities like floods, torrential rain, cloud burst, etc., the suspended sediment values can be predicted to some extent, so that the design of water projects take into account these exceptional predicted values in their design so as to limit the damages which may occur in the future.