Abstract
Accurate and reliable suspended sediment load (SSL) prediction models are necessary for the planning and management of water resource structures. In this study, four machine learning techniques, namely Gradient boost regression (GBT), Random Forest (RF), Support vector machine (SVM), and Artificial neural network ANN will be developed to predict SSL at the Rantau Panjang station on Johor River basin (JRB), Malaysia. Four evaluation criteria, including the Correlation Coefficient (R), Root Mean Square Error (RMSE), Nash Sutcliffe Efficiency (NSE) and Scatter Index (SI) will utilize to evaluating the performance of the proposed models. The obtained results revealed that all the proposed Machine Learning (ML) models showed superior prediction daily SSL performance. The comparative outcomes among models were carried out using the Taylor diagram. ANN model shows more reliable results than other models with R of 0.989, SI of 0.199, RMSE of 0.011053 and NSE of 0.979. A sensitivity analysis of the models to the input variables revealed that the absence of current day Suspended sediment load data SSLt-1 had the most effect on the SSL. Moreover, to examine validation of most accurate model we proposed divided data to 50% training, 25% testing and 25% validation) sets and ANN provided superior performance. Therefore, the proposed ANN approach is recommended as the most accurate model for SSL prediction.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Watershed sediment loads are an ecological risk, and their assessment is important for the development of protection measures, reservoir sustainability and generation of hydropower, flood control and water supply (Lin et al. 2010; Ziyan 2012). In several rivers, sediment is transferred in suspension and suspended sediment load (SSL) is key for channels designs, culverts, and dams (Targhi et al. 2017). For water resource management and environmental protection programmes, understanding potential sediment loads is critical (Melesse et al. 2011). Indeed, SSL models are seen as a key component of the planning and managing activities carried out via numerous resources managers in the watershed. Soil erosion, which is directly connected to sediment transportation concerns, remains a significant global environmental concern. Keeping monitoring of soil erosion and sediment transport could be repeated and challenging; thus, detailed techniques for predicting this significance for decision-making (Gajbhiye et al. 2015; Gajbhiye and Ashish 2014). Thus, estimate SSL is a vital step to water quality problems and water resources management. In recent decades, water resources management has employed machine learning (ML) approaches, particularly for modelling processes with limited knowledge successfully (Yoon et al. 2011; Wu 2020).
Literature review
Artificial intelligence approaches have been proved to be robust replacements for popular conventional approaches. Thus, some sorts of research have exposed the superiority of the novel methods we will review in this section as below. In (Shadkani et al. 2020) various ML techniques like Multi-layer perceptron (MLP), GBT and Multi-layer perceptron-Stochastic Gradient Descent (MLPSGD) were applied in order to evaluate SSL in U.S. Result of MLP methods were enhanced or optimized via SGD. Thus, the MLPSGD model was recommended as a reliable approach to predict SSL. Nevertheless, this algorithm cannot guarantee an optimal solution always and its time-consuming. In the USA this study (Melesse et al. 2011) assessed utilizing artificial neural network modelling technique. MLP-ANN with errors backpropagation algorithms, utilizing past 24houre and 7 days hydro climatological dataset precipitation, recent Q(t), predecessor Q(t-1), and predecessor SL(t − 1)), was using to prediction SL(t). Artificial neural network model was comparing with further models and showed better results than them. In addition, daily prediction showed better results than weekly prediction. This motivates us to use ANN in this work and compare it with further ML models in our work. This maybe also support our work regarding input design which its daily combination inputs. The study (Kisi 2012) investigated the capability of SVM for modelling Q(t) and SSL relation. SVM was comparing with those of the ANN and sediment rating curve (SRC). The result gets from this comparison presented that the SVM model was cable to generate best result than the other model. So, we will continue the investigation regarding which approach is better for SSL prediction. Researchers in Pham et al. (2018) used a hybrid ML ensemble method neural network Rotation Forest-based Radial Basis Function (RF-RBF) for spatial predicting in India, and the results showed that the RFRBF model has the better accuracy and ability when compared the further techniques. Nevertheless, the hybrid methods may be having complicate architecture and give only frequency resolutions. Based on (Ehteram 2021) they proposed advanced optimizers whale algorithm (WA) with ANN, so it was shown that hybridization of multi objective algorithm with whale algorithm and ANN model drastically enhances accuracy of model in prediction daily SSL. Regarding (Rashidi et al. 2016) they determined a proper input combination for developing the SSL predicting model and the effect of pre-processes of input parameter via Gamma Test (GT) was examined on performs of SVM by polynomial and RBF kernels. These results could help us to reduce the uncertainties parameters of SVM. Also, this study presented ability of identify the significance of the inputs variables, changing gamma test to a valuable and technical test for inputs parameters, pre-processes to predicting SSL. For estimate of daily suspend sediment concentration (SSC) (Taşar et al. 2017) implemented ANN and M5tree (M5T) methods, and statistical methods for example SRC and MLR. These predicting methods were compare and ANN model showed higher prediction SS than the other estimate approaches. Hence, this other evidence encourages us to use ANN technique. Based on (Al-Mukhtar 2019) three different artificial intelligence RF, SVM, and ANN approaches were deployed to model and predict the SSL at Sarai Station in Baghdad. So, the results showed that RF has superior performance among the others thus, this motivates us to more investigate by comparing these methods in our research. Daily and monthly prediction SSL was studied in Nourani and Andalib (2015). The effectiveness of model Wavelet based least square support vector machine (WLSSVM) and the ability was comparing with further models in multistep-ahead SSL prediction and the result presented that predicting SSL in daily scenario, the model has best results compared with ANN, while in monthly modelling, artificial neural network model showed slightly better accuracy than WLSSVM. Also, wavelet-based ANN (WANN) model presented similar values in daily and dissimilar in monthly of prediction SSL, once added wavelet leaded to increased accuracy of both models. In addition, (Olyaie et al. 2015) researchers make comparation among various approaches such as adaptive neuro-fuzzy inference system (ANFIS), SRC, ANN, coupled wavelet and neural network (WANN) for assessing the daily SSL in USA stations. Generally, outcomes showed the WANN method satisfactory able to predict and suitably estimation cumulative SSL, and rationally predicting peak SSL amounts. However, the wavelet transform had improved in use and commonly in current years because it is beginning in early 1980, until now still doesn’t have wide-spread use of Fourier transform. Fourier analyses has significant disadvantage. When transform to frequency area, time information is wasted. Modelling daily SSL estimate use in Chen and Chau (2016) hybrid double feed forward neural networks (HDFNNs), via uniting fuzzy pattern recognition and continuity equation to structure of double neural network, result presented that HDFNN is proper for modelling the sediment transport processes with non-linear, fuzzy and time fluctuating characteristic, and HDFNN could be substitute for used and could be recommend as an effective estimate method in order to predict SSL.As we mentioned before the drawbacks of such hybrid models is has complicated structures. Based on (Sharafati et al. 2018) different ML approaches were developed GBR, AdaBoost regression (ABR) and RF for predicting SSL. The forecast were be depend on daily values of Q(t) and SSC, thus the RFR approach showed a bit lead in predicting performance. (Nourani et al. 2021) they are developing artificial intelligence based ensemble approaches to modelling SSL by used various single models SVM, feed forward neural networks (FFNNs), ANFIS, and one conventional multi-linear regression (MLR), were applied to SSL modelling. Generally, the ML-based ensemble provides brilliant performance in SSL prediction. (Singh et al. 2016) it was developed novel ANN approaches for valuation of best management practice (BMP) for control sediment yield and runoff from minor agricultural watershed. These approaches integrated complicated non-linear effect of main climate, topography, drainages, and management features and could evaluated BMP efficiency without assumptions about their performance or physical mechanism. Statistical parameters proposed that the ANN models performed fine. We conclude from review above that this research area is still very active and that many methods to optimization and modelling approaches are being verified to dealing with predicting SSL. Our review displays that using ML models would be the greatest advantageous since it providing the finest representation of forecast SSL.
Problem statement
Predicting SSL is a nonlinear and complicated issue, which comprises the interface of numerous physical and hydrological variables that change in time and space, therefore isn’t easy mission. Depend on the prior characterises and due to it is the utmost substantial task in river basin investigation, establishing a relationship between SSL and discharge (Qt) has motivated several researchers in entire the universal (Salih et al. 2019). Recently, because of the development of soft computing and data driven methods, hydraulic engineers were tending to prediction SSL with regression and ML approaches. Conventional approaches such as multilinear regression model, autoregressive model, and sediment rating curves also have problems due to their limited ability to considering nonlinear and non-stationary in the environment and hydrological dataset that have been resolved in above mention newest computing methods (Jain 2001). Lately, the artificial intelligence method has been recognised as an effective substitute for modelling complex nonlinear systems. Usually, such models don’t consider the internal processes but progress models over the input- output relationship. ML models trying to represents whole the chemical process and physical process involved in the statistical term, with variables developed from historical dataset. In general, the difference equations are simplifying so as to figure out the answers appropriate for the models. Solutions of the involve equations maybe require supposition and simplification which are derive from the performance of the models. Regarding that, it’s significant for investigators to enhance the ability of model’s methods that could adequately distinguish the nonlinearity and the stochastic feature of the SSL and Q(t) data. Therefore, more attention for developing of more flexible SSL predicting system is necessary, ML models let us to develop software resolutions for whole these problems and much faster than these conventional models.
Objective of study
The present work aims to attain good accuracy and predictive reliability performance for artificial intelligence models of SSL. To this goal, we will emphasis on probable improvements which could contribute the prediction method quality. Consequently, this work aims to develop four algorithms namely boosted tree (GBT), random forest (RF), support vector machine (SVM) and artificial neural network (ANN) for prediction SSL at the Rantau Panjang station on Johor River basin (JRB), Malaysia using data from 1988–1998. To examine the sensitivity of approaches to different input combination of the proposed models. To investigate and compare accuracy between four ML models in prediction SSL. To validate the accuracy of the proposed algorithms via predicting unseen data.
Material and methods
Case study and data
This work will carry out in the Johor River basin (JRB) in the southern part of Malaysia Peninsular Fig. 1. The entire region of the basin is 1842,8 square kilometres with annual precipitation of 2500 mm (Tan 2014). Its 122.7 km in length and comes from Gunung Belumut in the north of the basin (second highest mountain of Johor). The river flows north–south and afterwards southwest to Johor Strait. The two largest tributaries in the northern basin are Linggiu and the Sayong River. The JRB is situated between 1°30 – 10 – N and 103°20 – 10 – E and has a surface area of approximately. 1652 sq. km. JRB ranges in elevation from 3 to 977 m.s.l. The JRB's main forms of soil use are perennial farming (oil palms and rubber) and forests. A basin's major soil kind is the Ultisols (Rengam-Jerangau) sequence. It's a well-drained, moderately permeable yellowish-brown, sandy clay ideal for oil palm and rubber planting (Tan et al. 2015). In this area there are four control stations: Tanah Jengli, Felda Inas, Johor Tenggara and Rantau Panjang. Data from Rantau Panjang Station for sediment prediction and coordination for the Rantau Panjang station situated on the Johor River Stream will be used in the present study (1°47′07.4"N 103°44′14.2"E). The basin consists primarily of an oil palm plantation, supplemented by an oak, rainforest, secondary forest, water and swamp forest bodies. The JRB climate is the tropical monsoon climate, dividing the monsoon from north-east November to February and the monsoon from southwestern countries May to August (Tangang et al. 2012). In December, flooding frequently occurs when the maximum precipitation and peak flow are reported. The data of both Q(t) and SSL for station Rantau Panjang (middle basin) is daily records cover ten years from 1988 until 1998 Source (Department of Irrigation and Drainage Ministry of Environment and Water, Malaysia). It could be observed from Fig. 2, which it presents the average monthly of SSL that is for the same month the fluctuated in the sediment over ten years. Also, Fig. 3 illustrates the discharge, which is it is quite random and extreme nonlinear and the mean discharge 26.72 m3/s, while the maximum value is 650.66. In addition, simple statistical analysis conducting for such data which is representing one of the considerable steps that shall carry out before used such historical dataset to ensure it is reliable prior as in Table 1.
Machine learning methods
One of the machine learning models is the Gradient Boosted Tree (GBT) model, which has an algorithm based on the opinion that the implications of merger decisions made via many specialists are superior to the option of a single specialist when determining an intricate function. Because of these models, the accuracy of poor classification algorithms could be enhanced, as can increasing algorithms (Friedman 2001). Model is built upon a combination of some simple models in gradient-boosted methods, and every easy model is training to match for inaccurate of a previous simple model. Gradient boosted tree is commonly used by data researchers in order to achieve better results in various machine learning tasks (Chen 2016). GB decision-making trees, that are dynamic are more conducive to nonlinear and cross input change, unlike linear models such as logistic regression. It is especially able to be scaled, powerful to considering the outliers, and can naturally simulate nonlinear decision boundaries because of the hierarchical structure. This model style is ideal for the managing of statistic characteristics and some characteristics with ten different classes. For more details it could be found in supplementary materials (Shadkani et al. 2020).
Regarding Random Forest (RF) model, it is based on the random number of simple trees this model contains of, which are used to vote for the much more common classification. Thus, their replies (averaged) are combined in order to estimate the dependent parameter (regression). The use of tree sets will increase prediction accuracy dramatically (mean increased predictability of new dataset cases). RF classification type won’t use in this study. RF regression is a tree-based algorithm that is commonly use in different of fields of artificial intelligence. Simultaneously Random Forest produces many forecaster trees and learns them individually. Eventually, the results are attained in class stage via defining the last category use all the class modes and via average the predicting of separately tree in regression stage. The key component of Random Forest regression algorithm is to use a bootstrap or baggage method for tree educators. Bootstrap improves model output through lower variance and without increasing distortion. Additional information regarding RF can be seen in Sharafati et al. (2018).
Another most important ML models is Support vector machine (SVM), it is the training points closest to the separate hyperplanes. For instance, hyperplanes functions that are cable for defining the positive and negative datasets have large maximum margins are responsible for decision functions. This demonstrates a range from closest positive to a hyperplane and maximises range between closes negative and hyperplane, more information are available in Aljanabi et al. (2018). There are two kinds of regression support vector machine model: 1. SVM regression is called epsilon, while 2. SVM is identified on namely as nu. In addition, four kernel functions forms of SVM as:
Linear Kernel:
Polynomial Kernel:
Radial Basis Kernel:
Sigmoid Kernel:
where \(k({y}_{\mathrm{\i} },{y}_{j})\) is described as a kernel functions, and values this kernel functions = Inner values of both \({y}_{\mathrm{\i} }\)& \({y}_{j}\) vectors in features space \(\phi \left({y}_{\mathrm{\i} }\right)\) and\(\phi \left({y}_{j}\right)\), which is \(k({y}_{\mathrm{\i} },{y}_{j}) = \phi \left({y}_{\mathrm{\i} }\right) \times \phi \left({y}_{j}\right).\) Here r, γ gamma and d each of them consider as kernel parameter.
The concept of ANN relates to the capability to learn in human's brain and apply this capability via computer algorithm. Once trained the model, implementing a certain input to it will follow a certain outcome. Toward matching output with the goals, the network is adapting depending on the difference between inputs and goals. Typically, many of these inputs and outputs are used to train networks in such a procedure, namely, supervised learning. This model is detailed via: The modelling of dataset processes unit of neural network (artificial neurons); Structure of neural network that’s a group of many neurons, connections style of neuron and weight or strengths and Single or many are trained or learning algorithm in order to alter connections weight of neural network in order to model the problems. The neurons could be known as the functions for two or many inputs entirely accept dataset of one numerical form and range predetermines. Such functions are named activation functions and are typically constrain \(y = {Neuron}_{w} (x),\) where X is inputs connection set, whereas W is neurons parameter set (Aliev and Guirimov 2014). The simple model realized as:
Here a is the numeral of the inputs to the neurons, \({x}_{\iota }\) is ι-th inputs to the neurons while \({w}_{\iota }\) refer to weights of ι-th connections for\(\iota = 1, , . a\). f is a function (activation function), its typically nonlinear for neuron that aren’t classifies as input the hidden neuron. It could be linearity for outputs neurons. One of the assemblies of ANN which commonly using and categorize in supervise learn is feedforwarded ANN. It is backpropagation law is use for learning these networks. Multi-layer perceptron ANN topology is complete via standards backpropagation (SBP) learns law (Anderson 1995).
Performance metrics
Several indicators performance evaluation parameters will be used to estimate the efficiency of implementing approaches and their accuracy. In this paper, Correlation Coefficient (R), Scatter Index (SI), Root Mean Square Error (RMSE) and Nash Sutcliffe Efficiency (NSE) will utilize. As a better accurate result of prediction SSL will get, once the R and NSE values get nearer to 1 while SI and RMSE get close to zero. These measures are described as following (Samadianfard et al. 2020):
where \({P}_{\mathfrak{i}}\) refer to estimate value and \({O}_{\mathfrak{i}} is\) observe dataset for time i, and \(\bar{{O }_{i}}\) is the averaged observe dataset.
Normalized data is most important for modelling applications. Usually utilized normalizing method contain maximum minimum procedure, value process, and peak process. In general, different scales the inputs area widely. In addition, it could accelerate the training period via initiating the training processes within a similar scale for each functionality. Normalized the data between [0,1] conducted by some researchers such as (Cigizoglu 2004), therefore in this study will use it. The range is based on the below equation where all the dataset from the training and testing sets were scaled, therefore the samples were normalized the data between [0,1] scaling to uniform ranges is recommended by “Practical Neural Network Recipes in C++ - Timothy Masters - Google Books” (2020). The calculations are as follows:
where \({\chi }_{n}\) is the normalized value for this parameter and \({\chi }_{\iota }^{i}\) indicate the measured amount of the specific variable,\({\chi }_{\iota max}^{i}\) refer to maximum values in the database for this variable \({\chi }_{\iota min}^{i}\) is a minimum value in data for such variable.
Sensitivity analysis is a method to evaluating the greatest effective input variable on the output variable. Therefore, in this work to estimate the effects of various input designs of parameter SSL and Q(t) on ML models. Performance evaluation for different input combinations is investigating will use and Scatter index (SI). In general, we will compare various input combinations networks in case of removing some parameters and their effect to establish a more effective model for the output that will indicate the importance value of prediction SSL impacting network accuracy of this model. Next, to select the best model, a separate evaluation of observing and predict SSL values produced via the better accurate model among four ML models will be implemented for Rantau Panjang station on Johor River basin (JRB), Malaysia using Taylor diagrams (TDs).
Result and discussion
A total past daily dataset was recorded for SSL and Q for ten years interval between 1988 to 1998, so regarding the Pareto principle based on rule 80–20 per cent, the length of data could be divided 80 per cent to the training set for 20% for testing. The training set is used to find a relationship between dependent and independent variables whereas a test set evaluates the models' performance. Therefore, in the present study, we are adopting this division for the dataset SSL prediction models. Moreover, it’s an essential step to choose a proper input variable for implementing models for predicting SSL accurately. In order to attain this goal, various methods we consider it for instance, Pearson correlation (Buyukyildiz and Kumcu 2017), autocorrelation (Himanshu et al. 2017) and Partial autocorrelation function (Kisi and Yaseen 2019), and the auto-correlation is lastly selecting due to it is easiness and efficiency in estimating the optimum set of input parameters for artificial intelligence models (Shadkani et al. 2020). Table 2 presents the relation between SSL and Q(t) for the Rantau Panjang station on the Johor River basin (JRB). It can be noticed from Table 2 that \({SSL}_{t}\) has the highest correlation with SSLt-1, SSLt-2, SSLt-3 and Qt, Qt-1, Qt-2, Qt-3 (where t means the daily time step). It could notice the correlation is decrease as increase the daily lag, thus we will examine the sensitivity of including these lags to more investigation as in Table 3.
Machine learning Models goodness of fit
It can be noticed from the comparison between the developed models in Table 4 that Comb. 9 to Comb.12 have the highest range of R with 0.97 and lower RMSE with 0.01. So, Table 4 depicts the variable importance of the comb. 12 which are the most important predictors for the RF model. However, all models show good performance and achieve a high level of accuracy with a fine range of NSE around 0.9. The method of the random forest has been generated a sequence of simple trees, the complexity of each tree on the Random Forest is determining as we will mention later. Predictions for all samples computed predicted values, and other statistics for all observations (samples) as in Table 4. We used advanced stopping condition to enable early stopping of the Random Forest training algorithm, for example stop adding trees before the full number of trees are added to the model. Where cycle to calculate mean error is equal to 10 and percentage decrease in training error 5. However, the number of predictors for tree models is 1, and the number of simple regression trees to be calculated in sequential forest build steps are 100 applied in this study. This notion of high robustness of the RF in Table 4 which is shows the goodness model performance in different input structures with correlation coefficients range 0.8 to 0.9. Comb. 7 delivers best performance comparing with further combinations. The RF feature the lowest RMSE and larger NSE values in such combinations 0.031 and 0.832, respectively. Figure 4 is the scatter diagram of actual and predict SSL.
The proposed SVM method was developing in three different stages. Stage one is a training session that is performed into alter parameters of SVM, afterward switch over to validation stage using invisible dataset in a training session to be assured that model is effectively achieved. The goal of validation stage is to ensure the generalizing of the model to be valid for the untraining input dataset and only memorize a assumed limit range of input–output interrelationships experienced in the training dataset stage (Shahin et al. 2002). However, ten-fold cross-validation utilized here is typically utilized prior to switch model between training–testing sessions, which is considered the subtree to the entire tree (Aljanabi et al. 2018). Anyhow, SVM method by RBF-kernel is used for prediction one step ahead SSL with raw Q-SSL dataset. This type of SVM is more attractive compared with other SVM versions because of 1), not like linear kernel, RBF kernels could manage a case once the relationship between class labels and attributes is nonlinear. 2), it tends to provide best performance under general smoothness assumptions. 3), RBF kernel has fewer tune parameters than the polynomial and a sigmoid kernel (Noori 2011). Due to search for model parameter sets is vital in acquiring reliable predictive performances for SVM, we search for the best SVM design by dual test kinds for RBF kernel functions, called Epsilon (ε) and Nu. Epsilon-RBF is used for prediction SSL. Type Nu-RBF slightly outperformed in most input models with R-square 0.94 comparing with Epsilon-RBF type with 0.92 as in Fig. 5. Results for model performance in testing and training phases are shown in Table 4, which displays that the RMSE decreases and increases in NSE from Comb. 4 until the preferable values reach Comb. 11 with 0.0149 and 0.962, respectively. Therefore, Comb. 11 provide the most accurate estimation with R of 0.984 and SI of 0.27. It can observe that the models had acceptable performance to prediction SSL at the JRB river.
For the ANN model, the MLP architecture is used in this current study. Three hidden layers are used in this work, even though prior investigations have been revealed that the single hidden layer is enough for the ANN model to be predicting any complicating nonlinear functions (Cybenko 1989). The process of establishing model weights is known as training and is the same as calibrating the mathematical model. ANN is training by the training samples of inputs and is recognized as an output dataset. At starting of trained, weights have been adjusted by the set of random values or depend on some prior knowledge. After those weights are systematically altered via learning algorithm such that for a given input the small difference between model response and observed data. Various learning instances are frequently obtainable to network, and a process is terminating in case such difference is fewer compare with the specified value. So, the ANN model is considering training. The better performance of network was the ANN training with gradient descent algorithm by 5 neurons in hidden layers. All models’ inputs shown good performance as it is pointed out in Table 4. It can observe that the performance of every combination gradually increases in the accuracy of networks as increased the number of combinations name. Thus, a bigger value of RMSE with 0.0243 was achieved at Comb.1, while the best or lowest value of RMSE was found to be 0.0115 at Comb.12. The best values of NSE were at Comb. 9,10, and 11 with 0.97.
Models comparison
In this section, the Taylor diagram is used to compare the most accurate combinations that were selected in previous sections over four machine learning models. As shown in Fig. 6 the ANN model ranked as the most accurate model compared with other techniques with a good value of standard deviation for predicted 255.71, which is closer to the actual standard deviation of 259.38. Even though the GBT model has a closer standard deviation value of predicted to actual with 249.679, SVM ranked as a second-best model in daily modelling SSL, SVM performed better than GBT and RF with highest R = 0.98. ANN performs slightly better than SVM because the previous assessment demonstrated that ANN is the finest model among other models. It could be pointed out that RF model has shown a far value of the standard deviation of predicted with 217.42.
Sensitivity analysis
To check the effects of input variables on SSL modelling, the scatter index valuation variable is used for several input parameters. To attain this objective, the Comb. 12 for ANN model were chosen for sensitivity analysis as the most effective model at Rantau Panjang station at JRB river. Table 5 display the effect of eliminating each input parameter on the accuracy of ANN. As shown in Table 5, the accuracy of the ANN model is increased if any of these parameters Qt-6, Qt-4 and SSLt-4 are removed. While the most significant impact on the accuracy of ANN network removed the variable SSLt-1, it could notice that the accuracy was reduced with a scatter index error value of 0.28. Similarly, SSLt-2 and Qt-5 have been influenced in improving estimate efficiency and removing these variables produced an increase in SI errors amounts.
Artificial neural network(ANN) validation
Finally, validate the most accurate model (ANN) is performed in order to assess its reliability in predicting SSL. The main concern associate with validation experiments is the issue of how to fairly the datasets are divided. In this study three different dataset splitting groups were investigated: Group 1(50% training, 25% validation,25% testing), Group 2(75% training, 15% testing, 10% validation) and Group 3(80% training, 10%, testing, 10% validation). In order to assess the range of possibilities. The random sub-sampling method is used for the entire three splitting groups 1,000 times each. The concern inherent in random subsample is that might some observations not ever be chosen in validation subsample, whereas maybe other observations chosen several times. Perform sampling technique 1,000 times confirms that any such disparities are maintained to the least. Hence according to use first proposed splitting, which is 50% training, 25% for each testing and validation, the value of NSE for the validation sample was 0.941. Figure 7a) displaying how the predicted value is close to the actual value with R = 0.974. Then, we were splitting data to proposed Group 2 (75% training, 15% testing and 10% validation), and the obtained results ( NSE = 0.899 and R equal 0.948) as in Fig. 7b). Lastly, the results of dividing the dataset by 80% training, 10% equally for each testing and validation set, can be seen in Fig. 7c) where the NSE value over the validation sample was found to be 0.937and R to be 0.968. It can be seen from Fig. 7 that the Group 1 (50% training,25% testing and 25% validation) split provides the most proper outcomes to test the validation of the ANN model. Consequently, all the results proved that ANN has superior performance and could be a promising tool in predicting daily SSL at the Johor River basin (JRB) in Malaysia.
Conclusion
In this study, the ability of GBT, SVR, RF, and ANN models in predicting the daily SSL of the Johor River basin (JRB) in Malaysia was investigated. For this aim, data from the period of 1988–1999 was split into two samples (for training 80% and for testing 20%) and used to develop the proposed four models (GBT, SVR, RF, and ANN). The autocorrelation technique was used to select relevant input. Then, a few irrelevant and less significant inputs were eliminated, and just dominant input variables were applied in various combinations for prediction SSL. In general, each model was evaluated using error parameters (RMSE, MAE, NSE and R); they were demonstrated good performance and achieved the main objective of this research. However, by comparing the findings of ML models with each other, ANN model outperformed all other models where R = 0.989, SI = 0., 199, RMSE = 0.011053, and NSE = 0.979. Sensitivity analysis was conducted to figure out the impact of the most significant variable on prediction SSL. The best input combination found to be SSLt-1, SSLt-2, SSLt-3, SSLt-4, SSLt-5, SSLt-6, Qt, Qt-1, Qt-2, Qt-3, Qt-4, Qt-5, Qt-6. In order to test validation of the ANN model, three groups of splitting data have been examined, and the most appropriate group was selected. The ANN model has displayed reliability in predicting unseen data for prediction daily SSL. Future work could be performed on evaluation accuracies of researched GBT, SVM, RF, and ANN models to predict SSL in different rivers and climates. And apply other data-driven techniques for the same data for the Johor River basin (JRB).
References
Aliev RA, Guirimov BG (2014) Type-2 fuzzy neural networks and their applications. Springer
Aljanabi QA, Chik Z, Allawi MF, El-Shafie AH, Ahmed AN, El-Shafie A (2018) Support vector regression-based model for prediction of behavior stone column parameters in soft clay under highway embankment. Neural Comput Appl 30(8):2459–2469. https://doi.org/10.1007/s00521-016-2807-5
Al-Mukhtar M (2019) Random forest, support vector machine, and neural networks to modelling suspended sediment in Tigris River-Baghdad. Environ Monit Assess191(11). https://doi.org/10.1007/s10661-019-7821-5
Anderson (1995) An introduction to neural networks
Buyukyildiz M, Kumcu SY (2017) An estimation of the suspended sediment load using adaptive network based fuzzy inference system, support vector machine and artificial neural network models. Water Resour Manag 31(4):1343–1359
Chen T (2016) “XGBoost : A Scalable Tree Boosting System,”
Chen XY, Chau KW (2016) “A Hybrid Double Feedforward Neural Network for Suspended Sediment Load Estimation”. https://doi.org/10.1007/s11269-016-1281-2
Cigizoglu HK (2004) Estimation and forecasting of daily suspended sediment data by multi-layer perceptrons. Adv Water Resour 27(2):185–195. https://doi.org/10.1016/j.advwatres.2003.10.003
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314. https://doi.org/10.1007/BF02551274
Ehteram M et al (2021) Design of a hybrid ANN multi-objective whale algorithm for suspended sediment load prediction. Environ Sci Pollut Res 28(2):1596–1611. https://doi.org/10.1007/s11356-020-10421-y
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Statist 29(5):1189–1232
Gajbhiye S, Mishra SK, Pandey A (2015) Simplified sediment yield index model incorporating parameter curve number. Arab J Geosci 8(4):1993–2004
Gajbhiye S, Ashish SKM (2014) Relationship between SCS-CN and sediment yield. https://doi.org/10.1007/s13201-013-0152-8
Himanshu SK, Pandey A, Yadav B (2017) Ensemble wavelet-support vector machine approach for prediction of suspended sediment load using hydrometeorological data. J Hydrol Eng 22(7):5017006
Jain SK (2001) Development of integrated sediment rating curves using ANNS. J Hydraul Eng 127(1):30–37
Kisi O (2012) Modeling discharge-suspended sediment relationship using least square support vector machine. J Hydrol 456–457:110–120. https://doi.org/10.1016/j.jhydrol.2012.06.019
Kisi O, Yaseen ZM (2019) The potential of hybrid evolutionary fuzzy intelligence model for suspended sediment concentration prediction. CATENA 174:11–23
Lin J, Cheng C, Chau K (2010) Using support vector machines for long-term discharge prediction. Hydrol Sci J 51:599–612. https://doi.org/10.1623/hysj.51.4.599
Melesse AM, Ahmad S, Mcclain ME, Wang X, Lim YH (2011) Suspended sediment load prediction of river systems: An artificial neural network approach. Agric Water Manag 98(5):855–866. https://doi.org/10.1016/j.agwat.2010.12.012
Noori R et al (2011) Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. J Hydrol 401(3–4):177–189. https://doi.org/10.1016/j.jhydrol.2011.02.021
Nourani V, Andalib G (2015) Daily and monthly suspended sediment load predictions using wavelet based artificial intelligence approaches. J Mt Sci 12(1):85–100. https://doi.org/10.1007/s11629-014-3121-2
Nourani V, Gokcekus H, Gelete G (2021) Estimation of suspended sediment load using artificial intelligence-based ensemble model. Complexity 2021. https://doi.org/10.1155/2021/6633760
Olyaie E, Banejad H, Chau K (2015) A comparison of various artificial intelligence approaches performance for estimating suspended sediment load of river systems : a case study in United States. https://doi.org/10.1007/s10661-015-4381-1
Pham BT, Shirzadi A, Tien Bui D, Prakash I, Dholakia MB (2018) A hybrid machine learning ensemble approach based on a Radial Basis Function neural network and Rotation Forest for landslide susceptibility modeling: a case study in the Himalayan area, India. Int J Sediment Res 33(2):157–170. https://doi.org/10.1016/j.ijsrc.2017.09.008
Practical Neural Network Recipes in C++ - Timothy Masters - Google Books. https://books.google.com.my/books?hl=en&lr=&id=7Ez_Pq0sp2EC&oi=fnd&pg=PR17&dq=Masters+1993&ots=e16ziwUjhS&sig=ce9nZTd79_wTL1ZD2nUfnAe1lIs&redir_esc=y#v=onepage&q=Masters1993&f=false. Accessed 18 Feb 2020
Rashidi S, Vafakhah M, Lafdani EK, Javadi MR (2016) Evaluating the support vector machine for suspended sediment load forecasting based on gamma test. Arab J Geosci 9(11). https://doi.org/10.1007/s12517-016-2601-9
Salih ZM, Sharafati SQ, Khosravi A, Faris K, Kisi F, Tao O, Ali H, Yaseen M (2019) River suspended sediment load prediction based on river discharge information: application of newly developed data mining models. Hydrol Sci J. https://doi.org/10.1080/02626667.2019.1703186
Samadianfard S, Hashemi S, Kargar K, Izadyar M (2020) Wind speed prediction using a hybrid model of the multi-layer perceptron and whale optimization algorithm. Energy Rep 6:1147–1159. https://doi.org/10.1016/j.egyr.2020.05.001
Shadkani S, Abbaspour A, Samadianfard S, Hashemi S, Mosavi A, Band SS (2021) Comparative study of multilayer perceptron-stochastic gradient descent and gradient boosted trees for predicting daily suspended sediment load: the case study of the Mississippi River, U.S. Int J Sediment Res 36(4):512–523. https://doi.org/10.1016/j.ijsrc.2020.10.001
Shahin MA, Maier HR, Jaksa MB (2002) Predicting Settlement of Shallow Foundations using Neural Networks. J Geotech Geoenvironmental Eng 128(9):785–793. https://doi.org/10.1061/(asce)1090-0241(2002)128:9(785)
Sharafati A, Haji Seyed Asadollah SB, Motta D, Yaseen ZM (2018) Application of newly developed ensemble machine learning models for daily suspended sediment load prediction and related uncertainty analysis. Hydrol Sci J 0(0):1–21. https://doi.org/10.1080/02626667.2020.1786571
Singh HV, Thompson AM, Gharabaghi B (2016) Event runoff and sediment-yield neural network models for assessment and design of management practices for small agricultural watersheds. J Hydrol Eng 1–12. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001457
Tan ML (2014) Free internet datasets for streamflow modelling using SWAT in the Johor river basin, Malaysia. IOP Conf Ser Earth Environ Sci 18(1). https://doi.org/10.1088/1755-1315/18/1/012193
Tan ML, Ibrahim AL, Yusop Z, Duan Z, Ling L (2015) Impacts de l’utilisation des sols et de la variabilité climatique sur les composantes hydrologiques dans le bassin du fleuve Johor, en Malaisie. Hydrol Sci J 60(5):873–889. https://doi.org/10.1080/02626667.2014.967246
Tangang FT, Juneng L, Salimun E, Sei K, Halimatun M (2012) Climate change and variability over Malaysia: gaps in science and research information. Sains Malaysiana 41(11):1355–1366
Targhi AT, Abbaszadeh S, Arabasadi Z (2017) A hybrid method for forecasting river suspended sediments in Iran. Int J River Basin Manag 0:1–26. https://doi.org/10.1080/15715124.2017.1315815
Taşar B, Kaya YZ, Varçin H, Üneş F, Demirci M (2017) Forecasting of Suspended Sediment in Rivers Using Artificial Neural Networks Approach. Int J Adv Eng Res Sci 4(12):79–84. https://doi.org/10.22161/ijaers.4.12.14
Wu C (2020) “A hybrid model coupled with singular spectrum analysis for daily rainfall prediction A hybrid model coupled with singular spectrum analysis for daily rainfall prediction,” no. April. https://doi.org/10.2166/hydro.2010.032
Yoon H, Jun S, Hyun Y, Bae G, Lee K (2011) A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coast. J Hydrol 396(1–2):128–138. https://doi.org/10.1016/j.jhydrol.2010.11.002
Ziyan Z (2012) “Early Flood Warning for Linyi Watershed by the GRAPES / XXT Model Using TIGGE Data,” no. 973, pp. 103–111. https://doi.org/10.1007/s13351-012-0110-7.1.
Acknowledgements
This research was supported by the Ministry of Education (MOE) through Fundamental Research Grant Scheme (FRGS/1/2020/TK0/UNITEN/02/16). The authors would like to acknowledge the access of data from Department of Irrigation and Drainage Malaysia (JPS).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by H. Babaie.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hanoon, M.S., Abdullatif B, A.A., Ahmed, A.N. et al. A comparison of various machine learning approaches performance for prediction suspended sediment load of river systems: a case study in Malaysia. Earth Sci Inform 15, 91–104 (2022). https://doi.org/10.1007/s12145-021-00689-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-021-00689-0