Introduction

Watershed sediment loads are an ecological risk, and their assessment is important for the development of protection measures, reservoir sustainability and generation of hydropower, flood control and water supply (Lin et al. 2010; Ziyan 2012). In several rivers, sediment is transferred in suspension and suspended sediment load (SSL) is key for channels designs, culverts, and dams (Targhi et al. 2017). For water resource management and environmental protection programmes, understanding potential sediment loads is critical (Melesse et al. 2011). Indeed, SSL models are seen as a key component of the planning and managing activities carried out via numerous resources managers in the watershed. Soil erosion, which is directly connected to sediment transportation concerns, remains a significant global environmental concern. Keeping monitoring of soil erosion and sediment transport could be repeated and challenging; thus, detailed techniques for predicting this significance for decision-making (Gajbhiye et al. 2015; Gajbhiye and Ashish 2014). Thus, estimate SSL is a vital step to water quality problems and water resources management. In recent decades, water resources management has employed machine learning (ML) approaches, particularly for modelling processes with limited knowledge successfully (Yoon et al. 2011; Wu 2020).

Literature review

Artificial intelligence approaches have been proved to be robust replacements for popular conventional approaches. Thus, some sorts of research have exposed the superiority of the novel methods we will review in this section as below. In (Shadkani et al. 2020) various ML techniques like Multi-layer perceptron (MLP), GBT and Multi-layer perceptron-Stochastic Gradient Descent (MLPSGD) were applied in order to evaluate SSL in U.S. Result of MLP methods were enhanced or optimized via SGD. Thus, the MLPSGD model was recommended as a reliable approach to predict SSL. Nevertheless, this algorithm cannot guarantee an optimal solution always and its time-consuming. In the USA this study (Melesse et al. 2011) assessed utilizing artificial neural network modelling technique. MLP-ANN with errors backpropagation algorithms, utilizing past 24houre and 7 days hydro climatological dataset precipitation, recent Q(t), predecessor Q(t-1), and predecessor SL(t − 1)), was using to prediction SL(t). Artificial neural network model was comparing with further models and showed better results than them. In addition, daily prediction showed better results than weekly prediction. This motivates us to use ANN in this work and compare it with further ML models in our work. This maybe also support our work regarding input design which its daily combination inputs. The study (Kisi 2012) investigated the capability of SVM for modelling Q(t) and SSL relation. SVM was comparing with those of the ANN and sediment rating curve (SRC). The result gets from this comparison presented that the SVM model was cable to generate best result than the other model. So, we will continue the investigation regarding which approach is better for SSL prediction. Researchers in Pham et al. (2018) used a hybrid ML ensemble method neural network Rotation Forest-based Radial Basis Function (RF-RBF) for spatial predicting in India, and the results showed that the RFRBF model has the better accuracy and ability when compared the further techniques. Nevertheless, the hybrid methods may be having complicate architecture and give only frequency resolutions. Based on (Ehteram 2021) they proposed advanced optimizers whale algorithm (WA) with ANN, so it was shown that hybridization of multi objective algorithm with whale algorithm and ANN model drastically enhances accuracy of model in prediction daily SSL. Regarding (Rashidi et al. 2016) they determined a proper input combination for developing the SSL predicting model and the effect of pre-processes of input parameter via Gamma Test (GT) was examined on performs of SVM by polynomial and RBF kernels. These results could help us to reduce the uncertainties parameters of SVM. Also, this study presented ability of identify the significance of the inputs variables, changing gamma test to a valuable and technical test for inputs parameters, pre-processes to predicting SSL. For estimate of daily suspend sediment concentration (SSC) (Taşar et al. 2017) implemented ANN and M5tree (M5T) methods, and statistical methods for example SRC and MLR. These predicting methods were compare and ANN model showed higher prediction SS than the other estimate approaches. Hence, this other evidence encourages us to use ANN technique. Based on (Al-Mukhtar 2019) three different artificial intelligence RF, SVM, and ANN approaches were deployed to model and predict the SSL at Sarai Station in Baghdad. So, the results showed that RF has superior performance among the others thus, this motivates us to more investigate by comparing these methods in our research. Daily and monthly prediction SSL was studied in Nourani and Andalib (2015). The effectiveness of model Wavelet based least square support vector machine (WLSSVM) and the ability was comparing with further models in multistep-ahead SSL prediction and the result presented that predicting SSL in daily scenario, the model has best results compared with ANN, while in monthly modelling, artificial neural network model showed slightly better accuracy than WLSSVM. Also, wavelet-based ANN (WANN) model presented similar values in daily and dissimilar in monthly of prediction SSL, once added wavelet leaded to increased accuracy of both models. In addition, (Olyaie et al. 2015) researchers make comparation among various approaches such as adaptive neuro-fuzzy inference system (ANFIS), SRC, ANN, coupled wavelet and neural network (WANN) for assessing the daily SSL in USA stations. Generally, outcomes showed the WANN method satisfactory able to predict and suitably estimation cumulative SSL, and rationally predicting peak SSL amounts. However, the wavelet transform had improved in use and commonly in current years because it is beginning in early 1980, until now still doesn’t have wide-spread use of Fourier transform. Fourier analyses has significant disadvantage. When transform to frequency area, time information is wasted. Modelling daily SSL estimate use in Chen and Chau (2016) hybrid double feed forward neural networks (HDFNNs), via uniting fuzzy pattern recognition and continuity equation to structure of double neural network, result presented that HDFNN is proper for modelling the sediment transport processes with non-linear, fuzzy and time fluctuating characteristic, and HDFNN could be substitute for used and could be recommend as an effective estimate method in order to predict SSL.As we mentioned before the drawbacks of such hybrid models is has complicated structures. Based on (Sharafati et al. 2018) different ML approaches were developed GBR, AdaBoost regression (ABR) and RF for predicting SSL. The forecast were be depend on daily values of Q(t) and SSC, thus the RFR approach showed a bit lead in predicting performance. (Nourani et al. 2021) they are developing artificial intelligence based ensemble approaches to modelling SSL by used various single models SVM, feed forward neural networks (FFNNs), ANFIS, and one conventional multi-linear regression (MLR), were applied to SSL modelling. Generally, the ML-based ensemble provides brilliant performance in SSL prediction. (Singh et al. 2016) it was developed novel ANN approaches for valuation of best management practice (BMP) for control sediment yield and runoff from minor agricultural watershed. These approaches integrated complicated non-linear effect of main climate, topography, drainages, and management features and could evaluated BMP efficiency without assumptions about their performance or physical mechanism. Statistical parameters proposed that the ANN models performed fine. We conclude from review above that this research area is still very active and that many methods to optimization and modelling approaches are being verified to dealing with predicting SSL. Our review displays that using ML models would be the greatest advantageous since it providing the finest representation of forecast SSL.

Problem statement

Predicting SSL is a nonlinear and complicated issue, which comprises the interface of numerous physical and hydrological variables that change in time and space, therefore isn’t easy mission. Depend on the prior characterises and due to it is the utmost substantial task in river basin investigation, establishing a relationship between SSL and discharge (Qt) has motivated several researchers in entire the universal (Salih et al. 2019). Recently, because of the development of soft computing and data driven methods, hydraulic engineers were tending to prediction SSL with regression and ML approaches. Conventional approaches such as multilinear regression model, autoregressive model, and sediment rating curves also have problems due to their limited ability to considering nonlinear and non-stationary in the environment and hydrological dataset that have been resolved in above mention newest computing methods (Jain 2001). Lately, the artificial intelligence method has been recognised as an effective substitute for modelling complex nonlinear systems. Usually, such models don’t consider the internal processes but progress models over the input- output relationship. ML models trying to represents whole the chemical process and physical process involved in the statistical term, with variables developed from historical dataset. In general, the difference equations are simplifying so as to figure out the answers appropriate for the models. Solutions of the involve equations maybe require supposition and simplification which are derive from the performance of the models. Regarding that, it’s significant for investigators to enhance the ability of model’s methods that could adequately distinguish the nonlinearity and the stochastic feature of the SSL and Q(t) data. Therefore, more attention for developing of more flexible SSL predicting system is necessary, ML models let us to develop software resolutions for whole these problems and much faster than these conventional models.

Objective of study

The present work aims to attain good accuracy and predictive reliability performance for artificial intelligence models of SSL. To this goal, we will emphasis on probable improvements which could contribute the prediction method quality. Consequently, this work aims to develop four algorithms namely boosted tree (GBT), random forest (RF), support vector machine (SVM) and artificial neural network (ANN) for prediction SSL at the Rantau Panjang station on Johor River basin (JRB), Malaysia using data from 1988–1998. To examine the sensitivity of approaches to different input combination of the proposed models. To investigate and compare accuracy between four ML models in prediction SSL. To validate the accuracy of the proposed algorithms via predicting unseen data.

Material and methods

Case study and data

This work will carry out in the Johor River basin (JRB) in the southern part of Malaysia Peninsular Fig. 1. The entire region of the basin is 1842,8 square kilometres with annual precipitation of 2500 mm (Tan 2014). Its 122.7 km in length and comes from Gunung Belumut in the north of the basin (second highest mountain of Johor). The river flows north–south and afterwards southwest to Johor Strait. The two largest tributaries in the northern basin are Linggiu and the Sayong River. The JRB is situated between 1°30 – 10 – N and 103°20 – 10 – E and has a surface area of approximately. 1652 sq. km. JRB ranges in elevation from 3 to 977 m.s.l. The JRB's main forms of soil use are perennial farming (oil palms and rubber) and forests. A basin's major soil kind is the Ultisols (Rengam-Jerangau) sequence. It's a well-drained, moderately permeable yellowish-brown, sandy clay ideal for oil palm and rubber planting (Tan et al. 2015). In this area there are four control stations: Tanah Jengli, Felda Inas, Johor Tenggara and Rantau Panjang. Data from Rantau Panjang Station for sediment prediction and coordination for the Rantau Panjang station situated on the Johor River Stream will be used in the present study (1°47′07.4"N 103°44′14.2"E). The basin consists primarily of an oil palm plantation, supplemented by an oak, rainforest, secondary forest, water and swamp forest bodies. The JRB climate is the tropical monsoon climate, dividing the monsoon from north-east November to February and the monsoon from southwestern countries May to August (Tangang et al. 2012). In December, flooding frequently occurs when the maximum precipitation and peak flow are reported. The data of both Q(t) and SSL for station Rantau Panjang (middle basin) is daily records cover ten years from 1988 until 1998 Source (Department of Irrigation and Drainage Ministry of Environment and Water, Malaysia). It could be observed from Fig. 2, which it presents the average monthly of SSL that is for the same month the fluctuated in the sediment over ten years. Also, Fig. 3 illustrates the discharge, which is it is quite random and extreme nonlinear and the mean discharge 26.72 m3/s, while the maximum value is 650.66. In addition, simple statistical analysis conducting for such data which is representing one of the considerable steps that shall carry out before used such historical dataset to ensure it is reliable prior as in Table 1.

Fig. 1
figure 1

Site of the study area in Peninsular Malaysia

Fig. 2
figure 2

The average monthly of the suspended sediment load (SSL)

Fig. 3
figure 3

Daily Discharge at station Rantau Panjang

Table 1 Simple statistics for measured Suspended sediment load (SSL) and Discharge Q(t)

Machine learning methods

One of the machine learning models is the Gradient Boosted Tree (GBT) model, which has an algorithm based on the opinion that the implications of merger decisions made via many specialists are superior to the option of a single specialist when determining an intricate function. Because of these models, the accuracy of poor classification algorithms could be enhanced, as can increasing algorithms (Friedman 2001). Model is built upon a combination of some simple models in gradient-boosted methods, and every easy model is training to match for inaccurate of a previous simple model. Gradient boosted tree is commonly used by data researchers in order to achieve better results in various machine learning tasks (Chen 2016). GB decision-making trees, that are dynamic are more conducive to nonlinear and cross input change, unlike linear models such as logistic regression. It is especially able to be scaled, powerful to considering the outliers, and can naturally simulate nonlinear decision boundaries because of the hierarchical structure. This model style is ideal for the managing of statistic characteristics and some characteristics with ten different classes. For more details it could be found in supplementary materials (Shadkani et al. 2020).

Regarding Random Forest (RF) model, it is based on the random number of simple trees this model contains of, which are used to vote for the much more common classification. Thus, their replies (averaged) are combined in order to estimate the dependent parameter (regression). The use of tree sets will increase prediction accuracy dramatically (mean increased predictability of new dataset cases). RF classification type won’t use in this study. RF regression is a tree-based algorithm that is commonly use in different of fields of artificial intelligence. Simultaneously Random Forest produces many forecaster trees and learns them individually. Eventually, the results are attained in class stage via defining the last category use all the class modes and via average the predicting of separately tree in regression stage. The key component of Random Forest regression algorithm is to use a bootstrap or baggage method for tree educators. Bootstrap improves model output through lower variance and without increasing distortion. Additional information regarding RF can be seen in Sharafati et al. (2018).

Another most important ML models is Support vector machine (SVM), it is the training points closest to the separate hyperplanes. For instance, hyperplanes functions that are cable for defining the positive and negative datasets have large maximum margins are responsible for decision functions. This demonstrates a range from closest positive to a hyperplane and maximises range between closes negative and hyperplane, more information are available in Aljanabi et al. (2018). There are two kinds of regression support vector machine model: 1. SVM regression is called epsilon, while 2. SVM is identified on namely as nu. In addition, four kernel functions forms of SVM as:

Linear Kernel:

$$k= k({y}_{\mathrm{\i} },{y}_{j})= {y}_{\mathrm{\i} }^{T}. {y}_{j}$$
(1)

Polynomial Kernel:

$$k= k\left({y}_{\mathrm{\i} },{y}_{j}\right)={({\gamma y}_{\mathrm{\i} }^{T}. {y}_{j}+r)}^{d} , \gamma >0$$
(2)

Radial Basis Kernel:

$$k=k\left({y}_{\mathrm{\i} },{y}_{j}\right)=\exp\left(-{y}_{\mathrm{\i} }-{{y}_{j}}^{2}\right), \gamma >0$$
(3)

Sigmoid Kernel:

$$k= k\left({y}_{\mathrm{\i} },{y}_{j}\right)=\tanh ({y}_{\mathrm{\i} }^{T}. {y}_{j}+r)$$
(4)

where \(k({y}_{\mathrm{\i} },{y}_{j})\) is described as a kernel functions, and values this kernel functions = Inner values of both \({y}_{\mathrm{\i} }\)& \({y}_{j}\) vectors in features space \(\phi \left({y}_{\mathrm{\i} }\right)\) and\(\phi \left({y}_{j}\right)\), which is \(k({y}_{\mathrm{\i} },{y}_{j}) = \phi \left({y}_{\mathrm{\i} }\right) \times \phi \left({y}_{j}\right).\) Here r, γ gamma and d each of them consider as kernel parameter.

The concept of ANN relates to the capability to learn in human's brain and apply this capability via computer algorithm. Once trained the model, implementing a certain input to it will follow a certain outcome. Toward matching output with the goals, the network is adapting depending on the difference between inputs and goals. Typically, many of these inputs and outputs are used to train networks in such a procedure, namely, supervised learning. This model is detailed via: The modelling of dataset processes unit of neural network (artificial neurons); Structure of neural network that’s a group of many neurons, connections style of neuron and weight or strengths and Single or many are trained or learning algorithm in order to alter connections weight of neural network in order to model the problems. The neurons could be known as the functions for two or many inputs entirely accept dataset of one numerical form and range predetermines. Such functions are named activation functions and are typically constrain \(y = {Neuron}_{w} (x),\) where X is inputs connection set, whereas W is neurons parameter set (Aliev and Guirimov 2014). The simple model realized as:

$$y=f\left(\theta +\sum_{\iota =1}^{a}{w}_{\iota } {x}_{\iota }\right)$$
(5)

Here a is the numeral of the inputs to the neurons, \({x}_{\iota }\) is ι-th inputs to the neurons while \({w}_{\iota }\) refer to weights of ι-th connections for\(\iota = 1, , . a\). f is a function (activation function), its typically nonlinear for neuron that aren’t classifies as input the hidden neuron. It could be linearity for outputs neurons. One of the assemblies of ANN which commonly using and categorize in supervise learn is feedforwarded ANN. It is backpropagation law is use for learning these networks. Multi-layer perceptron ANN topology is complete via standards backpropagation (SBP) learns law (Anderson 1995).

Performance metrics

Several indicators performance evaluation parameters will be used to estimate the efficiency of implementing approaches and their accuracy. In this paper, Correlation Coefficient (R), Scatter Index (SI), Root Mean Square Error (RMSE) and Nash Sutcliffe Efficiency (NSE) will utilize. As a better accurate result of prediction SSL will get, once the R and NSE values get nearer to 1 while SI and RMSE get close to zero. These measures are described as following (Samadianfard et al. 2020):

$$R=\frac{\sum_{i=1}^{n}{(O}_{\mathfrak{i}}-\bar{O })({P}_{\mathfrak{i}}-\bar{P })}{\sqrt{{{(O}_{\mathfrak{i}}-\bar{O })}^{2}\sum_{i=1}^{n}{{P}_{\mathfrak{i}}-\bar{P })}^{2}}}$$
(6)
$$RMSE=\sqrt{\frac{1}{\mathrm{n}}\sum_{\mathrm{i}=1}^{\mathrm{n}}{\left({P}_{i}-{O}_{i}\right)}^{2}}$$
(7)
$$\text{SI}=\frac{\sqrt{\frac{1}{\mathrm{n}}\sum_{\mathrm{i}=1}^{\mathrm{n}}{\left({P}_{i}-{O}_{i}\right)}^{2}}}{\bar{O} }$$
(8)
$$NSE=1-\frac{\sum_{i=1}^{n}{\left({P}_{\mathfrak{i}}-{O}_{\mathfrak{i}}\right)}^{2}}{\sum_{i=1}^{n}{{(O}_{\mathfrak{i}}-\bar{O })}^{2}}$$
(9)

where \({P}_{\mathfrak{i}}\) refer to estimate value and \({O}_{\mathfrak{i}} is\) observe dataset for time i, and \(\bar{{O }_{i}}\) is the averaged observe dataset.

Normalized data is most important for modelling applications. Usually utilized normalizing method contain maximum minimum procedure, value process, and peak process. In general, different scales the inputs area widely. In addition, it could accelerate the training period via initiating the training processes within a similar scale for each functionality. Normalized the data between [0,1] conducted by some researchers such as (Cigizoglu 2004), therefore in this study will use it. The range is based on the below equation where all the dataset from the training and testing sets were scaled, therefore the samples were normalized the data between [0,1] scaling to uniform ranges is recommended by “Practical Neural Network Recipes in C++ - Timothy Masters - Google Books” (2020). The calculations are as follows:

$${\upchi }_{\mathrm{n}}=\frac{{{\chi }_{\iota }^{i}} - {\chi }_{\iota min}^{i}}{{\chi }_{\iota max}^{i}- {\chi }_{\iota min}^{i}}$$
(10)

where \({\chi }_{n}\) is the normalized value for this parameter and \({\chi }_{\iota }^{i}\) indicate the measured amount of the specific variable,\({\chi }_{\iota max}^{i}\) refer to maximum values in the database for this variable \({\chi }_{\iota min}^{i}\) is a minimum value in data for such variable.

Sensitivity analysis is a method to evaluating the greatest effective input variable on the output variable. Therefore, in this work to estimate the effects of various input designs of parameter SSL and Q(t) on ML models. Performance evaluation for different input combinations is investigating will use and Scatter index (SI). In general, we will compare various input combinations networks in case of removing some parameters and their effect to establish a more effective model for the output that will indicate the importance value of prediction SSL impacting network accuracy of this model. Next, to select the best model, a separate evaluation of observing and predict SSL values produced via the better accurate model among four ML models will be implemented for Rantau Panjang station on Johor River basin (JRB), Malaysia using Taylor diagrams (TDs).

Result and discussion

A total past daily dataset was recorded for SSL and Q for ten years interval between 1988 to 1998, so regarding the Pareto principle based on rule 80–20 per cent, the length of data could be divided 80 per cent to the training set for 20% for testing. The training set is used to find a relationship between dependent and independent variables whereas a test set evaluates the models' performance. Therefore, in the present study, we are adopting this division for the dataset SSL prediction models. Moreover, it’s an essential step to choose a proper input variable for implementing models for predicting SSL accurately. In order to attain this goal, various methods we consider it for instance, Pearson correlation (Buyukyildiz and Kumcu 2017), autocorrelation (Himanshu et al. 2017) and Partial autocorrelation function (Kisi and Yaseen 2019), and the auto-correlation is lastly selecting due to it is easiness and efficiency in estimating the optimum set of input parameters for artificial intelligence models (Shadkani et al. 2020). Table 2 presents the relation between SSL and Q(t) for the Rantau Panjang station on the Johor River basin (JRB). It can be noticed from Table 2 that \({SSL}_{t}\) has the highest correlation with SSLt-1, SSLt-2, SSLt-3 and Qt, Qt-1, Qt-2, Qt-3 (where t means the daily time step). It could notice the correlation is decrease as increase the daily lag, thus we will examine the sensitivity of including these lags to more investigation as in Table 3.

Table 2 The autocorrelation between SSL and Q(t)
Table 3 Input combinations for daily prediction SSL

Machine learning Models goodness of fit

It can be noticed from the comparison between the developed models in Table 4 that Comb. 9 to Comb.12 have the highest range of R with 0.97 and lower RMSE with 0.01. So, Table 4 depicts the variable importance of the comb. 12 which are the most important predictors for the RF model. However, all models show good performance and achieve a high level of accuracy with a fine range of NSE around 0.9. The method of the random forest has been generated a sequence of simple trees, the complexity of each tree on the Random Forest is determining as we will mention later. Predictions for all samples computed predicted values, and other statistics for all observations (samples) as in Table 4. We used advanced stopping condition to enable early stopping of the Random Forest training algorithm, for example stop adding trees before the full number of trees are added to the model. Where cycle to calculate mean error is equal to 10 and percentage decrease in training error 5. However, the number of predictors for tree models is 1, and the number of simple regression trees to be calculated in sequential forest build steps are 100 applied in this study. This notion of high robustness of the RF in Table 4 which is shows the goodness model performance in different input structures with correlation coefficients range 0.8 to 0.9. Comb. 7 delivers best performance comparing with further combinations. The RF feature the lowest RMSE and larger NSE values in such combinations 0.031 and 0.832, respectively. Figure 4 is the scatter diagram of actual and predict SSL.

Table 4 Summary of results from four ML model
Fig. 4
figure 4

Scatter diagram of actual and predict SSL for the most accurate combination using four machine learning models, A) GBT, B) RF, C) SVM, and D) ANN

The proposed SVM method was developing in three different stages. Stage one is a training session that is performed into alter parameters of SVM, afterward switch over to validation stage using invisible dataset in a training session to be assured that model is effectively achieved. The goal of validation stage is to ensure the generalizing of the model to be valid for the untraining input dataset and only memorize a assumed limit range of input–output interrelationships experienced in the training dataset stage (Shahin et al. 2002). However, ten-fold cross-validation utilized here is typically utilized prior to switch model between training–testing sessions, which is considered the subtree to the entire tree (Aljanabi et al. 2018). Anyhow, SVM method by RBF-kernel is used for prediction one step ahead SSL with raw Q-SSL dataset. This type of SVM is more attractive compared with other SVM versions because of 1), not like linear kernel, RBF kernels could manage a case once the relationship between class labels and attributes is nonlinear. 2), it tends to provide best performance under general smoothness assumptions. 3), RBF kernel has fewer tune parameters than the polynomial and a sigmoid kernel (Noori 2011). Due to search for model parameter sets is vital in acquiring reliable predictive performances for SVM, we search for the best SVM design by dual test kinds for RBF kernel functions, called Epsilon (ε) and Nu. Epsilon-RBF is used for prediction SSL. Type Nu-RBF slightly outperformed in most input models with R-square 0.94 comparing with Epsilon-RBF type with 0.92 as in Fig. 5. Results for model performance in testing and training phases are shown in Table 4, which displays that the RMSE decreases and increases in NSE from Comb. 4 until the preferable values reach Comb. 11 with 0.0149 and 0.962, respectively. Therefore, Comb. 11 provide the most accurate estimation with R of 0.984 and SI of 0.27. It can observe that the models had acceptable performance to prediction SSL at the JRB river.

Fig. 5
figure 5

Comparison between Epsilon and Nu of SVM

For the ANN model, the MLP architecture is used in this current study. Three hidden layers are used in this work, even though prior investigations have been revealed that the single hidden layer is enough for the ANN model to be predicting any complicating nonlinear functions (Cybenko 1989). The process of establishing model weights is known as training and is the same as calibrating the mathematical model. ANN is training by the training samples of inputs and is recognized as an output dataset. At starting of trained, weights have been adjusted by the set of random values or depend on some prior knowledge. After those weights are systematically altered via learning algorithm such that for a given input the small difference between model response and observed data. Various learning instances are frequently obtainable to network, and a process is terminating in case such difference is fewer compare with the specified value. So, the ANN model is considering training. The better performance of network was the ANN training with gradient descent algorithm by 5 neurons in hidden layers. All models’ inputs shown good performance as it is pointed out in Table 4. It can observe that the performance of every combination gradually increases in the accuracy of networks as increased the number of combinations name. Thus, a bigger value of RMSE with 0.0243 was achieved at Comb.1, while the best or lowest value of RMSE was found to be 0.0115 at Comb.12. The best values of NSE were at Comb. 9,10, and 11 with 0.97.

Models comparison

In this section, the Taylor diagram is used to compare the most accurate combinations that were selected in previous sections over four machine learning models. As shown in Fig. 6 the ANN model ranked as the most accurate model compared with other techniques with a good value of standard deviation for predicted 255.71, which is closer to the actual standard deviation of 259.38. Even though the GBT model has a closer standard deviation value of predicted to actual with 249.679, SVM ranked as a second-best model in daily modelling SSL, SVM performed better than GBT and RF with highest R = 0.98. ANN performs slightly better than SVM because the previous assessment demonstrated that ANN is the finest model among other models. It could be pointed out that RF model has shown a far value of the standard deviation of predicted with 217.42.

Fig. 6
figure 6

Taylor diagram of predicted SSL values use most accurate combinations

Sensitivity analysis

To check the effects of input variables on SSL modelling, the scatter index valuation variable is used for several input parameters. To attain this objective, the Comb. 12 for ANN model were chosen for sensitivity analysis as the most effective model at Rantau Panjang station at JRB river. Table 5 display the effect of eliminating each input parameter on the accuracy of ANN. As shown in Table 5, the accuracy of the ANN model is increased if any of these parameters Qt-6, Qt-4 and SSLt-4 are removed. While the most significant impact on the accuracy of ANN network removed the variable SSLt-1, it could notice that the accuracy was reduced with a scatter index error value of 0.28. Similarly, SSLt-2 and Qt-5 have been influenced in improving estimate efficiency and removing these variables produced an increase in SI errors amounts.

Table 5 Effect of elimination input parameters on ANN-12 model efficiency for SSL prediction

Artificial neural network(ANN) validation

Finally, validate the most accurate model (ANN) is performed in order to assess its reliability in predicting SSL. The main concern associate with validation experiments is the issue of how to fairly the datasets are divided. In this study three different dataset splitting groups were investigated: Group 1(50% training, 25% validation,25% testing), Group 2(75% training, 15% testing, 10% validation) and Group 3(80% training, 10%, testing, 10% validation). In order to assess the range of possibilities. The random sub-sampling method is used for the entire three splitting groups 1,000 times each. The concern inherent in random subsample is that might some observations not ever be chosen in validation subsample, whereas maybe other observations chosen several times. Perform sampling technique 1,000 times confirms that any such disparities are maintained to the least. Hence according to use first proposed splitting, which is 50% training, 25% for each testing and validation, the value of NSE for the validation sample was 0.941. Figure 7a) displaying how the predicted value is close to the actual value with R = 0.974. Then, we were splitting data to proposed Group 2 (75% training, 15% testing and 10% validation), and the obtained results ( NSE = 0.899 and R equal 0.948) as in Fig. 7b). Lastly, the results of dividing the dataset by 80% training, 10% equally for each testing and validation set, can be seen in Fig. 7c) where the NSE value over the validation sample was found to be 0.937and R to be 0.968. It can be seen from Fig. 7 that the Group 1 (50% training,25% testing and 25% validation) split provides the most proper outcomes to test the validation of the ANN model. Consequently, all the results proved that ANN has superior performance and could be a promising tool in predicting daily SSL at the Johor River basin (JRB) in Malaysia.

Fig. 7
figure 7

ANN model using different split data groups

Conclusion

In this study, the ability of GBT, SVR, RF, and ANN models in predicting the daily SSL of the Johor River basin (JRB) in Malaysia was investigated. For this aim, data from the period of 1988–1999 was split into two samples (for training 80% and for testing 20%) and used to develop the proposed four models (GBT, SVR, RF, and ANN). The autocorrelation technique was used to select relevant input. Then, a few irrelevant and less significant inputs were eliminated, and just dominant input variables were applied in various combinations for prediction SSL. In general, each model was evaluated using error parameters (RMSE, MAE, NSE and R); they were demonstrated good performance and achieved the main objective of this research. However, by comparing the findings of ML models with each other, ANN model outperformed all other models where R = 0.989, SI = 0., 199, RMSE = 0.011053, and NSE = 0.979. Sensitivity analysis was conducted to figure out the impact of the most significant variable on prediction SSL. The best input combination found to be SSLt-1, SSLt-2, SSLt-3, SSLt-4, SSLt-5, SSLt-6, Qt, Qt-1, Qt-2, Qt-3, Qt-4, Qt-5, Qt-6. In order to test validation of the ANN model, three groups of splitting data have been examined, and the most appropriate group was selected. The ANN model has displayed reliability in predicting unseen data for prediction daily SSL. Future work could be performed on evaluation accuracies of researched GBT, SVM, RF, and ANN models to predict SSL in different rivers and climates. And apply other data-driven techniques for the same data for the Johor River basin (JRB).