Introduction

Suspended sediment load (SSL) prediction is a complicated process in river engineering practices. The method of transporting soil materials through erosive agents is referred to as sediment transport (Aksoy et al. 2019). Sediment load data is extremely useful for dam construction, measuring pollutants in rivers, forecasting territorial risks, planning stable channels and estuaries, and so on (Melesse et al. 2011; Khan et al. 2019). Monitoring and evaluating SSL is also important in determining water quality and associated hydrologic functions (Peterson et al. 2018a, b). A rise in SSL also decreases water visibility and admission to light, limiting plant and algae growth in the primary tropics (Henley et al. 2000). Moreover, the deposition of suspended sediments decreases the flow area, restricting the movement of marine life and eventually contributing to a shift in the river course. Hence, it is important to estimate the SSL data precisely. An effective prediction model may play a critical role in improving sediment load modeling in rivers. To address the issue, several traditional and artificial intelligence-based models have been developed to predict the river SSL. In general, time series methods assumed linear relationships between variables; however, these relationships cannot be effortlessly applied to real hydrological data; thus, the analysis could be enhanced by novel artificial intelligence (AI) methods (Babanehzad et al. 2020). As compared to conventional methods and other AI methods, SLFNs produce appropriate results (Wang et al. 2009). However, a single SLFN model is not enough to handle the non-stationarity of river SSL datasets. Wavelets are powerful models that can handle the nonlinearity as well as non-stationarity in datasets. Therefore, hybrid models based on wavelets are needed to be developed which can not only handle the nonlinearity and non-stationary in datasets but also shows a promising prediction performance.

Literature review

There are several AI-based models for SSL prediction. Table 1 shows a few recent significant contributions on the estimation of SSL using the AI techniques. Banadkooki et al (2020) hybridized the ant-lion optimizer with an artificial neural network (ANN) for estimating the river SSL. Hazarika et al. (2020a) compared the performance of support vector machine (SVM) and ANN for SSL estimation. Gupta et al. (2020) generated novel Huber loss function-based extreme learning machines for SSL prediction. Ghanbarynamin et al. (2020) applied several soft computing models for SSL prediction. Salih et al. (2020) explored several data-mining models for SSL estimation. Ehteram et al. (2021) developed a hybrid multi-objective whale algorithm for estimating the river SSL. Meshram et al. (2021) developed an iterative optimizer base random forest model for river SSL prediction. For the same purpose, Panahi et al. (2021) developed a black widow optimization-based algorithm-based adaptive neuro-fuzzy interface system (ANFIS) and SVM models. Sharghi et al. (2021) proposed prediction interval-based emotional ANN (EANN) with the Bootstrap technique for SSL estimation. Mohammadi et al. (2021) estimated the SSL using multilayer perceptron (MLP) hybridized with particle swarm optimization (PSO) and differential evolution (DE). Mohanta et al. (2021) used the ANFIS model for river SSL estimation. Nourani et al. (2021) applied SVM, ANFIS and feed-forward neural network (FFNN), and multilinear regressions (MLR) for SSL prediction. Sahoo et al. (2021) applied the recurrent neural network as well as the conventional SVM for river SSL estimation. Anand et al. (2021) prepared a review on deployment of the cohesionless sediments over alluvial channel. Talebkeikhan et al. (2021) did a comparative analysis of ML-based models for prediction of permeability. Gumgum and Guney (2021) studied the effect of sediment feeding on live-bed scour around the circular bridge piers. A comprehensive review of AI-based models for SSL estimation was presented in Rajaee and Jafari (2020) and Gupta et al. (2021).

Table 1 Few recent prominent contributions for SSL estimation using AI models

Drucker and his team proposed a novel SVM model called support vector regression (SVR) (Drucker et al. 1997) to solve the regression-type problems. The SVR and its variants have been fruitfully implemented for various regression-related problems including SSL prediction (Lafdani et al. 2013; Hazarika et al. 2021). Despite showing high-prediction performance, it lacks in computational cost increases exponentially as it solves a quadratic programming problem (QPP) for error minimization. In addition to that, its prediction performance degrades in the presence of noisy data. To improve the computational speed of SVR, a novel LSSVR model was suggested by Suykens and Vandewalle (1999). LSSVR solves a set of linear equations rather than solving QPP. Although its computational cost is reduced, it still lacks efficiency while dealing with noisy datasets. LSSVR has been explored by researchers in various application areas including SSL estimation. To enhance the prediction performance of SVR for noisy datasets, a novel AHSVR was suggested by Balasundaram and Meena (2019). However, recently the growing popularity of ELM (Huang et al. 2004, 2011; Liu et al. 2008) is because of its high generalization performance with low computational cost (Huang et al. 2006; Balasundaram and Gupta 2016; Hazarika et al. 2021). ELM has been fruitfully implemented by several researchers for SSL estimation. Hazarika et al. (2020b), Gupta et al. (2020), and Peterson et al. (2018a) to name a few recent applications. One of the widespread types of ANN is feed-forward networks with random weights which were popularized by Pao and Takefuzi (1992) in their research. They proposed novel RVFL networks (Pao and Takefuji 1992; Cao et al. 2015; Dai et al. 2017). In RVFL, the inputs and outputs can be directly connected, leading toward an exceptional generalization ability. The weights between the input and the hidden layers can also be generated randomly (Zhang and Suganthan 2016a). The RVFL model has been extensively investigated in a wide range of applications, including load demand forecasting (Ren et al. 2016), time-series analysis (Gao et al. 2021; Hazarika and Gupta 2020), visual tracking (Zhang and Suganthan 2016b), and others.

The AI-based models that have been developed for SSL prediction portray how a selection of different models and datasets can be made while developing a forecasting technique. It is well known that the river SSL datasets consist of non-stationary components; hence, it is tricky to come out with a decision using one method. This inspired us to suggest a hybrid prediction model. In view of this, by embedding the advantages of two distinct models, i.e., maximal overlap discrete wavelet transform (MODWT) and the powerful RVFL, the newly suggested algorithm eliminates the limitations of traditional prediction models. The high-generalization capability of RVFL with fast training speed is well known. Moreover, to the best of our knowledge, RVFL’s prediction performance has never been tested for river SSL prediction despite its numerous applications. On the other hand, wavelets are very powerful model that can handle the nonlinearity as well as non-stationary trends in datasets (Hazarika and Gupta 2020). Researchers have suggested various wavelet-based (WB) hybrid models for sediment load prediction. However, a recent study by Quilty and Adamowski (2018) presented that the prior wavelet-embedded forecasting studies generally do not focus on the best and the right practices for real-world WB prediction methodologies. Hence, they comprehensively studied the right and wrong wavelet-based studies. They also explored several boundary conditions (BC) that need to be adequately addressed to properly use a WB prediction technique in real-world issues and proposed a general WB data-driven forecasting framework (WDDFF) using MODWT and A-trous (AT). That was also the initial study that directly used the MODWT wavelet and scaling coefficients for predicting (Quilty and Adamowski 2018). Motivated by the idea of Quilty and Adamowski (2018), we have innovated a new framework by hybridizing the boundary-corrected MODWT algorithm with the fast and efficient RVFL model. The major contributions of this work are the following:

  1. 1.

    The prediction capacity of the RVFL model for SSL estimation has been tested.

  2. 2.

    Inspired by the work of Quilty and Adamowski (2018), a boundary-corrected MODWT-based approach has been adopted and a new boundary-corrected MODWT-based RVFL model is proposed.

  3. 3.

    Two different activation function has been used for the proposed boundary-corrected RVFL model.

  4. 4.

    Comparative analysis is shown with SVR, AHSVR, LSSVR, WTSVR, ELM, and RVFL models.

  5. 5.

    Additionally, the autocorrelation plots are also presented for the SSL datasets.

The next section elaborates the related studies. In the third section, the new BCWRVFL model is described. The experimental analyses are elaborated in the fourth section. In the last section the conclusion of this work is explained in brief.

Materials and methods

The RVFL

RVFL (Pao et al. 1994; Zhang and Suganthan 2016a) is a widely accepted single-layer feed-forward network (SLFN) where the output weights are chosen as an adaptable parameter (Tang et al. 2018). In RVFL networks the input and the output layers can be directly linked. In addition to the input node and hidden layer node, there also exists an enhancement node (EN) which consists of the hidden layer of the RVFL network (Shi et al. 2021).

Let an SLFN with training samples \(T\) such that \(X={\left\{\left({x}_{i},{y}_{i}\right)\right\}}_{i=1}^{T}\), where \({x}_{i}\in {\mathfrak{R}}^{d}\) and \({y}_{i}\in {\mathfrak{R}}^{e}\) are input vector and output vector with d and e dimensions, respectively. Let \(\beta ={\mathfrak{R}}^{{N}_{g}\times e}\) indicates the output weight matrix and \(Y={\mathfrak{R}}^{N\times e}\) is the output target matrix. l is the output of hidden layer. The regularized form of RVFL model can be expressed as

$$\begin{array}{c}min\\\beta\end{array}\left\|Z\beta-Y\right\|^2\;+C\left\|\beta\right\|^2\\$$
(1)

where \(Z=\left[GX\right]\) is the augmented matrix of input layer and hidden layer. \(\beta\) is the tradeoff parameter. The hidden layer output matrix \(L\in {\mathfrak{R}}^{N\times Ng}\) can be expressed as

$$L=\left[\begin{array}{ccc}{l}_{1}\left({x}_{i}\right)& \dots & {l}_{{N}_{g}}\left({x}_{1}\right)\\ {l}_{1}\left({x}_{2}\right)& \dots & {l}_{{N}_{g}}\left({x}_{2}\right)\\ \begin{array}{c}\begin{array}{cc}\bullet & \bullet \end{array}\\ \begin{array}{cc}\bullet & \end{array}\\ \begin{array}{c}\begin{array}{cc}\bullet & \end{array}\\ {l}_{1}\left({x}_{N}\right)\end{array}\end{array}& \begin{array}{c}\begin{array}{c}\\ \bullet \end{array}\\ \begin{array}{cc}& \bullet \end{array}\\ \dots \end{array}& \begin{array}{c}\bullet \\ \begin{array}{c}\bullet \\ \bullet \end{array}\\ {l}_{{N}_{g}}\left({x}_{N}\right)\end{array}\end{array}\right]$$
(2)

The weights of the hidden layer are created at random. Only the output layer weight vector \(\beta\) must be learned. By deriving (1) with respect to \(\beta\) and further equating to zero, we obtain

$$\beta ={\left(ZZ+CI\right)}^{-1}{Z}^{t}Y$$
(3)

Here,\(I\) is an identity matrix with appropriate dimension.

For any new input sample, \(x\in {R}^{n}\) the regression function of RVFL can be obtained as

$$f\left(x\right)=\left(\left[l\left(x\right)x\right]\beta \right)$$

The MODWT

The MODWT serves as a preprocessing model. The general advantage of the model is that it can handle the non-stationarity issue in time-series (TS) data. The non-stationarity is handled by decomposing the input samples into high pass filters (HPF) and low pass filters (LPF) that yield in wavelet coefficient (\({V}_{j,i}\)) and scaling coefficient (\({U}_{j,i}\)), respectively. The HPF and LPF are shown in Fig. 1 and can be denoted as (Al-Musaylh et al. 2020; Percival and Walden 2000)

$${V}_{j,i}=\sum_{l=0}^{l={L}_{j}-1}{t}_{j,l}{Z}_{i-1mod N}$$
(4)
$${U}_{j,i}=\sum_{l=0}^{l={L}_{j}-1}{S}_{j,l}{Z}_{i-1mod N}$$
(5)

where Z is the input data with \(N;j=\mathrm{1,2},\dots ,J\), where \(J\) indicates the level of decomposition at the time i The \({j}^{th}\) level \({V}_{j,i}\) and \({U}_{j,i}\) filters are denoted by \({t}_{j,l}\) and \({s}_{j,l}\) \({L}_{j}\) is the width of the \({j}^{th}\) level filter.

Fig. 1
figure 1

Diagram of MODWT decomposition

Finally, the additive reconstruction property can be used for reconstruction as (Maheswaran and Khosa 2012)

$${X}_{t}={U}_{j,i}+{V}_{j,i}$$
(6)

Proposed boundary-corrected wavelet random vector functional link (BCWRVFL)

It is very necessary to correctly use the \({V}_{j,i}\) and \({U}_{j,i}\). Hence, they should be boundary corrected (BC). BC indicates that the \({V}_{j,i}\) and \({U}_{j,i}\) should not suffer from any boundary conditions while prediction. Therefore, firstly the future data problem should be properly handled (Quilty and Adamowski 2018).

The data prediction problem and its solution

The data prediction problem takes place while a wavelet transform (WT) (e.g., AT-multiresolution analysis (MRA) and MODWT-MRA) needs TS observations that exist ahead of time \(t\) to perform a WT on a TS data at time \(t\). Hence, WT must not use future data in real-world TS data forecasting. However, as per Quilty and Adamowski (2018), the solution to the problem is simple. One should use the causal MODWT algorithm rather than the non-causal MODWT-MRA. However, in real-world forecasting problems, the decomposition level (DL), wavelet filters (WF), training, and testing should be properly chosen. The width of the filters \({L}_{j}\) can be chosen correctly as (Bašta 2014; Maslova et al. 2016)

$${L}_{j}=\left({2}^{J}-1\right)\left(L-1\right)+1$$
(7)

Additionally, the DL and WF selection is a 3-step procedure.

  1. 1)

    select MODWT or AT for wavelet decomposition,

  2. 2)

    chose the DL and WF,

  3. 3)

    eliminate the first \({V}_{j,1}\) and \({U}_{j,1}\) using (7) that results in a BC \({V}_{j,1}\) and BC \({U}_{j,1}\)

The model development stages of the proposed BCWRVFL are portrayed in Fig. 2. The normalized SSL data is given as an input to the MODWT model. The MODWT decomposes the data into some \({V}_{j,1}\) and \({U}_{j,1}\) using HPF and LPF, respectively. The \({V}_{j,1}\) and \({U}_{j,1}\) are BC in the next stage. The BC-MODWT data is given as an input to RVFL. Finally, the output is evaluated using five different performance indication measures.

Fig. 2
figure 2

BCRVFL model development stages

Experimental setup and dataset description

The simulations have been undertaken in a Windows 7 system with 8 GB RAM and ROM of 1 TB embedded with an Intel i5 processor. The MATLAB-2019 was used for conducting the simulations. The 70–30 approach has been used for the training–testing split. Moreover, the tenfold cross-validation is applied for the selection of the optimal parameters. The datasets are also normalized by taking \({\overline{x} }_{lm}=\frac{{x}_{lm}-{x}_{m}^{min}}{{x}_{l}^{max}-{x}_{m}^{min}}\), x is the input value and \({\overline{x} }_{lm}\) is the normalized value of \({x}_{lm}\). \({x}_{l}^{max}\) and \({x}_{m}^{max}\) are the maximum values as well as the minimum values, respectively. Zhang and Suganthan (2016a) found that the hardlim and sign activation function degrades the whole performance of the RVFL model while the radbas activation function always leads to good generalization performance. Therefore, the radbas activation function and the popular multiquadric activation function have been selected for ELM, RVFL, and BCWRVFL models. The radbas and multiquadric activation functions can be symbolized as:

  1. a)

    Radbas: \(f\left(a,x\right)=exp\left(-{\left(x-a\right)}^{2}\right),\)

  2. b)

    Multiquard: \(f\left(a,x\right)=\sqrt{\Vert {x}^{2}-{a}^{2}\Vert },\)

where \(\mathrm{f}\left(\mathrm{a},\mathrm{x}\right)\) indicates the output for \(x\) and\(a\). \(\Vert .\Vert\) refers to the Euclidean norm. As per the selection of the kernel in the SVR, AHSVR, WTSVR, and LSSVR models, the popular Gaussian kernel has been used. To authenticate the efficiency of the proposed BCWRVFL model, five different performance evaluators, i.e., root mean square error (RMSE), correlation coefficient (R), mean absolute error (MAE), normalized absolute error (NAE), and the ratio of sum of squared error to the total sum of squares (SSE/SST). Their definitions can be given as:

  • \(R=\frac{\sum\limits_{i=1}^{N}\left({z}_{i}-{\overline{z} }_{i}\right)\left({e}_{i}-{\overline{e} }_{i}\right)}{\sqrt{\sum\limits_{i=1}^{N}{\left({z}_{i}-{\overline{z} }_{i}\right)}^{2}} \sqrt{\sum\limits_{i=1}^{N}{\left({e}_{i}-{\overline{e} }_{i}\right)}^{2}}}\)

    • \(RMSE=\sqrt{\frac{1}{N}\sum\limits_{i=1}^{N}{\left({z}_{i}-{e}_{i}\right)}^{2}}\)  

    • \(\mathrm{MAE}=\frac{1}{\mathrm{N}}\sum\limits_{i=1}^{N}\left|{z}_{i}-{e}_{i}\right|\)  

    • \(NAE=\frac{\frac{1}{\mathrm{N}}\sum\limits_{i=1}^{N}\left({z}_{i}-{e}_{i}\right)}{\frac{1}{\mathrm{N}}\sum\limits_{\mathrm{i}=1}^{\mathrm{N}}{\mathrm{z}}_{\mathrm{i}}}\)  

    • \(SSE/SST=\frac{\frac{1}{\mathrm{N}}\sum\limits_{i=1}^{N}\left({z}_{i}-{\widehat{z}}_{i}\right)}{\frac{1}{\mathrm{N}}\sum\limits_{i=1}^{N}\left({z}_{i}-{\overline{z} }_{i}\right)}\)  

where

e:

estimated values

\(\widehat e\)  :

predicted value of e

\(\overline e\):

mean of e

z:

original values \(\widehat{\mathrm z}\) 

\(\overline{\mathrm z}\)  :

mean of z

\(\widehat{\mathrm z}\)  :

predicted value of z

max:

peak value

N:

total samples

The SSL dataset is accumulated from the Tawang Chu river with a catchment area of 2737 sq km and latitude and longitude of 27°30ʹ00ʺ to 28°24ʹ00ʺ and 91°47ʹ00ʺ to 92°28ʹ00ʺ, respectively (Panda et al. 2014). The monsoon season takes place between May and September or early October. A detailed description of the datasets is presented in Hazarika et al. (2020b) and Gupta et al. (2020). The study area is portrayed in Fig. 3. We have named the datasets from 2013 and 2015 as SSLD1 and SSLD2. In addition to that, to ensure the efficiency of the proposed models, we have also used a dataset that has been collected from the Pare river, India. The Pare river has a catchment area of 824 sq km. We name the dataset Pare SSLD. The statistics of the two datasets are presented in Table 2.

Fig. 3
figure 3

Study area for SSLD1 and SSLD2 (Panda et al. 2014)

Table 2 Details of the SSL datasets (g/L)

Results and analysis

Experiment on SSL datasets

The daily observed SSL data for SSLD1 is exhibited in Fig. 4 and 5. Very low SSL rates can be observed from January 2013 to mid of March 2013. The increasing trend can be observed from May 2013 to September 2013. This is because of the monsoon season and the trend gradually decreases from mid-September 2013 until December 2013. This is because of the decrease in rainfall during the winter season.

Fig. 4
figure 4

Observed daily sediment (g/L) for the SSLD1 dataset

Fig. 5
figure 5

Observed daily sediment (g/L) for the SSLD2 dataset

In case of the SSLD2 dataset, high SSL rates can be observed from May to Sepetember 2015. During the period several spikes can be observed; this is due to irregularity in rain and wind speed. However, negligible SSL can be noticed in between January to April and October to December (winter season).

The experimental outcomes of SVR, LSSVR, AHSVR, WTSVR, ELM multiquard (ELM M), ELM radbas (ELM R), RVFL multiquard (RVFL M), RVFL radbas (RVFL R), and the proposed BCWRVFL multiquard (BCWRVFL M) and BCWRVFL radbas (BCWRVFL R) are presented in Table 3. Various performance indicators, viz., RMSE, MAE, SSE/SST, NAE, and R, have been used to evaluate the models. Generally, the R index directly compares the observed value and the predicted value. It is observed that (a) for SSLD1, there is a 5.3443%, 6.4344%, 5.7246%, 6.3415%, 11,1111%, 5.4934%, 9.6843%, and 9.4501% increase in R value for the proposed BCWRVFL M model compared to SVR, LSSVR, AHSVR, WTSVR, ELM R, ELM M, RVFL R, and RVFL M.

Table 3 Performance of the models on the SSL time-series datasets

(b) For SSLD2, there is a 10.4244%, 12.8325%, 13.7213%, 11.1511%, 18.9818%, 8.9612%, 5.7537%, and 26.1948% increase in R value for the proposed BCWRVFL R model compared to SVR, LSSVR, AHSVR, WTSVR, ELM R, ELM M, RVFL R, and RVFL M.

(b) For Pare SSLD, there is a 20.1167%, 29.0193%, 29.6871%, 26.5539%, 17.7851%, 16.4982%, 84.7979%, and 28.4154% increase in R value for the proposed BCWRVFL R model compared to SVR, LSSVR, AHSVR, WTSVR, ELM R, ELM M, RVFL R, and RVFL M.

Further, the average rank based on the performance indicators is also tabulated in Table 4. One can notice from Tables 3 and 4 that the proposed BCWRVFL model shows better or comparable prediction performance. To portray the relationship between the observed and the predicted values, the observed versus prediction values of the models along with their R2 values are also shown in Figs. 6 and 7 for the SSLD1 and SSLD2 datasets, respectively. From Fig. 6 it is observed that the proposed BCWRVFL M showed better R2 value (0.6724) compared to SVR, LSSVR, AHSVR, WTSVR, ELM R, ELM M, RVFL R, and RVFL M models. Moreover, from Fig. 7, one can notice that the proposed BCWRVFL R showed better R2 value (0.8013) compared to SVR, LSSVR, AHSVR, WTSVR, ELM R, ELM M, RVFL R, and RVFL M models. The following implications can be derived from Table 2, Table 3, Fig. 6, and Fig. 7:

  1. a)

    It is noticeable from Table 3 that the BCWRVFL multiquard model shows best MAE, SSE/SST, NAE, and R values for SSLD1 dataset.

  2. b)

    For SSLD2 dataset and SSLD3, the BCWRVFL multiquard model shows best NAE and R values, respectively.

  3. c)

    Moreover, the proposed BCWRVFL radbas model shows best RMSE values for all datasets.

  4. d)

    In addition to that, the BCWRVFL with radbas activation function shows the best SSE/SST and R values for SSLD2 and best NAE value for SSLD3.

  5. e)

    Fig. 6 shows the observed versus prediction plots of the reported models for SSLD1. One can notice that the proposed BCWRVFL multiquard model is highly correlated.

Table 4 Ranks based on performance indicators for the reported models
Fig. 6
figure 6

Observed versus predicted SSL of SVR, LSSVR, AHSVR, ELM radbas, ELM multiquard, RVFL radbas, RVFL multiquard, BCWRVFL radbas, and BCRVFL and BCRVFL multiquard in SSLD1 dataset

Fig. 7
figure 7

Observed versus predicted SSL of SVR, LSSVR, AHSVR, ELM radbas, ELM multiquard, RVFL radbas, RVFL multiquard, BCWRVFL radbas, and BCWRVFL multiquard in SSLD2 dataset

Like Fig. 6, in Fig. 7 where the observed versus predicted values are shown for SSLD2, the proposed models show high correlation.

Moreover, the autocorrelation (ACF) as well as partial ACF functions are also presented in Figs. 8 and 9 for SSLD1 and SSLD2, respectively. The partial ACF removed the dependence on intermediate elements. Partial ACF identified how strongly the SSL data is correlated.

Fig. 8
figure 8

Autocorrelation and partial autocorrelation function for SSLD1

Fig. 9
figure 9

Autocorrelation and partial autocorrelation function for SSLD2

The model performances based on different indicators are plotted in Figs. 10 and 11 for SSLD1 and SSLD2 datasets, respectively. It can be noticed from Fig. 10 that the proposed BCWRVFL M shows the best NAE, SSE/SST, MAE, and R values while the BCWRVFL R shows the best RMSE value. Moreover, from Fig. 11 one can conclude that the proposed BCWRVFL R shows the best NAE score, while the BCWRVFL M shows the best RMSE, SSE/SST, and R values.

Fig. 10
figure 10

Visualization of the evaluators for the SSLD1 dataset

Fig. 11
figure 11

Visualization of the evaluators for the SSLD2 dataset

Experimental analyses on a real-world time-series dataset

Moreover, to further check the applicability of the proposed BCWRVFL model on real-world TS datasets, we have conducted an experiment on a TS dataset named “POPULATION,” which is the data of the total population in India during the time period of 1961 to 2019. The dataset is downloaded from https://data.worldbank.org/ and has been recently used by Hazarika and Gupta (2020). The experimental results of BCWRVFL are compared with the traditional SVR, LSSVR, AHSVR, WTSVR, ELM, and RVFL models. The results are portrayed in Table 4. It can be noted that the BCWRVFL shows excellent prediction performance for the “POPULATION” dataset. Table 5.

Table 5 Prediction performance of the reported models on “POPULATION” dataset (best result is bolded)

Conclusion

A novel hybrid model was developed and used to predict the SSL in this study. It is well known that the river SSL datasets contain non-stationary components, making it difficult to decide using a single method. This prompted us to propose a hybrid prediction model. The newly proposed algorithm eliminates the limitations of traditional prediction models by incorporating the benefits of two distinct models, namely, the maximal overlap discrete wavelet transform (MODWT) and the powerful RVFL. The boundary-corrected MODWT is combined for this purpose to create the hybrid model known as BCWRVFL. Experiments are performed on two SSL datasets that are accumulated from the Tawang Chu river, India, and an SSL dataset that is collected from the Pare river, India. The proposed BCWRVFL models are compared with SVR, LSSVR, HSVR, WTSVR, ELM, and RVFL models and evaluated using five performance indicators. The experimental outcomes reveal the importance and potential of the BCWRVFL model for SSL estimation as it shows close agreement with the observed records. The proposed BCWRVFL model can be applied on several real-world time-series applications such as wind speed prediction, price forecasting, energy consumption prediction, and others. However, the main limitation of the study is that we consider only the SSL data. In the future, some other climatological parameters like rainfall intensity, wind speed, and evaporation are needed to be considered along with the SSL values. It is suggested that the model be tested in areas/countries with more seasons and variability in weather conditions in the future to determine its predictive strength. Moreover, it would be fascinating to develop a wavelet-based deep RVFL network for SSL prediction in the future.