1 Introduction

Rivers are considered among the major water supply sources in many parts of the world. They undergo a variety of changes due to erosion and sedimentation. Studying changes in the geometrical and morphological characteristics of rivers under the influence of hydrodynamic phenomena such as sediment transport, tides, and salinity changes requires an in-depth understanding of these phenomena (Sharifi et al. 2021). Meticulous examination and prediction of the changes will significantly increase the safety of the constructions along rivers. On the other hand, a lack of knowledge about these phenomena can impose heavy damages and costs on projects (Li et and Heap 2011). Therefore, recognition of flows’ hydraulic properties through hydrodynamic analysis is vital in designing and operating the hydraulic structures located around and along rivers, especially in times of floods (Hardy et al. 1999; Raber et al. 2007; Podhornyi et al. 2013). Numerical models have several advantages over physical models when it comes to simulating the behavior of coastal systems. Numerical models are easier to implement and modify, as they only require software and hardware resources, while physical models need laboratory facilities and equipment. Numerical models can also handle a wide range of flow parameters and scenarios, such as different wave conditions, water levels, and sediment characteristics, while physical models are limited by scaling issues and experimental constraints. Furthermore, numerical models are more cost-effective and time-efficient than physical models, as they do not involve physical construction, maintenance, and measurement of the model domain. Therefore, numerical models are preferred for studying complex and dynamic coastal processes (Larson 2005; Quarteroni and Quarteroni 2009).

The accuracy of the results of numerical hydrodynamic flow modeling is a function of the correctness of the riverbed’s topographic data (Lai et al. 2018). The most common method for investigating the topography of a river’s physical model is to use the data collected for bathymetry using the SoNARs (Sound Navigation and Ranging) method (Intelmann 2006; Nittrouer et al. 2008; Colbo et al. 2014). This approach is very accurate and capable of producing topographic data of the riverbed with high quality and clarity. Nevertheless, the SoNARs method is so costly and lacks accuracy in shallow water conditions (Lai et al. 2018). Thus, various studies have been conducted to introduce inexpensive and accurate methods for bathymetry operations during the past few years. Mervade et al. (2005), for example, introduced a model based on establishing relationships between geometric properties, bottom-line position, and cross-sections of rivers. Lai et al. (2018) proposed an efficient model using Laplace’s theory and network generation using elliptical theory for flow lines to reconstruct the geometric sections of rivers. The information about a river in China was also used to validate the model. The results revealed that the model could estimate the longitudinal section of the river based on the information on cross-sections. Other methods presented include interpolation methods such as the kriging method (KM), inverse distance weighting technique (IDW), and nearest neighbor model (NN), which are based on historical data and interpolation of other points in the riverbed. Interpolation methods outperformed the traditional ones (Vogel and Mrker 2010; Li and Heap 2011; Bailly du Bois 2012). Since the historical data on a river’s geometry are scarce in some cases, and the available data are only related to cross-sections, it is challenging to find new points for river geometry, and therefore, accurate results cannot be expected. This leads to an urgent need in developing methods that can model river bathymetry in different conditions with high accuracy. Yongfei Fu et al. (2022), Using Fuzzy Comprehensive Evaluation Model in Xiaoqing River, Eastern China for assessment of a multifunctional river.

In recent years, methods based on soft computing and artificial intelligence have overcome the weakness of traditional methods in estimating and modeling many complex phenomena (Riahi-Madvar et al. 2021; Mai et al. 2020; Ben Seghier et al. 2021; Rad et al. 2022; Jafari-Asl et al. 2021a; Rahmanshahi et al. 2023). Artificial Neural Networks (ANNs), support vector machine (SVM), response surface method (RSM), gene expression programming (GEP), adaptive network-based inference system (ANFIS), and multivariate adaptive regression spline (MARS) are among these approaches. Among all, ANN and ANFIS are very popular with users thanks to their ease of use, flexibility, and high accuracy. Hence, many attempts have been made to apply ANN and ANFIS in hydraulic engineering. For instance, they have been used for flow estimation in rivers, design of hydraulic structures, modeling evaporation from the surfaces of rivers and lakes, groundwater level estimation, and water quality modeling (Afzali Ahmadabadi et al. 2023; Jafari-Asl et al. 2021b; Seo et al. 2015; Dikshit et al. 2020; Modaresi et al. 2018; Ebtehaj et al. 2020; Babaei et al. 2018; Ohadi and Jafari-Asl 2021; Walton et al. 2019; Langridge et al. 2020).

However, no comprehensive study has yet been conducted using these two methods to solve the problem of bathymetry in rivers. Thus, the present study was undertaken to present a new approach based on AI models, including ANFIS and ANN, for the accurate estimation of river geometry. To the best knowledge of the authors, the capability of ANN and ANFIS models has not been previously investigated for modeling the bathymetry of rivers. The rest of this paper is organized as follows: The theory of ANFIS, ANN, River Channel Morphology Model (RCMM), and details of the case study are proffered in Section 2. The results of the proposed framework are presented and analyzed in Section 3. A comparative study is carried out in Section 4. Finally, in Section 6, conclusions and future work are given.

2 Methodology

2.1 Artificial Neural Network (ANN)

Typically, an ANN consists of three layers, including input, hidden, and output layers. The outputs of the first layer are the inputs of the second layer, and the layers mediating the input and output layers are called the hidden layer. Each hidden layer is made up of a set of neurons responsible for performing calculations (Noori and Kalin 2016). ANN utilizes a supervised learning technique for training. Network training aims to learn the law between the neurons and provide a suitable output for each input after proper training, which requires adequate adjustment of network weights.

In the training process, the predicted values are compared with the actual values in each iteration the network is run in order to achieve the least error. Then, if an error occurs, the weighting coefficients of the input vectors are corrected. The neurons in the hidden layer are added as a bias according to Eq. (1).

$${y}_{0}=\sum_{i=1}^{n}{W}_{i}{X}_{i}+b$$
(1)

where \({W}_{i}\) is the weight of the connection branch for input \({X}_{i}\) and b is the bias of the network.

If we use a nonlinear transfer function (also known as the activation function), the output value is calculated based on Eq. (3).

$$f(X)=\frac{1}{1+{e}^{-X}}$$
(2)
$${y}_{0}=f(X)\sum_{i=1}^{n}{W}_{i}{X}_{i}+b$$
(3)

Finally, the ANN equation is represented by the following mathematical equation:

$${y}_{0}=f\left[\sum W{O}_{kj}\sum W{I}_{ij}{X}_{i}+{b}_{1}\right]+{b}_{2}$$
(4)

Determining the optimal number of neurons is of particular importance in network training (Ben Seghier et al. 2021). Many of them will cause overfitting, while a small number will result in underfitting. The optimal number is usually a result of sensitivity analysis. It is noteworthy that an activation function (often sigmoid, tangent, or linear) is used in an ANN to create a connection between inputs and outputs. Figure 1 shows the overall structure of an ANN.

Fig. 1
figure 1

A schematic structure of ANN

2.2 Adaptive Neuro-Fuzzy Inference System (ANFIS)

ANFIS is a neural network based on Mamdani and Sugeno Fuzzy Inference Systems.Mamdani fuzzy inference was first introduced as a method to create a control system by synthesizing a set of linguistic control rules obtained from experienced human operators. In a Mamdani system, the output of each rule is a fuzzy set. Since Mamdani systems have more intuitive and easier to understand rule bases, they are well-suited to expert system applications where the rules are created from human expert knowledge, such as medical diagnostics. The output of each rule is a fuzzy set derived from the output membership function and the implication method of the FIS. These output fuzzy sets are combined into a single fuzzy set using the aggregation method of the FIS. Then, to compute a final crisp output value, the combined output fuzzy set is defuzzified using one of the methods described in Defuzzification Methods.Sugeno fuzzy inference, also referred to as Takagi-Sugeno-Kang fuzzy inference, uses singleton output membership functions that are either constant or a linear function of the input values. The defuzzification process for a Sugeno system is more computationally efficient compared to that of a Mamdani system, since it uses a weighted average or weighted sum of a few data points rather than compute a centroid of a two-dimensional area . In the current research, we use S Sugeno fuzzy inference system, which combines ANN and FIS in a coordinated structure. The ANFIS architecture consists of 5 layers, as shown in Fig. 2. The nodes of the first layers are adaptive, while the other nodes are fixed. If x is equal to A1 and y is equal to B1, the first rule is as follows (Riahi-Madvar et al. 2021):

Fig. 2
figure 2

A schematic structure of ANFIS

$$f={p}_{1}x+{q}_{1}y+{r}_{1}$$
(5)

If x is equal to A2 and y is equal to B2, the second rule is as follows:

$$f={p}_{2}x+{q}_{2}y+{r}_{2}$$
(6)

where r and q are parameters that must be determined during the ANFIS training process. The application of different layers of ANFIS is presented in Eqs. (7) and (8).

$${Q}_{1,i}={\sigma }_{{A}_{i(x)}}$$
(7)
$${Q}_{1,i}={\sigma }_{{B}_{i(x)}}$$
(8)

where \({A}_{i}\) is the linguistic variable, x is the input of node i and Q1, and i is the membership function of \({A}_{i}\), which is usually defined as the Gaussian function (Eq. 9).

$${\sigma }_{{A}_{i\left(x\right)}}=\mathrm{exp}\left(\frac{-{(x-c)}^{2}}{{\sigma }^{2}}\right)$$
(9)

where σ is the standard deviation and C is the center of the Gaussian membership function. In the second layer, the excitation intensity of each principle is presented in Eq. (10).

$${w}_{i}={\sigma }_{{A}_{i\left(x\right)}}\times {\sigma }_{{B}_{i(x)}}, i=\mathrm{1,2}$$
(10)

In the third layer, the excitation intensity of each rule is calculated as follows:

$$\overline{w }=\frac{{w}_{i}}{{w}_{1}+{w}_{2}}$$
(11)

In the fourth layer, the sum of fuzzy rules is calculated according to Eq. (12):

$${\overline{w} }_{i}{f}_{i}={\overline{w} }_{i}({p}_{i}x+{q}_{i}y+{r}_{i})$$
(12)

In the fifth layer, all outputs of the fourth layer are summed as follows:

$$\sum_{i}{\overline{w} }_{i}{f}_{i}=\frac{{w}_{i}{f}_{i}}{\sum_{i}{w}_{i}}$$
(13)

In the proposed thalweg modeling approach of the present study, as shown in Fig. 3, the geographic coordinates of the right and left banks of a river (Xs, Ys, and Zs) are used to estimate the geographic coordinate of the thalweg (X, Y, and Z), respectively.

Fig. 3
figure 3

Flowchart of proposed AI-based model

2.3 River Channel Morphology Model (RCMM)

RCMM is an efficient conceptual model for estimating the thalweg of rivers. This model uses the information of a river’s planform to simulate the topography (Merwade et al. 2005). In brief, the erosion and deposition process lead to construct of an asymmetric channel cross-section in meandering rivers. RCMM conceptualizes this physical process to build an empirical channel cross-section. The main steps of RCMM are presented below (Dey et al. 2019):

  • Stage 1: In this stage, the configuration of a river/channel (i.e., channel depth and width) goes into a non-dimensional space.

  • Stage 2: A power law function (Eq. 14) is used to locate the thalweg by utilizing the radius of curvature of the channel centerline.

    $${t}^{*}=\left\{\begin{array}{cc}a{\left({r}^{*}\right)}^{-b}-0.5 &{r}^{*}>2\\ 0,& {r}^{*}\le 2\end{array}\right.$$
    (14)

    where \({t}^{*}\) denotes the location of the river thalweg in the normalized coordinate system, and \({r}^{*}\) is the normalized radius of curvature of the centerline segment. The coefficients of \(a\) and \(b\) are calibrated for the river based on the observed bathymetry data.

  • Stage 3: In this stage, a composite beta function is used to create a cross-sectional shape using three points from two left- and right-bank locations and the thalweg.

    $$\widehat{{\mathcal{Z}}^{*}}=\left\{f\left({n}^{*}|{\alpha }_{1}, {\beta }_{1}\right)+f\left({n}^{*}|{\alpha }_{2}, {\beta }_{2}\right)\right.\}\times k$$
    (15)

    where \(\widehat{{\mathcal{Z}}^{*}}\) represents the depth predicte, \(f\left({n}^{*}|{\alpha }_{1}, {\beta }_{1}\right)\) and \(f\left({n}^{*}|{\alpha }_{2}, {\beta }_{2}\right)\) show the two beta functions, and \(k\), \({\alpha }_{1}\), \({\beta }_{1}\), \({\alpha }_{2}\) and \({\beta }_{2}\) are coefficients depending on the thalweg location.

This leads to an asymmetric channel cross-section with the thalweg being closer to the outer bank in a meander.

2.4 Data Collection

To determine the Thalweg of a river using the proposed framework, it is first necessary to gather the topographic data of the river. Therefore, in the present study, the geographical position of the river was first identified on Google Earth, and then, a photo of the desired area was captured. In the Next step, the photo was uploaded into Global Mapper software and was georeferenced regarding the available coordinates. Afterward, the digital elevation model (DEM) file for the desired coordinates was downloaded and extracted to generate contour lines. Now it's time to digitize the map, which is done in Autocad Civil 3D software using the “Pline” command on the borders and any important lines in the river environment. Then, one takes the extracted contour lines to the Civil software and draws the desired lines, including the cross sections perpendicular to the left and right shore. In addition, the correlation of the points on the constructed surface can be calculated by using the “Point-divide obj” command on the desired points from drawn lines and interpolating them. Finally, the position of the concave line is determined using the points with the lowest height on cross sections.

3 Results and Implementation

3.1 Case Study I

The performance of the proposed approach using Adaptive Neuro-Fuzzy Inference System (ANFIS) and Artificial Neural Network (ANN) models for imitating the geometry of rivers was evaluated in two different case studies. The first case is the Shuibeicun reach in the Qinhe River in China (Fig. 4). Information on the river’s coordinates of the X-, Y-, and Z-axes of its left, right banks, and thalweg is available.

Fig. 4
figure 4

Location of the Shuibeicun reach (Liu et al. 2018)

Second to preparing the required data (i.e., normalization and segmentation for training and testing) and adjusting the parameters of ANFIS and ANN based on sensitivity analysis, researchers modeled the transversal and longitudinal sections of the river. The ANFIS and ANN adjustment values are presented in Table 1.

Table 1 Parameters setting for two methods

It is worth noting that the normalization of input data to AI-based models leads to data homogenization and increases prediction accuracy. In this study, Eq. (16) was used to normalize the collected data.

$${x}_{n}=\frac{{x}_{max}-x}{{x}_{max}-{x}_{min}}$$
(16)

where \({x}_{n}\) is the normalized value of variable \(x\), and \({x}_{max}\) and \({x}_{min}\) are the maximum and minimum values, respectively.

To evaluate the performance of ANFIS and ANN models and to examine their prediction accuracy, the common assessment evaluation criteria in the technical literature, including Root Mean Square Error (RMSE) was used. It is quite clear that the lower the RMSE value, the higher the model’s prediction accuracy is.

$$RMSE=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{\left({y}^{i}-{\widehat{y}}^{i}\right)}^{2}}$$
(17)

where \({y}^{i}\) and \({\widehat{y}}^{i}\) are the observed and predicted values, respectively. Also, n indicated the number of samples. It should be noted that each of the coordinates, including X, Y, and Z, is modeled separately for the river's thalweg. For example, only the right and left X's have been used to predict the x-coordinates of thalweg. For Y and Z, only the left and right Y's and Z's have been used. Therefore, the parameter n in the above equation is the number of data related to each coordinate in each stage of modeling.

The data collected for the geometry of the studied rivers was modeled in the following combinations:

  1. (a)

    Train (50%), Test (50%).

  2. (b)

    Train (30%), Test (70%).

To eliminate the impacts of the selected data on the models’ performance, we ran each model 10 times and randomly selected and modeled new data in each iteration. The best results for the first case study geometry by using ANFIS and ANN are reported in Table 2.

Table 2 The values of statistical indicators for the AI-based proposed models for predicting the thalweg of rivers

As can be seen from Table 2, the two methods have had similar results, but based on the average evaluation criteria of RMSE, the accuracy of ANFIS estimation is generally higher than ANN. It can also be inferred from the table that both models (ANN and ANFIS) have higher accuracy through the first type of combination, i.e. Train 50%, Test 50%, than the combination of the second type, i.e. Train 30%, and Test 70%. It is noticeable that the predicted and measured data sets are presented in Appendix.

Figure 5 depicts a scheme of the predicted coordinates of the river thalweg using both models. This figure is related to the best results of each model in 10 runs. With the ANFIS model, in the test phase, the RMSE value of 0.32 for the X-axis, 0.62 for the Y-axis, and 0.51 for the Z-axis were selected out of all the runs.

Fig. 5
figure 5

Results of the predicted thalweg of China river using ANFIS and ANN

Likewise, for the ANN model, the RMSE value of 4.48 for the X-axis, the RMSE of 4.9 for the Y-axis, and the RMSE of 5.36 for the Z-axis were selected out of all the runs.

As shown in Fig. 5, the modeled thalweg is in compliance with the measured thalweg. Figure 6 also shows the correlation between the predicted and measured values in both ANFIS and ANN models in the X-and Y-axes for all samples, including the training and testing phases. As can be noticed, the AI-based methods have a high accuracy in estimating the geometry of rivers owing to the point that the correlation coefficients of both models are equal to 0.99 and 0.98 for estimating the X-axis and Y-axis values, respectively.

Fig. 6
figure 6

The overall scatterplots of the proposed approach of ANFIS and ANN models (training and testing phases)

To show the models’ accuracy in estimating the coordinates of the thalweg based on AI methods, Fig. 7 shows the coverage of the estimated and observed points Z. As can be seen, the models have accurately estimated the thalweg height.

Fig. 7
figure 7

The overall time series plots using the ANN and ANFIS models (training and testing performance)

3.2 Case Study II

The second case study is a part of the Gaz River with a length of 22 km, located in Khuzestan Province, Iran. The Gaz is 55 km long and is considered the main drainage of the Gaz drainage basin, which originates from the Bashagard Mountains and flows into the Oman Sea. The topology information of the river used in this study was collected via the field survey (camera mapping) method. Figure 8 shows the Gaz basin and river.

Fig. 8
figure 8

Location of the second case study

According to the steps followed to predict the thalweg of the Qinhe River, the thalweg of the Gaz was modeled using ANFIS and ANN. To this end, the same adjustment parameters used in the previous case study were applied in the case of Gaz River too.

Among 10 runs for ANFIS and ANN models. According to Table 3, both models have approximately the same accuracy. However, like the previous case study, the ANFIS model was generally more accurate and robust than the ANN. Moreover, the scheme of the thalweg obtained through the ANFIS and ANN models is illustrated in Fig. 9, for the purpose of showing the accuracy of the AI-based models.

Table 3 The values of statistical indicators for the AI-based proposed models for the thalweg of rivers prediction in overall phases
Fig. 9
figure 9

Results of the predicted thalweg of Gaz River using ANFIS and ANN

Figure 9 illustrates that AI-based models simulate the geometry of rivers with high accuracy, which makes them a good alternative to field calculations and analytical models. Like the first case study, the Fig. 9 is related to the best results from each model in 10 runs.

With the ANFIS model, the RMSE value of 1.69 was selected for the X-axis ,and the RMSE values of 2 and 0.34 were picked for the Y- and the Z-axes, respectively. Similarly, with the ANN model, the RMSE values of 1.29, 1.38, and 0.52 were selected for the X-, Y-, and Z-axes out of 10 runs. Also, Fig. 8 shows the error of ANFIS and ANN models for the Z-axis of the thalweg.

Figures 10, 11, and 12 indicates that both models could estimate the coordinates of the thalweg with acceptable accuracy.

Fig. 10
figure 10

The curve fitting of the proposed framework to predict the coordinate X using ANFIS and ANN for Gaz river

Fig. 11
figure 11

The curve fitting of proposed framework to predict the normalized coordinate Y using ANFIS and ANN in Gaz river

Fig. 12
figure 12

The curve fitting of proposed framework to predict the normalaized coordinate Z using ANFIS and ANN in Gaz river

4 Comparative Performance

As mentioned in Section 2.3, RCMM is a conceptual model that has widely been used to estimate the thalweg of rivers. In this section, a comparative performance test of the AI-based framework against the RCMM model was carried out to show the efficiency of AI-based models. To this end, the geographic coordinates X and Y of the thalweg in the case study I were predicted based on the geographic coordinate of the right and left banks of the river using RCMM and the outcomes were compared with those of the two AI-based models.

Table 4 shows the values of statistical metrics for four models, including ANFIS, ANN, and RCMM models. According to this table, it is clear that the AI-based models outperform famous conceptual models in terms of statistical parameters (i.e., RMSE).

Table 4 Comparison of prediction results of AI- based methods and RCMM

Moreover, the best model is the ANFIS, with RMSE= 1.69 m, RMSE= 2 m, RMSE= 0.34 m. while the accuracy of the RCMM model is lower than two AI-based models.

Overall, it can be said that the numerical methods are not able to correctly simulate the thalweg of rivers since they usually face remarkable errors for measured points. However the AI-based models have been able to correctly simulate the thalweg of rivers, while the ANFIS method has had the most compatibility among them. Thus, it can be claimed that the ANFIS is the best model for the prediction of the thalweg of rivers. It can be concluded that by using AI methods, the thalweg of rivers can be accurately estimated with the lowest computational costs.

5 Discussion

In this study, the authors compared the performance of the AI-based models with that of a numerical model, namely RCMM, using data from two rivers in China and Iran. The results showed that AI-based models performed better than RCMM in terms of accuracy and efficiency. The ANFIS model also outperformed the ANN model, especially when data availability was limited. There are some questions such as why do the AI-based models work, why do they give better results than the analytical model, and what are the advantages and disadvantages of both types of models? It can be say that AI-based models are data-driven methods that learn from patterns and relationships in the input and output data, without requiring any prior knowledge or assumptions about the system or the process that they are modeling. Therefore, they can adapt to different situations and conditions, as long as they have enough data to train and test them. Moreover, the results shown that that AI-based models are highly dependent on the quality and quantity of the data that they use, and that they may not be able to generalize to new cases or scenarios if the data is not representative of the system or the process that they are modeling. The results also showed that AI-based models are often considered as black-box methods that do not reveal how they make their predictions or what features they use to make their decisions.

6 Conclusion

Given the importance of rivers’ topography in their hydrodynamic modeling, this study proposed a new approach based on ANFIS and ANN methods for modeling and estimating a river’s thalweg. The efficiency of these models was evaluated in two cases, the Qinhe River and the Gaz River in China and Iran, respectively. Also, the stability of the models was evaluated, and the errors caused by the selection of training and testing data were eliminated by replicating the modeling 10 times by selecting various data. The modeling results for the first case, which had a smaller database, showed that ANN had better accuracy than the ANFIS in modeling the river’s thalweg. Also, for the second case, which had a large database, the ANFIS simulated the river coordinates more accurately. In general, comparing these two methods shows that the ANFIS model is more robust than the ANN model because the values of statistical parameters of the model have been much lower than ANN. Nevertheless, both models can be used as effective tools for estimating the thalweg of rivers. Moreover, comparing results obtained from the AI-based models to the RCMM model showed that the ANN yielded the lowest prediction error. Since determining the weight and bias values in ANN and ANFIS parameters has a significant role in the accuracy of prediction, it is suggested to use a combination of meta-heuristic optimization methods to determine these parameters in future studies. It is also recommended to present a robust model for estimating the topography of rivers by combining an uncertainty-based model such as MCS, ANFIS, and ANN models, taking into account the effect of existing uncertainties. Also, in order to better evaluate the effectiveness of artificial intelligence models, it is recommended that the models be trained using data from 10 rivers and then applied to another river to determine their accuracy.