1 Introduction

Human beings have endeavored to determine the features of water, bring it under control, define the laws governing its movement, prevent possible threats, and make the best use of water to survive in every phase of their lives (Kuru and Tezer 2020). The increase in water demand owing to global warming, drought, climate change, unplanned consumption, industrialization and agricultural use puts pressure on clean water resources. The way to reduce the threat is the efficacious use of available resources and water management. Sustainable water resources planning and effective river basin management present significant benefits for engineers and decision-makers. One of the critical points to ensure sustainability is the forward estimation of river flows (Kilinc and Haznedar 2022).

Long-term flow forecasting is required for optimal management of hydrological resources, ecological restoration, flood mitigation, and power generation (Ni et al. 2020). Numerous models are utilized to enhance the prediction accuracy of hydrological time series. Generally, these models are divided into physical-based and data-based models. Physical-based models require multi-source knowledge, powerful mathematical calculation tools, and a complex parameter optimization process. On the other hand, data-based models are an efficient alternative that establishes direct connections between input and output data without comprehending complex physical mechanisms (Chau et al. 2005).

Nowadays, computational issues have risen exponentially compared to previous years. Those growing issues have led to the emergence of too much data, which enabled the process of a large amount of data. In addition, it has become possible to predict a future situation or event from old data stacks. Machine learning emerged from these large data piles to solve certain problems or make predictions for certain situations (Dehghani et al. 2018).

Hydrological systems have a complex nature and are quite difficult to implement over large areas. Thus, researchers have turned to the use of relatively faster and more flexible artificial intelligence techniques based on simplifications to estimate the hydrological flow. Artificial intelligence techniques have added a new dimension to the theoretical modeling approach as they can solve various issues and provide successful results related to water resources and hydrological engineering. The hybrid and updated designs of the models mentioned in Artificial Neural Networks (ANN), Adaptive Neuro-Fuzzy Inference System (ANFIS), Gene Expression Programming (GEP), Long Short Term Memory (LSTM), and Gated Recurrent Units (GRU) are the main topics in the relevant literature (Gerger et al. 2021).

In recent years, the performance of models has been analyzed in scientific studies by integrating artificial intelligence techniques with various parameters such as temperature. Gerger et al. (2021) evaluated the relationship between estimation and flow measurement with the help of ANFIS, ANN and GEP methods by using the monthly average current values of two stations within the boundaries of the Dicle Basin. Kisi et al. (2014) modeled the rainfall-runoff relationship for a small basin in Turkey by conducting ANFIS, ANN and GEP methods with 4 years of data. These models were compared to the multi-liner regression (CDR) method. As a result, it has emerged that the gap method was successful in modeling the estimation-flow process. Moreover, it was thought the model mentioned could be considered as an alternative to other applied artificial intelligence models. Akrami et al. (2014) conducted a hybrid study on the connection of forecasting the flow in the Klang River basin with ANFIS and enhanced MANFIS. The outcomes of the study demonstrated that the MANFIS method contains a lower error and calculation complexity and higher precipitation prediction when MANFIS compared to the traditional ANFIS model. Yaseen et al. (2017) determined the monthly flow estimation of the Pahang River in favor of a new approach of ANFIS, called ANFIS-FFA. The study displayed that FFA could improve the prediction accuracy of the model in a situation where the hybrid model was compared with traditional ANFIS. In addition to all these studies, Calp (2019) analyzed the amount of local precipitation with ANFIS-GA and ANN models which were optimized with the Genetic Algorithm (GA) by using 10 years of meteorological data from the Basel region of Switzerland. It has been revealed that the model could be efficiently used in the prediction of meteorological events since the predictive success of the results obtained was quite high. Zare and Koch (2018) have studied groundwater level fluctuations in the Miandarband plain of Iran. They applied the new hybrid Wavelet-ANFIS model with various combinations of input and main wavelets for the estimation of the ripple. The hybrid model performed better than other model variants. Seifi and Riahi-Madvar (2019) developed a hybrid digital intelligence model for distribution in channel flows. ANFIS, which was optimized by GA and intelligence based on ANN operated. The study revealed that the model presented better results than other classical methods. According to Azad et al. (2018), Particle Swarm Optimization (PSO) exhibited an improvement in the performance of classical ANFIS for all measurements in river flow estimation. In order to optimize the prediction performance, Qasem et al. (2017) analyzed the models of ANFIS-GA, ANFIS-DEC, and ANFIS-PSO. As a result of his studying, ANFIS-PSO was found to be the one that provided the best performance among evolutionary algorithms. Xu et al. (2020) applied the LSTM network, which targeted the time series data field in the process of predicting the river flow estimation. LSTM's performance was compared with Support Vector Regression (SVR) and Multilayer Perceptron Method (MLP). As a result of the performance, it turned out that LSTM provided more promising results. Likewise, Latifoglu et al. (2015) precipitated river flow estimation by using LSTM networks. He presented the performance analysis of the new model developed by Single Spectrum Analysis (TSA) for the forecast data. On the other hand, Kara (2019) proposed the LSTM network to create an efficient and accurate model for daily solar radiation estimation. The effectiveness of the proposed method was compared with machine learning algorithms. It has been suggested that the mentioned model has more pleasing performance than other machine learning models. Lian et al. (2022a, b) proposed a cause-driven runoff forecasting framework based on the LSTM hybrid model. The results showed that the hybrid model was superior to other models. Poursaeid et al. (2022) used different artificial intelligence models (extreme learning machine (ELM), least square support vector machine (LSSVM), adaptive neuro-fuzzy inference system (ANFIS), and multiple linear regression (MLR) model) to simulate and estimate the groundwater level. The results revealed that ELM model was more accurate than other methods. Wang et al. (2022) proposed a novel application of the wavelet decomposition-prediction-reconstruction (WDPRM) model. The model consisted of wavelet decomposition, a particle swarm optimization support vector machine (PSO-SVM), and an artificial bee colony algorithm-optimized BP neural network (ABC-BP). The results indicated that the WDPRM model has advantages against single prediction methods. Khosravi et al. (2022) hybridized convolutional neural network (CNN) using a metaheuristic bat-inspired algorithm to improve the test accuracy of streamflow forecasting. The model was compared with well-known benchmarks models (multilayer perceptron, adaptive neuro-fuzzy inference system, support vector regression and random forest). The evaluation metrics showed that the hybrid CNN-BAT algorithm outperformed the other algorithms. Lian et al. (2022a, b) proposed a climate-driven streamflow forecasting framework (CDSF) to improve runoff forecasting accuracy. This framework consisted of principal component analysis (PCA), long short term memory (LSTM), and bayesian optimization (BO). The results revealed that the hybrid model was significantly better than other benchmark models. Hu et al. (2021) suggested a novel hybrid decompose-ensemble model that employed variational mode decomposition (VMD) and back-propagation neural networks (BPNN) in order to improve the accuracy and stability of daily streamflow estimation. The results confirmed that the VMD outperformed a single estimation model. The proposed model also had a better performance in estimating higher-magnitude flows.

While ANN and ANFIS models were frequently studied in the relevant literature, it has been seen the LSTM method was conducted limitedly in river current estimates. In this study, the flow measurement station was located on the Zamanti River of the Seyhan Basin in the South of Turkey. In addition, flow estimation analysis was conducted to support the sustainable water potential of this basin. Thus, the prediction performance of the models was assessed with artificial intelligence methods, with the help of measured daily flow data at the flow measurement station (FMS) throughout the years. Studies on estimation were carried out by means of a hybrid approach similar to the training of the ANFIS model with the GA algorithm. The performance of the proposed hybrid ANFIS-GA approach was compared with the performances of classical ANFIS model. Also, within the scope of simulation studies, the traditional artificial neural networks (ANN) and long-short term memory (LSTM) method which is a quite popular in recent years were used to predict streamflow data. The classical optimizers as 'Adam', 'Adamax', 'Nadam' and 'RMSprop' were applied in the training process of the LSTM method. In this respect, the success of ANFIS-GA, ANFIS, ANN and LSTM methods in the process of determining river flow estimations was combined with comparative analysis. Thus, the relevancy of statistical analysis in forecasting the flow was enhanced. The outcomes indicate that the proposed ANFIS-GA model enhances prediction accuracy more than the others and is viable for predicting river flow.

Some points that show the originality of this study are as follows. The performance of the proposed hybrid ANFIS-GA method has been compared with LSTM, which is a very popular and effective deep learning method in recent years. In other respects, the LSTM method was conducted limitedly in real-time river flow prediction. In addition, a limited number of studies are found for the prediction of river flow in the Seyhan basin in the literature. The remaining article was structured as follows: In Sect. 2, the data set and the components of the applied method were specified. In Sect. 3, the gathered outcomes from the research were presented comparatively and a detailed evaluation was provided. Eventually, the outcomes of the study were summarized in Sect. 4.

2 Materials and Methods

In this study, a hybrid ANFIS-GA model was conducted for the Zamanti River, which was constructed as a result of optimizing the initial and final parameters gathered in the ANFIS network with GA, thus, the river flow rate was estimated.

2.1 Study Area and Dataset

The Seyhan Basin is located in the west of Turkey and extends in a wedge shape from Çukurova to the north. The upper part is located in Central Anatolia, the middle and lower parts are located in the Mediterranean Region. The basin, which contains the catchment areas of the Seyhan River and Göksu and Zamantı branches, is between 36º 33'- 39º 12' North and 34º 24'- 36º 56' east latitude and longitude. The basin is 2.213 km2 wide and covers 2.82% of Turkey. The Seyhan Basin holds a total rainfall area of 20.731 km2, and the average annual rainfall is 624 mm; the average annual flow is 211.07 m3/s. The average annual yield of the basin is 10.18 L/s km3, and the ratio of the flow to precipitation is 0.51. Regarding such factors as the fact that the entire Seyhan Basin is located within the country's borders, instantaneous high flow changes, and the area of sea-connected basin exits, it is essential to study the Seyhan Basin. Furthermore, the catchment potential, sustainable water resources, and accurate river flow forecasts for watershed management are among the essential aspects to be considered (Kilinc 2022).

The research area covers the Zamanti and Körkün FMS located in the upper part of the Seyhan River Basin, as shown in Fig. 1. The Seyhan River, which carries its source from the Taurus Mountains, empties into the sea west of the Çukurova Delta. The size of the river's basin is approximately 21,700 km2. There is the Ceyhan River basin in the east of the Seyhan Basin and the Euphrates River basin in the northeast. It is bounded to the west by the closed basin of Sultan Marshes, to the northwest by the Kızılırmak River basin and to the southwest by the Eastern Mediterranean basin. The lower territories of the Seyhan River basin are under the influence of the Mediterranean climate, while the continental climate is effective in the middle and upper parts. The average annual rainfall is about 700 mm in coastal areas, while in mountainous areas, the value has risen to 1000 mm, and in the north and inland, it falls to 400 mm.

Fig. 1
figure 1

Location of Seyhan River Basin

The Seyhan basin, one of the largest rivers of Turkey pouring into the Mediterranean, is one of the leading basins that will be significantly affected by drought due to climate change. It is predicted that there will be significant decreases up to 30% in surface water resources, snow storage, and groundwater potential in this basin (Acikel and Ekmekci 2021). This situation reveals the importance of the basin in future planning. In addition to climate change, the increasing effect of drought on the basin day by day will significantly affect the economic activities of the basin and the management of water resources. Information on the yields of streams in areas with arid and semi-arid climates is important in terms of utility and irrigation water, determination of irrigation time and reservoir operation. The change in precipitation in the river basin will directly affect the river flows. This situation reveals the importance of analyzing the flow data in different time periods in terms of monitoring the drought of the basin. The fact that the Seyhan Basin has a high sensitivity to climate changes and that precipitation and temperature changes significantly affect both the surface water potential and groundwater recharge of the basin have been effective in considering this region as a study area.

In the study, sample data whose positions are given in Fig. 2 and flow characteristics are shown in Figs. 3 and 4 are indicated. The data were obtained from Zamanti Ergenuşağı (E18A026) and Körkün Hocali FMS (E18A020) of the Seyhan River, which is one of the significant rivers of Turkey. The flow values between October 2000 and September 2011 are used as sample data. The number of river flow measurements belonging to Zamanti and Körkün FMS is 4383 and 3653 per day, respectively. The number of measurements, the continuity of the data among the measurements and the homogeneous distribution of the data were the determining factors in using the flow data of these stations in the study. Within the scope of the study, 80% of the total data set was operated for training and the remaining 20% for testing. Additionally, the statistical parameters of the flow measurement data of Zamanti and Körkun FMSs on the Seyhan River are given in Table 1.

Fig. 2
figure 2

Location of Zamanti and Körkün (FMS) on Seyhan River

Fig. 3
figure 3

The approximate 20-year river flow measurement of Zamanti-FMS

Fig. 4
figure 4

The approximate 20-year river flow measurement of Körkün FMS

Table 1 The statistical characteristics of river flow data

2.2 Long Short Term Memory Network (LSTM)

LSTM networks are primarily an effective method utilized to clarify and predict future information by merging past information with current information. As shown in Fig. 5, LSTM blocks have three gates that perform the cell's write, read and reset operations. All cells are managed by these three gates (Tasabat and Aydin 2021). The entrance gate controls the input information to the open cell, determines how much information is transferred to new data using the past gate, and how much information is used when calculating the output using the exit gate.

$$f_{t} = \sigma \left( {W_{f,x} *X_{t} + W_{f,h} *h_{t - 1} + b_{f} } \right)$$
(1)
$$i_{t} = \sigma \left( {W_{i,x} *X_{t} + W_{i,h} *h_{t - 1} + b_{i} } \right)$$
(2)
$$\tilde{C}_{t} = \tanh\,\left( {W_{c,x} *X_{t} + W_{c,h} *h_{t - 1} + b_{c} } \right)$$
(3)
$$C_{t} = C_{t - 1} *f_{t} + i_{t} *\tilde{C}_{t}$$
(4)
$$o_{t} = \sigma \left( {W_{o,x} *X_{t} + W_{o,h} *h_{t - 1} + b_{o} } \right)$$
(5)
$$h_{t} = o_{t} *\tanh \left( {C_{t} } \right)$$
(6)
Fig. 5
figure 5

The structure of the LSTM model cell

In a situation where \(\sigma\) represents a logistic sigmoid function, \(W\) are the input weights of different gates. LSTM has occurred from the entrance gate \(\left( {i_{t} } \right)\), the entrance modulation gate \(\left( {\tilde{C}_{t} } \right)\), forget gate \(\left( {f_{t} } \right)\) and exit gate \((o_{t} )\). Also, \((b)\) bias vector, \(\left( {C_{t} } \right)\) current cell state and \(\left( {h_{t} } \right)\) hidden state have been stated in this network. All these controllers determine the amount of information gathered from the previous cycle and the information transferred to the new one.

2.3 Adaptive Neuro-Fuzzy Inference System (ANFIS)

ANFIS is an artificial system developed by Jang and inspired by the Takagi–Sugeno fuzzy model (Haznedar et al. 2021). It is capable of back-propagation learning of artificial neural networks with the ability to draw conclusions from fuzzy systems. Both methods are combined, and thus a hybrid artificial intelligence model called neuro-fuzzy network is created. ANFIS calculates the output value with a certain systematic, which makes it possible by distributing the input data performed with membership functions over the neural network with IF–THEN rules. This process provides inference ability and is less dependent on experience. As a result, the success of the ANFIS model is quite high in forecasting problems. ANFIS is mainly composed of five layers. Figure 6 shows a basic ANFIS structure consisting of two inputs and one output (Haznedar et al. 2017).

  • 1st Layer

    This layer is called the fuzzification layer. Nodes in this layer are adaptive so that they can be optimized. At this stage, the output signal for each node is generated, which relies on the input values and the type of conducted membership function. The outputs of the nodes in this layer \(\left( {O{}_{1i}} \right)\) are indicated in Eqs. (7) and (8) (Karaboga and Kaya 2019).

    $${O}_{1i}=\mu {A}_{i}\left(x\right)\qquad\qquad i=\mathrm{1,2}$$
    (7)
    $$O_{1i} = \mu B_{i - 2} (x)\qquad\qquad i = 3,4$$
    (8)

    \(A_{i}\) and \(B_{i}\) represent any membership function and \(\mu A_{i}\) and \(\mu B_{i}\) illustrate the membership degree. In a situation where the bell curve is operated as a membership function, \(\mu A_{i}\) is calculated with Eq. (9). Here, \(a_{i}\) is the sigma of the membership function, \(b_{i}\) is the slope of the membership function and \(c_{i}\) is the center of the membership function.

    $$\mu {A}_{i}=\frac{1}{1+{\left[{\left(\frac{x-{c}_{i}}{{a}_{i}}\right)}^{2}\right]}^{{b}_{i}}}\qquad\qquad i=\mathrm{1,2}$$
    (9)
  • 2nd Layer

    The second layer is known as the rule layer, and each node in the layer specifies the rules of the fuzzy system and the number of rules. As a result, the firing strength \(\left( {w_{i} } \right)\) of each rule is calculated by multiplying the membership degrees coming from the fuzzification layer.

    $${O}_{2i}={w}_{i}=\mu {A}_{i}\left(x\right).\mu {B}_{i}\left(y\right)\qquad\qquad i=\mathrm{1,2}$$
    (10)
  • 3rd Layer

    The third layer, in which the firing strength of each rule is normalized, is called the normalization layer.

    $${O}_{3i}=\overline{{w }_{i}}=\frac{{w}_{i}}{{w}_{1}+{w}_{2}}\qquad\qquad i=\mathrm{1,2}$$
    (11)
  • 4th Layer

    This layer is the defuzzification layer. The weighted output value given in Eq. (12) is calculated as a result of multiplying the normalized firing strength value of each rule coming from the normalization layer with the first-order polynomial.

    $${O}_{4i}=\overline{{w }_{i}}.{f}_{i}=\overline{{w }_{i}}.\left({p}_{i}x+{q}_{i}y+{r}_{i}\right) i=\mathrm{1,2}$$
    (12)
  • 5th Layer

    The fifth layer, known as the output layer, collects the weighted output value of each node from the fourth layer, as stated in Eq. (13). Thus, the actual output value of the ANFIS system is found.

    $${O}_{5i}=f=\sum \overline{{w }_{i}}.{f}_{i}=\frac{\sum \overline{{w }_{i}}.{f}_{i}}{\sum {w}_{i}}\qquad\qquad i=\mathrm{1,2}$$
    (13)

    ANFIS has two fuzzy logic system parameters, the premise-consequent parameters, which merge the fuzzy rules. One of the main objectives of ANFIS is to optimize these parameters with a learning algorithm. To optimize the parameters, it is provided by using the input–output data set. Additionally, parameter optimization is performed in order to minimize the error value between the actual output and the target output. In this context, the traditional derivative-based back propagation (BP) algorithm is commonly used to optimize the premise-consequent parameters of the classical ANFIS model developed by Jang.

Fig. 6
figure 6

A basic structure of ANFIS that has two inputs and one output

2.4 Genetic Algorithm (GA)

GA was developed in the 1970s by John Holland, whose basic principles were laid down and inspired by the mechanisms of evolutionary development of living things (Holland 1975). Furthermore, GA is a heuristic algorithm designed to locate the global optimum in the search space. Due to the fact that it performs with a set of solutions, it holds the ability to access the region where the global optimum is located rapidly. For this reason, the success of obtaining optimal results is high even in multidimensional problems with a large search space and a large number of variables. Moreover, these results appear within acceptable periods. Operators such as crossover, mutation, inheritance and selection in the flow diagram are used to produce the most appropriate new solutions for the cost function of the problem. The flow diagram is demonstrated in Fig. 7.

  • Inıtial Populations

    The initial population is created using randomly selected candidate solutions. Although there are not any specific criteria for the number of selected candidate solutions, it is considered that selection of large number of individuals in the initial phase did not influence the goodness of the solution and solution time.

  • Evaluation of Fitness Function

    The fitness function value represents the solution quality of each individual and determines the individuals which are transferred to the next generation. It is obtained by subjecting each chromosome into fitness function.

  • Selection

    Roulette Wheel and Tournament Methods are often used for the selection criteria. In the Roulette Wheel, parent chromosomes are selected from chromosomes having high fitness values rather than the ones having relatively low fitness values. On the other hand, in the tournament, the best chromosomes are selected as the parent chromosomes among randomly selected ‘n’ chromosomes from the population (Haznedar et al. 2017).

  • Crossover

    Crossover is usually carried out by genes from two parent chromosomes. From the first gene to predetermine the number of a gene is taken from the first parent chromosome while the rest of them are taken from the second parent chromosome. Crossover operator provides better solutions by combining the good features of individuals.

  • Mutation

    The mutation process produces new solution candidates by increasing the diversity of chromosomes in the population. In this method, more than one gene is changed to generate a different individual. The mutation rate is usually between 0.5% to 5%. The role of the mutation operator is that it prevents the algorithm from sticking to a local optimum since the mutation operator randomly selects a gene on a chromosome.

Fig. 7
figure 7

The flow chart of GA

As mentioned earlier, GA can be described as a population-based and artificial intelligence-based heuristic optimization algorithm. The term for candidate solutions forming the algorithm's population is chromosomes. Chromosomes are transformed into solutions that represent better results through various evolutionary processes. This process persists until stopping criteria are met. For example, it continues until an acceptable fitness value is reached or criteria such as a predetermined processing time or several generations are completed. The fundamental steps of the Genetic Algorithm are given below.

  1. 1.

    Generate the initial population.

  2. 2.

    Calculate the fitness value for each solution in the population.

  3. 3.

    Repeat the following steps until a new population is brought into existence.

    1. (a)

      Select two parental chromosomes from the population, depending on the fitness values.

    2. (b)

      Apply crossover to n parents depending on the probability of a crossover to generate a new product, which is also known as a child.

    3. (c)

      Mutate the new product at the rate of the probability of mutation.

  4. 4.

    Use the newly produced population in the algorithm's next run.

  5. 5.

    If the stop criterion is provided, STOP.

  6. 6.

    GO to Step 2.

2.5 Estimation of Streamflow Data Based on ANFIS-GA (The Proposed Method)

As mentioned previously, ANFIS is a hybrid AI method that is generated by combining the neural networks’ learning ability with the inference of fuzzy logic (Simon 2002). ANFIS is trained by utilizing the current input and output data set to solve prevailing problems. This section clarifies the components of the offered innovative approach for training the ANFIS model with the GA algorithm. ANFIS has two fuzzy logic system parameters, the premise-consequent parameters, which need to be optimized. The premise parameters pertain to the membership functions provided in Eq. (9). They are operated to place input values in the fuzzification layer. The consequent parameters are the p, q, and r parameters given in Eq. (12) belonging to the linear functions in the defuzzification layer. Generally, traditional derivative-based algorithms are operated to optimize the parameters of ANFIS. Nevertheless, for reasons such as slow convergence of derivative-based algorithms, local minimization, largely depending on the initial values, and so on, optimizing ANFIS parameters has turned into one of the primary concerns. Population-based algorithms, performing with a set of solutions, often allow them to access the region rapidly where the global optimum is located. ANFIS is a challenging model for optimizing parameters. Hence, a population-based GA algorithm, which is a mighty algorithm, was employed to optimize the parameters and eliminate the drawbacks. The RMSE function provided in Eq. (14) is applied to calculate the fitness value of the solutions. GA tries to find the best premise and consequent parameters with the chromosome with acceptable fitness value. Considering the training process of ANFIS, the block diagram of the proposed method is given in Fig. 8.

$$RMSE = \sqrt {\frac{{\sum\nolimits_{i = 1}^{N} {\left( {F\left( i \right) - F_{d} \left( i \right)} \right)}^{2} }}{N}}$$
(14)
Fig. 8
figure 8

Schematic representation of the proposed model

The F and Fd in Eq. (14) demonstrate the value estimated through ANFIS and the actual output value of the data. In addition, N represents the total number of samples in the training dataset. Optimizing the premise and consequent parameters minimizes the RMSE function presented in Eq. (14). The most acceptable RMSE value is obtained when the actual and the predicted value are closest to each other. GA seeks to discover the most appropriate RMSE value until the stop criterion is formed. In this study, the number of iterations is determined as the stopping criterion.

3 Comparative Results

In this study, solutions to the hydrological time series issues of Zamanti and Körkün FMS, as presented in Figs. 3 and 4, were searched by optimizing the parameters of ANFIS structures using GA. While forecasting, data sets were organized by designing 5-offset scenarios for the time delay. For the data set of the 5 time-shifted scenarios, the daily flow values starting from 5 days before and up to the day before being used as input in the 6th day forecasting. Numerous attempts were made to determine the GA algorithm's initial parameter values. As a consequence of the attempts, it was uncovered that the number of iterations was 1000, the population size was 50, the crossover rate was 0.8, and the mutation rate was 0.01. The performance of the proposed hybrid ANFIS-GA approach was compared with the performances of the classical ANFIS model. Thus, the classical ANFIS network is additionally trained with the traditional BP algorithm to estimate the river flow data. To serve and achieve this purpose, the number of iterations was determined as 1000, the learning coefficient was 0.2, and the momentum ratio was 0.4.

Moreover, the performance of the presented ANFIS-GA approach was compared with the performance of deep learning methods owing to the fact that deep learning has become relatively widespread and has been involved in numerous various areas. In this respect, the Long Short Term Memory (LSTM) network, one of the deep learning models, was involved in the aforementioned time series issues. The LSTM network consists of 1 input layer, 4 hidden layers and 1 output layer. The number of iterations is established as 1024 for the 5-offset scenario. 'Adam', 'Adamax', 'Nadam' and 'RMSprop' were applied as optimization functions and the loss function was determined as 'MAE'. In order to show the performance of traditional methods for the current study, the ANN method was also applied in the estimation studies.

The performance measures of the methods were as indicated; average square root error (RMSE), mean absolute error (MAE), average absolute percentage error (MAPE), determination coefficient (R2), and standard deviation (SD). The evaluation metrics is among the measurement tools widely operated in multiple studies to predict daily flow values and determine effectiveness. (Abyaneh et al. 2011; Arslan and Sekertekin 2019). Performance measurements of ANFIS-GA, ANFIS, ANN, and LSTM methods are shown in Tables 2 and 3. It was detected that the performance of the hybrid ANFIS-GA model was relatively prospering compared to the ANFIS, ANN and LSTM network. Furthermore, the methods applied in the test data obtained from Zamanti and Körkün FMSs and the scatter charts are given in Figs. 9, 10, 11, 12, 13, and 14, respectively. In the graphs demonstrated, the 'ADAM' optimizer function provided the LSTM network's most satisfactory results compared with the obtained results. The scatter graph indicates that the results of the hybrid ANFIS-GA method are closer to the actual river current values. In addition, it has been revealed to hold the highest success prediction with an R2 value of 0.9409 for the Zamanti River and an R2 value of 0.9263 for the Körkün River.

Table 2 Performance Measurements for Zamanti FMS
Table 3 Performance Measurements for Körkün FMS
Fig. 9
figure 9

Comparative plots of the observed and predicted flow of the ANFIS-GA model during the testing phase for Zamanti FMS

Fig. 10
figure 10

Comparative plots of the observed and predicted flow of the classical ANFIS model during the testing phase for Zamanti FMS

Fig. 11
figure 11

Comparative plots of the observed and predicted flow of the LSTM-ADAM model during the testing phase for Zamanti FMS

Fig. 12
figure 12

Comparative plots of the observed and predicted flow of the ANFIS-GA model during the testing phase for Körkün FMS

Fig. 13
figure 13

Comparative plots of the observed and predicted flow of the classical ANFIS model during the testing phase for Körkün FMS

Fig. 14
figure 14

Comparative plots of the observed and predicted flow of the LSTM-ADAM model during the testing phase for Körkün FMS

Among the results obtained, it is noticed that the hybrid ANFIS-GA method offers quite a success in river flow estimation compared to LSTM, ANN and ANFIS methods. The ANFIS-GA method exhibited the highest R2 (≈0.9409) values with minimum RMSE (≈6.6517 m3/s), MAE (≈2.7875 m3/s), MAPE (≈4.3105 m3/s) and SD (≈0.1056) for the FMS test set over time. In addition, it was the most successful model by achieving the minimum RMSE (≈2.6433 m3/s), MAE (≈1.0308 m3/s), MAPE (≈6.6980 m3/s), SD (≈0.2764), and the highest R2 (≈0.9263) values for the Körkün FMS test set with different characters in the same basin.

The success of the presented method is the result of the evolutionary approach that the elitism conservation strategy of the GA algorithm can evaluate and select the optimal solution from 1000 solutions in each generation. Furthermore, by overcoming the minimum disruption problem of the population-based GA and derivative-based BP algorithm, optimal solutions were achieved in the global region, and thus the ANFIS model was optimized more successfully. The results of various optimization functions of the LSTM network are also displayed in Table 1 even though the findings indicated the LSTM network revealed a superior R2 value than the ANFIS model, it was encountered that it lagged behind the ANFIS models in the RMSE, MAE and MAPE measurements. The LSTM network is a fairly popular and frequently used method in recent years. Nowadays, the LSTM network is quite prevalent and a frequently conducted method. Nevertheless, it is not able to perform more pleasingly than the traditional ANFIS method, and it lags behind the hybrid ANFIS-GA model, including the coefficient of determination (R2) in the study. In addition, as depicted in Tables 1 and 2, traditional ANN method performance for the current study lagged behind all other methods with R2 (≈0.7684) and R2 (≈0.8514), respectively. However, results in terms of other measurements are similar, which indicates the fact that the generalization ability of the traditional ANN method is poor.

All evaluation criteria confirm that the hybrid ANFIS-GA model is quite successful. The performance of the LSTM model remains inadequate compared to other models. In addition, the correlation of river currents carries LSTM networks one step further. One of the most significant aspects causing the matter is the volatilities noticed in the flow rates between the two stations. The distinctions in the instantaneous variability of the currents in the Zamanti River are instances of the matter. Following the trend line at Körkün station, the current values make the prediction capacity higher than anticipated. Even though LSTM is approaching ANFIS at this station, the success of the hybrid ANFIS-GA model remains at the forefront.

In addition, all evaluation criteria for the two stations have indicated that the hybrid ANFIS-GA model has achieved considerable progress in percentage terms. The values with the highest R2 and lowest standard deviation are indicated in the Zamanti station in the general evaluation. Zamanti Ergenuşağı station is located on the transition level of the Ceyhan water collection basin includes an association with proximity to the upstream point. On the other hand, the Körkün station is placed on the same line as the final hydrological station on the Seyhan River and manages the river's flow to the sea. Elements such as the capacities of various precipitation areas and the climate characteristics of the two stations differ when compared to each other. These elements impact the correlation between observed and estimated values. In the light of all the details, it has been recognized that it is essential to transform the results into LSTM parameters and optimize the learning rate and the number of hidden neurons, which are its two primary parameters.

4 Discussions

The training of the ANFIS parameters is one of the main problem. The GA algorithm was used to overcome this challenge in this study. It is seen that the proposed hybrid ANFIS-GA has high accuracy in forecasting streamflow. In addition, the proposed ANFIS-GA method's results compared with the results of many existing studies in the literature using the ANFIS based hybrid models to predict different time series problems. These comparisons showed that artificial intelligence algorithms successfully optimize ANFIS parameters.

Adnan et al. (2022) developed a novel hybrid model where gradient-based optimization (GBO) algorithm was employed to adjust adaptive neuro-fuzzy systems (ANFIS) hyperparameters. Several benchmark methods for optimizing ANFIS parameters were compared, which included particle swarm optimization (PSO), grey wolf optimization (GWO), and colony optimization (ACO). The results contained that all ANFIS hybridized model successfully enhanced the prediction accuracy. Kayhomayoon et al. (2022) built a hybrid ANFIS model based on several metaheuristic algorithms such as genetic algorithm (GA), particle swarm optimization (PSO), and colony optimization for continuous domains (ACOR). All hybridized models compared with benchmark models. The results revealed that the best prediction was achieved by ANFIS-ACOR model. Also, the findings of the study displayed that metaheuristic algorithms could significantly improve the performance of the ANFIS model. Fadaee et al. (2020) investigated the capability of butterfly optimization algorithm (BOA) and genetic algorithm (GA) integrated with ANFIS, ANN and multiple linear regression (MLR) models. The results indicated that the performances of metaheuristic algorithms increased the accuracy of the ANFIS model. Karami et al. (2022) combined the ANFIS model with GA, PSO, ACO, and differential evolution (DE) algorithms. The comparison results showed that hybrid ANFIS-ACO extended to continous domain algorithm with the best prediction. Rathnayake et al. (2022) assessed the capacity of the novel Cascaded-ANFIS algorithm and compared this benchmark model with the gated recurrent unit (GRU), LSTM and recurrent neural network (RNN) models. The results clearly revealed that Cascaded-ANFIS outperformed better than single models.

The proposed hybrid ANFIS-GA method successfully predicted the daily flow rate in flow measurement station data of two different hydrological conditions. In addition, the study demonstrated the success of the hybrid model in predicting the optimal level of streamflow when compared with the hybrid models and linear regression models in the literature.

5 Conclusions

In conclusion, the performance of ANFIS-GA, ANFIS, ANN and LSTM models for Seyhan river flow forecasting was examined. Therefore, the current values which were obtained from Zamanti Ergenuşağı and Körkün current stations between the dates of October 2000 and September 2011 were operated as sample data. In the study, 3502 and 2918 data, which were 80% of the total data set, were used for training, and 876 and 730 data, the remaining 20%, were used for testing. Derivative-based algorithms, such as BP, are widely used in training parameters of the ANFIS model. However, there are difficulties such as slope calculation in derivative-based algorithms, as well as causing problems such as local minimum. The aforementioned situations also reveal fundamental problems in training and updating ANFIS with derivative-based algorithms. In this context, the most appropriate premise and consequent parameters for the river current velocity estimation of the ANFIS network were determined using the population-based GA algorithm, which is a powerful algorithm to overcome the aforementioned problems. The hybrid ANFIS-GA model's performance was compared with the classic ANFIS model, the traditional ANN and deep learning LSTM network models that have been pretty prevalent recently. RMSE, MAE, MAPE, STD and R2 different statistical indicators were conducted to compare the models. The results of the performance indicators in Tables 2 and 3 presented the suggested hybrid ANFIS-GA model can deliver more accurate and reliable estimates than other methods. The results demonstrate that the ANFIS-GA model may suit current river estimation.

The ANFIS-GA model turned out to provide promising results in river flow forecasts. However, there are some limitations in the study: only streaming data was conducted as input. However, while flow time series are stochastic, they are not linear in nature. In addition, the data rely on various meteorological parameters such as evaporation, humidity, snowmelt and temperature. Consequently, the presented study can be extended in the future by gathering more input data. Decomposition techniques can be incorporated into the model to taken the flow data's motion and nonlinearity. In the following studies, offered model's performance can be considered at shorter time intervals such as hourly, 30 min, 10 min for the predictions. Hydrological variables such as evaporation, precipitation, and sediment can be involved in hydrology to examine and develop the application of the suggested model since the hybrid model's contribution to the study is promising.

Overall, the study of population-based algorithms like GA with a set of candidate solutions usually allows them to quickly access the region of the global optimum. However, negative aspects have arisen due to many parameters and the network's training formed from a random point. For population-based algorithms, it usually carries a long computational time to find the optimal solution in the region where they are located, which is due to high probability-based search strategies. This case is the most obvious limitation of the proposed ANFIS-GA method. In this respect, forthcoming studies are arranged to be carried out to train the ANFIS network by utilizing different meta-heuristic search techniques while examining its implementation on related problems. In contrast, new regions with various flow aspects can be analysed with the help of the ANFIS-GA model. Hence, data with various aspects are monitored, and forthcoming research paths can be chosen. In addition, the GA algorithm used in the proposed hybrid ANFIS-GA method can be involved in the training process of the LSTM network and so, new studies can be carried out to increase the prediction performance of the studied basin.