Introduction

A side orifice refers to a type of outlet located on the side of a channel that is commonly used to redirect water flow from the main channel. It is a useful mechanism for managing water levels in canals, controlling floods, and diverting water from irrigation or drainage systems. Side orifices are also employed for various purposes such as chemical distribution in factories and redirecting water flow from dam reservoirs. Side weirs and side orifices are characterized by variable flow with a declining discharge, and they have been widely researched due to their practical applications in environmental engineering and water management (Hussain et al. 2010; Hussain and Haroon 2019; Zaji and Bonakdari 2014). In addition to side orifices, there are several other hydraulic structures used for redirecting water flow. These include sluice gates, weirs, intake structures, spillways, and others. These structures have been studied extensively in the field of hydraulic engineering and have various applications in water management, irrigation, and flood control. Some of the research studies that have investigated these hydraulic structures include Hashid et al. (2015), Hussain et al. (2010, 2011, 2014, 2016), Ramamurthy et al. (1986, 1987), and Shariq et al. (2018). Wastewater treatment plants commonly utilize hydraulic structures such as weirs, sluice gates, and intake structures to distribute incoming flow to parallel processing units like flocculation tanks, sedimentation tanks, and aeration basins. Understanding the science of accurately directing discharge from a channel is crucial in wastewater treatment plants for optimal processing of wastewater (Hussain and Haroon 2019).

Side weirs have been extensively studied due to their wide range of applications in environmental and hydraulic engineering. The first analysis of flow over a side weir was conducted by De Marchi (1934), and since then, numerous researchers have conducted experimental, theoretical, and numerical studies on the hydraulics and flow characteristics of side weirs of various shapes. These shapes include rectangular, triangular, trapezoidal, elliptical, and circular (Emiroglu et al. 2011; Hager 1987; Mohammed et al. 2014; Mohammed and Golijanek-Jędrzejczyk 2020; Ranga Raju et al. 1979; Shariq et al. 2018; Vatankhah 2012; Vatankhah and Rafeifar 2020; Ramamurthy et al. 1986, 1987) developed an analytical method for calculating the flow rate of side orifices. Using laboratory measurements to validate their methods, they demonstrated that their suggested analytical model accurately predicted discharge data. A survey of the literature on flow through sharp-crested side weirs reveals that the coefficient of discharge depends mostly on the approach flow Froude number, flow depth in the main channel, weir crest height, weir length, top flow breadth, and cannel slope (Borghei et al. 1999). Experiments conducted in an effort to determine the coefficient of discharge for a circular side orifice under various geometric and hydraulic conditions (Hussein et al. 2010). They had developed numerous multiple regression formulas for estimating the discharge coefficient based on their experimental results. Their proposed relations determined the coefficient based on the width of the channel, the diameter of the side orifice, and the Froude number. Hussein et al. (2011) investigated the flow pattern in a square-sided orifice and conducted tests to establish the discharge coefficient. They concluded that the coefficient of discharge for an orifice with square sides relates to the Froude number, the Reynolds number, and its geometric parameters. They had also developed a relationship for the coefficient of discharge that could be applied to both circular and square side orifices. Hussein et al. (2016) analysed and experimentally tested the hydraulic behaviour of circular side orifices under submerged and non-submerged situations. By analysing the results of their experiment, they derived different formulas for calculating the discharge coefficient and explained that the errors for submerged and free situations are 10 and 5%, respectively. Moreover, Vatankhah and Mirnia (2018) measured discharge coefficients for triangular side orifices under a variety of geometric and hydraulic situations. They developed discharge coefficient-influencing parameters and proposed estimate formulae. The coefficient of discharge for a steep crested side orifice under a free flow condition was estimated using three linear data-driven models: locally weighted linear regression (LWLR), multiple linear regressions with interaction (MLRI), and multivariate linear regression (MLR) (Jamei et al. 2021). Recently, researchers have proposed Regularized Extreme Learning Machine (RELM) models to predict the discharge coefficient of triangular side orifices and compared their performance with the Extreme Learning Machine (ELM) model. Their findings suggest that the RELM model (with R = 0.995 and RMSE = 0.003) performed better than the ELM model (with R = 0.982 and RMSE = 0.010) (Mahmoudian et al. 2022; Moghadam et al. 2022 Gerami Moghadam et al. (2022) ; Shen et al. 2022) also used artificial intelligence to predict the discharge coefficient (Cd) of triangular side orifices. They trained their model using a backpropagation neural network (BPNN) and optimized its weights and thresholds using a sparrow search algorithm (SSA). Their results showed that the SSA-BPNN model had high accuracy and strong generalization ability, with a maximum error of 6.56% and an average error of 1.73%. Mahmodian et al. (2019) presented self-adaptive extreme learning machine (SAELM) to model the discharge coefficient of rectangular and circular side orifices in open channels. The models were evaluated using Monte Carlo simulations and k-fold cross validation. The shape coefficient parameter was found to increase the accuracy of the models. The most effective parameter was identified as the ratio of the flow height above the weir to the weir crest height with R = 0.995, and RMSE = 0.004. Evidently, artificial intelligence (AI) models and learning machines have been successfully used for modelling and forecasting the discharge coefficient of divergent structures. These models are precise, dependable equipment for simulating the discharge capacity and estimating the influence degree of various variables on the discharge coefficient. Moreover, AI techniques are so efficient that they enable researchers to save a significant amount of time, money, and effort.

The past researchers proposed models for the estimation of discharge coefficient of weirs using Gene Expression Programming (GEP) (Ebtehaj et al. 2015a; Azimi et al. 2017a; Hussain et al. 2021). Hybrid neuro-fuzzy models have also been employed to predict the discharge coefficient of weirs and side orifices using hybrid neuro-fuzzy models (Khoshbin et al. 2016; Azimi et al. 2017b; Ebtehaj et al. 2015b) used the Group Method of Data Handling (GMDH) to predict the discharge coefficient of orifices with square sides, while Akhbari et al. (2017) determined the discharge coefficient of triangular weirs using radial basis function neural networks. In addition, Azimi et al. (2017c) utilized the Extreme Learning Machine (ELM) to identify variables that influence the discharge coefficient of weirs positioned in trapezoidal canals. Support vector regression (SVM) was also used by Azimi et al. (2019) to simulate the discharge coefficient of rectangular side weirs. Furthermore, Bagherifar et al. (2020) used computational fluid dynamics (CFD) to model the flow field in a circular flume along a rectangular side weir, determining that the projected energy reduction along the side weir was minimal and that the specific energy along the side weir was mostly constant, resulting in an anticipated average difference between upstream and downstream specific energy of 2.1%.

Gerami Moghadam et al. (2022) developed a new algorithm called the generalized structure group method of data handling (GSGMDH) to simulate the coefficient of discharge (Cd) of triangular lateral orifices. They identified that the upstream Froude number and orifice height ratio were the most influential factors in the model. The study concluded that the GSGMDH algorithm performed better than the classical group method of data handling (GMDH) in predicting the Cd.

Extensive research for the prediction of discharge through the side orifice have been conducted using various artificial intelligence techniques. There is a gap in comparative study of different training algorithms of ANN for the prediction of discharge coefficient of side orifice. In the present study, a comparative study of four different training algorithms namely Gradient Descent (ANN-GD), Levenberg–Marquardt (ANN-LM), Gradient-Descent with Momentum (GDM) and Gradient-Descent with Adaptive Learning (GDA) have been conducted for the estimation of discharge of a circular side orifice.

Experimental setup and data collection

Hussain et al. (2010) conducted experiments in a rectangular main channel of 9.15 m in length, 50 cm in width, and 50 cm in depth. At the end of the primary channel, a sluice gate was erected to regulate the flow’s depth. Two 20 cm diameter supply pipes supplied water to the main canal. On the left side of the channel, 5.18 m from the upstream end of the main channel, a circular orifice was created. The discharge from the aperture was channelled into a 3.80 m long, 26 cm wide, and 41 cm deep diversion canal, followed by a return channel. Experiments were performed with varied orifice diameters (D) of 5, 10, and 15 cm and with varying orifice crest heights (W) of 50, 10, 15, and 20 cm. Three to four main channel discharges were observed against each set of D and W measurements. By modulating the sluice gate, different flow depths were maintained in the main channel for each discharge value. Using a digital point gauge with a precision of 0.01 mm, the water level in the main channel near the orifice and across the crests for each run was measured.

From Hussain et al. (2010), a total of 214 experimental data have been acquired. The schematic diagram of the experimental setup and further description of the design of the experiment can be found in Hussain et al. (2010). Table 1 contains the descriptive statistics of the variables employed in the present investigation. Randomly selected 70% of the available data sets were used for training, 15% for testing, and the remaining 15% for the validation of ANN models.

Table 1 Basic statistics of the data

Modeling techniques

The modelling techniques and performance assessment parameters adopted in the present study are  discussed herein.

Artificial neural networks

Artificial neural networks (ANNs) are a popular method in machine learning that imitates the central nervous system of the human brain. ANNs are utilized to simulate complex non-linear processes and to match inputs with outputs, including for classification problems. ANNs consist of artificial neurons as the primary processing unit, and typically have an input layer, hidden layer(s), and output layer. The number of neurons in the hidden layers is determined by the complexity of the problem being studied, and all neurons are connected to one another with accompanying weights. Feed-forward neural networks transfer input signals to the output layer through the hidden layers, with each neuron in a given layer receiving signals from all neurons in the layer below it. Training an ANN involves adjusting the weights between connections using input-output data sets, typically utilizing the back-propagation algorithm. The training process consists of two phases: feed-forward and back-propagation. During the feed-forward stage, the input layer receives data and transmits it to the output layer via the hidden layers, with each neuron receiving signals from all neurons in the layer below it and using related weights to sum all incoming inputs. This can be represented mathematically as Eq. (1) (Zurada 1994; Schalkoff 1997).

$$Ne{t_j}=\sum\limits_{{i=1}}^{{{n_1}}} {{w_{ij}}{x_i}+b}$$
(1)

where, Netj = Input at neuron j; wij = associated weight between the connection of ith and jth neuron; n1 = number of neurons in the layer; xi = input at the ith neuron and b = bias weight. The input received (Netj) at jth neuron is then transformed using a non-linear sigmoid activation function to get the output yj for each neuron, as shown in Eq. (2).

$${y_j}=f(Ne{t_j})=\frac{1}{{1+\exp ( - \alpha Ne{t_j})}}$$
(2)

where, f is a sigmoid activation function; yj = output at jth neuron and α is a slope parameter. In the back-propagation phase of training an artificial neural network (ANN), the weights between the connections wij are initialized, and the total error function for the training data set at the output layer is calculated. After that, in the conventional back-propagation step, an optimization technique is used to minimize the total error function, and this is usually done using the Gradient Descent (GD) algorithm. However, there are alternative optimization techniques that can be used, such as the Gradient Descent with Momentum (GDM), Gradient Descent with Adaptive Learning (GDA), and Levenberg–Marquardt (LM) algorithms. These techniques can help to improve the speed and accuracy of the optimization process and can be particularly useful in situations where the GD algorithm may struggle to find an optimal solution or may converge too slowly.

By incorporating a momentum element in the Gradient Descent with Momentum (GDM) algorithm, it can learn faster and potentially overcome local minima more efficiently. Both regular Gradient Descent (GD) and GDM utilize a fixed learning rate throughout the training process. However, since GD and GDM are highly sensitive to the correct selection of the learning rate, their performance can be enhanced by allowing the learning rate to change during training. The Gradient Descent with Adaptive Learning rate (GDA) algorithm adjusts the learning step size during the training process to keep it as large as possible while maintaining stable learning. Another algorithm that improves upon these methods is the Levenberg–Marquardt (LM) algorithm, which uses both first and second-order derivatives to search for the optimal solution. However, the performance of these algorithms depends on the initialization procedure and is problem-specific, so it is necessary to test them for each specific situation (Hagan and Menhaj 1994; Marques de Sa JM et al (2007)). .

Statistical metrics

To evaluate the performance of the developed ANN model five different statistical measures, namely, root mean square error (RMSE), average absolute relative error (AARE), Pearson’s coefficient of correlation (R), Nash–Sutcliffe efficiency (E), and mean squared error (MSE) were used in this study (Eqs. (3)–(7)).

$$RMSE=\sqrt {\frac{1}{n}\sum {{{\left( {{Q_{observed}} - {Q_{predicted}}} \right)}^2}} }$$
(3)
$$AARE=\frac{1}{n}\sum {\left| {\frac{{{Q_{observed}} - {Q_{predicted}}}}{{{Q_{observed}}}}} \right|} \times 100$$
(4)
$$R=\frac{{n\sum {\left( {{Q_{observed}} \times {Q_{predicted}}} \right)} - \sum {{Q_{observed}}} \sum {{Q_{predicted}}} }}{{\sqrt {\left[ {n\sum {{(Q_{observed})}^{2} - {{\left( {\sum {{Q_{observed}}} } \right)}^2}} } \right]\left[ {n\sum {{(Q_{predicted})}^{2} - {{\left( {\sum {{Q_{predicted}}} } \right)}^2}} } \right]} }}$$
(5)
$$E=1 - \frac{{\sum {{{\left( {{Q_{observed}} - {Q_{predicted}}} \right)}^2}} }}{{\sum {{{\left( {{Q_{observed}} - {Q_{mean}}} \right)}^2}} }}$$
(6)
$$MSE=\frac{1}{n}\sum {{{\left( {{Q_{observed}} - {Q_{predicted}}} \right)}^2}}$$
(7)

Development of ann model

Discharge (Q) through a sharp-crested side orifice is determined by flow depth in the main channel (Ym), crest height (W), orifice diameter (D), main channel discharge (Qm), and channel width (B). Using dimensional analysis, all variables have been non-dimensionalized to generalise the ANN model. In this study, a feed-forward back-propagation ANN model was created to determine the discharge  through a side orifice with a sharp crest. Three layers comprise the constructed ANN model: input layer, hidden layer, and output layer. The input layer has four neurons, while the output layer contains a single neuron. The ANN model uses the non-dimensional terms D/B, Ym/B, W/B, and \(\frac{{Q}_{m}}{\sqrt{g{B}^{5}}}\) as inputs corresponding to four neurons of the input layer and the non-dimensionalized discharge \(\frac{Q}{\sqrt{g{B}^{5}}}\) as an output as shown in Fig. 1. The constructed feed forward back-propagation neural network model is trained to assess the performance parameters. The experimental data as shown in Table 1 were utilised to train the ANN model’s input-output pattern. Following the feed-forward step, the related weights between the various connections of artificial neurons were initialised, and the total error function was determined at the output layer. In the back-propagation step, four distinct algorithms, GD, GDM, GDA, and LM, were employed to modify the weights appropriately by reducing the output layer’s total error function.

Fig. 1
figure 1

Artificial neural network architecture

Results and discussion

The performances of developed ANN models using four different training algorithms, namely, ANN-GD, ANN-GDM, ANN-GDA, and ANN-LM models, in prediction of circular side orifice discharge have been evaluated by using the experimental data collected by Hussain et al. (2010). First, the best architectures of the ANN models have been determined by using the training and testing data sets. To determine the optimal neural network architecture, the trial-and-error method was employed, which involved fixing the size of the hidden layer (i.e., the number of neurons in the hidden layer) and comparing the RMSE values for different hidden layer sizes ranging from 1 to 20. The architecture which provided the least value of RMSE is selected as the best architecture as shown in Fig. 2. The performance statistics of ANN-GD, ANN-GDM, ANN-GDA, and ANN-LM with number of hidden neurons are shown in Tables 3, 4, 5 and 6 respectively. From Fig. 2, it can be observed that architectures 4-1-1, 4-1-1, 4-2-1, and 4-8-1 are the best architectures found for the ANN-GD, ANN-GDM, ANN-GDA, and ANN-LM models, respectively. The parameters for algorithms, GD, learning rate (α) = 0.01; GDM, α = 0.01, momentum factor (η) = 0.9; GDA, α = 0.01, ratio to increase α = 1.05, ratio to decrease α = 0.7; and LM, initial μ = 0.001, decrease factor for μ = 0.1, increase factor for μ  = 10 were found to be the best during training among all the trial values of parameters.

Fig. 2
figure 2

Architecture selection for various ANN models

After development of these models, their performances in discharge prediction from the circular side orifice are compared with each other. The performances of the developed ANN models have been evaluated using five different statistical measures, namely, root mean square error (RMSE), average absolute relative error (AARE), Pearson coefficient of correlation (R), Nash–Sutcliffe efficiency (E), and mean squared error (MSE) (Eqs. (3)–(7)).

Table 2 displays the statistical measures obtained from various models created in this study. The ANN-LM model, which used the Levenberg–Marquardt algorithm as the optimization technique during the backpropagation step, outperformed all other models for predicting discharge from a circular side orifice. During training, the ANN-LM model achieved an AARE value of 3.13, an R value of 0.9994, an E value of 0.9987, and an RMSE value of 0.0005. During testing, the model achieved an AARE value of 4.43, an R value of 0.9976, an E value of 0.9952, and an RMSE value of 0.0010.

Table 2 Performance evaluation of ANN models

The scatter plots of the discharge obtained from different ANN models and the experimental discharge data are plotted and shown in Fig. 3. From Fig. 3, it can be observed that the discharge obtained from ANN-LM which utilized the LM algorithm during training in perfect agreement with the discharge obtained from the experiments with excellent agreement.

Fig. 3
figure 3

Scatter plot between Q (ANN) and Q (experiment)

Table 3 Performance of ANN-GD algorithm for different number of hidden neurons
Table 4 Performance of ANN-GDA algorithm for different number of hidden neurons
Table 5 Performance of ANN-GDM algorithm for different number of hidden neurons
Table 6 Performance of ANN-LM algorithm for different number of hidden neurons

Comparison of developed ANN model with Hussain et al. (2010) model

Hussain et al. (2010) investigated the discharge over a circular side orifice and developed Eq. (8) to estimate the discharge.

$$Q=\frac{\pi }{4}\left( {0.678 - 0.072{F_r} - 0.130\frac{D}{B}} \right)\sqrt {2gH} {D^2}$$
(8)

To further assess the performance of the proposed ANN-LM model, the predicted discharge values were compared to those obtained from the Eq. (8) proposed by Hussain et al. (2010). Figure 4 shows that the ANN-LM model is closer to the line of agreement than Eq. (8). Additionally, the scatter plot of the ANN-LM model falls within the ± 5% error line, while the discharge predicted using the  Eq. (8) falls within the ± 10% error line, as shown in Figs. 5 and 6, respectively. These results indicate that the ANN-LM model proposed in this study has reduced the error by 50%.

Fig. 4
figure 4

Comparison of performance of discharge prediction models proposed in this study and that of Hussain et al. (2010)

Fig. 5
figure 5

Observed discharge verses predicted discharge using ANN-LM model

Fig. 6
figure 6

Observed discharge verses predicted discharge using Eq. (8)

Conclusions

This research utilized artificial intelligence technology to predict the discharge from a circular side orifice and compared the performance of four different ANN training algorithms: ANN-GD, ANN-GDM, ANN-GDA, and ANN-LM. To determine the optimal neural network architecture, the trial-and-error method was employed, which involved fixing the size of the hidden layer (i.e., the number of neurons in the hidden layer) and comparing the RMSE values for different hidden layer sizes ranging from 1 to 20. The optimal structures were found to be 4-1-1, 4-1-1, 4-2-1, and 4-8-1 for the ANN-GD, ANN-GDM, ANN-GDA, and ANN-LM models, respectively. The ANN-LM model that applied the LM algorithm as an optimization approach in the backpropagation step outperformed all other models, with AARE, R, E, and RMSE values of 3.13, 0.9994, 0.9987, and 0.0005 during training and 4.43, 0.9976, 0.9952, and 0.0010 during testing, respectively. Moreover, when compared to the discharge equation proposed in literature, the ANN-LM model was found to significantly reduce the error in discharge prediction by 50%. This indicates that the proposed ANN-LM model is highly accurate and can be utilized for discharge prediction from circular side orifices.