Introduction

Surface water pollution is a significant concern for communities worldwide and requires more attention from environmental researchers (Hameed et al. 2016). In addition, the implications of contaminated surface water are deterioration of water quality (WQ) (Panaskar et al. 2016), a direct threat to human health (Kadam et al. 2019), and disruption of the aquatic ecosystem's balance (Koranga et al. 2022). Accordingly, the United Nations (UN) announced that approximately 1.5 million people die annually from diseases driven by polluted water. Moreover, it is reported that 80% of health troubles in developing countries are brought on by polluted water (Aldhyani et al. 2020a; b). However, surface WQ is affected by various factors, including natural components such as temperature or precipitation (Gupta and Gupta 2021; Michalak 2016; Shanley 2017) and anthropogenic factors such as manufacturing practices, urbanization, and agriculture (Asadollah et al. 2021; Ustaoğlu et al. 2020). In view of this, managing WQ in diverse climate conditions seems crucial to prevent the population from suffering from diseases and health troubles. Hence, various techniques have been created and developed to evaluate the WQ, such as the Water Quality Index (WQI) technique (Uddin et al. 2021). WQI has been commonly utilized "as a classification indicator" to evaluate and categorize the water bodies' quality based on measuring a wide range of physiochemical and organic variables and plays a vital role in water management(Das Kangabam et al. 2017; Judran and Kumar 2020).

Prediction is crucial to WQ monitoring and is part of contemporary environmental management. Over the past few decades, many traditional WQ prediction methods have been used, e.g. auto-regressive integrated moving averages (ARIMA) (Araghinejad 2013) and multiple linear regression (MLR) (Rajaee and Boroumand 2015). However, with the expansion in the volume of data, traditional methods cannot effectively suit the requirements of researchers due to the increase in computing power and the inability to capture non-stationarity (Chang et al. 2016) and nonlinear (Huo et al. 2013) WQ owing to its sophisticated and complicated nature(Chen et al. 2020). Hence, artificial intelligence techniques such as artificial neural networks (ANNs) are becoming more popular in recent years because they can surmount the limitations or drawbacks of traditional models. In addition, due to their similarities with the brain's nervous system, ANNs are suitable for analyzing nonlinear and unpredictable problems and have become a hotspot in environmental quality research (Hajirahimi and Khashei 2022). The ANN's architecture comprises an input, hidden, and output layers (Kadam et al. 2019). artificial neural networks (ANNs) (Sakizadeh 2016). For instance, Vijay and Kamaraj (2021) employed ANN to predict WQI Drinking Water Distribution System. Also, Yilma et al. (2018) confirmed that ANN is useful for modelling the WQI.

Moreover, the imperative for raising data-driven techniques' reliability, accuracy, and capability has encouraged scientists to develop creative models. The fundamental goal of these new models is to maximise the potential of existing models by combining the benefits of several methodologies (Faruk 2010). These combined approaches generally integrate methods in a process where conventionally, one technology is considered the basic method, the others functioning as pre-processing or postprocessing techniques (Modaresi and Araghinejad 2014). In this context, various optimization techniques were utilized to integrate the machine learning models by finding their hyperparameters, leading to increased accuracy and saving time (Ahmed et al. 2017). For example, particle swarm optimization (PSO) was employed in various fields, such as drought (Nabipour et al. 2020) and WQ (Aghel et al. 2018; Azad et al. 2019). Also, Faramarzi et al. (2020) proposed the marine predator's algorithm (MPA), which is applied in various applications, including friction stir welded (Abd Elaziz et al. 2020) and photovoltaic systems (Yousri et al. 2020). In addition, the slime mould algorithm (CPSOCGSA) has been devised by Rather and Bala ( 2019b) to solve optimization problems such as engineering design problems (Rather and Bala 2019a) and water drought forecasting (Alawsi et al. 2022).

Finally, this research aims to predict long-term WQI considering several WQ parameters and rainfall data. The following objectives will be carried out to accomplish this goal: (1) calculating the WQI for the Yesilirmak River. (2) Applying three stages of data pre-processing to increase data quality and choice the optimal independent factors. (3) To reduce uncertainty, combine the ANN model with the CPSOCGSA algorithm and compare the MPA and PSO methods.

Based on the authors' investigation, this is the first time utilizing update algorithms (i.e., CPSOCGSA and MPA) to predict the WQI.

Methodology

The framework of the proposed methodology comprises five stages: study area and data collection, calculating the WQI, data pre-processing, CPSOCGSA algorithm, ANN model, and evaluation of the model's performance, as shown in (Fig. 1).

Fig. 1
figure 1

Framework of the architecture of ANN to predict WQI

Area of study and data set

The Yesilirmak River is considered one of the longest rivers in Turkey and located between approximately 29′ and 40′ N. 31′ 40′ N and 44′ 71′ E. 44′ 61′ E, with 39,628 km2 covering's area and discharges into the Black Sea in Samsun. The basin has transported partially and fully treated municipal wastewater and pollution from various diffuse sources, including farmyards and agricultural areas. Additionally, different delta parts have dried up in recent decades (Dinc et al. 2021). The station case study was located within the provincial borders of Çorum, Eastern Black Sea Basin. Therefore, the coastal sea climate is dominant in this basin zone, with annual precipitation of about 443 mm.

In several countries, the essential obstructive faced by researchers is the lack of data. Therefore, these stations were sufficiently selected for assessing and building prediction models regarding the number of data and availability of parameters. Additionally, the National Oceanic and Atmospheric Administration (NOAA) (NOAA 2021) was adopted for obtaining the climate variables. The dataset was collected for twenty years, from 1995 to 2014. It contained eleven WQ parameters with one influential climatic factor (rainfall), where the values are distributed regularly along the monitoring periods. Table 1 presents the WQ parameters used to calculate the WQI.

Table 1 The water standards, assigned weights, and relative weight for the WQI's equation (Sharma et al. 2014; Kulisz Monika et al. 2021; Kadam et al. 2019; Şener et al. 2017)

Water quality index calculation

The WQI is an evaluation method that reflects the influence of individual WQ variables on the overall quality of aquatic systems (Ramakrishnaiah et al. 2009). For calculating the WQI, 11 significant WQ parameters were specified regarding the World Health Organization standard for drinking WQ (Organization et al. 2004). The physicochemical parameters were: biological oxygen demand (BOD), pH, electrical conductivity (EC), dissolved oxygen (DO), chlorine (CL−1), calcium(Ca+2), magnesium (Mg+2), nitrite (NO2−1), sodium (Na+1), sulfate (SO4), and total dissolved solids (TDS) (Ewaid et al. 2018; Kulisz Monika et al. 2021). In this research, the WQI calculation can be categorized into three stages:

Firstly, weights (Wi) were assigned to all WQ parameters with a scale ranging from 1 to 5, reflecting their significance in affecting water bodies. Table 1 shows the WQ standards, assigned weights (AW.), and relative weights (RW.) of the WQI's equation.

Secondly, calculate the relative weights (Wi) by (Eq. 1):

$$Wi=\frac{wi}{{\sum }_{i=1}^{n}wi}$$
(1)

\(\mathrm{Wi}\) Is the relative weight, wi is the weight of every parameter, and n is the number of parameters.

$$qi=(ci/si)*100$$
(2)
$$qi=((ci-vi)/(si-vi))*100$$
(3)

Thirdly, from (Eq. 2), a quality rating scale (qi) was assigned by dividing every parameter concentration in the water samples by its respective standard in WHO guidelines. In contrast, the quality rating for pH and DO was calculated by (Eq. 3).

Where Ci is the measured concentration of parameters in each simple, Si is the drinking water standard for each parameter, Vi is the ideal value (for pH 7.0 and DO = 14.6), and SIi is the subindex of the parameter. Finally, WQI was derived from formulas (4) and (5).

$$SIi=qi*Wi$$
(4)
$$WQI=\sum SIi$$
(5)

Table 2 shows the classification of WQI according to the range proposed by previous studies (Aldhyani et al. 2020a; b; Ewaid et al. 2018).

Table 2 water quality classification (WQC)

Data pre-processing

Three strategies were adopted in this research as pre-processing data methods, which are normalization, cleaning, and selecting the optimal model's input.

Normalization

Data normalization is a crucial step for mining datasets in soft computing. In various scientific research, techniques and assessment criteria mostly have multiple scale sizes and units, producing various analysis data results. Therefore, normalization will be a fundamental task to achieve comparability among the variables and realize the expectations of optimization data. Moreover, it minimizes the dimension's impact of different time series and eliminates the influence of outliers and multicollinearity of model parameters (Wu and Wang 2022). In this work, the SPSS 26 statistics package was employed to normalize the time series by applying the natural logarithm technique.

Cleaning

Data cleaning methods comprise locating outliers and noise and then treating outliers and eliminating noise to optimize the data analysis outcomes (Tabachnick et al. 2007). So, the box whisker technique was adopted in this study to identify the outliers were existed outside of the period in the formula below (Kossieris and Makropoulos 2018):

$$\pm 1.5 \times \mathrm{ CQR }(\mathrm{CQR }=\mathrm{ third\, quartile }\left(\mathrm{Q}3\right)-\mathrm{ first\, quartile }(\mathrm{Q}1))$$
(6)

Also, the technique of singular spectrum analysis (SSA) is used to detect and remove noise from time series. It is an efficient method for analyzing the initial time series into various components (Golyandina et al. 2018; Karami and Dariane 2022). This approach has shown to be effective in diverse speciality areas, such as rainfall prediction(Reddy et al. 2022), hydrology (Ouyang and Lu 2018), drought forecasting (Pham et al. 2021), and groundwater prediction (Polomčić et al. 2017).

Identifying the best model input

Determining the appropriate predictors is fundamental in establishing a forecast model's structure and enhancing the performance of model. Therefore, the tolerance approach was applied to identify the optimal scenario for the predictors to avoid multicollinearity (Calì et al. 2016; Pallant 2020). It recommended a tolerance coefficient value equal to or higher than 0.2 for selecting the model's predictors.

Constriction coefficient-based particle swarm optimization and chaotic gravitational search algorithm (CCPSOCGSA)

CPSOGSA is a hybrid heuristic optimization that utilizes the intensification potential of the CPSO algorithm with the diversification capability of GSA's algorithm. The components of this hybridization technique will be clarified in the subsections below.

A. Constriction Coefficient based Particle Swarm Optimization (CCPSO.)

The PSO algorithm is a popular optimization approach inspired by natural swarm behaviour for birds or fish. The PSO architecture consists of three principal parameters: abest, bbest and inertia weight. Where the abest and bbest assist the finding of the search-space region. The inertia weight has a significant impact on the process of global exploration. In (Eqs. 7,8), the Particle Swarm mathematical formulation explains the updating of the process for the particle's location and velocity during the alteration of the particle values.

$${v}_{x}^{d}\left(t+1\right)=w\left(t\right){v}_{x}^{d}\left(t\right)+{c}_{1}{r}_{x1}\left(abest- {x}_{x}^{d}(t)\right)+{c}_{2}{r}_{x2}\left(bbest- {x}_{x}^{d}(t)\right)$$
(7)
$${x}_{x}^{d}\left(t+1\right)={x}_{x}^{d}\left(t\right)+ {v}_{x}^{d}\left(t+1\right)$$
(8)

where the c1, c2 are the learning constants, while rx1 and rx2 are the numbers range from 0 to 1.

Constriction coefficients were employed to improve Particle Swarm Optimisation (PSO) exploitation stage by minimizing the impact of particle movements outside the solution space and hastening convergence during the optimization stage. The coefficients are described as follows:

$${\varphi }_{1}=2.05, {\varphi }_{2}=2.05, \varphi ={\varphi }_{1}+{\varphi }_{2}$$
(9)
$$K=2/\left(\varphi -2+\sqrt{\left({\varphi }^{2}-4\right)}\right)$$
(10)

Substituting the inertia weight by the notation K, (Eq. 7) can be rewritten as in below:

$${v}_{x}^{d}\left(t+1\right)=K{v}_{x}^{d}\left(t\right)+K{\varphi }_{1}{r}_{x1}\left({abest}_{x}\left(t\right)- {x}_{x}^{d}\left(t\right)\right)+K{\varphi }_{2}{r}_{x2}\left(bbest- {x}_{x}^{d}\left(t\right)\right)$$
(11)

where Kφ1 = c1, Kφ2 = c2.

B. Chaotic Gravitational Search algorithm CGSA

Gravity and motion Newton's law is considered fundamental to the configuration of the GSA algorithm. Newton's law states that "the gravitational force between two masses is directly proportional to the product of their masses and inversely proportional to the square of the distance between them". Accordingly, the gravitational force (F_ij) between masses (i.e., searching agents) (x) and (y) at time (t) can be represented as in the following (Eq. 12):

$${F}_{xy}=G\left(t\right)\frac{{m}_{px}\left(t\right){m}_{ay}\left(t\right)}{{R}_{xy}\left(t\right)+\in }\left({x}_{x}^{d}\left(t\right)+{x}_{y}^{d}\left(t\right)\right)$$
(12)

where \({m}_{px} \mathrm{is the} \)attractive mass, \(and {m}_{ay}\) is the passive masses. While Rxy(t)  is the Euclidian distance between the two masses at the time (t), while \(\in \) is a small coefficient, and G is the constant help for controlling the solution space to secure a feasible region.

The (G) constant can be represented by (Eq. 13):

$$G\left(t\right)=G\left({t}_{o}\right){e}^{\left(-\alpha \frac{CI}{MI}\right)}$$
(13)

where \(G\left({t}_{o}\right)\&G\left(t\right)\) is the initial and final values of G, respectively. \(CI\) is the current iteration, \(\alpha \) is the small constant, and \(MI\) is the maximum value of iterations. The behaviour of G over time is proposed using a chaotic normalization (Rather and Bala 2019a, b). Hence, (Eq. 14) descript the final formula of the gravitational constant's:

$${G}^{c}\left(t\right)={C}_{i}^{norm}\left(t\right)+G\left({t}_{o}\right){e}^{\left(-\alpha \frac{CI}{MI}\right)}$$
(14)

The aggregate force exerted by the masses (i.e., searching agents) could be found as shown in (Eq. 15):

$${F}_{x}^{d}\left(t\right)= \sum_{y=1, y\ne x}^{m}{\gamma }_{y}{F}_{xy}$$
(15)

where \(\gamma \) value ranges from 0 to 1, it is essential to calculate the position and velocity to find the global optimum, which could be represented according to (Eqs. 16, 17):

$${v}_{x}^{d}\left(t+1\right)={\gamma }_{y}{v}_{x}^{d}\left(t\right)+{a}_{x}^{d}\left(t\right)$$
(16)
$${x}_{x}^{d}\left(t+1\right)={x}_{x}^{d}\left(t\right)+{v}_{x}^{d}\left(t+1\right)$$
(17)

where \({a}_{x}^{d}\left(t\right) \) is the mass acceleration.

C. The Combination algorithm (CPSOCGSA)

Combining the two heuristic techniques (CPSO and CGSA) could assist to exceeded each technique's imperfections. The hybridization equation formula can be described in (Eq. 18):

$${v}_{x}^{d}\left(t+1\right)=\left(2/\left(\varphi -2+\sqrt{{\varphi }^{2}-4}\right)\right){v}_{x}^{d}\left(t\right)+K{\varphi }_{1}{r}_{x1}\left({a}_{x}^{d}\left(t\right)-{x}_{x}^{d}\left(t\right)\right) +K{\varphi }_{2}{r}_{x2}\left(bbest-{x}_{x}^{d}\left(t\right)\right)$$
(18)

The particle's location is indicated by (Eq. 19):

$${x}_{x}^{d}\left(t+1\right)={x}_{x}^{d}\left(t\right)+{v}_{x}^{d}\left(t+1\right)$$
(19)

Artificial neural network (ANN)

ANNs are an information-processing approach designed to simulate the functioning of the human brain by attempting the same connectivity and processes as biological neurons (Kouadri et al. 2022). In this research, the ANN structure was composed of four layers of neurons: an input layer which has WQ parameters and rainfall data, two hidden layers to address the nonlinearity relationship, and an output layer which has the target (WQI)(see Fig. 2). The multilayer perceptron (MLP) is performed with feed-forward backpropagation (FFBB). The efficiency with a low error rate and speed of the learning algorithm (Levenberg–Marquardt, LM) was an underlying factor employed with FFBB for training the ANN model (Payal et al. 2015). Matlab Toolbox was running to implement the ANN technique.

Fig. 2
figure 2

ANN structure for CPSOCGSA-ANN

The data were separated into three groups: training, testing, and validation, with 70%, 15%, and 15% of values utilised for each set, respectively, such as (Kulisz Monika et al. 2021; Zubaidi et al. 2018).

However, the traditional method (trial and error) was time-consuming and did not always introduce the optimal solution. As a result, combining metaheuristic algorithms with the ANN model is regarded as a superior technique for determining the optimal neurons’ number in the hidden layers (N1, N2) and selecting the best learning rate coefficient (Lr). These hyperparameters are responsible for choosing the best predictors and target mapping to avoid underfitting or overfitting the model.

Model performance assessment

Three statistical criteria are used in this study to evaluate the WQI prediction model's performance. The criteria are the root means squared error (RMSE, Eq. 20), mean absolute error (MAE, Eq. 21), and determination coefficient (R2, Eq. 22) (Mohammadi et al. 2020; Mohammadi and Mehdizadeh 2020):

$$MAE=\frac{{\sum }_{i=1}^{N}\left|{O}_{i}-{F}_{i}\right|}{N}$$
(20)
$$RMSE=\sqrt{\frac{{\sum }_{i=1}^{N}{\left({O}_{i}-{F}_{i}\right)}^{2}}{N}}$$
(21)
$${{R}^{2}}^{ }={\left[\frac{\sum_{i=1}^{N}\left({O}_{i}- {\overline{O} }_{i}\right)\left({F}_{i}-{\overline{F} }_{i}\right)}{\sqrt{\sum {\left({O}_{i}- {\overline{O} }_{i}\right)}^{2}\sum {\left({F}_{i}-{\overline{F} }_{i}\right)}^{2}}}\right]}^{2}$$
(22)

where \({O}_{i}\) represents observed WQI, \({F}_{i}\) is the forecast WQI, \(N\) is the sample size \(,\overline{{F }_{i}}\) is the mean of forecast WQI, and \(\overline{{O }_{i}}\) is the mean of observed WQI.

Additionally, the Taylor diagram was utilised to evaluate the outcomes of prediction models by creating a visual comprehension of performance by showing diverse points on a polar plot for multisets of modelling outcomes (Taylor 2001).

Results and discussion

Water quality index assessment

The assigned weight index approach was juxtaposed with the WHO standards for drinking WQ to calculate the WQI for the Yesilrmak river. Table 3 presents the eleven parameters that were combined for the WQI prediction and the descriptive statistics values for them. As can be observed, the WQI varies from 37.41 to 181.06, and the mean was 51.36; this variation reflects the unsuitability for drinking, as shown in Table 2. Calcium (Ca+2), Nitrite (NO2−1), chloride (CL−1), and sodium (Na+1) contents did not exceed the acceptable range in any tested values. While pH, biological oxygen demand (BOD5), electrical conductivity (EC), magnesium (Mg+2), sulfate (SO4), and total dissolved solids (TDS) were higher than the specified values in the WHO guidelines. In the same context, the dissolved oxygen exceeded the minimum acceptable value in six samples during the monitoring periods.

Table 3 Descriptive statistics for the Yesilrmak river

Preparation of predictors and target variables

The input data [i.e., WQ parameters and rainfall) and target data (WQI)] were normalized by applying a natural logarithm (Ln) to minimize the adverse impact of extreme values and achieve the normal time series distribution. Then, if any outliers remained after the normalization, it was adjusted. The box plot technique displayed in (Fig. 3) decreased the variation scale of normalization data compared to the raw data. It also showed how the data had been cleaned from outliers.

Fig. 3
figure 3

Box plot drawings of WQI with some of the selected parameters before and after normalized and cleaned data

After that, the SSA was applied to decompose the normalized and cleaned time series into three different components to eliminate the noise. Also, the data pre-processing technique enhanced the correlation coefficient (R) between the dependent and independent factors, such as improving the correlation between WQI and chlorine from 0.609 to 0.7106.

Then, a tolerance approach was applied to locate the optimal predictors scenario to simulate the WQI with high accuracy and prevent multicollinearity by clearing superfluous parameters. According to (Pallant 2020), the tolerance coefficient for the nominated input variables must be higher than 0.2. As shown in Table 4 below, five variables were chosen; biological oxygen demand, Chlorine, magnesium, potassium and Rainfall as the optimal input data model.

Table 4 Collinearity statistics for the specified predictors

In the last step, according to(Gharghan et al. 2016; Kulisz and Kujawska 2021), the data were categorized into 70% training, 15% testing, and 15% validation to create and evaluate the forecasting model.

Configuring the hybrid model

The artificial neural network was hybridized with a metaheuristic approach to locate the optimum LR, N1, and N2. The Matlab program was used for running the hybrid models (i.e., MPA-ANN, CPSOCGSA-ANN, and PSO-ANN) to identify the optimum ANN's hyperparameters (LR, N1, and N2). This research utilized swarm sizes (10, 20, 30, 40, and 50 pop) for every swarm repeated five times with ( 200 iterations) to achieve the minimum fitness function (MSE). For example, Fig. 4 depicts the CPSOCGSA-ANN performance and exposes the optimum fitness function for all swarms of WQI.

Fig. 4
figure 4

CPSOCGSA-ANN algorithm performance for simulating WQI

Figure 5A shows that the swarm size (305) was superior in the CPSOCGSA-ANN algorithm and produced the minimum fitness function (MSE = 0.001283, with five Iterations). While, swarm size (405) for the MPA-ANN algorithm offers the best solution within the fitness function (MSE = 0.01341, with nine Iterations), as shown in Fig. 5B. Finally, the PSO-ANN algorithm within swarm size (404) was superior to other PSO-ANN swarms and delivers less error (MSE = 0.01479, with 94 Iterations), as shown in Fig. 5C.

Fig. 5
figure 5

Best performed of CPSOCGSA-ANN, PSO-ANN and MPA-ANN algorithms for simulating WQI

Based on the best swarms for each metaheuristic method, Table 5 shows the hyperparameters for ANN models.

Table 5 ANN hyperparameters of all models

Performance evaluation

After determining the optimal hyperparameters for the ANN approach, each ANN method was run several times to identify the optimal network that delivers an accurate result. Three statistical metrics (RMSE, MAE, and R2) were employed to assess the model's effectiveness in predicting WQI data. The results of the performance criteria indices for the CPSOCGSA-ANN, PSO-ANN, and MPA-ANN strategy in the validation section are depicted in Table 6 (Dawson et al. 2007). The CPSOCGSA-ANN was superior to other models because it realized the minimum values of MAE, RMSE criteria and the highest value of R2.

Table 6 Performance evaluation for validation data phase

The Taylor diagram (Fig. 6) was utilized to evaluate the hybrid models’ performance in the validation phase. The measured WQI time series is constituted by the red character (Reference) on the Taylor diagram's X-axis. Taylor's graphical diagram compares three statistics, including correlation coefficient (R), standard deviation (SD), and root mean square error difference (RMSD). It thus delivers a dependable evaluation of the relative performance of various strategy (Ghorbani et al. 2018; Tao et al. 2021). Regarding the Taylor diagram, the CPSOCGSA-ANN model was indicated to be the best forecast model compared with the PSO-ANN and MPA-ANN models because it is the nearest to the reference point.

Fig. 6
figure 6

Taylor diagrams for different forecasting models

Moreover, to increase the reliability of the CPSOCGSA-ANN strategy, the Kolmogorov_Smirnov (K_S) and Shapiro_Wilk (S_W) tests were adopted to test the normality of the error data. The results showed that the (p-values) of K_S and S_W tests were more than 0.05, indicating that the errors have a normal distribution according to Valentini et al. (2021) (see Table 7).

Table 7 Tests of normality

Based on the above results, it is possible to conclude that:

  • The assessment result of Yesilrmak river quality varied from good to poor, except for one point (June 2000) that was unsuitable for drinking.

  • The SSA and tolerance approach has high benefits for enhancing data quality and picking the optimized scenario for input parameters.

  • Comparing the performance accuracy of the hybrid models, including CPSOCGSA-ANN, PSO-ANN, and MPA-ANN, showed that the CPSOCGSA-ANN was a reliable and superior model to predict WQI data.

Conclusion

The present research has assessed the WQ during the monitoring period (1995–2014) for the Yesilrmak river within the provincial borders of Çorum, Turkey. To this end, eleven physicochemical parameters were assigned to calculate the WQI. The result varied from good to poor, except for one point (June 2000) that was unsuitable for drinking, as shown in Table 3. Additionally, this study employed novel hybridization models combining pre-processing data and ANN approach optimized by multiple metaheuristic methods (i.e., CPSOCGSA, MPA, and PSO) to predict the WQI. Also, to raise the accuracy of models, the implications of climate factors cannot be neglected; hence, adopting rainfall beside the WQ parameters as predictors. The results for the models presented two crucial aspects. The first aspect confirms that pre-processing methods (i.e., SSA and tolerance) enhanced the dataset quality and yielded appropriate scenario of predictors. The second aspect, the performance of CPSOCGSA-ANN, was superior to other models (MPA-ANN and PSO-ANN) that yielded R2 = 0.965, MAE = 0.01627, and RMSE = 0.0187.

For future research, it is recommended to use these metaheuristic algorithms to integrate other machine learning models. Also, assess the impact of additional predictors such as an expansion of urbanization, the volume of wastewater discharges, and water consumption.