Introduction

Water pollution has become an urgent environmental issue facing our world. The levels of contaminants, especially total phosphorus (TP), have been increasing at a very high rate, and this has led to a significant impact on ecological systems as well as on the economy and human health. Approximately 15% of the global population is concerned with the issue of water safety (Katz and Dosoretz, 2008). In China, a large amount of industrial and domestic wastewater has been discharged into the environment as a result of decades of rapid economic development and urbanization, and domestic sewage water constitutes approximately 70% of the total wastewater produced by the country. Therefore, it has become an urgent matter to implement intensive wastewater treatment programs to efficiently remove TP from wastewater.

In the past, physical or chemical methods such as adsorption, precipitation, filtration, and oxidation have been popular for treating wastewater (Ho and Babel, 2020). However, there has recently been renewed interest in staged coagulation-flocculation-based methods that are cheaper and require less time to operate (Mortadi et al. 2020). Coagulation plays an important role in reducing the levels of particulates, synthetic organic carbon, precursors of disinfection by-products (DBPs), and some inorganic and metal ions as well as microorganisms in wastewater to improve its quality (Wang et al. 2021). During conventional coagulation treatment, alum coagulant, which undergoes a hydrolysis reaction to form Al3+ species, is the commonly used inorganic coagulant (Santos et al., 2016). It has been shown that aluminum salt-based coagulants are much more effective than iron salt-based coagulants for removing turbidity from wastewater (Kumari and Gupta, 2020; Rizzo et al. 2008). Aluminum ions tend to attract particles in the wastewater to form larger flocs that can then be easily precipitated and removed from the water (Beltran et al. 2009).

In recent years, the use of the polymerized forms of metal coagulants (e.g., poly-aluminum chloride (PAC)) has increased because of their excellent flocculating property (Zhou et al. 2021; Bogunovi et al., 2021). Compared with traditional Al2(SO4)3, PAC has both strong charge neutralization and bridging abilities for easily attracting particulate matter or colloidal matter to form large flocs (Rizzo et al. 2019; Usefi and Asadi-Ghalhari, 2019). However, organic polymer-based flocculants, such as polyacrylamide (PAM), are widely used in the coagulation process by researchers. The advantages of their excellent flocculation effect and low dosage make them more effective in many cases for wastewater treatment plants (Li et al. 2018). PAC and PAM have been widely used in the coagulation–flocculation process in WWTPs for decades (Liu et al. 2013a, b; Wang et al. 2011; Xu et al. 2020). However, with China’s gradually increasing wastewater treatment plant discharge water quality standards, how to further improve the pollutant removal efficiency of the conventional coagulation process has become a hot research topic (Lv et al. 2019).

An important factor that can greatly enhance the effectiveness of the coagulation-flocculation-based method is to speed up the precipitation process of the floc and its separation from the water, and this might be achieved by adding magnetic powder (Liu et al. 2013a, b; Kumari et al. Kumari and Gupta, 2020). The presence of magnetic powder could enhance the effectiveness of the separation provided by the coagulation-flocculation treatment system. Recently, it has been reported that Fe3O4 as magnetic seeding has been used to remove micro-pollutants from wastewater with the advantages of accelerated floc sedimentation, less sludge production, and low cost (Huangfu et al. 2017). Furthermore, the magnetic powder may be separated from the waste sludge through a magnetic separator that can be set up on a magnetic coagulation device and then reused in the system (Lv et al. 2020; Yao et al. 2014). However, the effectiveness of this method largely depends on the precise amount of magnetic powder added and the exact coagulant used. Many factors can influence the efficiency of the magnetic coagulation-flocculation process, such as the amount of coagulant, flocculant, magnetic powder, pH, mixing, and hydraulic retention time (Lohwacharin et al. 2014; Qasim et al. 2019). Thus, the proper optimization of these factors can significantly increase the treatment efficiency. It has been reported that the water quality of secondary effluent from municipal wastewater treatment plants is stable (Li et al. 2009). Furthermore, the pH values of the water source in this work were calculated to be approximately 7.2 ± 8.8%. Hence, pH was not evaluated as a primary variable in this study.

In traditional methods of optimization, the effect of only one variable is studied at a time. This approach not only wastes precious experimental time, but it is also not possible to observe the overall effect of the variables on responses. RSM is widely used to design experiments and optimize experimental data. It can reduce the required number of experiments and determine the experimental conditions for obtaining the best response (Mohammad et al. 2019). RSM has become a common mathematical model for water treatment because it offers several advantages, such as requiring fewer tests, lower costs, time-saving, as well as having the power to evaluate the interactions among various factors.

However, RSM is difficult to practically apply in wastewater treatment plants (WWTP) due to its inability to incorporate non-controllable variables. Because wastewater treatment plants are different from laboratory experiments, the daily water quality fluctuates, and once the influent water quality has changed significantly beyond the original design range of the RSM, it is difficult for the RSM to obtain the optimal process solution and accurately predict the effluent quality. However, an artificial neural network (ANN) is a computer simulation of the biological neuron transmission process in the biological brain. It can mimic the great capability of learning and evaluating errors (Igwegbe et al. 2019). The greatest advantage of an ANN model is that it can handle a large amount of nonlinear and complex data with the addition of non-controllable variables, and it can be constantly updated using the input of new data obtained during the operational process of a WWTP. Thus, the combination of an ANN with the RSM will help to overcome the limitation of the RSM and predict the optimum conditions and accurate removals for any inflow water quality.

Previous studies have focused on a comparison between the RSM and ANN models for optimizing and modeling predictions, but these models have rarely been used as a combined approach (Hafeez et al. 2020). In this study, a novel approach of a combined RSM-ANN model was used for multiple target optimization by the magnetic coagulation process. This work aimed to improve the discharge water quality and to evaluate this novel optimization/prediction strategy for WWTP applications in the future.

Materials and methods

Materials

Characteristics of the secondary effluent

Wastewater samples were collected from the secondary effluent of domestic wastewater from a municipal wastewater treatment plant (WWTP) in Wenzhou, China. The initial total phosphorus (TP), pH, total nitrogen, chemical oxygen demand, and turbidity were 0.52 mg/L, 7.1, 14.32 mg/L, 18 mg/L, and 33.40 NTU, respectively. The poly aluminum chloride (PAC), polyacrylamide (PAM), and magnetic powders (Fe3O4) were of analytical grade. The PAC and PAM were purchased from the Xinyuan Environmental Protection Company (Henan), and the magnetic powder was acquired from the Kaili Metallurgical Research Institute (Tianjin).

Preparation of the coagulant and flocculant

The coagulant, PAC, was prepared as follows. A total of 5 g of PAC was weighed and dissolved in a 200 ml beaker and stirred until the powder was completely dissolved. The volume was placed in a 500 ml volumetric flask, shaken well, and a concentration of a 10 mg/ml PAC solution was prepared on the day of the test. Thereafter, the solution of the flocculant and the PAM were prepared in a 500 ml beaker by weighing 0.3 mg of the PAM, and the beaker was placed on a magnetic stirrer and stirred at 1000 rpm/min for 4 h. This was done because PAM dissolves in water as a viscous, colorless solution and is extremely difficult to hydrolyze completely. Finally, the water samples were subjected to both conventional coagulation and magnetic coagulation assays, and the results from the two methods were compared. The chosen method was then optimized.

Design method of the coagulation jar tests

Initial magnetic coagulation test

The experimental method of coagulation was conducted by Jiang et al. (2014). Our experimental conditions were optimized in terms of the PAC addition order, the stirring rate, and the stirring time. The conditions were obtained by the order of the PAC, the magnetic powder, and the PAM, with stirring at 500 rpm for 45 s, 500 rpm for 45 s, and 80 rpm for 10 min, respectively.

The experimental procedure of this work was conducted as follows. The wastewater sample (1 L) was dispensed into a 2-L beaker, and a total of six such samples were prepared for each test. To each beaker containing the wastewater, a different amount of coagulant (PAC) was added, yielding different final concentrations (10, 20, 30, 40, and 50 mg/L), and the samples were then stirred for 45 s at 500 rpm. Next, the magnetic powder was added to each mixture to a final concentration of 500 mg/L, and the mixture was stirred for another 45 s. After that, the flocculant (PAM) was added to the mixtures to 0.3 mg/L followed by stirring for 10 min at 80 rpm. The mixtures were allowed to stand for 15 min and 200 ml of the water was withdrawn within 2 cm from the surface. The TP concentrations and turbidity in the samples were then measured to determine the optimum concentration of PAC that resulted in the maximal removal of TP and turbidity. The experiment was repeated using 0.3 mg/L of PAM and the PAC concentration that yielded the maximum TP and turbidity removal rates while varying the concentration of the magnetic powder from 300 to 700 mg/L. Finally, the concentrations of the PAC and magnetic powder that gave the maximal TP and turbidity removal rates were used to determine the optimum concentration of PAM by conducting the test under different concentrations of PAM (0 to 0.6 mg/L).

Response surface methodology design

The statistical design of the experimental analysis was performed using the Design Expert software (version 10.0). The Box-Behnken Design (BBD) and RSM were applied to optimize three variables: the coagulant (PAC), the magnetic powder, and the flocculant (PAM) Barekati-Goudarzi et al. (2016). Data from the initial test were used as center points to select the ranges of the PAC, the magnetic powder, and the PAM concentrations, which were 10 to 40 mg/L, 300 to 900 mg/L, and 0 to 0.3 mg/L, respectively. The levels and the three variables are shown in Table 1. The optimized result generated by the RSM was then used to perform an additional experiment as further confirmation. In all the cases, the TP and turbidity removals were analyzed as the response.

Table 1 Experimental range, levels, and coded names of the three variables

Data analysis

The designed RSM model included 15 runs. The modeling and analysis of the experimental data were performed using Design Expert 10.0. To optimize the three variables (PAC, magnetic powder, and PAM concentrations) in the process of magnetic coagulation-flocculation, the data were fitted to a second-order polynomial model (Khalid et al. 2019). The quadratic equation model for the predicted optimal conditions is expressed by the following equation (Zhang et al. 2018).

$$Y={\beta }_{0}+\sum \nolimits_{i=1}^{n}{\beta }_{i}{x}_{i}+\sum \nolimits_{i=1}^{n}{\beta }_{ii}{x}_{i}^{2}+\sum \nolimits_{i=1}^{n-1}\sum \nolimits_{j=i+1}^{n}{\beta }_{ij}{x}_{i}{x}_{j}$$
(1)

where Y, β0, βi, βii, and βij are the predicted response, regression constant coefficient, linear coefficient, quadratic coefficient, and interaction coefficient, respectively; and xi and xj are the coded values of the variables.

The graphical data were analyzed using an analysis of variance (ANOVA) to determine the interactions of the different variables and responses. The value of the coefficient of determination (R2) was used to express the quality of the fit produced by the RSM model (Momeni et al. 2018). The statistical significance of the model was assessed using an F-test. P-values at the level of 95% confidence were used to determine the significance of the variables Pambi and Musonge (2016). A statistical significance inconsistency between the predicted and experimental results was considered at a P < 0.05 level. The smaller the P-value, the better the model could fit the data Tetteh and Rathilal (2018). For each of the tested variables (coagulant, magnetic powder, and flocculant), a 3D surface plot and its respective contour plot were also obtained at the three levels.

Artificial neural network

A feed forward network, the ANN model, was developed with a back-propagation (BP) algorithm using MATLAB 2014a software. The concentration of PAC, magnetic powder, and PAM were selected as the three process variables for the input layer, whereas the removals of TP and turbidity were chosen as the output layer neurons to build a three-layer BP-ANN model. The experimental data were normalized between 0 and 1. The tansig function and the purelin function were selected as the transfer functions of the hidden layer and output layer, respectively. To determine the optimal network topology, the number of neurons was trained and selected as optimal, after which the neural network with the optimal effect was also selected by varying the number of training sessions. The optimal network model was selected by comparing metrics such as R2, the root mean square error (RMSE), and the average absolute deviation (AAD).

Similar to the RSM model, the experimental data were also generated from the BBD design and then used to determine the optimum architecture with network parameters. Then the trainlm function for training the BP-ANN model was then selected. The total data from the experiments were divided into three portions: the training set, the validation set, and the testing set representing 60%, 20%, and 20% of the experimental data, respectively. The aim of the three-part data was to evaluate the performance of the model with respect to the prediction for unseen experimental data that were not trained and to reduce the generalization capability of the model.

Results and discussion

Conventional coagulation/magnetic coagulation process evaluation

The results obtained using the conventional coagulation and magnetic coagulation methods under their best condition were studied to evaluate which process presented better parameter removals under the same settling times. Figure 1 shows the TP removal and turbidity using the two methods under settling times of 1, 2, 5, 10, 15, and 30 min.

Fig. 1
figure 1

Effect of settling time on TP and turbidity removal by conventional and magnetic coagulation

It can be observed, in a general way, that magnetic coagulation presents better parameter removals compared to conventional coagulation. In addition, in assays conducted under different settling times, the settling time for magnetic coagulation was much shorter than that of the conventional coagulation process. In terms of conventional coagulation, both the TP and turbidity removals kept increasing with increases in the settling time, and the highest removals were obtained and remained stable at 30 min of settling. Under the same conditions, the magnetic coagulation sedimentation processes were nearly completed at approximately 5 min and remained stable after 15 min. The same behavior applied to magnetic coagulation was reported by Dai (2017) and Ding et al. (2021). This might be explained by the possibility that the magnetic seed addition formed a magnetic floc with magnetic seeds as the core. The coagulant would hydrolyze and wrap around the magnetic powder particles, and this would greatly increase the chance of collision between pollutants and flocs, thus increasing the pollutant removal effect while also accelerating the floc settling process. This would provide a huge advantage for engineering applications. Therefore, it can be concluded that the great effect of the magnetic powder addition to enhance coagulation can greatly improve pollutant removal. Furthermore, the use of magnetic powder could also significantly reduce the settling time, which plays an important role in the advanced treatment. Thus, the magnetic coagulation process was chosen and further optimized by the RSM.

RSM modeling and optimization

The results obtained from the initial magnetic coagulation test revealed that the PAC, magnetic powder, and PAM concentrations were the primary factors that affected the TP and turbidity removals from the wastewater samples. The TP and turbidity removals were further investigated by performing 15 experimental runs according to the BBD design (Table 2), and the optimum conditions were obtained by the RSM (Table S1). The experimental data acquired from the BBD were subsequently used to obtain the predictive values via the RSM and the BP-ANN models.

Table 2 Box-Behnken design with the observed data and predicted responses

The values predicted by the RSM displayed good agreement with the actual data, with a coefficient of determination close to one (R2 = 0.9922, 0.9920; Table 3). The results were analyzed using the software Design Expert 10.0, and the regression equations of the second-order model for the TP and turbidity removal rates were obtained. The regression equations are shown in Eq. (2) and Eq. (3). In addition, the fitting quality of the RSM model was also evaluated by an application of an analysis of variance (ANOVA). The coefficients of the quadratic equations (Eq. (2) and Eq. (3)) and the p-values for the TP and turbidity removals were acquired from subsequent experimental data. The responses of the TP and turbidity removals in terms of the three variables are shown in Eq. (2) and Eq. (3), respectively. Both equations are second-order polynomial equations, and A, B, and C represent the concentration of the coagulant, the magnetic powder, and the flocculant, respectively. The positive number in front of each term represents a positive effect, while the negative number indicates a negative effect. The PAC, magnetic powder, and PAM all exerted a positive effect on the efficiency of TP removal, but only the PAM exerted a positive effect on turbidity removal. Equations (2) and (3) are as follows:

Table 3 ANOVA for the response quadratic model of TP removal and turbidity removal
$${Y}_{TP}=83.04+2.41A+0.42B+0.71C+0.15AB+0.31AC+0.29BC-2.55{A}^{2}-2.42{B}^{2}-2.32{C}^{2}$$
(2)
$${Y}_{turbidity}=60.38-3.28A-1.61B+1.66C+9.21AB+4.99AC+8.12BC-8.33{A}^{2}-15.26{B}^{2}-5.83{C}^{2}$$
(3)

Optimization of TP removal

Figure 2 represents the effect of the three factors (A, B, and C), and the interaction between any two factors on the TP removal effect. It can be seen from Fig. 2a that when the fixed magnetic powder dosing amount was certain, the removal efficiency of TP gradually increased as the PAC concentration increased from 10 mg/L. When the PAC concentration was raised to approximately 30 mg/L, the removal efficiency of TP reached the peak of the rising trend. In addition, with further coagulant PAC injection, the TP removal efficiency showed a slow decreasing trend, and the removal effect became slightly worse. However, it can be seen that the TP removal at the response value remained above an 86% removal rate, probably because the concentration of PAC was the primary factor in the removal of TP from water, but when the coagulant was added to reach a certain concentration, the TP removal effect in water no longer improved. The addition of excess PAC would lead to the formation of too many products of PAC hydrolysis, which in turn, can interfere with the adsorption and precipitation of TP, resulting in a lower removal rate Krupińska (2020).

Fig. 2
figure 2

a Response plot showing the effects of PAC and magnetic powder on TP removal and b response plots showing the effects of PAM and PAC on the TP removal

While the PAC dosing concentration was fixed, with an increase in the magnetic powder dosing from 300 to 900 mg/L, the TP removal efficiency reached the highest in the middle of the 600–700 mg/L of magnetic powder dosing, and then it gradually decreased. This was because the magnetic powder itself does not have the hydrolytic properties of a coagulant, and the purpose of adding a magnetic species to water is to easily form a magnetic floc with magnetic powder as the core that can better adsorb suspended matter and make it settle. Therefore, the TP removal rate was improved by appropriately increasing the amount of the magnetic powder injection. In addition, it can be seen from Fig. 2 that when the concentration of the magnetic powder was 300 mg/L, the PAC concentration of 40 mg/L could only reach a removal rate of approximately 79%. However, with a gradual increase in the concentration of magnetic powder, the corresponding PAC concentration was gradually reduced, and the maximum TP removal increased. This indicated that the magnetic powder could reduce the use of PAC to a great extent, reduce the cost, and make a significant improvement in pollutant removal. However, the magnetic coagulation process is performed for good magnetic composite floc to achieve the best TP removal. Thus, the amount of magnetic powder and PAC concentration has an optimal matching ratio. If either the amount of magnetic powder or that of the PAC is too much, this will decrease the pollutant removal effect.

Figure 2b shows the interaction between PAC and PAM. It can be seen from the graph that the TP removal changed rapidly followed by the PAC concentration, and then it reached the peak when the PAC was 30 mg/L. In contrast, a change in the PAM concentration had less of an effect on the TP removal rate, with the highest point near 0.15 mg/L. It could be concluded that PAC was the most important influencing factor on TP removal by magnetic coagulation. This was because PAC hydrolysis produced positively charged Al3+ that could effectively adsorb and subsequently complex phosphorus for a better sedimentation process.

Optimization of turbidity removal

Figure 3 shows the interaction effect of ABC on turbidity removal. Similarly, it shows a circle between two factors in the plane diagram, and this indicates that the interaction effects were significant in terms of turbidity (Fig. 3a). The turbidity removal rate increased slowly with an increase in the PAC dosage and then decreased rapidly when the concentration of the PAC dosage was certain. In addition, the effect of the magnetic powder on turbidity also displayed the same effect. This is because when the magnetic powder dosage is too large and shows a saturation trend, there will be residual magnetic powder in the water that will be difficult to adsorb with the pollutants and it will drop, thus suspending in the water to interfere with the turbidity removal effect. This also demonstrates the reason for the use of turbidity as an indicator of the magnetic coagulation process.

Fig. 3
figure 3

a Response plot showing the effects of the PAC and magnetic powder on turbidity removal and b response plot showing the effects of the PAM and magnetic powder on turbidity removal

As shown in Fig. 3b, at PAM concentrations less than 0.12 mg/L, it was difficult to obtain satisfactory turbidity removal rate results regardless of the variation in the magnetic powder. This indicated that the removal mechanism of the magnetic mixing coagulation was via adsorption on the adsorption sites of the hydrolysis products of the coagulants and flocculants to effectively adsorb pollutants. When the concentration of the polymer flocculant was too small, it was difficult to achieve an effective removal effect even if the amount of magnetic powder was increased. The above results demonstrate that the turbidity value of the effluent can directly reflect whether the amount of magnetic powder is too much. More precisely, the amount of magnetic powder is an important factor that affects the turbidity of the effluent.

Optimization of the process parameters

The optimum conditions for both TP and turbidity removal were discovered by using the three variables, namely, concentrations of the PAC, magnetic powder, and PAM. To obtain the optimum conditions for multiple responses, the desired conditions for each response were selected from the options provided (Gadekar and Ahammed 2019). In this study, the desired goal for turbidity removal was set as “maximize,” while TP removal was set as “in range.” Finally, the optimum conditions were obtained at a PAC of 28.42 mg/L, magnetic powder of 623 mg/L, and PAM of 0.18 mg/L. The predicted TP and turbidity removals were 83.28% and 59.80%, respectively.

ANOVA

The data generated by ANOVA in the RSM model for TP and turbidity removals are summarized in Table 3. The model was statistically acceptable as indicated by the high F value (70.69, 69.17) and low p-value (< 0.0001, 0.0001). The lack of fit (LOF) value indicated a smaller variation in the data around the fitted model. Poor fitting of the model to the data would result in an insignificant LOF. The large value for the lack of fit (Prob > F = 0.5289, 0.4576) showed that the F-statistic was insignificant, and this implied that the correlation between the variables and the process responses was significant for this quadratic model.

For both TP and turbidity removal, the model revealed a large and significant effect exerted by each of the three variables and their interaction. The model significance was computed based on the F-value and p-value. The value of p (< 0.05) revealed a significant interaction between any two of the three variables, and the smaller the p value, the more significant the interaction. The model chosen was largely based on the judgment conceived by the experimental design. Variables A, B, and C had significant effects on both removals, and the three variables had significant quadratic effects on A2, B2, and C2, respectively (Table 3).

BP-ANN modeling

Selection of the processing units and the hidden layer neurons

The generalization ability of the BP-ANN model has a great correlation with the number of hidden layers and the neurons of the hidden layers. The more hidden layers and neurons in the hidden layers, the better the generalization capability and the lower the accuracy of the network. The mean square error (MSE) values of different neuron numbers in the hidden layer are shown in Table S2, and the neuron number with the smallest MSE was selected to construct the BP-ANN model. Therefore, 3 × 7 × 2 was taken as the optimal architecture for the model (Fig. S1).

Network training and evaluation

To analyze the fitting and predictive performance of the two models, namely, the RSM and the BP-ANN models, four indicators of each model, the root mean squared errors (RSME), the average absolute deviation (AAD), the mean squared error (MSE), and the correlation coefficient (R2), were calculated (Table 4). The MSE values of the TP and turbidity obtained from the RSM and the BP-ANN model were 0.23, 1.23, and 4.45, 11.56, respectively. Moreover, larger values of these indicators (the RMSE, AAD, and MSE) were found to be associated with a greater model prediction error, consistent with a previous study Ghritlahre and Prasad (2018). The results in Table 4 show that the RSM model gave lower values for the AAD, MSE, and RSME compared with the BP-ANN model. Thus, from the perspective of these three indicators, the RSM model could be more reliable, yielding a better fit and prediction than the BP-ANN model under that specific water quality (Ganapathy et al. 2020). However, the value of R2 indicates good fitness between the experimental and predicted data given by the model. The results depicted that both the RSM (0.9922 for TP and 0.9920 for turbidity) and the BP-ANN (R2 = 0.9461 for TP and 0.9461 for turbidity) models both had great fitness, as shown in Table 4, because the R2 were relatively close to one. In this case, the significance of the BP-ANN was demonstrated by the three R2 values, namely, the training set, the testing set, and the verified set. The validation R2 of the response given by the BP-ANN was 0.96 (Fig. 4), which implies a good, stable prediction ability for the TP and turbidity removal by magnetic coagulation under a given range of concentrations.

Table 4 Comparison of the predictive capabilities of the RSM and BP-ANN models
Fig. 4
figure 4

BP-ANN model with the training, validation, test, and all prediction sets

The experimental and predicted data given by the two models in terms of the TP and turbidity removals are graphically shown in Fig. 5a and c. It can be seen that the actual results are slightly larger than the prediction results produced by the BP-ANN in terms of the TP removal experimental numbers at 6, 12, and 13. However, in terms of turbidity removal, the ANN errors are larger than the RSM at the numbers 3, 6, 11, and 12. This indicates that the RSM is a superior model compared to the single BP-ANN model. The residual values of the two models for the removal rate are shown in Fig. 5b and Fig. 5d, where the deviation of the residual values for the RSM clearly displays a regular curve compared with that of the BP-ANN model. The smaller fluctuation in the residuals near zero, the more accurate the model prediction. This indicates that the RSM could optimize the process better and had a greater fitting and prediction performance of the pollutant removal under the specific water qualities compared to the single BP-ANN. However, the limitation of the RSM is its inability to incorporate non-controllable variables. Thus, by taking the self-learning and updating advantage of the BP-ANN, it is possible to better predict the removal effects by the magnetic coagulation process based on the RSM model. Therefore, the combination of the two models appeared to have a good fit and provide enhanced flexibility, feasibility, and accuracy in practical applications for wastewater treatment plants.

Fig. 5
figure 5

a Comparison of the experimental with the predicted value for TP removal by the RSM and BP-ANN models, b distribution of the residual value of TP removal, c comparison of the experimental with the predicted value for turbidity removal by the RSM and BP-ANN models, and d distribution of the residual values of turbidity removal

RSM-ANN modeling and validation

The RSM-ANN model was constructed using the same experimental data that was used for the BP-ANN model with two additional uncontrollable variables, the raw TP concentration, and the raw water turbidity. In addition, the same back-propagation algorithm was applied for the RSM-ANN modeling. The RSM-ANN was tested by varying the number of hidden layers and neurons in order to select the optimum topology architecture by comparing the MSE and the four R2 values. In terms of R2 (all), the number of neurons in the hidden layer showed a gradual increase in R2 (all) when the number of neurons was valued from seven to 10. When the number of neurons exceeded 10, the value of R2 also decreased gradually. While the training R2 of the topologies 5 × 7 × 2 and 5 × 8 × 2 were 0.9998 and 1, respectively, and these were higher than the topology 5 × 10 × 2 (0.9997). Validation is the most important indicator in the analysis of network performance. According to the MSE values, the MSE values were smaller when the number of neurons was 10, 12, and 14 and their training R2 values were all close to 1. However, the values of validation R2, testing R2, and R2 (all) were not close to 1 compared to 10 for the number of neurons were 12 and 14. This indicated training overfitting (12 and 14), resulting in a decrease in the model prediction performance. Therefore, 5 × 10 × 2 was used as the optimal architecture for the model (Table 5). The training and predicted R2 were 0.9997 and 0.9336 (Fig. 6), respectively, which were closer to one than the predicted R2 of the RSM (0.9175, 0.9092). Additional experiments conducted under the optimum conditions for TP and turbidity removal were 88.79 ± 5.45% and 63.48 ± 9.60%, respectively, and the values predicted by the RSM-ANN and the RSM were 87.71 ± 5.74%, 64.62 ± 10.75%, 83.28%, and 59.80%, respectively. Furthermore, the treated water qualities of the TP concentration and turbidity values were 0.17 ± 6.69% mg/L and 2.46 ± 5.09% NTU, respectively, which completely met the surface water class IV standard. This result indicates that the RSM-ANN provided a better predicted performance than a single RSM model. The RSM-ANN model improves on the limitations of the RSM while allowing for more practical applications for guiding wastewater treatment plants.

Table 5 MSE values and R2 of the different topologies of the RSM-ANN model
Fig. 6
figure 6

RSM-ANN model with the training, validation, test, and all prediction sets

Conclusion

In this study, a coagulation process enhanced by Fe3O4 as the magnetic powder was employed to further treat the secondary effluent from a WWTP to improve the effluent quality. Multiple responses in terms of the TP and turbidity removals were simultaneously optimized and predicted with the help of the RSM and the BP-ANN approaches. The optimum conditions were obtained using the RSM at a PAC of 28.42 mg/L, magnetic powder of 623 mg/L, and PAM of 0.18 mg/L. Both models had good fitness and prediction abilities, while the RSM provided a slightly better prediction performance under known water quality conditions compared to the single BP-ANN model.

A novel approach combining the RSM and BP-ANN was then developed by adding uncontrollable variables. The RSM-ANN model was optimized and validated using the additional experimental data. Under optimum conditions, the TP and turbidity removal obtained were 88.79 ± 5.45% and 63.48 ± 9.60%, respectively, and the predictions of the RSM-ANN (87.71 ± 5.74% for TP and 64.62 ± 10.75% for turbidity) were closer than that of the single RSM predictions (83.28% for TP and 59.80% for turbidity) to the experimental values. Moreover, the effluent TP and turbidity were 0.17 ± 6.69% mg/L and 2.46 ± 5.09% NTU, respectively, which were far below the water quality standard of Class IV. Therefore, the results of the study showed that the magnetic coagulation process could be used as an advanced treatment to improve the discharged water quality from a WWTP, and the combined RSM-ANN could be an effective and powerful approach for predicting TP and turbidity with instant adjustments of the operational variables.