Keywords

1 Introduction

SAT system is considered attractive unconventional water resources for Egypt, which is suffering water sacristy. The usage of SAT can provide treatment for the wastewater and recharge in groundwater aquifers. While guidelines are available for the use of SAT system in Egypt for removal of nitrogen and organic matter, no guidelines are available for the SAT removal potential of organic micropollutants. This chapter discusses this issue and provides a prediction model for the OMP removal in SAT systems and is based on the author’s work on soil aquifer treatment system and analysis models [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21].

2 Soil Aquifer Treatment System

The increasing development of urbanization and population growth has been caused by water resource pollution control and proper recharge of soil aquifer. There is the possibility of organic micropollutants (OMPs) into the aquifer during groundwater aquifer recharge. Organic micropollutants consist of toxic minerals of endocrine disruptors, pharmaceutically active compounds, and personal care product. The environmental and health risks are increased by entering of the organic micropollutants into an aquifer soil. Therefore, the control and elimination of organic micropollutants from soil aquifer treatment system are of paramount importance to protect the aquifer from pollution. Because of the importance of control and removal of organic micropollutants from water resources, many studies have been conducted to remove these compounds. Drillia et al. [22] in an experimental study removed six different pharmaceutical compounds from various soils for municipal wastewater. Xu et al. [23] using the degradation and adsorption removed the personal care products and pharmaceuticals in agricultural soils in an experimental study. Maeng et al. [24] examined the OMP removal during bank penetration and feeding and recovery of aquifers. Time and place of transfer were provided to remove the active pharmaceutical ingredients. Yu et al. [25] studied three different soil types to remove the five types of pharmaceuticals and personal care products (PPCPs). Also, they studied the seasonal changes of organic micropollutant compounds in sewage treatment plant (STP). Personal care products and endocrine-disrupting were removed from the California wastewater.

Soil aquifer treatment (SAT) is a perfect option to artificial recharge of the groundwater aquifer. On the other hand, there is likely the penetration of the organic micropollutants (OMPs) into the soil aquifer during soil aquifer treatment as well. Hence, OMP removal in SAT system is of considerable importance. In Fig. 1, a schematic plan of soil aquifer treatment system at a sewage treatment station has been shown.

Fig. 1
figure 1

Schematic plan of soil aquifer treatment system at a sewage treatment plant

Lin et al. [26] eliminated the heavy metals for soil aquifer treatment in a wastewater treatment plant. Also, Fox et al. [27] in an experimental study analyzed the organic carbon content in the soil aquifer treatment for the five different types of soil. Amy and Drewes [28] investigated the removal/transformation mechanism of organic matter from wastewater stream through the soil aquifer treatment for wastewater treatment plant. Then, Sharma et al. [29] analyzed the removal of organic materials in wastewater during soil aquifer treatment system. Their results showed that the redox conditions, input flow rate, and residence time are sufficient for the removal of organic waste. Xu et al. [23] studied the characteristics and behavior of dissolved organic matter on the soil aquifer treatment. It was also showed that during soil aquifer treatment, 70% of the organic material is removed. Also, Caballero [30] studied the SAT system as a pretreatment process to remove organic micropollutant compounds from the wastewater stream. Sharma et al. [29] analyzed the effects of horizontal roughing filtration, coagulation, and sedimentation during pretreatment operations on the mechanism of soil aquifer treatment. It was showed that sedimentation and coagulation lead to less head loss and reduce the clogging effects. Abel et al. [31] examined the effects of temperature and redox conditions on the reduction of organic matter of wastewater on the soil columns and SAT systems. Abel et al. [31] studied the organic matter reduction, pharmaceutical compounds, and nitrogen in the soil aquifer treatment process in a laboratory model. Onesios-Barry et al. [32] conducted an experimental study of the SAT system in the removal of pharmaceuticals and personal care products (PPCPs) for different concentrations in wastewater treatment plants. Suzuki et al. [33] studied the mechanism of the organic matter removal and disinfection by-product formation potential in the higher layer of the soil aquifer treatment system in an experimental study.

3 Simulations of SAT Organic Micropollutant Removal

3.1 Model Setup

To capture the change in characteristics of OMPs during infiltration in the vadose zone, Sattar [13] chose a 2D vertical section in the soil beneath the SAT pond. Study domain dimensions were taken (10 m × 30 m). The upper boundary was selected to be a variable flux boundary of width 5 m, to simulate the intermittent water infiltration from a spreading basin, and lies in the center of the domain width. The lower boundary is chosen as free drainage to allow flow passage below the study domain. The OMP attenuation was simulated throughout 90 days from the day of application of wastewater in the ponds. This time was considered sufficient for OMP plume to infiltrate through soil layers and gets attenuated.

3.2 Attenuation of OMPs in SAT System

Using the average values of the SAT system parameters presented in Sattar [11], HYDRUS simulations were carried to model the fate of OMPs in SAT system under effects of sorption and biodegradation, both individually and combined. Figure 2 shows the variation of OMP concentration along the depth of vadose zone captured along the vertical centerline of the study domain. It was clear that the biodegradation process was more efficient in reducing the contaminant concentration than the sorption process although the plume sizes for both simulations were almost the same after 90 days. Moreover, it was noted that the OMP concentration at the topsoil layer was higher, in case of considering both sorption and biodegradation than in case of considering biodegradation only. The reason for this was attributed to the effect of sorption which distributes the contaminant mass between the sorbed phase and the liquid phase soon after the contaminant injection which explains the lower concentration after day 86. On the other side, biodegradation process is only active for the contaminant mass existing in the liquid phase which means that the mass that is sorbed onto the soil is not available for biodegradation. This leads to higher biodegradation rates in the case of biodegradation only than the case of combined sorption and biodegradation.

Fig. 2
figure 2

Variation of OMPs along depth

4 Extreme Learning Machine

Various studies have been conducted concerning to the use of artificial intelligence in performance evaluation of SAT system. Recently, Sattar [11] predicted the organic micropollutant removal during soil aquifer treatment system by gene expression programming model.

One of the most popular technique in data mining field is feedforward neural network (FFNN) which is trained by a gradient-descent algorithm such as back-propagation (BP). The disadvantages of FFNN-BP which lead to the low performance of this technique are imprecise learning rate and slow rate of convergence and presence of local minima. Therefore, Huang et al. [34] introduced a new algorithm for FFNN training based on single hidden layer feedforward neural networks (SLFNs), namely, extreme learning machine (ELM). The ELM required adjusting only activation function type and a number of hidden layers, while there are several user-defined parameters such as adjustment of hidden layer biases during execution of the algorithm and input weights. ELM compared to other learning algorithms such as BP in the learning process perform very fast and present appropriate performance in extended processing generation function. In this study, the removal mechanism of organic micropollutants (OMPs) in soil aquifer treatment (SAT) using extreme learning machine (ELM) method is modeled.

4.1 Architecture of ELM

Huang et al. [34] presented a new algorithm in terms of extreme learning machine learning to train the single-layer feedforward neural network (SLFFNN) as it does not require iterative tuning and has the ability to achieve global minima. The use of ELM in training the SLFFNN leads to a significant reduction in training time compared to the algorithms based on gradient-descent. Also, the use of active ELM as training algorithm does not require additional parameters such as stopping criterion and learning rate. Experimental observations by Huang et al. [34] showed that the ELM has an excellent ability in universal approximation and useful generalization. In an SLFFNN with random hidden nodes, at first, the input data set and real actual output of the model [(X), (Y)] are determined. Subsequently, the number of hidden nodes [K] and the type of activation function [g(∙)] are determined. Then, its weight and bias values are presented in random order [(W), (b)]. Then, the hidden layers’ matrix [H] is determined, and then the weight of output as analytic is calculated [β]. Input variables can be defined as the following matrix:

$$ X={\left[\begin{array}{ccc}{X}_{11}& \cdots & {X}_{1j}\\ {}\vdots & \cdots & \vdots \\ {}{X}_{n1}& \cdots & {X}_{nj}\end{array}\right]}_{n\times j} $$
(1)

where n and j are the numbers of samples and variables, respectively. Also, the actual output is defined as follows:

$$ Y={\left[{Y}_1\kern0.5em \cdots \kern0.5em {Y}_n\right]}^T $$
(2)

In the following, a positive integer value for hidden nodes (K) and a differentiable function for g(∙) should be defined. Thus, an input weight matrix W is created randomly to make the connection between the input and hidden nodes:

$$ W={\left[\begin{array}{ccc}{W}_{11}& \cdots & {W}_{1k}\\ {}\vdots & \cdots & \vdots \\ {}{W}_{j1}& \cdots & {W}_{jk}\end{array}\right]}_{j\times k} $$
(3)

Hidden layer matrix H by multiplying the input matrix X in the weight matrix W is calculated as follows:

$$ H= XW $$
(4)

Hidden layer active matrix H with the function g(∙) leads to the hidden layer output matrix:

$$ H=g(H) $$
(5)

Output matrix of the hidden layer, H out, and an input vector \( \widehat{Y} \) by output layer weight β are connected. ELM network output is calculated as follows:

$$ \widehat{Y}={H}_{\mathrm{out}}\beta $$
(6)

and

$$ \left\Vert {H}_{\mathrm{out}}\widehat{\beta}-Y\right\Vert =\underset{\beta }{\min}\left\Vert {H}_{\mathrm{out}}\beta -Y\right\Vert $$
(7)

The minimum norm to solve the least squares is calculated as follows:

$$ \beta ={H}_{\mathrm{out}}^{+}Y $$
(8)

where \( {H}_{\mathrm{out}}^{+} \) calculated as follows is the Moore-Penrose generalized inverse of H out:

$$ {H}_{\mathrm{out}}^{+}={\left({H}^TH\right)}^{-1}{H}^T $$
(9)

4.2 Performance Evaluation Criteria

In this study, to investigate the accuracy of numerical models, statistical indices root-mean-square error (RMSE), mean absolute percentage error (MARE), correlation coefficient (R), BIAS, scatter index (SI), and ρ are used as follows:

$$ \mathrm{RMSE}=\sqrt{\frac{1}{n^{\prime }}\sum \limits_{i=1}^{n^{\prime }}{\left({T}_{{\left(\mathrm{Predicted}\right)}_i}-{T}_{{\left(\mathrm{Observed}\right)}_i}\right)}^2} $$
(10)
$$ \mathrm{MARE}=\frac{1}{n^{\prime }}\sum \limits_{i=1}^{n^{\prime }}\left(\frac{\left|{T}_{{\left(\mathrm{Predicted}\right)}_i}-{T}_{{\left(\mathrm{Observed}\right)}_i}\right|}{T_{{\left(\mathrm{Observed}\right)}_i}}\right) $$
(11)
$$ \mathrm{BIAS}=\frac{1}{n^{\prime }}\sum \limits_{i=1}^{n^{\prime }}\left({T}_{{\left(\mathrm{Predicted}\right)}_i}-{T}_{{\left(\mathrm{Observed}\right)}_i}\right) $$
(12)
$$ R=\frac{\sum \limits_{i=1}^{n^{\prime }}\left({T}_{\left(\mathrm{Observed}\right)i}-{\overline{T}}_{\left(\mathrm{Observed}\right)}\right)\left({T}_{\left(\mathrm{Predicted}\right)i}-{\overline{T}}_{\left(\mathrm{Predicted}\right)}\right)}{\sqrt{\sum \limits_{i=1}^{n^{\prime }}{\left({T}_{\left(\mathrm{Observed}\right)i}-{T}_{\left(\mathrm{Observed}\right)}\right)}^2\sum \limits_{i=1}^{n^{\prime }}{\left({T}_{\left(\mathrm{Predicted}\right)i}-{\overline{T}}_{\left(\mathrm{Predicted}\right)}\right)}^2}} $$
(13)
$$ \mathrm{SI}=\frac{\mathrm{RMSE}}{{\overline{T}}_{\left(\mathrm{Observed}\right)}} $$
(14)
$$ \rho =\frac{\mathrm{SI}}{1+R} $$
(15)

where T (Observed)i is the observed values, T (Predicted)i the predicted values by the numerical model, \( {\overline{T}}_{\left(\mathrm{Observed}\right)i} \) the mean of observed values, and n′ the number of observed data.

5 ELM Prediction of SAT Organic Micropollutant Removal

Table 1 shows the parameters controlling the operation of a SAT system and the removal of OMPs, while Table 2 illustrates the ranking of parameters according to their contribution on the removal of OMPs in a SAT system [35]. It is observed that three parameters had the highest ranking: first-order biodegradation rate, saturated hydraulic conductivity, and dry to wet ratio. These high influential parameters have been chosen as predictors for developing prediction models in Sattar [11] and in this study. Using the 50,000 data sets simulated by Sattar [11], 25,000 (50%) were used to develop the models, 12,500 (25%) were used to test the models, and 12,500 (25%) were used to validate the developed models.

Table 1 Parameters controlling the fate and transport of OMPs in SAT system [11]
Table 2 Ranking parameters controlling the fate and transport of OMPs in SAT system according to their contribution in system output uncertainty [11]

To develop an OMP attenuation prediction model, the OMP plume mass, normalized mass ratio, and zero concentration depth are addressed. The mass stored in the contaminant plume can be calculated as:

$$ \mathrm{Plume}\ \mathrm{Mass}=V\times \sum \limits_{i=1}^N{C}_i\bullet {\theta}_i $$
(16)

where N = the total number of 2D FE mesh nodes in the study domain, i = the index of the node number, C i = contaminant concentration at node i, θ i = unsaturated moisture content at node i, and V = soil volume of each node.

To assess the SAT system OMP removal efficiency, the normalized plume mass has to be calculated, where a system with high removal efficiency would yield smaller fractions of the normalized mass. The normalized plume mass can be calculated as the ratio between the plume mass to the cumulative total injected mass, with time.

To estimate the parameters of plume mass, normalized mass ratio and concentration depth of 0% of the five parameters are presented in Table 3, as ELM model inputs are used. Therefore, a total of 15 different models of ELM are introduced. It should be noted that the Monte Carlo simulation (MCS) to determine the uncertainty of plume mass, normalized mass ratio, and concentration depth of 0% with 1,000 realizations to generate random inputs of ELM models are used. Sattar [11] stated that the plume mass for organic micropollutants in the soil, 90 days after the SAT operation, is calculated as follows:

$$ \mathrm{Plume}\kern0.17em \mathrm{mass}=6.14{K}_s^{1.1}{C}_0{\mu}_1^{-1}{\mathrm{DWR}}^{1/2} $$
(17)
Table 3 The quantities needed for modeling the OMP removal by ELM model

Here, K s is saturated hydraulic conductivity, C 0 the amount of concentration, μ 1 the rate of first-order biodegradation, and DWR the intermittent application of wastewater in the soil aquifer treatment system. Also, to calculate the normalized mass ratio, the following equation was used:

$$ \mathrm{Mass}\ \mathrm{ratio}={0.0055}_0{\mu}_1^{-1}{\mathrm{DWR}}^{1/5} $$
(18)

The minimum depth under a SAT system required to remove more than 98% contamination to the depth of 0% concentration is defined as follows:

$$ {Y}_{\mathrm{zero}}=0.17{e}^{{\mathrm{DWR}}^{0.5}{K}_s^{-1}}{\mu}_1^{-1/2}{\mathrm{HLR}}^{1/2}{\mathrm{DWR}}^{-3/8}{K}_s^{1/4} $$
(19)

where HLR is the hydraulic loading rate.

5.1 Plume Mass

Plume mass parameter is calculated by Eq. (16). The results of ELM models 1–5 for this parameter with GEP model provided by Sattar [11] were compared. In Table 4, different statistical indices for ELM models 1–5 and Sattar [11] model in the prediction of plume mass parameter are arranged. Also, scattering plots of the models for plume mass are depicted in Fig. 3. The highest correlation coefficient value for ELM 2 and ELM 5 models is calculated. The lowest of R value for ELM 4 equal to 0.964 is computed. The SI and ρ values for ELM 4 have been predicted, 0.811 and 0.413, respectively. ELM 5 among all ELM models has the highest correlation coefficient value and the least amount of errors. For this model, the BIAS value is predicted which is − 113,364. However, the MARE and the correlation coefficient values for the Sattar [11] model have been calculated, 38.274 and 0.933, respectively.

Table 4 Statistical indices of ELM models 1–5 and [11] to predict the plume mass parameter
Fig. 3
figure 3

Scatter plot for prediction of plume mass parameter (a) Sattar [11,12,13] (b) ELM 1 (c) ELM 2 (d) ELM 3 (e) ELM 4 (f) ELM 5

5.2 Mass Ratio

In the following, the results of ELM models 1–5 to predict the mass ratio parameter are evaluated. In Table 5, statistical index values to predict the mass ratio by ELM models [11] are shown. Also, scatter plots for the model presented in Fig. 4 are visible. Based on the results of ELM, the highest RMSE value has been predicted for ELM 1 (RMSE = 0.019). For this model, the R statistical index is calculated to equal to 0.974, while ELM 3 has the highest amount of correlation (R = 0.977) and the lowest MARE value(MARE = 1.661). For ELM 3, BIAS and ρ parameter are calculated, 0.0036 and 0.239, respectively. In contrast, GEP model introduced by Sattar [11] has less correlation (R = 0.823). Hence, RMSE and MARE values for the model [11] have been calculated, 0.521 and 341.867, respectively. Therefore, ELM models to predict the mass ratio parameter have an acceptable accuracy.

Table 5 Statistical indices of ELM models 1–5 and [11] to predict mass ratio parameter
Fig. 4
figure 4

Scatter plots for prediction of mass ratio parameter (a) Sattar [11,12,13] (b) ELM 1 (c) ELM 2 (d) ELM 3 (e) ELM 4 (f) ELM 5

5.3 Zero Concentration Depth

Also, the accuracy of ELM models 1–5 in the modeling of Y zero parameter is examined (see Fig. 5). In Table 6, statistical indices calculated for ELM models and GEP model proposed by Sattar [11] are shown. ELM 1 between the ELM models has the least accuracy. For this model, RMSE and ρ values are calculated, 1.479 and 0.145, respectively. However, the value of correlation coefficient for ELM 1 is estimated, 0.959. Also, ELM 5 predicts the Y zero parameter more accurately compared to other ELM models. The parameters MARE, BIAS, and ρ values for this model are predicted, 0.237, 0.00106, and 0.141, respectively. However, the accuracy of the model [11] to predict Y zero parameter is less than ELM models. In other words, the correlation coefficient value for the model is 0.944 [11]. Therefore, based on the analysis of simulation results, ELM model estimates parameters plume mass, mass ratio, and Y zero with reasonable accuracy.

Fig. 5
figure 5

Scatter plot for prediction of Y zero parameter (a) Sattar [11,12,13] (b) ELM 1 (c) ELM 2 (d) ELM 3 (e) ELM 4 (f) ELM 5

Table 6 Statistical indices of ELM models 1–5 and [11] to predict the parameter Y zero

To analyze the results of ELM, the parameter discrepancy ratio (DR) as the ratio of modeled values to measured values is introduced (DR = T (Predicted)/T (Observed)) [11]. The proximity of discrepancy ratio to 1 represents the proximity of predicted values to measured results. In Table 7, the values DRmax, DRmin, and DRave are the maximum, minimum, and mean discrepancy ratio. For plume mass parameter, the lowest DRave for ELM 4 is calculated, while the average of discrepancy ratio for the model has been computed, 39.175 [11]. For ELM 3, the mass ratio parameter has the lowest DRave value (DRave = 2.508). The DRmax and DRmin values for this table model have been estimated, 335.291 and 0.0004, respectively. As can be seen, the DRave for the proposed GEP model [11] has been obtained, 342.861. The lowest DRave value to ELM models in prediction of Y zero parameter for ELM 5 is obtained (DRave = 1.060). The DRmax and DRmin values for the model ELM 5 have been calculated, 65.876 and 0.022, respectively. For model DRmax, DRmin and DRave values are obtained, 109.803, 0.545 and 1.250, respectively [11]. Based on the analysis results of discrepancy ratio parameter, results predicted by the model’s ELM compared with GEP model introduced by Sattar [11] are closer to the measured values.

Table 7 DRmax, DRmin, and DRave values for ELM models 1–5 and [11]

6 SAT Site Selection in Egypt

SAT systems would be an attractive unconventional water resource in Egypt, which is true, especially in rural communities. Egyptian researchers are keeping this in mind. Recently, RIGW [36] published recommended characteristics of groundwater aquifer for a successful and efficient SAT operation and removal of organic matter. These included an infiltration rate of more than 0.25 m/day, minimum depth to groundwater of 5 m, high values of porosity, and saturated zone transmissivity, and most importantly, the aquifer should not be flowing into the Nile River. Recently, El Arabi et al. [37] have provided guidelines for the selection of potential SAT sites in Egypt. The primary target for these guidelines was to ensure adequate removal of nitrogen and biochemical oxygen demand (BOD) from treated wastewater. However, the OMP removal criterion has not been considered in these guidelines despite the ecological and health risks imposed by their presence in Egyptian soils and native groundwater. Figure 6 shows the potential locations for construction of SAT systems in Egypt [38]. It was found that the best sites existed on the Western fringes of the Nile delta, W1 to W6, as shown in Fig. 6. SAT systems in these areas would improve the quality of groundwater regarding reducing the salinity and would help treat more than 350 million m3/year of wastewater produced from nearby treatment plants, I–V.

Fig. 6
figure 6

Potential locations for construction of SAT systems in Egypt [38]

For the potential SAT sites in Egypt (as shown in Fig. 6), the average depth to groundwater is contained in Table 8 [36, 39, 40]. Table 8 presents the results of Sattar model study [11], and the model developed in this study for the zero concentration depth, i.e., the depth at which the concentration of OMPs reaches zero. The best locations for potential SAT system with the highest efficiency in removal of OMPs are Wadi El Natrun, Sadat City, and Alexandria, respectively, with an average removal efficiency of 90%. On the other hand, Rashed and Abu Rawash had the shallowest groundwater table disabling vadose soil to completely attenuate OMPs during wastewater infiltration, making these sites less favorable for SAT system construction. With the availability of detailed hydrogeological investigations for the potential SAT site, the uncertainty in model predictions [41,42,43,44,45], for OMP removal, can be significantly reduced, and the influence of SAT operational aspects can be studied.

Table 8 Potential SAT system locations in Egyptian Western delta fringes and corresponding OMP removal

7 Conclusions and Recommendations

One of the most important methods to remove organic micropollutants (OMPs) at water and wastewater treatment plants is using the soil aquifer treatment (SAT). In this study, the mechanism of organic micropollutant removal using extreme learning machine (ELM) method was evaluated. Therefore, five different ELMs for each of the parameters’ plume mass, mass ratio, and depth of concentration of 0% (Y zero) were defined. Also, the results of extreme learning machine with results of gene expression programming model provided by [11] were compared. Analysis of numerical model results indicated the acceptable accuracy of ELM models in prediction of the pollutant removal efficiency during the SAT operation. Moreover, the model prediction accuracy of the optimum ELM model is assessed using the statistical performance parameters, including MARE, correlation coefficient, and scatter index, which were found to be 5.473, 0.970, and 0.746, respectively. Also, the discrepancy ratio for the parameter Y zero by the best ELM model was calculated to be 1.060.