1 Introduction

Permeable reactive barrier (PRB) is a well-known technology for groundwater treatment (U.S. EPA 2002; Das 2002; Chandrappa and Das 2012, 2014). It is a passive in-situ treatment wall (porous) filled with a reactive material and installed perpendicular to the groundwater flow in the subsurface (Fig. 1). The PRBs capture the contaminant plumes, break down the contaminants and release the treated water into the surroundings. The contaminants are chemically, physically and/or biologically treated through the main processes of immobilization and transformation depending on the type of the reactive material. Conventionally, zero-valent iron (ZVI) is the reactive material used in PRBs but other materials such as surfactant-modified zeolites and peat moss have also been used (Chandrappa and Das 2012, 2014). ZVI has been widely used in PRBs as it is a highly reactive material and is suitable for treatment of various kinds of contaminants, i.e., heavy metals and hydrocarbons (Scherer et al. 2000; Vignola et al. 2011). One of the main limitations of this technology has to do with somewhat unpredictable longevity of the treatment system. This is mainly due to the intricate physico-chemical processes, which occur in the PRB during the treatment process, one of which is permeability losses (Phillips et al. 2003; Li et al. 2005; Ruhl et al. 2011; Wilkin et al. 2014).

Fig. 1
figure 1

Schematic diagram showing contaminated groundwater flowing through a PRB

Permeability decline of ZVI in PRB primarily takes place due to mineral precipitation (Liang et al. 2003; Hosseini et al. 2011). Apart from participating in reactions that are capable of degrading the contaminants, ZVI will also react with dissolved oxygen and other mineral constituents in the groundwater (mainly carbonates) as well as the water itself to form iron hydroxides. The formation of these secondary precipitates produces a coating on the ZVI particles surface and clogs the pores in the ZVI barrier. As a result, a decline in the porosity, and therefore, permeability of the ZVI barrier occurs, and the access of the contaminants to ZVI becomes constrained.

As PRB is a passive treatment system, there are no additional forces that drive the flow of contaminant plumes through the PRB (Das 2005; Chandrappa and Das 2014). Therefore, the contaminated groundwater may bypass if the barrier is blocked significantly due to the reduction in permeability, and thus, the main function of the PRB, i.e., contaminant treatment, may be lost. As such, it is important to have an idea of the permeability loss over time for particular design of a PRB. It is evident from several literatures (e.g., Wilkin et al. 2002; Reardon 2005; Johnson et al. 2008) that the permeability loss occurs after the PRB has been installed in the subsurface. However, the data collected from field scale are difficult to use directly to determine the long-term permeability losses due to the fact that there are many uncontrolled parameters that affect the process. On the other hand, due to the time limitations, the data collection which is prolonged for many years in laboratory scale studies is often not practical. In principle, an approach based on computational fluid dynamics (CFD) may be employed to determine permeability losses (e.g., see, Liu et al. 2011) and the mathematical modelling should be capable of determining if there is a decline in the permeability (Jeen et al. 2012). However, the CFD tools generally require complex solution procedures for the governing equations for mass transport and fluid flow besides any other constitutive equations for changes in the physico-chemical properties of the PRB. The CFD models generally assume that the in-situ processes in the PRB can be described by well-defined parameters and governing equations; however, this is not necessarily the case.

In a different context, it can be seen that several researchers have applied the artificial neural networks (ANNs) in predicting the permeability for porous domains (e.g., oil reservoirs, membrane filters for water treatment) and have concluded that ANN performed well and provided accurate results (Afshari et al. 2014; Inthata et al. 2013; Karimpouli et al. 2010). The ANN is a computational tool composed of simple elements working in parallel commonly known as neurons (Hanspal et al. 2013), which imitate the workings of the human brain and nervous system. The neurons are grouped into three interconnected layers (the input, hidden and output layers) (Modrogan et al. 2010; Deka and Quddus 2014). It is used as an alternative to logistic regression, which is a statistical technique with the same goal of predicting an outcome variable based on pre-determined set of input parameters. However, ANN is not a physically based approach and it relies on the network’s ability to understand given information and/or outputs from physically based relationships derived using other methods, e.g., CFD and/or experiments (Tu 1996; Deka and Quddus 2014). For example, ANN architectures were developed to determine dynamic capillary pressure effects in two-phase flow in heterogeneous porous domains which used data from computational flow physics-based studies (Das et al. 2015).

In the context of this work, ANN is an attractive option as it can be used to determine the slow process of permeability decline in PRB much beyond the span of typical laboratory experiments. ANN allows a continuous learning process where the input data can continuously be updated creating a larger database for better training and validation of the ANN structure, hence, enabling better predictions. The input and output parameters for the ANN structures can be controlled and varied to simulate a process where data are missing or only discreet data points are available, e.g., see the work by Zargari et al. (2013). Zargari et al. claimed that the most accurate permeability data from the laboratory experiments do not provide a continuous profile along the line of measurement. Therefore, they used ANN to predict the porosity and permeability of their porous domain of interest which were then compared with the experimental results. It was concluded that the ANN models provided lower errors and the results from porosity modelling is better than from the permeability modelling.

Motivated by the above background, this work aims to develop artificial neural networks (ANNs) for predicting the permeability losses in PRB. The main objective is to develop the ANNs and use them to model the permeability losses that take place in zero-valent iron permeable reactive barriers (PRBs). A variety of single and double hidden layer ANN structures were designed to predict permeability decline at different pressure points of a laboratory scale PRB which was carried out mainly by varying the number of neurons in the hidden layer(s). Bearing in mind that increasing the number of neurons and the number of hidden layers do not necessarily determine how good an ANN structure is, a statistical analysis of the obtained results was carried out to identify the best ANN structures by comparing them to regression models.

2 Methodology

2.1 Data Collection

The reference datasets used for the development and implementation of the ANN were obtained from well-defined in-house laboratory experiments. For this purpose, two clear acrylic square tubes (Hindleys Ltd., Sheffield, UK) with the dimensions of 10 cm height × 4.45 cm width × 4.45 cm length and wall thickness of 0.63 cm were packed with ZVI of two different particle sizes, namely, coarse (2.35 ± 0.01 mm) and fine (0.40 ± 0.02 mm) sizes. The density of then ZVI particles obtained from pycnometer measurement was 7.25 g/cm3. For the fine and coarse ZVI beds, the initial porosities at the time of packing the particles were 0.62 and 0.52, respectively.

Figure 2 illustrates the schematic diagram of the experimental setup where water flowed through the column at the initial flow rates of 0.29 and 0.24 mL/min for coarse and fine ZVI, respectively. The fluid pressure was measured using pressure meters at different time intervals from the 4 pressure measuring ports, P1, P2, P3 and P4. The differential pressure used in this study is denoted as ΔP with the subscript of port location, i.e. ΔP12 means the difference in pressure between P1 and P2. The permeability values at any particular time were calculated from Darcy’s law (Holdich 2002; Das et al. 2005; Nassehi and Das 2007) using the measured data from the experiments. Please note that although the data are collected from laboratory scale (small) experiments, the designed set up represent real field scale values. This is because the relevant parameters were kept similar to the field conditions as far as possible. For example, the flow rate was set to simulate typical groundwater flow and the ZVI packing in the cell was similar to a real PRB.

Fig. 2
figure 2

Schematic diagrams of a experimental setup for collecting data of permeability losses over time and b location of pressure ports

2.2 ANN Modeling

In this study, MATLAB’s ANN toolbox was used to develop single and double hidden layer ANNs. The reference data were imported into MATLAB using appropriate calling functions after which each network was trained. The training was carried out by dividing the collected data into three sets: 70 % for training, 15 % for validation and 15 % for testing the ANN structures. During training, the number of neurons hidden layer(s) was varied from 2 to 12 neurons to determine the optimum architecture. A correlation coefficient, R, and slope, m, with values close to 1 and an intercept value close to 0, were the criteria used to indicate a good ANN structure. Each structure was trained at least 20 times and an average of the best 10 network training data results were further analysed to determine the best single- and double- hidden-layered structures for each of the 4 sets of data. Details of the extensive analysis carried out on the results obtained are discussed in the following sections.

Before training the neural networks, some default processing functions were altered by changing some of the network parameters to scale the input and targets to be within the desired range. The epoch limit was set to 200 iterations, the goal for error was set to 0 and the {mapstd} function (i.e., the function that processes the data matrices by mapping each row’s means and deviations to the default value) was used to normalise the training dataset values to lie between 0 and 1. In addition, regression analysis was carried on the reference experimental data to validate the ANN performance, details of which will be discussed subsequently.

As shown in Table 1, the input variables include fluid residence time, pressure drops at four (4) measuring ports in the ZVI filled cells representing small laboratory scale PRBs (see Fig. 2), inlet flow rate, dynamic viscosity of fluid, average porosity (initial values), average particle size (initial values) and the length of the laboratory scale reactor, while permeability was the output variable. The input and output variables were selected as they are known to affect the flow of a fluid through the porous materials and hence, the permeability losses.

Table 1 Statistics of the input and output variables used for ANN modeling

In artificial neural networking, the output values can be determined at a time from the given set of input variables, combining the pressure drop results obtained from the 4 measuring ports (∆P12, ∆P13, ∆P34 and ∆P24). The fine and coarse ZVI data were combined in order to have a substantial number of datasets. However, the process of developing the ANN architectures would be very complex, and could introduce artificial over-fitting of the data. Hence, relatively simple ANN structures, shown in Fig. 3, were used for single-hidden-layer model and double-hidden-layer model with seven inputs. The mentioned structures were used to model the permeability decline in the PRB utilising the data obtained from the 4 pressure measuring ports separately using the same six other input variables in the input layer of the ANN architecture.

Fig. 3
figure 3

The two proposed ANN structures containing seven inputs a single-hidden-layer model and b double-hidden-layer model

2.3 Performance Testing

After developing various ANNs, the data collected were used as the testing data. Utilizing Microsoft Excel, the following criteria were employed to compare the performance of the ANN and regression models: average absolute relative error, sum squared error, threshold statistics, correlation coefficient, and Nash-Sutcliffe efficiency. These are discussed below.

2.3.1 Average Absolute Relative Error (AARE)

This measure of accuracy is the average of the relative errors usually expressed as a percentage and is defined as follows:

$$ \mathrm{AARE}=\frac{1}{N}{\displaystyle {\sum}_{i=1}^N\left|\frac{S_{cal}-{S}_{obs}}{S_{obs}}\right|\times 100} $$
(1)

where N is the total number of data points predicted; S obs and S cal are the observed and calculated permeabilities, respectively. Good model performances are indicated by lower AARE values.

2.3.2 Sum Squared Error (SSE)

This represents the total squared deviation between the observed and calculated values. It is also used as a measure of variation within a group of data. It is defined by the formula

$$ \mathrm{S}\mathrm{S}\mathrm{E}={\displaystyle {\sum}_{i=1}^N{\left({S}_{obs}-{S}_{cal}\right)}^2} $$
(2)

If all model results were identical the SSE would be 0.

2.3.3 Threshold Statistics (TS)

Here, a set of threshold values are used to distinguish ranges of values when the predicted behavior of a model varies in an important way. The threshold statistics for a level of absolute relative error (ARE) of x% is defined by the formula

$$ \mathrm{T}\mathrm{S}=\frac{N_x}{N} $$
(3)

where N x is the predicted number of data points for which the absolute relative error falls below the x%. The percentages used in this work are 5, 10, 25, 50 and 100 %. This equation measures the consistency in prediction errors. Large threshold values indicate better model performance.

2.3.4 Correlation Coefficient (R)

The Pearson product moment coefficient R is defined by the equation

(4)

where and are the average observed and calculated permeability values, k. This equation was used to show the dependency between simulated and observed data. A good model is indicated by having R-values nearing 1.0.

2.3.5 Nash-Sutcliffe Efficiency (E)

The Nash-Sutcliffe efficiency coefficient E is defined by the formula:

(5)

Values of E nearing unity signify high accuracy of predicted data and hence, indicate a good model.

2.4 Multiple Regression Analysis

Multiple regression analysis (MRA) is a statistical analysis method. It construes the variance of a dependent variable using given independent variables. In order to find the best permeability-predicting model, MRA was employed by using experimentally collected and observed permeability results to serve as independent and dependent variables, respectively. In this paper, both linear and non-linear MRA were adopted. They were used as comparisons to the predictions acquired using ANN. The regression data analysis tool of Microsoft Excel was utilized for all regression modelling work.

2.4.1 Linear multiple regression (LMR)

LMR was employed to generate a mathematical relationship that describes variations on the permeability, regressed against all seven independent input parameters

$$ k={\beta}_0+{\beta}_1{x}_1+{\beta}_2{x}_2\dots +{\beta}_7{x}_7 $$
(6)

where: k is the permeability; β 0 → β 7 are the regression coefficients to be estimated using the ANOVA regression tool; and x 1 → x 7 are the independent variables which are all raised to the power of 1.

2.4.2 Non-linear multiple regression (NLMR)

In this work the non-linear multiple regression (NLMR) was also applied to evaluate the ANN result. By applying the same variables as in LMR, NLMR uses a polynomial range of different orders as investigated by Jain and Indurthy (2003). The general forms of the equations for NLMR have been chosen as recommended by Jain and Indurthy (2003), where the permeability values are regressed as shown below.

$$ k={\beta}_0+{\beta}_1{\left({x}_1\right)}^{0.05}+{\beta}_2{\left({x}_2\right)}^{0.05}\dots +{\beta}_7{\left({x}_7\right)}^{0.05} $$
(7)
$$ k={\beta}_0+{\beta}_1{\left({x}_1\right)}^{0.5}+{\beta}_2{\left({x}_2\right)}^{0.5}\dots +{\beta}_7{\left({x}_7\right)}^{0.5} $$
(8)
$$ k={\beta}_0+{\beta}_1{\left({x}_1\right)}^{1.25}+{\beta}_2{\left({x}_2\right)}^{1.25}\dots +{\beta}_7{\left({x}_7\right)}^{1.25} $$
(9)

3 Results and Discussion

3.1 Data Collection

The experimentally determined plot of flow rate at the outlet (mL/h) versus time (day) for both coarse and fine ZVI particles is shown in Fig. 4. For the same inlet flow rate, and, hence, fluid pressure at the inlet, it can be seen that the outlet flow rate is continuously decreasing with time. The outlet flow rate is significantly reduced for both particle sizes, i.e., from 18 and from 14.5 to 8 mL/h for the coarse and fine ZVI particles, respectively. This indicates that the permeability is decreasing, thus reducing the flow rate for the same imposed inlet fluid pressure.

Fig. 4
figure 4

Outlet flow rate (mL/h) versus time (day) for ZVI beds consisting of coarse and fine ZVI particles. Please note that data are presented starting from 2 days as the PRBs take approximately 2 days to reach steady values

As shown in Fig. 2, the pressure drop values were measured across 4 different reference points of ΔP12, ΔP14, ΔP13 and ΔP24. The permeability (k) values were calculated from the directly measured data from the experiment using Darcy’s law. Figure 5 illustrates the permeability across the 4 different reference points at different time periods for the coarse and fine particles. The trend is more obvious for coarse particles but it can still be observed that the permeability is decreasing in the similar patterns for both particle sizes. As the reactor is small, the differences in the losses in permeability values at different locations are small and it seems that the permeability values are quite close to each other as well. These permeability values are used as a reference data for developing the ANN structure which will be discussed in the following section.

Fig. 5
figure 5

The intrinsic permeability values (k, m2) versus time (day) for a coarse and b fine ZVI particles

3.2 ANN Modeling

3.2.1 Training and validation of ANN structures

Figure 6 depicts typical plots for training and post-training analysis of an ANN structure for a selected pressure point, i.e., ∆P13. It was observed that the mean-square errors decreased as the network training process progressed.

Fig. 6
figure 6

a Trained network analysis and (b) post-training analysis for ANN [7-4-6-1] model of ∆P13 data

The number of iterations (epochs) was varied for all ANN models trained. This was due to the fact that the validation test stops the training of the network when the peak performance is achieved. It can be seen from Fig. 6(b) that the post-training analysis provided the best linear fit of the data points (shown in red linear line) between the plot of the outputs (Y) on the y-axis and the targets (T) on the x-axis. It is also confirmed by the high R value of 0.997. The line of best fit is then used to determine the correlation coefficient, slope and intercept which were the results used to decide the best ANN structure. Table 2 shows the post analysis results of the best ANN structures using each of the 4 sets of reference data. Generally, good results were obtained using the ∆P12, ∆P13 and ∆P34 but not with the ∆P24 data. We believe this is due to the shortage of data points, but as more experimental data become available in the future, better prediction would be achieved.

Table 2 Post training data of the best ANN structures using experimental data from 4 different pressure points

3.2.2 ANN Performance Testing

From the AARE plots depicted in Fig. 7, it is evident that the ANN structures have AARE values with a magnitude 10 times lower than that of the regression models, i.e., the ANN structures performed significantly better than the linear and non-linear regression models. However, there were two exceptions where the ANN structures designed with the ∆P12 and ∆P24 data performed in a similar manner to the non-linear regression (Fig. 7a) and linear regression (Fig. 7d), respectively, possibly due to data shortage (74 data points). Nevertheless, the ANN [7-10-12-1] model showed some improvement. It can be seen from Fig. 7b that the ANN [7-4-6-1] model had a low value when the ∆P13 data was used, while in Fig. 7a the AARE values for the ANN structures designed using the ∆P12 data are all of similar magnitudes. Also, from Fig. 7c, it can be observed that the ANN [7-2-1] performed notably better than the other ANN structures having a value of 15.61 when the ∆P34 data was utilized.

Fig. 7
figure 7

Comparisons of the AARE of the regression (linear and non-linear) and ANN structures at different reference points a ΔP12; b ΔP13; c ΔP34; and d ΔP24

Figure 7 illustrates the SSE comparisons and it signifies that the regression models perform poorly for all the 4 datasets. Particularly, the x0.05 non-linear regression model performed the worst in all 4 cases with errors of 1.19 × 10−20, 2.96 × 10−20, 2.79 × 10−21 and 6.15 × 10−20. From Fig. 8a, it can be seen that the ANN structures designed using the ∆P12 data performed similarly with the exceptions of ANN [7-2-1] which had higher error of 2.61 × 10−22. This trend with the ANN [7-2-1] model was also seen with the ∆P13 data as shown in Fig. 8b with the value of 3.55 × 10−22, the ANN [7-4-6-1] model is seen to perform the best using the ∆P13 data with the value of 1.56 × 10−23. From Fig. 8c, it can be observed that the ANN [7-10-12-1] performed well with the ∆P34 data (error: 1.20 × 10−25). In terms of the ANN structures designed with the ∆P24 data, although the SSE values were higher than anticipated, the double hidden layer structures are seen to perform better than the single layer structures. The results improved as the number of neurons in the double hidden layer was increased, i.e., ANN [7-10-12-1] performed the best with an error value of 7.74 × 10−21.

Fig. 8
figure 8

Comparisons of the SSE between regression and ANN structures at different reference points (a) ΔP12; (b) ΔP13; (c) ΔP34; and (d) ΔP24

Comparisons between the regression (linear and non-linear) and ANN models using efficiency (E) and correlation coefficient (R) as presented in Fig. 9 also highlight how poorly the regression models performed with the non-linear regression models performing the worst. For example from Fig. 9d, the x0.05 non-linear regression model is seen to have an efficiency E of −0.267 and a correlation coefficient R of 0.000121, which are very low values. On the other hand, the ANN structures had high efficiency and correlation coefficient values, with most values nearing 1 except the models designed using the ∆P24 data with the highest efficiency and correlation coefficient values being 0.7072 and 0.6383, respectively. It is logical to expect that the ANNs would perform better than the LR and NLR models as the ANNs are trained to the data obtained while the LR and NLR are simply regressed using the available data points. The data collected from the experiments are governed by Darcy’s law which is a linear relationship between permeability and other factors that affect the flow hydrodynamics (e.g., pressure drop). However, the permeability losses in the PRBs seem to be a non-linear process and consequently, none of the LR and NLRs performs well.

Fig. 9
figure 9

Comparisons of E and R between regression and ANN structures at different reference points a ΔP12; b ΔP13; c ΔP34; and d ΔP24

Observing the threshold statistics plots, the underperformance of the regression models is again apparent as Fig. 10 shows. From Fig. 10a, the poor performance of the NLR (x0.05) is visible where none of the AARE values fell into the threshold statistics categories. In contrast, it can be seen that the ANN [7-4-1] structure performed the best with a TS-5 value of 87.84 and 100 % of its AARE values were within the TS-100 category. The TS values obtained from the ∆P24 data were again lower than anticipated. Even for TS-100, the highest value obtained was only 48.65 obtained by both ANN [7-8-10-1] and [7-10-12-1]; these might be due to the data shortage as discussed previously.

Fig. 10
figure 10

Comparison of threshold values between regression and the ANN structures at different reference points a ΔP12; b ΔP13; c ΔP34; and d ΔP24

It is evident from the performance testing by comparing the results from ANN prediction to that of the regression models (both linear and non-linear) that the ANN has higher accuracy. From the comparisons of all ANN models and the error statistics, it also seems that the best performing models are single-hidden-layer model ANN [7-2-1] with the data at the pressure point of ∆P34 and double-hidden-layered model ANN [7-4-6-1] with the data at the pressure point of ∆P13. In order to demonstrate how the permeability data compare, Fig. 11 is prepared which depicts the permeability versus time plots using the data from one of these ANN structures, namely: (a) single-hidden-layered model ANN [7-2-1] and the regression models; and (b) double-hidden-layered model ANN [7-4-6-1] and regression models. The structures of these two ANNs are shown in Fig. 3 which we used to explain the proposed ANNs. The data from these ANNs were compared with the reference permeability data correspondingly to determine the accuracy of the ANN and regression models. The figure reflects that the predictive abilities of regression (linear and non-linear) models are significantly poor. Although the linear regression models performed slightly better than the non-linear regression models, they do not provide sufficient representation of the reference permeability data. On the other hand, ANN structures, both single and hidden layer equally provide a much more accurate depiction of the characteristic behavior of the reference permeability data. In Fig. 11, the LR and NLR equations at ∆P34 are given by k = 3.75E-12 + 7.51E-15 (X1) - 2.74E-12(X2) + 5.02E-12(X7) and k = 3.75E-12 + 7.51E-15(X1 0.5)-2.74E-12(X2 0.5) + 5.02E-12(X7 0.5), respectively. The LR and NLR equations are k = 8.997E-12-1.47E-13(X1)-2.44E-12(X2) + 1.11E-11(X7) and k = 8.997E-12-1.47E-13(X1 0.5)-2.44E-12(X2 0.5) + 1.11E-11(X7 0.5) at ∆P13, respectively.

Fig. 11
figure 11

The plot of permeability versus time for: aReference data, ANN [7-2-1], LR and NLR (0.5) models at ∆P34; and b Reference data, ANN [7-4-6-1], LR and NLR (0.5) models at ∆P13

4 Conclusions

The main objective of this study, i.e., to create a permeability-predicting ANN structure has been achieved. ANNs (single and double hidden layered) and regression (linear and non-linear) models were attempted and the complex relationships between the observed permeability decline and the physical parameters characterizing the process were determined. The deployed data used for model development, network training, performance evaluation and further analysis were gathered from in-house experiments data. Permeability loss is one of the main limitations of the ZVI-PRB technology, and it has been established that ANNs can model these permeability losses better than regression models. From the performance statistics parameters, which comprise the average absolute relative error (AARE), sum squared errors (SSE), threshold statistics (TS), correlation coefficient (R) and efficiency (E), the high performance of ANN for predicting the permeability decline in PRBs is demonstrated. It can be seen from the correlation coefficient value in the post-training data that the best single and double-hidden-layer ANN structures are ANN [7-2-1] at ∆P34 and ANN [7-4-6-1] at ∆P13, respectively. According to all performance comparison criteria, these two structures reflect the best results. In addition, these structures are matched with the proposed ANN structures. In summary, ANN is shown to be a reliable tool in predicting the permeability loss in PRB system and the best ANN structures that should be adopted are ANN [7-2-1] and ANN [7-4-6-1].