Introduction

In geotechnical engineering practice, soils are rarely used in their natural compaction state, and to meet different geotechnical criteria, they should often be compacted. Compacted soil is widely used in landfills liners and waste impoundments, to cap new waste disposal units and to close old waste disposal sites because of its relatively low cost, accessibility, durability, high resistance to heat, and other factors (Wang and Huang 1984).

Almost all the regulatory agencies in the world require that compacted soil liners and covers be designed to have a hydraulic conductivity of less than or equal to a specified maximum value. According to the US Environmental Protection Agency Regulation, Brazilian Standard (NBR 13896-1997), and German Standards, compacted clay liners are required to have conductivity of 10−7 cm/s or less; however, the Chinese Ministry of Construction has specified a permeability of 10−8 cm/s or less for this purpose (Du et al. 2009).

For a successful design and construction of compacted soil liners and covers, not only the hydraulic conductivity but also factors including chemical compatibility, construction method, slope stability and bearing capacity, and subsidence phenomenon should be taken into consideration as well as environmental factors such as desiccation, and the development and execution of a construction quality assurance plan (Daniel 1984; Oakley 1987; EPA 1988; Elsbury et al. 1990; Daniel and Benson 1990). In Fig. 1, different instabilities in compacted soil liners in landfills can be observed (Dixon and Jones 2005).In practice, design engineers traditionally require that soil liners be compacted within a specified range of water content and to a minimum dry unit weight. According to Hermann and Elsbury (1987), this minimum value is 95 % of γ dmax from standard Proctor compaction (ASTM D-698) or 90 % of γ dmax from modified Proctor compaction (ASTM D-1557). The range of acceptable water content for soil liners and covers might typically be about zero to four percentage points wet of standard or modified Proctor optimum (Fig. 2).

Fig. 1
figure 1

Potential failures modes occurred in soil liners, after Dixon and Jones (2005)

Fig. 2
figure 2

Traditional method for specification of acceptable water contents

The shape of the acceptable evolved empirically from construction practices applied to roadway bases, structural fills, embankments, and earth dams. The specification is based primarily upon the need to achieve a minimum γ d for adequate strength and limited compressibility. Soil liners are compacted wet of optimum because wet-side compaction minimizes hydraulic conductivity due to the change in the texture of soil (Bjerrum and Huder 1957; Lambe 1951; Mitchell et al. 1965; Boynton and Daniel 1985).

Daniel and Benson (1990) analyzed the results of Mitchell et al. (1965) and Boutwell and Hedges (1989) to investigate this traditional approach for the design of compacted soils liners. They concluded that the traditional approach did not address the geotechnical requirements of compacted soil liners, properly.

In Fig. 3a which is drawn based on the results of Mitchell et al. (1965), the significant portion of the superimposed acceptable zone on the contours of hydraulic conductivity yielded a hydraulic conductivity of more than 10−7 cm/s which is stated as a criterion for the design and construction of compacted soil liners. In Fig. 3b, the contours of hydraulic conductivity and shear strength are superimposed on the acceptable zone for 95 % of γ dmax and water content 0–4 % wet of optimum (Boutwell and Hedges 1989). However all wγ d points contained within the acceptable zone correspond to test specimens with a hydraulic conductivity of less than 10−7 cm/s, but the shape and boundaries of the acceptable zone in this figure correlate with neither the hydraulic conductivity nor the shear strength. Besides this, as can be observed, the variation in shear strength in the acceptable zone is dramatic which could cause considerable deficiencies in the performance of compacted soil liners.

Fig. 3
figure 3

Evaluation of traditional method to design compacted soil liners based on hudraulic conductivity and shear strength a Mitchell et al. (1965); b Boutwell and Hedges (1989)

Based on the results of these analyses, Daniel and Benson (1990) proposed a new approach for design and construction compacted soil liners. According to this method, the compaction curve of the soil based on the compactive effort used in the field or a range of compactive efforts should be developed. Permeability tests are carried out to determine the hydraulic conductivity of each compacted specimen. The wγ d relationship should be re-plotted with different symbols used to represent compacted specimens that had hydraulic conductivities meeting the design criteria. However, the acceptable zone is modified based on other considerations, e.g., shear strength, interfacial friction with an overlying geomembrane, shrink/swell considerations, concern over cracking when settlement occurs, concern for constructability, or local practices (Fig. 4).

Fig. 4
figure 4

Developing acceptable zone for design of compacted soil liners based on hydraulic conductivity and shear strength (Daniel and Benson 1990)

This method was further developed by Daniel and Wu (1993) to design and construct compacted soil liners in arid regions considering the shrinkage to have minimum desiccation cracks in liners. Although the proposed method by Daniel and Benson (1990) and Daniel and Wu (1993) is very precise and efficient, developing an acceptable zone is not an easy task, especially in the early stages of earthworks when the correct source of soil should be chosen among several different sources.

Performing laboratory tests to develop an accepted zone for design and construction of compacted soil liners is very time consuming especially in the case of permeability tests on samples with high portions of clay content. Because of these difficulties, using models which are capable of reasonable prediction of geotechnical properties of soils based on their index properties is of interest to geotechnical engineers. Correlation between compaction characteristics of soils and their index properties has been the subject of many investigations based on these properties of soils; some judgments could be made for the hydraulic conductivity and shear strength of soils.

Rowan and Graham (1948), Davidson and Gardiner (1949), Turnbull (1948), Jumikis (1946), Ring et al. (1962), Ramiah et al. (1970), Nagaraj (1994), etc. are among the researchers who tried to relate compaction characteristics of soils to index properties such as specific gravity and Atterberg limits (liquid limit, plastic limit, shrinkage limit, and plasticity index) and some factors related to their grain size distribution.

Similar models have been developed to estimate hydraulic conductivity of soils based on some of their basic properties. Researchers including Hazen (1911), Zunker (1930), Carman (1937), Burmister (1954), Michaels and Lin (1954), Olsen (1962), Mitchell et al. (1965), Wang and Huang (1983), Koltermann and Gorelick (1995), Boadu (2000), Chapuis (2004), Sinha and Wang (2008), and Cote et al. (2011) among others have tried to predict the hydraulic conductivity of soils from some factors related to the grain size distribution of the soil, Atterberg limits, and density of soils; however, the impact of soil’s structure and texture and definitely the type of permeant can increase the uncertainty in the prediction of hydraulic conductivity.

As opposed to the latter two geotechnical properties, several studies have been carried out to predict the shear strength of soils based on their basic properties. Kayadelen et al. (2009) used artificial neural network (ANN), Genetic Programming (GP), and Adaptive Neuro Fuzzy (ANFIS) methods to predict the φ′ value of soils from their index properties and Mousavi et al. (2011) used GP and orthogonal least squares algorithm (OLS) to present a correlation between the internal friction angle and the physical properties of soils such as fine and coarse content, density, and liquid limit. Sezer (2013) used nonlinear multiple regression (NMR), neurofuzziness (NF), and ANN methods to predict the shear strength of soil (Tizpa et al. 2014).

Database

A database with 595 data sets was compiled, in which 155 data sets were used for modeling the permeability, 320 data sets for modeling maximum dry density (MDD) and optimum moisture content (OMC), and 120 cases for modeling effective friction angle of shearing. The database includes test results performed on different types of soils; therefore, the results of this research should be valid for all types of soils.

For each data set, the permeability, OMC, MDD, compaction degree, friction angle, and soil index properties (grain size curve, Atterberg limits, and specific density) were available. However, a soil type index (STI) was introduced to take into account the classification of the soil (USCS) in modeling the compaction characteristics. The STI values were specified according to the following sequence: GW(1), GP(2), SW(3), SP(4), GM(5), SM(6), GC(7), SC(8), ML(9), CL(10), MH(11), and CH(12). In the cases of mixed classifications (e.g., CL-ML), STI was calculated as the average value of the individual classifications. Table 1 gives the descriptive statistics of the input variables used for the model developments. Also, the variation ranges of the output parameters are summarized in Table 2. Note that the gravel content (G c) was a coarse aggregate with a particle size coarser than 4.75 mm and the grain size of sand content (S c) ranged from 4.75 to 0.075 mm. Sand content includes coarse sand (S 1) ranging from 4.75 to 0.6 mm, medium sand (S 2) ranging from 0.6 to 0.2 mm, and fine sand (S 3) ranging from 0.2 to 0.075 mm. Particles that ranged from 0.075 to 0.002 mm were classified as silt and particles smaller than 0.002 were clay.

Table 1 Descriptive statistics of the input variables used in the model developments
Table 2 Descriptive statistics of the output variables

As presented in Table 3, the database was obtained from different sources but mainly from the geotechnical engineering laboratory at the Federal University of Bahia (UFBA), Brazil. Some other cases from Wang and Huang (1984), Kayadelen et al. (2009), and Mousavi et al. (2011) were also added.

Table 3 Data sources used to compile the database

Evolutionary polynomial regression

Evolutionary polynomial regression (EPR) is a data-driven regression method that was developed by Giustolisi and Savic (2006) based on evolutionary computing. To avoid the problem of mathematical expressions growing rapidly in length with time, in EPR the evolutionary procedure searches for the exponents of a polynomial function with a fixed maximum number of terms. During one execution, it returns a number of expressions with increasing numbers of terms up to a limit set by the user to allow the optimum number of terms to be selected (Ahangar-Asr et al. 2011). In general, EPR is a two-stage method to construct symbolic models using polynomial structures. In the first stage, EPR searches for exponents of polynomial expressions by employing a genetic algorithm (GA). In the second stage, numerical regression is used to compute the constant values of the previously selected terms by solving a least squares (LS) problem. The general expression in EPR can be formulated as:

$$ y={\displaystyle \sum_{i=1}^n} F\left( X, f(X),{a}_i\right)+{a}_0 $$
(1)

Where y is the computed vector of output and F is an n-dimensional function. X is the matrix of inputs and n is the number of input terms in the expression. Also, f is a function defined by the user and a i is a constant.

To apply the EPR procedure, the evolutionary process starts from a constant mean of output values. By increasing the number of evolutions, it gradually picks up different participating parameters in order to form equations describing the relationship between the parameters of the system. The EPR procedure stops when the termination criterion (the maximum number of terms in the mathematical expression, the maximum number of generations, or a particular allowable error) is satisfied. Figure 5 shows a typical flow diagram of the EPR procedure.

Fig. 5
figure 5

Typical flow diagram for EPR procedure (Rezania et al. 2008)

Performance analysis

Fitting parameters analysis

Different statistical approaches have been used to evaluate the performance of the prediction models. These parameters are the coefficient of determination (COD), root mean squared error (RMSE), and coefficient of residual mass (CRM). Following equations are the mathematical expressions of these parameters:

$$ \mathrm{COD}=1-\frac{{\displaystyle {\sum}_{i=1}^n}\left({M}_i- P\right)}{{\displaystyle {\sum}_{i=1}^n}\left({M}_i-\overline{M}\right)} $$
(2)
$$ \mathrm{RMSE}=\sqrt{\frac{{\displaystyle {\sum}_{i=1}^n}{\left({P}_i-{M}_i\right)}^2}{n}}\times 100 $$
(3)
$$ \mathrm{CRM}=1-\frac{{\displaystyle {\sum}_{i=1}^n}\left({P}_i\right)}{{\displaystyle {\sum}_{i=1}^n}\left({M}_i\right)} $$
(4)

Where M i and P i are the measured and predicted values, respectively, \( \overline{M} \) is the mean of the measured values, and n is the number of samples. The RMSE is the variance of the residual error and should be minimized when the outputs fit a set of data. The case of a perfect fitting RMSE is zero. The lower the RMSE is, the higher the accuracy of the model predictions. The CRM represents the difference between the measured and predicted values. The optimum value of CRM is zero. Positive values of CRM indicate under estimation and vice versa.

Parametric analysis

For further verification of the EPR prediction models, parametric analyses have been performed. The method of parametric analysis is based on changing one predictor variable at a time while the other predictor variables are kept constant at the average values of their entire data sets.

Parametric analysis investigates the response of the predicted values from the EPR models to a set of input data generated over the training ranges of the minimum and maximum data. These variables are presented to the prediction model and the output is calculated. This procedure is repeated using another variable until the model response is tested for all input variables.

Results and discussions

EPR model for maximum dry density

Eight input parameters have been used in the EPR model for maximum dry density (kN/m3) including gravel content (G c); coarse, medium, and fine sand content (S 1, S 2, and S 3); silt and clay content (S c and C c); plastic limit (PL); and STI. The following EPR model is obtained for predicting MDD:

$$ \mathrm{MDD}=1.11{\mathrm{PL}}^{0.5}-0.256\mathrm{PL}-7.609\times {10}^{-6}{S_{\mathrm{c}}}^{0.5}{C_{\mathrm{c}}}^2\mathrm{STI}+0.054{S_1}^{0.5}{S_3}^{0.5}+0.0001{G_{\mathrm{c}}}^2{S}_2+17.58 $$
(5)

Figure 6 shows the predicted values of MDD versus measured values for training and testing data sets. Among 320 measured data sets, 290 sets (90 %) have been used for training and 30 sets (10 %) have been used for testing the model. Table 4 also presents the performance of the EPR prediction model.

Fig. 6
figure 6

Comparison between the predicted values of MDD and the actual data

Table 4 Performance of the EPR model for MDD

Figure 7 presents the results of the parametric study on the EPR MDD model. As is obvious from graphs, increasing coarse content (gravel content and coarse, medium, and fine sand content) increases the maximum dry density of the soil. It is also coherent that the gravel content has the greatest effect on the prediction of MDD values. It can be seen that increasing fine content (silt and clay) causes the MDD to decrease. Moreover, MDD decreases as the plastic limit increases which could imply greater fine content in the soil.

Fig. 7
figure 7

Parametric study results on the MDD prediction model

EPR model for optimum moisture content

Five input parameters have been used in the EPR model for OMC including specific gravity (Gs), clay content (C), PL, MDD, and STI. The obtained model from EPR for prediction of OMC is:

$$ \mathrm{OMC}=0.0025{\mathrm{PL}}^2{\mathrm{STI}}^{0.5}+0.0025{C}^2+32.82\mathrm{Gs}-0.229{\mathrm{Gs}}^2\mathrm{MDD}-0.0006 G{s}^2 C\left(\mathrm{PL}\right)-46.39 $$
(6)

Figure 8 shows a comparison between the results of the EPR model and experimental data for both the training and testing sets. Among the 320 measured data sets, 290 sets (90 %) were used for training and 30 sets (10 %) were used for testing the model. Table 5 also presents the performance of the EPR prediction model.

Fig. 8
figure 8

Comparison between the predicted values of OMC and the actual data

Table 5 Performance of the EPR model for OMC

Figure 9 presents the results of the parametric study of the developed EPR model for OMC. As is clear from graphs, increasing clay content which implies higher plastic limits leads to higher values of OMC. However, it can be seen that MDD and OMC are clearly dependent. By increasing the maximum dry density, the OMC values decrease. Furthermore, G s seems to have a minor effect on OMC values.

Fig. 9
figure 9

Parametric study results on the OMC prediction model

EPR model for permeability coefficient (K)

Four input parameters have been used in the EPR model for the permeability coefficient (cm/s) including effective grain size (D 10), mean grain size (D 50), plasticity index (PI), and compaction degree (Cd) expressed in percent. D is referred to effective grain size to mean grain size of soil. The following EPR model is obtained for predicting the permeability:

$$ \mathrm{Log}\ k=-0.00051{\mathrm{Cd}}^2-0.263{\mathrm{PI}}^{0.5}-1.629{D}^{0.5}+5.76\times {10}^{-6} D\left(\mathrm{PI}\right){\mathrm{Cd}}^2-0.0001 D\left({\mathrm{PI}}^2\right)+0.0286 $$
(7)

Figure 10 shows a comparison between the results of the EPR model and experimental data for both training and testing sets. Among the 155 measured data sets, 140 sets (90 %) were used for training and 15 sets (10 %) were used for testing the model. Table 6 illustrates the performance of the developed EPR model for training and testing datasets.

Fig. 10
figure 10

Comparison between the predicted values of permeability coefficient and the actual data

Table 6 Performance of the EPR model for permeability coefficient

Figure 11 presents the results of the parametric study of the EPR permeability model. Figure 11a shows that by increasing the dimensionless parameter of D 10/D 50, the permeability of the soil increases. It is obvious that increasing the ratio of D 10 to D 50 implies uniform particle size distribution which causes the permeability to increase. Figure 11b indicates that increasing the compaction degree decreases the permeability of soil due the reduction in the void ratio. As expected, increasing the plasticity index also causes the permeability of the soil to decline (Fig. 8c). Another aspect that is coherent from results is that the plasticity index has the greatest effect on the permeability of soils.

Fig. 11
figure 11

Parametric study results of the permeability coefficient model against a D 10/D 50; b compaction degree; c plasticity index

EPR model for angle of shearing resistance

Four input parameters were used to build the EPR model for predicting the effective angle of shearing resistance. These parameters are as follows: coarse-grained content (Cc), fine-grained content (Fc), soil bulk density (γ), and shearing rate (Sr). These important factors, representing the φ′ behavior, were selected based on the literature review (Kayadelen et al. 2009; Mousavi et al. 2011). As expected, the main parameters affecting the soil strength parameters should be the soil type and soil density as well as the shearing rate at which the shear tests are performed. The obtained model for the effective angle of shearing resistance is:

$$ \tan {\varphi}^{\prime }=0.0084\mathrm{Fc}\left(\mathrm{Sr}\right)+2.71\times {10}^{-5}\mathrm{Fc}\left({\gamma}^2\right)\left({\mathrm{Sr}}^{0.5}\right)+0.021{\mathrm{Cc}}^{0.5}-0.154{\mathrm{Fc}}^{0.5}\left({\mathrm{Sr}}^{0.5}\right) + 0.572 $$
(8)

A comparison between the measured and predicted values of φ' illustrates that the model described here provides highly accurate prediction of the effective angle of shearing resistance (Fig. 12). Table 7 illustrates the performance of the developed EPR model for the training and testing datasets. Note that among the 120 data sets, 105 sets were used for training and 15 sets were used for testing the model.

Fig. 12
figure 12

Comparison between the predicted values of effective friction angle and the actual data

Table 7 Performance of the EPR model for effective friction angle

Figure 13 shows the results of the parametric study on the EPR effective angle of shearing resistance model. As expected, the results of the parametric study indicate that φ′ increases as the coarse content increases and decreases with increasing fine content. As can be seen, the effective friction angle will increase by increasing the soil bulk density.

Fig. 13
figure 13

Parametric study results of the effective friction angle model

Conclusion

The accurate determination of the geotechnical properties of soil from their index parameters is of paramount importance in the design of geotechnical structures. Because laboratory tests to determine the MDD, OMC, permeability, and effective friction angle are time consuming and expensive, it is desirable to develop models which are capable of predicting these parameters. This paper presents prediction models which use evolutionary polynomial regression. A database of 533 data sets was compiled. The database contains classification, Atterbeg limits, compaction, permeability, direct shear, and consolidated-drained triaxial compression test results which were performed on different types of soils (SC, SM, SP, CL, CH, ML, and MH). The database is obtained from tests carried out in the geotechnical laboratory at the UFBA, Brazil, as well as some experimental data from the literature.

The results of the prediction models have been compared with the experimental data. Comparisons of the results demonstrate that the developed EPR models provide highly accurate predictions. In the EPR approach, there is no need to preprocess, normalize, or scale the data. An intriguing feature of EPR is its ability to present more than one model for a complex phenomenon. The best models are chosen based on their performance on a set of data. For further verification of the EPR prediction models, parametric analyses were performed.