Introduction

The special nature of soft soil deposits is the most crucial for geotechnical engineering. Soft soils are widespread all over the world, some of which are located in important cities. Civil engineering constructions in soft soil deposits are limited by their tendency to conduct excessive settlement because of low shear strength quality. Consolidation and displacements can be noticeable under construction loads because of the large void ratio and inherent compressibility of clays, which can be time consuming and tedious for the structure engineer. Persisting low shear strength is particularly hazardous when constructing a large embankment on a soft clay base, facilitating potential circular or sliding failure planes. Therefore, ground improvement schemes are necessary. In this study, a ground improvement method by stone columns (SC) is considered because it has shown to be effective in improving soft soil properties. Many studies on SC-reinforced soft soil have been carried out all over the world (Guetif et al. 2007; Kousik et al. 2008; Zhou et al. 2002; Cimentada et al. 2011; Zahmatkesh and Choobbasti 2010). However, only a few have discussed the enhancing effect of SC on soft soil properties. In general, SC can increase the bearing capacity of soft soils and enhance the drainage and dissipation of excess pore water pressure (Bo and Choa 2004). S.R. Lo et al. (2010) compared numerical results and field measurements of several ground structures and found that the usage of the finite element program yields highly accurate results.

Artificial intelligent methods are computational models capable of executing complex input–output mapping (Tien Bui et al. 2011; Pradhan and Pirasteh 2010). The calculative simplicity of the processing elements makes the network much easier than dealing with rigorous mechanistic computational methods. The capability of neural networks to perform precise modeling of complex systems lies in linking the different neurons in the network. However, artificial neural networks (ANNs) cannot provide accurate results. Rumelhart et al. (1986) developed the back-propagation algorithm for artificial neurons, which was later considered an acceptable model. It was an important step that enabled many studies to improve in many applications in various scientific disciplines. The most widely used neural network is the back-propagation neural network (BPN), which uses sigmoid function in processing input signals and back-propagation algorithm in prediction error correction. BPN has a superior function approximation capability over the radial basis function network, which has superior capability in pattern classification (Haykin 1998). However, the downside of BPN is the limited dynamic range of sigmoid function, which sometimes makes use of a large number of neurons necessary to achieve the desired accuracy.

Various approaches such as ANN have been applied in wide areas of research because of their capability to represent any nonlinear processes given the sufficient complexity of the trained networks (Shahin and Indraratna 2006; Choobbasti et al. 2009; Kasa et al. 2011a, b; Zarea et al. 2012; Kia et al. 2011; Tien Bui et al. 2012). In this study, non cross validation (NCV) and tenfold cross validation (TFCV) neural network models were applied to predict the settlement of soft soil clay reinforced by SC under a highway embankment. A comparison between these two proposed models and the predicted results of settlement soil was further discussed.

Methodology

Settlement prediction of soft clay soil improved by stone column

Currently, there are a number of available methods used for predicting and calculating settlement. These methods can be classified as either approximate methods which make important simplifying assumptions and complex methods based on fundamental elasticity and plasticity theory which model material and boundary conditions, such as finite element method. The approaches considered are as follows:

Equilibrium method

The equilibrium method is one of the simple approximate methods to calculate the settlement of sand compaction piles, as described by Aboshi et al. (1979, Barksdale and Bachus (1983), and Barksdale and Takefumi (1990). This method is quite simple and used by engineers to calculate reduction settlement for ground improvement of stone columns. This particular method as well as others is derived from one unit cell idealization, whereby stone column is modeled to be a concentric body in a composite soil mass. The applied load on a composite soil mass develops stress and causes the occurrence of stress concentration in the column. The stone column will be stiffer than the surrounding soft soil (Bergado et al. 1996). The relative stiffness of the stone column and the surround soil is affected by the magnitude of the stress concentration. The stress in surrounding clayey ground (σ c ) is then given by Eq. (1).

$$ {\sigma}_c\kern0.5em =\kern0.5em \frac{n.\sigma }{\left[1\kern0.5em +\kern0.5em \left(n\kern0.5em -\kern0.5em 1\right).{a}_s\right]}\kern0.5em =\kern0.5em {\mu}_c\sigma $$
(1)

Where n = stress concentration factor, a s = area replacement ratio, μ c = ratio of stress in cohesive soil to average stress, and σ = applied stress.

The settlements occurring below the stone column reinforced ground is generally calculated in the normal method.

The level of improvement of soft soil by stone column is dependent upon the stress concentration factor n (as reflected in μ c ), the initial effective stress in the clay, and the magnitude of applied stress σ. The Eq. (2) indicates that if other factors are constant, a greater reduction in settlement is achieved for longer columns and smaller applied stress increments. The reduction settlement is calculated from conventional one-dimensional consolidation theory.

$$ \frac{S_t}{S}\kern0.5em =\kern0.5em \frac{{ \log}_{10}\left(\frac{{\overline{\sigma}}_{o+{\mu}_c}}{\overline{\sigma_o}}\right)}{{ \log}_{10}\left(\frac{{\overline{\sigma}}_{o+\sigma }}{\overline{\sigma_o}}\right)} $$
(2)

Where S t is the total settlement, σ c is the change in stress in clay soil, σ is the acting stress, σ o is the initial effective stress, C c is the compression index, e o is the initial void ratio, and H is the vertical height of stone column.

When using the equilibrium method, settlements occurring beneath the reinforced ground must be considered separately using conventional consolidation or elastic settlement analysis.

The Priebe method

Priebe (1991), Arukrajah and Affendi (2002), and Bo and Choa (2004) provides a design procedure for vibro-replacement construction of stone columns. Priebe (1995) adapted, extended, and provided design procedures with design charts for various aspects of stone column design, including settlement reduction, bearing capacity, shear strength values of improved ground, and liquefaction. An equation is provided below for predicting the improvement factor no based on the cross-sectional area of the column, the area of the unit cell, and the coefficient of active earth pressure. The series of equations used to calculate settlement depend on the basic improvement factor, no, and consider the coefficient of earth pressure to be one as presented below.

$$ {n}_o\kern0.5em =\kern0.5em 1\kern0.5em +\kern0.5em \frac{A_c}{A}\;\left(\frac{1/2\kern0.5em +\kern0.5em f\;\left({\mu}_s,{A}_c/A\right)}{K_{ac}f\;\left({\mu}_s,{A}_c/A\right)}\kern0.5em -\kern0.5em 1\right) $$
(3)

Where;

$$ f\;\left({\mu}_s,{A}_c/A\right)\kern0.5em =\kern0.5em \left(\frac{\left(1\kern0.5em -\kern0.5em {\mu}_s\right)\;\left(1\kern0.5em -\kern0.5em {A}_c/A\right)}{1\kern0.5em -\kern0.5em 2{\mu}_s\kern0.5em +\kern0.5em {A}_c/A}\right)\kern0.5em -\kern0.5em 1 $$
(4)
$$ {k}_{ac}\kern0.5em =\kern0.5em { \tan}^2\;\left(45\kern0.5em -\kern0.5em {\upphi}_c/2\right) $$
(5)
$$ {n}_o\kern0.5em =\kern0.5em {S}_t/S $$
(6)

Where n o = basic improvement factor, A c = area of column, A = unit cell area, μ s = Poisson’s ratio, K ac = Rankine’s active earth pressure, and ∅c = stone column material friction angle.

The Priebe method quantifies the improvement that results from the inclusion of the stone column without densification of the soil between stone columns. This design method refers to the improving effect of stone column in a soil.

Greenwood and Kirsch (1984) concluded that the simplicity of the Priebe method applying an improvement ratio to conventional consolidation is attractive to engineers, which results as the method being widely used. The Priebe method is a common method of design in practice.

Description of data

A highway project called Lebuhraya Pantai Timur2, which has a length of 173 km, is currently being constructed between Kuantan and Kula Terengganu in the state of Terengganu in Malaysia. The map in Fig. 1 indicates the location of the two project sites. The geotechnical design works include ground improvement of the existing foundation to sustain the imposed dead and traffic loads for highways. A proposal for the improvement of soft clay soil requires borehole information acquired from several stretches of soft soils to be analyzed thoroughly. This information provides details of the soil layers, water content, and position of the ground water level. Piezometer tubes are installed into the ground to measure changes in the ground water level for a specific period. Ground site investigations also involve in situ and laboratory tests, area photographs, and geological maps. The probable soil conditions and limits to which method can be employed in the design can be determined from the data.

Fig. 1
figure 1

Location plan of the LPT2 Expressway project in Malaysia

These data were also used to calibrate and validate the neural network models obtained from the FEM package of Plaxis v8 program analysis (PLAXIS 2002). Each condition had 288 cases.

Neural network modeling

ANN is a system formed by computational units called neurons, which can highly interconnect with each other. The ANN system can learn, recall, and generalize from training data (Attoh-Okine 2002). In this study, ANNs with TFCV and NCV models were used to predict settlement. The models were divided into two sets: training and test. Table 1 and Fig. 2 show the values of some variables used in ANN and the structure of ANN used in this research for predicting settlement. To design a neural network, several architectures of ANN models were examined by varying the number of hidden layers and neurons in each hidden layer and the training function parameters (Beale and Jackson 1990; Flood and Kartam 1994a, b). The powerful neural network model was obtained after a number of trials using three layers (i.e., input, hidden, and output). A total of 30 nodes, 15 nodes, and 1 node were found distributed in the neurons of the input, hidden, and output layers, respectively. In the structure of a neural network, the number of neurons in each hidden layer is trained once the error of the network reaches a minimum value (Banimahd et al. 2005). All neural models use three types of algorithm, namely trainlm, trainscg, and traingdx. The mean square error (MSE) and the coefficient of determination (R 2) of the parameters in both training and testing are defined as follows:

$$ MSE\kern0.5em =\kern0.5em \frac{1}{n}{\displaystyle {\sum}_1^n{\left({y}_p\kern0.5em -\kern0.5em {y}_m\right)}^2} $$
(7)
Table 1 Soil parameters adopted and input parameters used in ANN
Fig. 2
figure 2

Structure of ANN used in this research for predicting settlement

The coefficient of determination R is a measure of scatter or the lack of it between two sets of data and is given as follows:

$$ {R}^2\kern0.5em =\kern0.5em \frac{{\left(n{\displaystyle {\sum}_1^n{y}_m{y}_p\kern0.5em -\kern0.5em \left({\displaystyle {\sum}_1^n{y}_m}\right)\;\left({\displaystyle {\sum}_1^n{y}_p}\right)}\right)}^2}{\left(n\left({\displaystyle {\sum}_1^n{y}_m^2}\right)\kern0.5em -\kern0.5em {\left({\displaystyle {\sum}_1^n{y}_m}\right)}^2\right)\;\left(n\;\left({\displaystyle {\sum}_1^n{y}_p^2}\right)\kern0.5em -\kern0.5em {\left({\displaystyle {\sum}_1^n{y}_p}\right)}^2\right)} $$
(8)
$$ Efficiency\kern0.5em =\kern0.5em 1\kern0.5em -\kern0.5em \left(\frac{ MSE}{\sigma^2}\right) $$
(9)

Where y m and y p are measured and predicted parameters, respectively, and σ is the standard deviation.

Multiple linear regression model

A multiple linear regression (MLR) model of settlement was built to test the relationship between settlement of SC and its determinants. This model was compared with the ANN model. The following multiple regression equation was used to predict the settlement SC for the dataset as follows:

$$ {y}_i\kern0.5em =\kern0.5em {\beta}_0\kern0.5em +\kern0.5em {\beta}_1{x}_{1i}\kern0.5em +\kern0.5em {\beta}_2{x}_{2i}\kern0.5em +\kern0.5em \cdots +\kern0.5em {\beta}_p{x}_{pi}\kern0.5em +\kern0.5em {e}_i $$
(10)

Where for a set of i successive observations, the predicted and variable y is a linear combination of an offset β o , a set of k predictor variables x with matching β coefficients, and a residual error e. The β values are commonly derived through the ordinary least squares method. When the regression equation is used in a predictive mode, e is omitted because its expected value is zero. Regression models are inherently linear, although curvilinear relationships can be incorporated through polynomial terms in the regression. Known relationships can be prespecified by transforming a nonlinear predictor variable into a more linear form before using it in the model.

Network training and validation

NCV model

The settlement soft soil data were partitioned into training and validation data, with the former being used to develop the network and the latter to verify the predictive quality of the trained model. Up to 75 % of the database was used for training; the remaining 25 % of the data were used for testing the network prediction. As much as possible, the training data were made to capture the widest variations in input and output patterns in the database. This process is performed to avoid having extreme data in the testing set, which could make the true generalization capability of the model impossible to assess within the domain of the training data, as in the case of completely randomized selection (Shahin et al. 2004). The MSE values after using three types of training algorithm (i.e., trainlm, trainscg, and traingdx) as an indicator of the accuracy of the results are shown in Table 2.

Table 2 Summary of training and testing results of NCV model (30 15 1)
Fig. 3
figure 3

Input data before and after training for NCV model (training algorithm trainlm)

TFCV model

The standard machine learning method TFCV is used to train and test ANN models. The test sample dataset should never be used with the training process of ANN. In TFCV, the data were separated into ten sections that were roughly equal in size. In the first iteration, nine of these subsets were combined and used for training. The remaining set was used for testing the performance of our ANN on unnoticed cases. We repeated this process for ten iterations until all subsets were used once for testing. The TFCV model was also used to assess the robustness of ANN.

The early stopping procedure was used to prevent the ANN model from overfitting and to keep it generalizable to future cases (Bishop 1995; Mitchell 1997). Generalizability is the capability of a model to prove a similar predictive effect and gives the right results of input data not seen during the training process. It can be measured by the performance of error frequency for training data or error frequency for novel data. When the model (i.e., the network) is powerful, the risk increases in that it learns peculiarities specific for the training samples where losing generalization occurs; therefore, training prematurely (“early stopping”) must be stopped to avoid the overfitting phenomenon (Haykin 1999).

The ANN model is designed to have a large number of hidden nodes, as ANNs with a large number of hidden nodes generalize better than networks with a small number of hidden nodes when trained with back propagation and “early stopping” (Caruana et al. 2001; Lawrence et al. 1997; Weigend 1994).

The TFCV neural network model is an alternative neural network type with a trainable activation classified date and a capability to achieve the desired accuracy. In this study, the capability of the TFCV neural network model in simulating the settlement behavior of reinforced soft clay under high way embankment was investigated. In the training process, the model attempts to distinguish and recognize the data and makes the forecasting values very close to the actual event (Figs. 3 and 4).

Fig. 4
figure 4

Input data before and after training for TFCV model (training algorithm trainlm)

Relation importance of input (RI)

In 1991, Garson invented a simple technique to interpret the important relation between input parameters by examining the connected weights of the training network (Goh 1994; Shahin et al. 2002a). A connection weight approach was used to evaluate the importance of inputs (i.e., SC and embankment parameters) to predict output (settlement) in ANNs. The connection weight method is used to sum up the products of the input-hidden and hidden-output connection weights between each input neuron and output neuron for all input variables (Olden et al. 2004). The relative importance of input variable i is determined from the following formula:

$$ \begin{array}{ccc}\hfill R{I}_i\kern0.5em =\kern0.5em \frac{{\displaystyle {\sum}_{j=1}^m{W}_{ij}{W}_{jk}}}{{\displaystyle {\sum}_{i=1}^n{\displaystyle {\sum}_{j=1}^m{W}_{ij}{W}_{jk}}}}\kern0.5em \times \kern0.5em 100\%\hfill & \hfill i\kern0.5em =\kern0.5em 1,2,3,\dots, n\hfill & \hfill j\kern0.5em =\kern0.5em 1,2,3,\dots, m\hfill \end{array} $$
(11)

Where RI i is the relative importance (expressed in percentage) of the variable i in the input layer on the output variable, j is the index number of the hidden node, W ij is the connection weight between input variable i and hidden node j, and W jk is the connection weight between hidden node j and the output node k.

Results and discussion

Result comparisons

The capability of ANN to conform to learning is supported by the degree of acceptability of the model training and testing. The convergence of the testing data with the forecasting data is represented as standard for quality and robustness model. The MLR model was developed using the same input and output parameters and then compared with the ANN model. The results indicate that the estimated R is relatively low (R = 0.9117).

Figures 3a, b and 4a, b show the training data of the BPN using the training algorithm (trainlm) for each TFCV and NCV models. Before the training process, all the points’ dates appeared isolated, and no matching with the field data was performed. However, after training, the model appeared to make all the points’ dates close to the actual dates.

The BPN that uses training algorithm (trainlm) provided good quality of the network’s prediction compared with another algorithm for each TFCV and NCV models. As shown in Tables 2 and 3, TFCV produced higher efficiency for training (0.9935) and testing (0.9716) than did NCV (0.9909 and 0.9691, respectively).

Table 3 Summary of training and testing results of TFCV model (30 15 1)

In Fig. 5a, b, the BPN network outputs are compared with the training of TFCV and NCV models, respectively. TFCV simulations had better correlation and exceeded the training data (R = 0.996771 for TFCV and 0.99605 for NCV).

Fig. 5
figure 5

Comparison of actual versus predicted settlement. a Tenfold cross validation ANN and b non cross validation ANN

The predictions of the TFCV and NCV models are compared with the measured and predicted data in Fig. 6a, b, respectively. After tracking and monitoring all data points, all measured and predicted data points matched, except for a few points along the curve (Fig. 6a). The NCV model in Fig. 6b had a different data behavior (measured with predicted data). However, a difference was noted in the third plot. This result indicates that the TFCV model has a high level of learning and a high quality of prediction. Further comparison of forecasting values of ANN, MLR, and Priebe models was also made against the measured values (Fig. 7). The figure clearly shows that ANN (TFCV model) provides the closest estimate of settlement and is therefore the most accurate.

Fig. 6
figure 6

Comparison of actual versus predicted settlement. a Non cross validation ANN and b tenfold cross validation ANN

Fig. 7
figure 7

Performance comparison of various settlement prediction methods

Parametric study and sensitivity analysis

An attempt was made to identify which of the input parameters has the most effect on the settlement behavior of SC bedded in soft clay (model output). Thus, a sensitivity analysis was carried out on the neural network. The whole computation was repeated for each output neuron. Figure 8 demonstrates the summary of determining the relative importance of the input variables of the TFCV model.

Fig. 8
figure 8

Relative importance of input variables of the TFCV artificial neural model using Eq. (11)

The results of the parametric study carried out to assess the generalization ability of the TFCV model are presented in Fig. 8. Results showed a good agreement between the TFCV model response and the expected settlement behavior of SC when subjected to embankment loading. The angle of internal friction had the highest importance (72 %), whereas the diameter of SC had the lowest importance (2 %) when compared with the other input variables. The predicted settlement decreased as the angle of internal friction of material SC and as the diameter and length of SC increased (Fig. 9). The predicted settlement increased with increasing spacing between SC and the height of the embankment. The consistent decrease in the predicted settlement with increment in parameters (Fig. 9) and the increase with increment in another two parameters indicate a good agreement between the TFCV model and the actual settlement. Soft clay with SC shows improved settlement behavior, and soft soil properties affect the settlement behavior of soft clay (Priebe 1991, 1995). The settlement varied proportionally with spacing between SC and height of the embankment. With regard to the behavior of foundation soil, the height of the embankment varied inversely with settlement. The settlement started with small values under light load (1 to 3 m fill embankment), after which a heavy load of embankment material produced high values of settlement. This result agrees with the study carried out.

Fig. 9
figure 9

Sensitivity of tenfold cross validation model for friction internal angle of SC, spacing between SC, diameter of SC, length of SC, and high of embankment

Conclusions

In this study, two geotechnical applications were performed using ANN to simulate the settlement curve. The development of the two models TFCV and NCV was based on the use of the proposed simple input data. The input data consisted of the angle of internal friction, spacing between SC, diameter, length of SC, and height of the embankment. The other parameters were soil types and installation methods. Based on R2, MSE, and settlement quality, a significant improvement was observed in the comparison of the results of the models using variance algorithms. The trainlm algorithm gave better results than the other models. The ANN model had the lowest RMSE values for in-sample and out-of-sample forecasting. These results indicate that the nonlinear ANN model can generate a better fit than the MLR model.

The proposed TFCV model was more accurate in prediction than the NCV model. The sensitivity analysis indicates that the prediction of the settlement by the TFCV model is in agreement with the underlying physical behavior of settlement prediction based on documented prior knowledge. The properties of material on SC in this case had high relative importance (72 %) compared with the other parameters. Based on the parametric study, the TFCV model responded reasonably well to various input parameters in a manner consistent with the anticipated behavior of soft clay soil reinforced with SC in a highway embankment in the LPT2 project.