1 Introduction

Pile foundations are slender structural elements situated beneath superstructures, commonly used as load transferring systems and soil settlement controls at sites where there are inadequate sub-soil layers. Pile bearing capacity and associated settlement play a key role in the design of pile foundations (Shahin 2013; Tschuchnigg and Schweiger 2015; Unsever et al. 2015; Feng et al. 2016). It has been demonstrated by Das (2015) that pile bearing capacity can be determined by dividing the ultimate applied load by a certain safety factor, depending on the strength of the structure and its serviceability. Associated settlement, on the other hand, can be attributed as a consequence of an increase in effective stress, resulting in elastic compression and a reduction in soil volume in the effective stress zone (Momeni et al. 2014).

The stress–strain relationship of soil is non-linear and settlement can be driven by an increase in relative vertical stress (Loria et al. 2015; Jebur et al. 2017a). In conventional procedures, pile settlement can be determined by dividing the sub-soil layers into sections. The total summation of the compression in soil layers is equal to the settlement (Tomlinson 2014). Uncertainties associated with a range of factors i.e. soil stress history, soil compressibility, nonlinear relationships between soil stress–strain and stress distribution due to sampling, have been cited as barriers to accurately determining pile settlement (Shahin et al. 2002). Because of these difficulties, there has recently been in increase in the number of experimental and numerical studies concerning pile bearing capacity (Mullapudi and Ayoub 2010; Xu et al. 2013; Naghibi et al. 2014; Madhusudan and Ayothiraman 2015). However, for simplification purposes and by necessity, several hypotheses associated with the significant parameters that govern pile settlement, have been assumed. This has resulted in the fact that the majority of current approaches fail to achieve the required levels of accuracy with respect to pile settlement (Nejad et al. 2009).

Due to this failing, in situ, static pile load tests (SPLT), dynamic load tests (DLT), standard penetration tests (SPT) and cone penetration tests (CPT) are still the most acceptable methods to measure pile capacity and its associated settlement. However, while essential, these come with their own difficulties in that they are time consuming, present complications for the construction process, are not environmentally friendly and are also costly (Momeni et al. 2014).

There are situations where computational intelligence (CI), based on artificial neural networks (ANNs), has been introduced and found to be a more robust and accurate approach in comparison to other modelling methods (Nejad et al. 2009). ANN is a data driven, mathematical approach used to mimic the biological structure of the human brain and nervous system (Momeni et al. 2014; Schmidhuber 2015; Jaeel et al. 2016). Recently, the feasibility of ANN applications have been tested and successfully applied, solving a range of problems related to geotechnical engineering, giving acceptable levels of accuracy (Momeni et al. 2014; Alkroosh et al. 2015; Jaeel et al. 2016).

Momeni et al. (2014) conducted a study examining pile-bearing capacity using a hybrid genetic algorithm approach (GA). To develop the database for network training, 50 in situ pile-loading tests were performed on concrete piles. Four factors were used as the most effective model input parameters affecting pile-bearing capacity; pile set, pile geometrical properties, drop height and hammer weight. To provide the best model output, trial and error was also used to select the optimum model. A good fit was achieved when comparing the results of the data with the predicted (GA) model output, having a mean square error (MSE) of 0.002. This substantiates the use of ANN as a practical and efficient approach to modelling pile capacity.

Shahin (2014) addressed the feasibility of recurrent neural networks (RNN) to evaluate the pile bearing capacity of steel piles. Six model input parameters were found to be the most important factors influencing the steel pile bearing capacity, these comprised pile diameter, pile effective length, the weighted average cone point resistance over the pile tip zone of failure, the weighted average friction resistance over the pile effective depth, the average cone point resistance over the penetrated depth and the weighted average friction ratio over the pile embedment depth. The results revealed that the RNN model had ability to simulate the pile bearing capacity for steel piles with some degree of success.

Despite many investigations highlighting the use of artificial neural networks (ANNs) to simulate pile bearing capacity and settlement, to date, there are still specific gaps in the subject knowledge. Comprehensive experimental tests evaluating the bearing capacity of rigid and flexible concrete model piles, driven in three different sand densities, carried out to create an accurate database to develop and verify a new Levenberg–Marquardt (LM) algorithm to predict pile load-settlement response, would be a breakthrough in deep foundation research.

2 Aim and Objectives

The current investigation has been performed to address gaps in the geotechnical literature in relation to the determination of accurate pile settlement, the specific objectives are to:

  • Perform a series of comprehensive experimental tests to explore the bearing capacity of concrete piles having three aspect ratios (lc/d) of 12, 17 and 25 to simulate the response of rigid and flexible piles, penetrated in three, relative sand densities (Dr); loose, medium and dense.

  • Develop an accurate laboratory database for the ANN model.

  • Utilise a new MATLAB training algorithm, i.e. the Levenberg–Marquardt (LM) based ANN, to develop a predictive model of pile settlement.

  • Carry out hypothesis testing (T-tests and F-tests), to establish how representative the database sub-division, training, validation and testing are, with respect to each other.

  • Assess the relative importance (‘Beta’ value) and the statistical significance (‘Sig’ value) as well as outliers of all variables on the model output using SPSS-23 software.

3 Materials and Methods

3.1 Sand Properties

The sand particles were composed of sub-rounded particles, as confirmed by scanning electronic microscopy (SEM) observations (Fig. 1a, b). Based on the Unified Soil Classification System (USCS), this sand is classified as poorly graded (SP). The coefficient of uniformity (Cu) and the curvature coefficient (Cc) are 1.786 and 1.142, respectively. The sand was prepared in three relative densities, Dr of 18, 51 and 83%, as this represented the entire range of the in situ sand density. The minimum and the maximum sand unit weight was 15.33 and 17.5 kN/m3, respectively. To maintain the impact of the grain size distribution on the combined soil-pile interaction, the ratio between the proposed diameter of pile to the medium diameter (d50) of the sand specimen should be 45 (Nunez et al. 1988). To minimize the effect of the scale factor and to give a precise simulation of the sand-pile interaction, it has been suggested by Remaud (1999) that the ratio must be 60 times the diameter of the pile. Taylor (1995) however, proposed that the ratio should be at least 100. In this study, the ratio of the diameter of the pile to medium diameter (d/d50) is 133 as shown in Fig. 2, meeting the scaling law criteria. Testing was carried out following the procedure stated by Akdag and Özden (2013).

Fig. 1
figure 1

a and b Scanning electronic microscopy (SEM) views of the sand specimen

Fig. 2
figure 2

Profile of particle size gradation in the sand sample

3.2 Pile Loading Procedure

Precast concrete piles have been used in this study, their aspect ratios (lc/d) measuring 12, 17 and 25, with an outer diameter of 40 mm to investigate the behaviour of rigid and flexible piles (Madhusudan and Ayothiraman 2015). Model concrete piles are to be used in this study since they are highly recommended for deep foundation systems (Feng et al. 2016). For the mechanical applied load of the pile, a static load test was run in accordance with the procedure described by ASTM D1143-07 (American Society for Testing and Materials ASTM 2013). Compression loads were applied in increments using a new hydraulic jack system type DBBSM, connected at the top of the load cell, having a maximum capacity of 10 kN. This was secured between the pile head loading system and the hydraulic ram model (ZE3408E-T). A Polytetrafluoroethylene (PTFE) sheet was used in the pile-testing chamber in order to reduce the friction between the chamber and the sand specimen. The PTFE sheet has a coefficient friction of < 0.04 in comparison to steel sheet friction coefficients of about 0.605 (Young and Freedman 2000). The loads were applied directly onto an aluminium pile cap with a diameter of 150 mm and thickness of 25 mm.

4 The Levenberg–Marquardt (LM) Algorithm Model Development

The LM training algorithm is a data driven computing method, which, more specifically, can be applied when the relationship between model input and output parameters are nonlinear (Nguyen-Truong and Le 2015). The LM algorithm based on artificial neural networks (ANNs) considers three processing layers or nodes; namely, an input layer, one or more hidden layer(s) and an output layer (Bashar 2013). Those layers form the ANN means of learning and detailing the patterns controlling the dataset that the network is constructed with. It is worth pointing out that the objective of the hidden layer is to transform the model input parameters into the output layer, multiplied by connection weights and any bias either added or subtracted. This computational intelligence (CI) approach has been cited as a versatile and efficient computational tool, which successfully solves problems that may be difficult to tackle using numerical approaches. The multi-layer back propagation (MLBP) method developed by Rumelhart et al. (1986), is the most robust and popular process to train the network in many fields of engineering and sciences (Bashar 2013). In this study, the Levenberg–Marquardt (LM) algorithm was trained using the (MLBP) with training parameters, as shown in Table 1.

Table 1 The Levenberg–Marquardt (LM) training parameters

4.1 Pre-processing and Data Classification

To construct the Levenberg–Marquardt (LM) based-ANN model architecture, to smooth and to eliminate overfitting, the database is randomly classified into three sets: training, validation and testing. The goal of the training dataset is to create the network and fit the model, while the testing set provides an independent check of network performance during the training process. The task for the validation set is to finally evaluate the performance and generalisation ability of the ANN model, as reported by Shahin (2016) and Jaeel et al. (2016). To develop an optimum (ANN) model, all patterns that are present in the database need to be enclosed in the training set. Because the testing dataset was used to check the quality of the ANN model, it needs to be representative of the training set and should consequently comprise all patterns that exist in the available database (Shahin and Jaksa 2005). The database was normalised between 0.0 and 1.0 before the training of the network, to eliminate the influence of one factor over another and also to allow each individual variable (IV) to receive the same attention during the training process (Majeed et al. 2013; Nejad and Jaksa 2017). It is crucial that the dataset used for the training, testing and cross validation represent similar populations (Masters 1993). However, in terms of statistical analyses, the T-test and F-test, were conducted, for normalised data as shown in Table 2 to ensure that the training, cross validation and testing datasets have similar statistical parameters.

Table 2 T-test and F-test results for the (ANN) model inputs and output

4.2 Statistical Significance of Each Independent Variable (IV)

The level of contribution of each independent variable (IV) to the dependent variable (DV) in the constructed model has been ascertained by calculating the relative importance, or Beta value, and the statistical significance (p value) using SPSS-23. Any IV at p > 0.05 can be discounted as it has no substantial influence on the model target (Field 2008; Hashim et al. 2017c). Statistically, the closest to one the absolute Beta value is, the more significant the impact of that IV on the model (Pallant 2005; Hashim et al. 2017a, b). Table 3 shows that the applied load and the sand-pile interface friction angle have the highest contribution to the model output at Beta values of 0.787 and 0.613 respectively. Pile slenderness ratios, flexural rigidity and pile length made a lesser contribution to the model output. Moreover, results in Table 3 also revealed that the maximum Sig value for all variables is less than 0.05, matching the statistical criteria. Based on the statistical analyses, the ANN model, was trained with five parameters, these being applied load, P, pile slenderness ratio, lc/d, pile axial rigidity, EA, pile effective length, lc and the interface friction angle, δ. The model output was pile settlement as illustrated in Fig. 3.

Table 3 Results of the statistical analysis
Fig. 3
figure 3

Sketch of the optimised ANN topology

4.3 Outliers

Outliers can be illustrated as points, or a single data point, that appears to be incompatible with other dataset observations (Walfish 2006). The performance and the generalisation ability of the developed model can be highly influenced by the presence of such extreme points (Hashim et al. 2017c). Therefore, all IVs and DVs should be screened before the training process. The presence of outliers can be tested by determining the Mahalanobis distances (MDs) following the statistical criteria reported by Tabachnick and Fidell (2013). In this investigation, for five IVs, the screening test revealed that the maximum MDs is 20.52. Whereas, for the experimental dataset, the highest MDs was found to be 19.26 as given in Table 3, which evidences the absence of the outliers. A summary of the statistical parameters for the models’ inputs and output, are given in Table 4.

Table 4 A statistical summary for the ANN inputs and output variables

4.4 Dataset Size

The reliability of the size of the dataset must be precisely calculated in order to develop the best relationship between the independent variables (IVs) and the model output, and to obtain an efficient model performance (Pallant 2005; Hashim et al. 2017c). For the five input parameters, according to the equation below (Eq. 1), the minimum dataset size required to train the LM algorithm is 90 (Tabachnick and Fidell 2013). In this paper, there were 254 experimental dataset points used to run the LM training algorithm, satisfying the aforementioned statistical criteria.

$$N > 50 + 8*IVs$$
(1)

where N and IVs denote the required size of the sample and number of independent factors to perform the LM training algorithm.

5 Results and Discussion

5.1 Architecture and ANN Model Performance

The ANN network was trained utilising the Levenberg–Marquardt (LM) MATLAB algorithm version R2017a, as it is a more reliable and a faster approach than all other artificial neural approaches (Jeong and Kim 2005). To include full details about the LM algorithm is beyond the scope of this study but can be found in Hagan et al. (1996). The generalisation ability and the performance of the proposed algorithm can be evaluated using different performance measuring indicators suggested in the open literature. In the context of the present paper, the mean square error (MSE) function was identified to measure the model performance with an error goal set at zero. The LOGSIG transfer function (TF) was utilised in the hidden layer and the PURELIN transfer function was used to interconnect layer two and three as shown in Eqs. 2 and 3, and as recommended by Alizadeh et al. (2012). It should be stated that the existence of the transfer function in the hidden layer and output layer is essential in order to transform the weighted sum of all signals hitting on a neurons so as to select its “firing intensity” (Majdi and Beiki 2010; Jebur et al. 2017b).

The experimental dataset, a total of 254 data points, was randomly divided into three subsets, composed of 70% training (178 data points), 15% testing (38 data points) and 15% validation (38 data points). After training the ANN network, the results revealed that the optimum ANN model consisted of three layers; the input layer, one hidden layer with 10 neurons and an output layer. As mentioned previously, the performance of the LM algorithm was characterised by the mean square error (MSE) as shown in Eq. 4. The main objective of the training dataset is to learn the patterns presented in the dataset by updating ANN biases and weights (Trigo 2000; Jaeel et al. 2016). This training process normally ends when the error value is sufficiently small enough (Yadav et al. 2014). The performance of the model under training is displayed in Fig. 4, the results revealing that the minimum square error (MSE) was 0.0025192 at an epoch of 215. It can also be seen that the training process was terminated to avoid overfitting once the cross-validation error started to increase. The variation in error gradient, the Marquardt adjustment parameter (mu) and checks for the validation are presented in Fig. 5. It can be seen that the gradient error is 0.004691, while the mu factor and the validation check numbers are 1e-06 and 6 at an iteration of 221, respectively.

Fig. 4
figure 4

Graph presenting the optimum mean square error (MSE) selected during the training process

Fig. 5
figure 5

Performance diagrams for the ANN trained network

The error histogram (EH) has been presented in Fig. 6 to obtain additional verification of network performance. The EH can also give an indication of outliers and data features where the fit is significantly poorer than the majority of the rest of the data (Yadav et al. 2014; Abdellatif et al. 2015). In Fig. 6, the red, green and the blue bars signify testing, validation and training data, respectively. It should be noted that the majority of data coincide with a zero error line, which represents a scheme for outline verification to determine if the dataset is inadequate.

$$Z_{j} = f\left( {\mathop \sum \limits_{1}^{5} w_{ij}^{\left( 1 \right)} X_{i } + b_{j}^{\left( 1 \right)} } \right) = \frac{1}{{1 + { \exp }\left( {\mathop \sum \nolimits_{1}^{5} - w_{ij}^{\left( 1 \right)} x_{i} + b_{j}^{\left( 1 \right)} } \right)}}$$
(2)
$$y = \mathop \sum \limits_{1}^{5} w_{j}^{\left( 2 \right)} z_{j} \pm b^{\left( 2 \right)}$$
(3)
$$MSE = \frac{1}{n} \left( {\mathop \sum \limits_{1}^{n} (measured\left( {ij} \right) - predicted\left( {ij} \right))^{2} } \right)$$
(4)

where the factors w (1) ij and b (1) j are the synaptic connection weights and threshold biases that were identified during the training process between the input and hidden layers respectively. w (2) ij and b (2) j are the synaptic connection weights and threshold biases that were determined during the training process in the output layer. Xi represents the number of input parameters that used in the first layer (input layer). \(f\) is the log-sigmoid transfer function, which is used to transform the weighted sum in the hidden layer. MSE is the Mean Square Error indictor that is utilised to evaluate the performance of the LM trained network by measuring the error percentage between the measured and the predicted values.

Fig. 6
figure 6

Error histogram during training, testing and validation

5.2 Evaluation of the Robustness of the ANN Model

In this section, the results of the experimental load-settlement (Q–S) tests were compared with the predicted values established by the optimum model of the LM trained network. A series of experimental pile load tests were carried out on concrete pile models. The testing program comprised of three piles with slenderness ratios (lc/d) of 12, 17, and 25 with diameters of 40 mm to examine the behaviour of rigid and flexible piles. In total, 254 points were recorded from the experimental pile load-settlement results using two strain type displacement transducers (SDTs), with 50 mm travel distance connected to a P3 strain indicator. In addition, the applied loads were recorded using a calibrated load cell type (DBBSM series S-Beam), having a maximum capacity of 10 kN. As previously mentioned, the supervised Levenberg–Marquardt (LM) algorithm was utilised to develop and train the ANN network based on MATLAB, version (R2017a).

Figures 7, 8 and 9 illustrate the extent of the fit between the experimental and predicted normalised load-carrying capacity of concrete piles, subject to axial loads at different stages of mechanical loading for loose, medium and dense sand. The load carrying capacity variations are typical for pile foundations subject to axial mechanical loading systems, i.e. varying from pile head to pile toe due to the increase in developed shaft resistance in the effective soil zone. The results revealed that the pile load carrying capacity values display a clear elastic branch for a pile loaded at about 200 N in loose sand, 400 N in medium sand, and about 800 N in dense sand, where local nonlinearity is observed. Furthermore, plastic mechanisms involved in the soil surrounding the pile are the leading cause of the non-linearity of the load-settlement curve; as the applied load increases, the pile response shows nonlinearity until reaching a maximum capacity at about 10% of pile diameter following the pile load test criteria reported by BSI (BS EN 8004:1986). For model piles with a slenderness ratio (lc/d) of 12 driven in loose sand, the maximum pile capacity is about 520 N. While, for piles with lc/d of 17, and 25 the maximum pile capacity is about 750 and 950 N respectively. Comparing the results of the model pile tested in medium and dense sand, for a model pile with slenderness ratio of 12, the maximum pile capacity is about 1050 N in medium sand and about three times this value (3, 150 N) in dense sand. Moreover, the maximum bearing capacity for piles with an aspect ratio of 17, driven in medium and dense sand is about 1400 and 4350 N correspondingly. Furthermore, for pile with slenderness ratio of 25, the maximum pile capacity in dense sand is almost three times the pile capacity tested in medium sand. It should be mentioned that this increase in pile capacity with the increase in the pile effective depth (lc) and sand relative density can be assigned due to an increase in the point bearing (end bearing) and mobilised skin friction resistance developed within the contacted soil in the effective stress zone. According to the graphical comparisons between the measured and the predicted values, for loose sand, the predicted results are slightly overestimated for the pile load-test curves in case of pre-yield working settlement. Moreover, there was an excellent fit between the proposed LM training algorithm and measured values in post-yield pile load tests responses for all cases, with a correlation coefficient (R) of 0.99 for all data. This demonstrates that the proposed algorithm is a reliable method that can be applied to predict pile load-settlement curves with an acceptable level of accuracy.

Fig. 7
figure 7

Profiles of measured versus predicted pile load tests for model piles embedded in loose sand

Fig. 8
figure 8

Profiles of measured versus predicted pile load tests for model piles embedded in medium sand

Fig. 9
figure 9

Profiles of measured versus predicted of pile load tests for model piles embedded in dense sand

For further evaluation of the reliability and the performance of the proposed LM algorithm the results were also presented graphically with the corresponding experimental settlement in the form of a regression calibration curve (Fig. 10). As can be seen, the training algorithm satisfies the robustness test. All the measured and predicted points are matched well and close to the best-fit line with correlation coefficients of 0.99139, 0.98565, 0.988819 and 0.99008 for training, validation, testing and all data, substantiating the application of the LM algorithm based on ANN as an effective predictive tool that behaves in an acceptable manner.

Fig. 10
figure 10

Regression profiles of the experimental versus predicted settlement for the training, testing and validation of all data

Lastly, the performance of the LM algorithm was also examined graphically, as demonstrated in Fig. 11. The testing dataset has been utilised to plot a regression calibration curve between fitted versus predicted values, with a 95% confidence interval (CI). Significant agreement can be observed between the measured versus predicted values, with a root mean square error (RMSE) and correlation coefficient (R) of 0.0478 and 0.988, which also confirms that ANN, based on the Levenberg–Marquardt (LM) MATLAB training algorithm, can successfully reproduce the results of the experimental pile settlement with a high degree of accuracy.

Fig. 11
figure 11

Calibration plot of resulting model for the testing dataset at a 95% confidence interval (CI)

6 Concluding Remarks

A series of experimental studies have been conducted to examine the pile bearing capacity of piles embedded in sandy soil with sand densities (Dr) of 18, 51 and 83%. According to the statistical parameters, the applied load (P), sand-pile friction angle (δ), pile axial rigidity (EA), pile slenderness ratio (lc/d), and pile effective length (lc) were identified as the most important input parameters on model output with different weights, following the order: P > δ > lc/d > lc > EA. The results of the screening dataset test reveals that the maximum MDs is less than the critical value (20.52), which confirms the absence of outliers in the experimental dataset. The LM training algorithm based ANN has favorable features such as simplicity, high efficiency, ease of application and generalisation, which makes it an attractive choice to capture highly non-linear load-settlement responses. In conclusion, based on the results of the graphical comparison of pile carrying capacity and the regression calibration curve, the proposed algorithm can be used as an efficient data-driven approach to accurately model pile settlement with a root mean square error (RMSE), correlation coefficient (R) and mean absolute error (MAE) of 0.050192, 0.98819 and 0.0025192, respectively. One of the advantages of the proposed method is that pile settlement can be successfully simulated using the LM algorithm, with five input parameters that can be easily determined without the need to perform expensive and time-consuming tests.