Abstract
Driven piles are commonly used to transfer the loads from the superstructure through weak strata onto stiffer soils or rocks. For driven piles, the impact of the piling hammer induces compression and tension stresses in the piles. Hence, an important design consideration is to check that the strength of the pile is sufficient to resist the stresses caused by the impact of the pile hammer. Due to its complexity, pile drivability lacks a precise analytical theory or understanding of the phenomena involved. In situations where measured or numerical data are available, various soft computing methods have shown to offer great promise for mapping the nonlinear interactions between the system’s inputs and outputs. In this study, two soft computing methods, the Back propagation neural network (BPNN) and Multivariate adaptive regression splines (MARS) algorithms were used to assess pile drivability in terms of the Maximum compressive stresses, Maximum tensile stresses, and Blow per foot. A database of more than four thousand piles is utilized for model development and comparative performance of the predictions between BPNN and MARS.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
- Back propagation neural network
- Multivariate adaptive regression splines
- Pile drivability
- Computational efficiency
- Nonlinearity
1 Introduction
Driven piles are commonly used to transfer the loads from the superstructure through weak strata onto stiffer soils or rocks. For these piles, the impact of the piling hammer induces compression and tension stresses in the piles. Hence, an important design consideration is to ensure that the strength of the pile is sufficient to resist the stresses introduced by the impact of the pile hammer. One common method of calculating the driving stresses is based on the stress-wave theory [18] which involves the discrete idealization of the hammer-pile-soil system. Considering that the conditions at each site are different, generally a wave equation based computer program is required to generate the pile driving criteria for each individual project. The pile driving criteria include:
-
Hammer stroke versus Blow per foot BPF (1/set) for required bearing capacity,
-
Maximum compressive stresses versus BPF,
-
Maximum tension stress versus BPF.
However, this process can be rather time consuming and requires very specialized knowledge of the wave equation program.
The essence of modeling/numerical mapping is prediction, which is obtained by relating a set of variables in input space to a set of response variables in output space through a model. The analysis of pile drivability involves a large number of design variables and nonlinear responses, particularly with statistically dependent inputs. Thus, the commonly used regression models become computationally impractical. Another limitation is the strong model assumptions made by these regression methods.
An alternative soft computing technique is the artificial neural network (ANN). The ANN structure consists of one or more layers of interconnected neurons or nodes. Each link connecting each neuron has an associated weight. The “learning” paradigm in the commonly used Back-propagation (BP) algorithm [14] involves presenting examples of input and output patterns and subsequently adjusting the connecting weights so as to reduce the errors between the actual and the target output values. The iterative modification of the weights is carried out using the gradient descent approach and training is stopped once the errors have been reduced to some acceptable level. The ability of the trained ANN model to generalize the correct input-output response is performed in the testing phase and involves presenting the trained neural network with a separate set of data that has never been used during the training process.
This paper explores the use of ANN and another soft computing technique known as multivariate adaptive regression splines (MARS) [3] to capture the intrinsic nonlinear and multidimensional relationship associated with pile drivability. Similar with neural networks, no prior information on the form of the numerical function is required for MARS. The main advantages of MARS lie in its capacity to capture the intrinsic complicated data mapping in high-dimensional data patterns and produce simpler, easier-to-interpret models, and its ability to perform analysis on parameter relative importance. Previous applications of the MARS algorithm in civil engineering include predicting the doweled pavement performance, estimating shaft resistance of piles in sand and deformation of asphalt mixtures, analyzing shaking table tests of reinforced soil wall, determining the undrained shear strength of clay, predicting liquefaction-induced lateral spread, assessing the ultimate and serviceability performances of underground caverns, estimating the EPB tunnel induced ground surface settlement, and inverse analysis for braced excavation [1, 7, 8, 12, 13, 15,16,17, 19,20,21,22,23]. In this paper, the Back propagation neural network (BPNN) and MARS models are developed for pile drivability predictions in relation to the Maximum compressive stresses (MCS), Maximum tensile stresses (MTS), and Blow per foot (BPF). A database of more than four thousand piles is utilized for model development and comparative performance between BPNN and MARS predictions.
2 Methodologies
2.1 Back-Propagation Algorithm
A three-layer, feed-forward neural network topology shown in Fig. 1 is adopted in this study. As shown in Fig. 1, the back-propagation algorithm involves two phases of data flow. In the first phase, the input data are presented forward from the input to output layer and produces an actual output. In the second phase, the error between the target values and actual values are propagated backwards from the output layer to the previous layers and the connection weights are updated to reduce the errors between the actual output values and the target output values. No effort is made to keep track of the characteristics of the input and output variables. The network is first trained using the training data set. The objective of the network training is to map the inputs to the output by determining the optimal connection weights and biases through the back-propagation procedure. The number of hidden neurons is typically determined through a trial-and-error process; normally the smallest number of neurons that yields satisfactory results (judged by the network performance in terms of the coefficient of determination R2 of the testing data set) is selected. In the present study, a Matlab-based back-propagation algorithm BPNN with the Levenberg-Marquardt (LM) algorithm [2] was adopted for neural network modeling.
2.2 Multivariate Adaptive Regression Splines Algorithm
MARS was first proposed by [3] as a flexible procedure to organize relationships between a set of input variables and the target dependent that are nearly additive or involve interactions with fewer variables. It is a nonparametric statistical method based on a divide and conquer strategy in which the training data sets are partitioned into separate piecewise linear segments (splines) of differing gradients (slope). MARS makes no assumptions about the underlying functional relationships between dependent and independent variables. In general, the splines are connected smoothly together, and these piecewise curves (polynomials), also known as basis functions (BFs), result in a flexible model that can handle both linear and nonlinear behavior. The connection/interface points between the pieces are called knots. Marking the end of one region of data and the beginning of another, the candidate knots are placed at random positions within the range of each input variable.
MARS generates BFs by stepwise searching over all possible univariate candidate knots and across interactions among all variables. An adaptive regression algorithm is adopted for automatically selecting the knot locations. The MARS algorithm involves a forward phase and a backward phase. The forward phase places candidate knots at random positions within the range of each predictor variable to define a pair of BFs. At each step, the model adapts the knot and its corresponding pair of BFs to give the maximum reduction in sum-of-squares residual error. This process of adding BFs continues until the maximum number is reached, which usually results in a very complicated and overfitted model. The backward phase involves deleting the redundant BFs that made the least contributions. An open MARS source code from [10] is adopted in performing the analyses presented in this paper.
Let y be the target dependent responses and X = (X 1 , …, X P ) be a matrix of P input variables. Then it is assumed the data are generated based on an unknown “true” model. For a continuous response, this would be
in which e is the fitting error. f is the built MARS model, comprising of BFs which are splines piecewise polynomial functions. For simplicity, only the piecewise linear function is expressed and considered in this paper. Piecewise linear functions follow the form \( \hbox{max} (0,x - t) \) with a knot defined at value t. Expression \( \hbox{max} ( \cdot ) \) means that only the positive part of \( (.) \) is used otherwise it is assigned a zero value. Formally,
The MARS model f(X), which is a linear combination of BFs and their interactions, is expressed as
where each \( \lambda_{m} \) is a BF. It can be a spline function, or interaction BFs produced by multiplying an existing term with a truncated linear function involving a new/different variable (higher orders can be used only when the data warrants it; for simplicity, at most second-order is adopted). The terms β are constant coefficients, estimated using the least-squares method.
Figure 2 shows an example illustration of how the MARS algorithm would make use of piecewise linear spline functions to fit provided data patterns. The MARS mathematical equation is as follows
in which BF1 = max(0, x – 17), BF2 = max(0, 17 – x) and BF3 = max(0, x – 5) and max is defined as: max(a, b) is equal to a if a > b, else b. The knots are located at x = 5 and 17. These two knots delimit/cut the x range into three intervals where different linear relationships are identified.
The MARS modeling is a data-driven process. To construct the model in Eq. (3), first the forward phase is performed on the training data starting initially with only the intercept \( \beta_{0} \). At each subsequent step, the basis pair that produces the maximum reduction in the training error is added. Considering a current model with M basis functions, the next pair to be added to the model is in the form of
with each \( \beta \) being estimated by the least-squares method. This process of adding BFs continues until the model reaches some predetermined maximum number, generally leading to a purposely overfitted model.
The backward phase improves the model by removing the less significant terms until it finds the best sub-model. Model subsets are compared using the less computationally expensive method of Generalized Cross-Validation (GCV). The GCV is the mean-squared residual error divided by a penalty that is dependent on model complexity. For the training data with N observations, GCV is calculated as [9]
in which M is the number of BFs, d is a penalty for each basis function included in the developed sub-model, N is the number of data sets, and \( f(x_{i} ) \) denotes the MARS predicted values. Thus the numerator is the mean square error of the evaluated model in the training data, penalized by the denominator which accounts for the increasing variance in the case of increasing model complexity. Note that \( (M - 1)/2 \) is the number of hinge function knots. The GCV penalizes not only the number of BFs but also the number of knots. A default value of 3 is assigned to penalizing parameter d and further suggestions on choosing the value of d can be referred to [3]. At each deletion step, a basis function is pruned to minimize Eq. (3), until an adequately fitting model is found.
After the optimal MARS model is determined, by grouping together all the BFs involving one variable and another grouping of BFs involving pairwise interactions, the analysis of variance (ANOVA) decomposition procedure [3] can be used to assess the parameter relative importance based on the contributions from the input variables and the BFs.
3 Performance Measures
Table 1 shows the performance measures and the corresponding definitions utilized for prediction comparison of the two surrogate methods.
4 Pile Drivability Data Sets
In this paper, a database containing 4072 piles with a total of seventeen variables is developed from the information on piles already installed for bridges in the State of North Carolina [11]. Seventeen variables including hammer characteristics, hammer cushion material, pile and soil parameters, ultimate pile capacities, and stroke were regarded as inputs to estimate the three dependent responses comprising of the Maximum compressive stresses (MCS), Maximum tensile stresses (MTS), and Blow per foot (BPF). A summary of the input variables and outputs is listed in Table 2.
For purpose of simplifying the analyses considering the extensive number of parameters and large data set, Joen and Rahman [11] divided the data into five categories (Q1–Q5) based on the ultimate pile capacity, as detailed in Table 3. In this paper, for each category 70% of the data patterns were randomly selected as the training dataset and the remaining data were used for testing. For details of the entire data set as well as each design variable and responses, the report by Joen and Rahman [11] can be referred to.
5 BPNN Models
For simplicity, only BPNN models with one single hidden layer structure are considered. The optimal BPNN model is selected from models with different hidden neurons since the other main parameters for BPNN algorithms have been fixed as:
logsig transfer function from the input layer to the hidden layer;
tansig transfer function from the hidden layer to the output layer;
maxepoch = 500;
learning rate = 0.01;
min_grad = 1 × 10−15;
decrease factor mu_dec = 0.7;
increase factor mu_inc = 1.03.
5.1 The Optimal BPNN Model
The BPNN with the highest coefficient of determination R2 value for the testing data sets is considered to be the optimal model. Figure 3 plots the R2 values of the testing data sets for BPNN models with different neurons (from 5 to 15) in the hidden layer for MCS, MTS and BPF predictions. It can be observed that for the optimal MCS, MTS, and BPF models, the number of the neurons in the hidden layer is 9, 7 and 11, respectively.
5.2 Modeling Results
Figures 4, 5 and 6 show the BPNN predictions for the training and testing data patterns for MCS, MTS, and BPF, respectively. For the MCS predictions, considerably high R2 (>0.97) are obtained for both the training and testing patterns. Compared with the MCS predictions, the developed BPNN model is slightly less accurate in predicting the MTS mainly as a result of the bias (errors) due to the significantly smaller tensile stress values in comparison to the compressive stresses. For the BPF estimation, high R2 are also obtained for both the training and testing patterns, with the latter slightly greater than the training sets. In addition, the three optimal BPNN models can serve as reliable tools for prediction of MCS, MTS and BPF.
5.3 Parameter Relative Importance
The parameter relative importance determined by BPNN is based on the method by [5] and discussed by Goh [6]. Figure 7 gives the plot of the relative importance of the input variables for the three BPNN models. It can be observed that MCS is mostly influenced by the input variable x11 (Slenderness) and MTS is mostly influenced by the input variable x8 (Penetration). Interestingly, BPF is primarily influenced by the input variable x16 (Ultimate pile capacity).
5.4 Model Interpretability
For brevity, only the developed BPNN MCS model is expressed in mathematical form through the trained connections weights, the bias, and the transfer functions. The Mathematical expression for MCS obtained by the optimal MCS analysis is shown in the Appendix 1.
6 MARS Models
It is assumed that at most the 2nd order interaction is considered for the prediction of MCS, MTS and BPF using MARS. The number of basis functions changes from 2n to n2 (n = 17 in this study, numerical trials indicate that overfitting occurs when the number of BFs exceeds 80).
6.1 The Optimal MARS Model
The MARS model with the highest R2 value and less BFs for the testing data set is considered to be the optimal. Figure 8 plots the R2 values of the testing data sets for the MARS models with different BFs (from 34 to 78) in the hidden layer for the MCS, MTS and BPF predictions. It can be observed that for the optimal MCS, MTS, and BPF models, the number of BFs is 52, 36 and 38, respectively.
6.2 Modeling Results
Figures 9, 10 and 11 show the MARS predictions for the training and testing data patterns for MCS, MTS, and BPF, respectively. For the MCS prediction, considerably high R2 (>0.95) are obtained for both the training and testing patterns. As in the BPNN analysis, the developed MARS model is less accurate in predicting MTS compared with the MCS predictions, mainly due to the bias brought about by the smaller tensile stress values. For the BPF estimation, high R2 (>0.90) are also obtained for both the training and testing patterns, with the latter slightly greater than the training sets. Consequently, the three optimal MARS models can serve as reliable tools for prediction of MCS, MTS and BPF.
6.3 Parameter Relative Importance
Table 4 displays the ANOVA decomposition of the built MARS models for MCS, MTS and BPF respectively. For each model, the ANOVA functions are listed. The GCV column provides an indication on the significance of the corresponding ANOVA function, by listing the GCV value for a model with all BFs corresponding to that particular ANOVA function removed. It is this GCV score that is used to assess whether the ANOVA function is making a significant contribution to the model, or whether it just marginally improves the global GCV score. The #basis column gives the number of BFs comprising the ANOVA function and the variable(s) column lists the input variables associated with this ANOVA function.
Figure 12 gives the plot of the relative importance of the input variables for the three HP drivability models developed by MARS. It can be observed that both MCS and BPF are mostly influenced by the input variable x1 (hammer weight). Interestingly, MTS is primarily influenced by the input variable x6 (the weight of helmet). It should be noted that since the BPNN and MARS algorithms adopt different methods in assessing the parametric relative importance, it is understandable that the two algorithms give different results.
6.4 Model Interpretability
Table 5 lists the BFs of the MCS model. The MARS model is in the form of
7 Discussions
Comparisons of R2, r, RRMSE and ρ, as well as the built model interpretability between MARS and BPNN are shown in Table 6. It can be observed that generally BPNN models are slightly more accurate than MARS. However, in terms of the model interpretability, MARS outperforms BPNN through easy-to-interpret model. Thus, both these two methods can actually be used for cross-validation.
8 Summary and Conclusions
A database containing 4072 pile data sets with a total of 17 variables is adopted to develop the BPNN and MARS models for drivability predictions. Performance measures indicate that both the BPNN and MARS models for the analyses of pile drivability provide similar predictions and can thus be used for predicting pile drivability as cross-validation. In addition, the MARS algorithm builds flexible models using simpler linear regression and data-driven stepwise searching, adding and pruning. The developed MARS models are much easier to be interpreted.
References
Attoh-Okine, N. O., Cooger, K., & Mensah, S. (2009). Multivariate adaptive regression (MARS) and hinged hyperplanes (HHP) for doweled pavement performance modeling. Journal of Construction and Building Materials, 23, 3020–3023.
Demuth, H., & Beale, M. (2003). Neural network toolbox for MATLAB-user guide version 4.1. The Math Works Inc.
Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19, 1–141.
Gandomi, A.H., Roke, D.A. (2013). Intelligent formulation of structural engineering systems. In Seventh MIT Conference on Computational Fluid and Solid Mechanics- Focus: Multiphysics & Multiscale, 12–14 June, Cambridge, USA.
Garson, G. D. (1991). Interpreting neural-network connection weights. AI Expert, 6(7), 47–51.
Goh, A. T. C. (1994). Seismic liquefaction potential assessed by neural networks. Journal of Geotechnical Engineering, ASCE, 120(9), 1467–1480.
Goh, A. T. C., & Zhang, W. G. (2014). An improvement to MLR model for predicting liquefaction-induced lateral spread using multivariate adaptive regression splines. Engineering Geology, 170, 1–10.
Goh, A. T. C., Zhang, W. G., Zhang, Y. M., Xiao, Y., & Xiang, Y. Z. (2016). Determination of EPB tunnel-related maximum surface settlement: A Multivariate adaptive regression splines approach. Bulletin of Engineering Geology and the Environment. https://doi.org/10.1007/s10064-016-0937-8.
Hastie, T., Tibshirani, R., Friedman, J. (2009). The elements of statistical learning: Data mining, inference and prediction, 2nd ed., Springer.
Jekabsons, G. (2010). VariReg: A software tool for regression modelling using various modeling methods. Riga Technical University. http://www.cs.rtu.lv/jekabsons/.
Jeon, J. K., Rahman, M. S. (2008). Fuzzy neural network models for geotechnical problems. Research Project FHWA/ NC/ 2006–52. North Carolina State University, Raleigh, N.C.
Lashkari, A. (2012). Prediction of the shaft resistance of non-displacement piles in sand. International Journal for Numerical and Analytical Methods in Geomechanics, 37, 904–931.
Mirzahosseini, M., Aghaeifar, A., Alavi, A., Gandomi, A., & Seyednour, R. (2011). Permanent deformation analysis of asphalt mixtures using soft computing techniques. Expert Systems with Applications, 38(5), 6081–6100.
Rumelhart, D. E., Hinton, G. E., Williams, R. J. (1986. Learning internal representation by error propagation. In Parallel distributed processing, Rumelhart DE, McClelland JL (eds). MIT Press, (pp. 318–362) Cambridge, vol. 1.
Samui, P. (2011). Determination of ultimate capacity of driven piles in cohesionless soil: A multivariate adaptive regression spline approach. International Journal for Numerical and Analytical Methods in Geomechanics, 36, 1434–1439.
Samui, P., Das, S., & Kim, D. (2011). Uplift capacity of suction caisson in clay using multivariate adaptive regression splines. Ocean Engineering, 38(17–18), 2123–2127.
Samui, P., & Karup, P. (2011). Multivariate adaptive regression splines and least square support vector machine for prediction of undrained shear strength of clay. Applied Metaheuristic Computing, 3(2), 33–42.
Smith, E. A. L. (1960). Pile driving analysis by the wave equation. Journal of the Engineering Mechanics Division ASCE, 86, 35–61.
Zarnani, S., El-Emam, M., & Bathurst, R. J. (2011). Comparison of numerical and analytical solutions for reinforced soil wall shaking table tests. Geomechanics & Engineering, 3(4), 291–321.
Zhang, W. G., & Goh, A. T. C. (2013). Multivariate adaptive regression splines for analysis of geotechnical engineering systems. Computers and Geotechnics, 48, 82–95.
Zhang, W. G., & Goh, A. T. C. (2014). Multivariate adaptive regression splines model for reliability assessment of serviceability limit state of twin caverns. Geomechanics and Engineering, 7(4), 431–458.
Zhang, W. G., & Goh, A. T. C. (2017). Reliability assessment of ultimate limit state of twin cavern. Geomechanics and Geoengineering, 12(1), 48–59.
Zhang, W. G., Zhang, Y. M., & Goh, A. T. C. (2017). Multivariate adaptive regression splines for inverse analysis of soil and wall properties in braced excavation. Tunneling and Underground Space Technology, 64, 24–33.
Acknowledgements
The authors are grateful to the support by the National Natural Science Foundation of China (No. 51608071) and the Advanced Interdisciplinary Special Cultivation program (No. 106112017CDJQJ208850).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix 1
Appendix 1
Calculation of BPNN Output MCS Model
The transfer functions used MCS are ‘logsig’ transfer function for hidden layer to output layer and ‘tansig’ transfer function for output layer to target. The calculation process of BPNN output for MCS is elaborated in detail as follows:
From connection weights for a trained NN, it is possible to develop a mathematical equation relating input parameters and the single output parameter Y using
in which b0 is the bias at the output layer, ωk is the weight connection between neuron k of the hidden layer and the single output neuron, bhk is the bias at neuron k of the hidden layer (k = 1, h), ωik is the weight connection between input variable i (i = 1, m) and neuron k of the hidden layer, xxx is the input parameter i, and fsig is the sigmoid (logsig & tansig) transfer function.
Using the connection weights of the trained neural network, the following steps can be followed to mathematically express the BPNN model:
Step 1: Normalize the input values for x1, x2,… and x17 linearly using \( X_{norm} = 2(\rm{x}_{actual} - \rm{x}_{\hbox{min} } )/(\rm{x}_{\hbox{max} } - \rm{x}_{\hbox{min} } ) - 1 \)
Let the actual \( x_{1} = X_{{1{\text{a}}}} \) and the normalized \( x_{1} = X_{1} \)
Let the actual \( x_{2} = X_{{ 2 {\text{a}}}} \) and the normalized \( x_{2} = X_{2} \)
Let the actual \( x_{3} = X_{{ 3 {\text{a}}}} \) and the normalized \( x_{3} = X_{3} \)
Let the actual \( x_{4} = X_{{ 4 {\text{a}}}} \) and the normalized \( x_{4} = X_{4} \)
Let the actual \( x_{5} = X_{{ 5 {\text{a}}}} \) and the normalized \( x_{5} = X_{5} \)
Let the actual \( x_{6} = X_{{ 6 {\text{a}}}} \) and the normalized \( x_{6} = X_{6} \)
Let the actual \( x_{7} = X_{{ 7 {\text{a}}}} \) and the normalized \( x_{7} = X_{7} \)
Let the actual \( x_{8} = X_{{ 8 {\text{a}}}} \) and the normalized \( x_{8} = X_{8} \)
Let the actual \( x_{9} = X_{{ 9 {\text{a}}}} \) and the normalized \( x_{9} = X_{9} \)
Let the actual \( x_{10} = X_{{ 1 0 {\text{a}}}} \) and the normalized \( x_{10} = X_{10} \)
Let the actual \( x_{11} = X_{{ 1 1 {\text{a}}}} \) and the normalized \( x_{11} = X_{11} \)
Let the actual \( x_{12} = X_{{ 1 2 {\text{a}}}} \) and the normalized \( x_{12} = X_{12} \)
Let the actual \( x_{13} = X_{{ 1 3 {\text{a}}}} \) and the normalized \( x_{13} = X_{13} \)
Let the actual \( x_{14} = X_{{ 1 4 {\text{a}}}} \) and the normalized \( x_{14} = X_{14} \)
Let the actual \( x_{15} = X_{{ 1 5 {\text{a}}}} \) and the normalized \( x_{15} = X_{15} \)
Let the actual \( x_{16} = X_{{ 1 6 {\text{a}}}} \) and the normalized \( x_{16} = X_{ 1 6} \)
Let the actual \( x_{17} = X_{{ 1 7 {\text{a}}}} \) and the normalized \( x_{17} = X_{{ 1 7 {\text{a}}}} \)
Step 2: Calculate the normalized value (Y1) using the following expressions:
Step 3: De-normalize the output to obtain MCS
Note: \( logsig\left( x \right) = 1/\left( {1 + exp\left( { - x} \right)} \right) \)while \( \begin{array}{*{20}l} {tanh\left( x \right) = \, 2/(1 + exp( - 2x)) - 1} \hfill \\ \hfill \\ \end{array} \)
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Zhang, W., Goh, A.T.C. (2018). Modelling of Pile Drivability Using Soft Computing Methods. In: Roy, S., Samui, P., Deo, R., Ntalampiras, S. (eds) Big Data in Engineering Applications. Studies in Big Data, vol 44. Springer, Singapore. https://doi.org/10.1007/978-981-10-8476-8_14
Download citation
DOI: https://doi.org/10.1007/978-981-10-8476-8_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8475-1
Online ISBN: 978-981-10-8476-8
eBook Packages: EngineeringEngineering (R0)