1 Introduction

Sand deposits are generally much more heterogeneous than their clay counterparts [1]. Therefore, differential settlements are probably to be higher in sand deposits than in clay profiles [2]. Settlement occurs in cohesionless soils in a short time due to their high degree of permeability [3]. This immediate settlement creates relatively rapid deformation of superstructures, which causes the incapacity to remedy damage to prevent further deformation [1]. Furthermore, excessive settlement occasionally brings about the structural failure [4]. Usually, the settlement of shallow foundations for example pad or strip footings are limited to 25 mm [5].

Two major criteria (bearing capacity and settlement criteria) control the design of shallow foundations. The settlement criterion is more critical than the bearing capacity criterion in the design of shallow foundations on cohesionless soils. Thus, settlement criterion usually controls the design process, rather than bearing capacity, especially when the breadth of footing exceeds 1 m [6].

In order to propose an indirect estimation by empirical equations, the statistical methods are traditionally used [7]. In recent years, new techniques such as artificial neural networks (ANNs) and fuzzy interference system were employed for developing predictive models to estimate the needed parameters [713]. ANN is now being used as alternate statistical tool [7]. ANNs are very sophisticated modeling techniques, enable the modeling of extremely complex functions [14]. Recently, ANNs have been used successfully to many problems in geotechnical engineering owing to their successful performance in modeling nonlinear multivariate problems. ANNs currently attract many researchers studying the settlement prediction of shallow foundations on cohesionless soils (i.e., Shahin et al. [1], Sivakugan et al. [15]). The basic characteristics of ANNs in tackling quantitative and qualitative indexes contain the large-scale parallel-distributed processing, continuously nonlinear dynamics, collective computation, high fault tolerance, self organization, self learning, and real-time treatment [16]. In this study, ANNs, with respect to the above advantages, were utilized to predict the settlement of one-way strip footings, without a need to perform any manual work such as using tables or charts. To achieve this, a computer programme [17] was developed in the Matlab programming environment for calculating the settlement of one-way footings from five traditional settlement prediction methods such as Meyerhof [18], Terzaghi and Peck [19], Pary [20], Peck et al. [21], Burland and Burbidge [22]. The footing geometry (length, L, and width, B), the footing embedment depth, D f, the bulk unit weight, γ, of the cohesionless soil, the footing applied pressure, Q, and corrected standard penetration test, N cor varied during the settlement analyses, and the settlement value of each one-way footing was calculated for each method by using the written programme. Then, an ANN model for each traditional method was developed by using the results of the analyses to predict the settlement. The settlement values predicted from the ANN model were compared with the settlement values calculated from the traditional method for each method. Additionally, several performance indices such as determination coefficient (R 2), variance account for (VAF), mean absolute error (MAE), root mean square error (RMSE), and scaled percent error (SPE) were calculated to check the prediction capacity of the ANN models developed. Sensitivity analyses were also carried out to examine the relative importance of the factors affecting settlement prediction.

2 Calculation of settlement of one-way footings on cohesionless soils

In this study, a computer program [17] was written in the Matlab programming environment to calculate the settlement, Δh, of one-way footings on cohesionless soils based on standard penetration test from five traditional methods, namely, Meyerhof [18], Terzaghi and Peck [19], Pary [20], Peck et al. [21], Burland and Burbidge [22], given by Eqs. (1)–(5), respectively.

$$ \Updelta h = \Updelta h_{\text{a}} \frac{{q_{\text{net}} }}{{q_{\text{a}} }} $$
(1)
$$ \Updelta h = \frac{{q_{\text{net}} }}{{q_{\text{a}} }}25 $$
(2)
$$ \Updelta h = \frac{{\alpha Bq_{\text{net}} }}{{N_{\text{m}} }}C_{\text{D}} C_{\text{T}} C_{\text{w}} $$
(3)
$$ \Updelta h = \frac{{q_{\text{net}} }}{{q_{\text{a}} C_{\text{w}} }}25 $$
(4)
$$ \Updelta h = q_{\text{net}} B^{0.7} I_{\text{c}} $$
(5)

In Eqs. (1)–(5), I c is the compressibility index, q net is the net applied pressure, q a is the allowable bearing capacity, Δh a is the absolute maximum allowable settlement, C w is the correction for water table depth, α is a constant and taken as 200 in SI units, C D is the factor for the influence of excavation, C T is the factor for the thickness of the compressible layer, and N m is the measured average standard penetration value.

The footing geometry (length, L, and width, B), the footing embedment depth, D f, the bulk unit weight, γ, of the cohesionless soil, the footing applied pressure, Q, and corrected standard penetration test, N cor varied during the settlement analyses as follows: The Β value was varied 1, 2, and 3 m. For each B value, the L value was varied as 10, 20, and 30 m. The γ value for each ΒL pair was varied as 16, 18, 20, and 22 kN/m3. The D f value was changed as 0.5–3.5 m with step of 1.0 m. The Q value was varied from 2,500 to 5,000 kN with step of 500 kN. The Ν cor value was varied from 5 to 45 with step of 10. Then, the settlement value of each one-way footing was calculated for each method by using the written program. The effect of the water table is already reflected in the measured SPT blow count [18]. Thus, the depth of water table is not included in this study. Square and rectangular footings are taken into account in this study. As found by Burbidge [23], there is no important difference between the settlement of circular and square footings having the same width (B) on the same soil. Therefore, circular footings are also considered to be equivalent to as square footings. A summary of the results are given in Table 1. It can be noted from the table that Terzaghi and Peck [19] method generally yielded the highest settlement values; Pary [20] and Burland and Burbidge [22] methods yielded lower settlement values; Meyerhof [18] and Peck et al. [21] generally yielded similar settlement values lower than those predicted by Terzaghi and Peck [19] and higher than those predicted by Pary [20] and Burland and Burbidge [22] methods.

Table 1 A summary of the results

3 Artificial neural network models

3.1 Brief overview of artificial neural networks

ANNs are the form of artificial intelligence which is based on the function of human brain and nervous system [24]. An ANN consists basically of simple processing elements called neurons, which are highly interconnected. Typically, the neurons are organized logically into groupings called layers. An ANNs architecture (Fig. 1) is constructed by three or more layers, which contain an input layer, one or more hidden layers, and an output layer. This ANN architecture is commonly referred to as a fully interconnected feedforward multi-layer perceptron (MLP). Each neuron in a given layer is connected to all the neurons in the next layer by means of weighted connections.

Fig. 1
figure 1

The ANNs architecture

ANNs learn from the data examples fed to them and utilize these data to adjust their weights in an attempt to find a relationship between model inputs and corresponding outputs [24]. Once the learning or training phase of the model has been successfully accomplished, the performance of the trained model has to be validated using an independent validation set. Details of ANNs are beyond the scope of this study and are given elsewhere (e.g., Flood and Kartam [25]).

3.2 Development of artificial neural network models

An ANN model for each traditional method is designated for predicting the settlement, Δh, value of the one-way footing on cohesionless soils by using the neural network toolbox written in Matlab environment (Math Works 7.0 Inc. 2006). In each ANN model, the footing geometry (length, L, and width, B), the footing embedment depth, D f, the bulk unit weight, γ, of the cohesionless soil, the footing applied pressure, Q, and corrected standard penetration test, N cor were used as the input parameters, while the calculated Δh value was the output parameter. The boundaries of the input and output parameters for each method are given in Table 2. The input and output data were then scaled to lie between 0 and 1, by using Eq. (6). In Eq. (6), where x norm is the normalized value, x is the actual value, x max is the maximum value, and x min is the minimum value.

Table 2 Boundaries of the parameters used for the models developed
$$ x_{\text{norm}} = \frac{{(x - x_{\hbox{min} } )}}{{(x_{\hbox{max} } - x_{\hbox{min} } )}} $$
(6)

Overfitting makes MLPs memorize training patterns in such a way that they cannot generalize well to new data [14, 26]. As a result, cross-validation technique [27], considered to be the most effective method to ensure overfitting does not occur [28], was used as the stopping criterion in this study. In this technique [27], the database is divided into three subsets: training, validation, and testing. The training set is used to adjust the connection weights [29]. The testing set is utilized to check the performance of the model at various stages of training and to determine when to stop training to prevent overfitting [29]. The validation set is used to predict the performance of the trained network in the deployed environment [29]. Shahin et al. [29] investigated the impact of the proportion of the data used in various subsets on the performance of ANN model developed for estimating the settlement of shallow foundations and found no exact relationship between the proportion of the data and model performance. However, they obtained the optimal model performance when 20 % of the data were utilized for validation and the rest data were divided into 70 % for training and 30 % for testing. Therefore, to avoid overfitting, the database was randomly divided into three sets: training, testing, and validation. In total, 56 % of the data (i.e., 2,150 data sets), 24 % (i.e., 922 data sets), and 20 % (i.e., 768 data sets) were used for training, testing, and validation sets, respectively, in each ANN model developed in this study.

The neural network toolbox of MATLAB7.0, a popular numerical computation and visualization software [19], was used for training, validation, and testing of MLPs in each ANN model. The Levenberg–Marquardt back-propagation learning algorithm [30] was used in the training stage. One hidden layer with a sufficient number of hidden neurons is capable of approximating any continuous function [31]. Therefore, in this study, one hidden layer was used. Then, the optimum number of neurons in the hidden layer of the model was determined by varying their number starting with a minimum of 1 then increasing in steps by adding one neuron each time. Log-sigmoid transfer (activation) function, the most commonly used to construct the neural networks, was used in each ANN model to achieve the best performance in training as well as in testing. Two momentum factors, μ, (=0.01 and 0.001) were selected for the training process to search for the most efficient ANN architecture in each ANN model. The coefficient of determination, R 2, and the MAE were utilized to evaluate the performance of each developed ANN model. The performance of the network during the training and testing processes was examined for each network size until no significant improvement occurred. The flow chart showing the determination of NN’s weights is also given in Fig. 2. The optimal ANN’s performance for each traditional method was obtained with the model having four neurons in the hidden layer and a 0.001 momentum factor.

Fig. 2
figure 2

The flow chart showing the determination of NN’s weights [13]

4 Results and discussion

A comparison of Δh values calculated from five traditional methods with the Δh values predicted from the ANN models developed is depicted in Figs. 3, 4, 5, 6 and 7. As seen from the figures that the predicted Δh values are quite close to the calculated Δh values, as their R 2 values are much close to unity, which indicates no significant difference between calculated and predicted Δh values.

Fig. 3
figure 3

Comparison of calculated Δh values from Meyerhof [18] method with predicted Δh values from the ANN model developed for a training, b testing, and c validation data sets

Fig. 4
figure 4

Comparison of calculated Δh values from Terzaghi and Peck [19] method with predicted Δh values from the ANN model developed for a training, b testing, and c validation data sets

Fig. 5
figure 5

Comparison of calculated Δh values from Parry [20] method with predicted Δh values from the ANN model developed for a training, b testing, and c validation data sets

Fig. 6
figure 6

Comparison of calculated Δh values from Peck et al. [21] method with predicted Δh values from the ANN model developed for a training, b testing, and c validation data sets

Fig. 7
figure 7

Comparison of calculated Δh values from Burland and Burbidge [22] method with predicted Δh values from the ANN model developed for a training, b testing, and c validation data sets

In fact, the coefficient of correlation between the measured and predicted values is a good indicator to evaluate the prediction performance of the any model developed. In this study, variance VAF, given by Eq. (7), and the RMSE, given by Eq. (8), were also computed to control the performance of the prediction capacity of predictive models developed in the study, as employed by [12, 13, 3236].

$$ {\text{VAF}} = \left[ {1 - \frac{{\text{var} (y - \hat{y})}}{{\text{var} (y)}}} \right] \times 100 $$
(7)
$$ {\text{RMSE}} = \sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {(y_{i} - \hat{y}_{i} )^{2} } } $$
(8)

where var denotes the variance, y is the measured value, \( \hat{y} \) is the predicted value, and N is the number of the sample. If VAF is 100 % and RMSE is 0, the model is treated as excellent. The performance indices calculated for the ANN models developed in this study are given in Table 3. Each ANN model has exhibited high prediction performance based on the computed performance indices (Table 3).

Table 3 The details of the performance indices of the ANN models

In addition to the performance indices, a graph between the SPE, (as given by Eq. (9) and employed by Kanibir et al. [37] and Erzin et al. [38]), and cumulative frequency was also drawn in Figs. 8, 9, 10, 11 and 12 for Meyerhof [18], Terzaghi and Peck [19], Pary [20], Peck et al. [21], Burland and Burbidge [22] methods, respectively, to show the performance of the models developed.

Fig. 8
figure 8

Scaled percent error of the settlements predicted from the ANN model for Meyerhof [18] method

Fig. 9
figure 9

Scaled percent error of the settlements predicted from the ANN model for Terzaghi and Peck [19] method

Fig. 10
figure 10

Scaled percent error of the settlements predicted from the ANN model for Parry [20] method

Fig. 11
figure 11

Scaled percent error of the settlements predicted from the ANN model for Peck et al. [21] method

Fig. 12
figure 12

Scaled percent error of the settlements predicted from the ANN model for Burland and Burbidge [22] method

$$ {\text{SPE}} = \frac{{(\Updelta h_{\text{p}} - \Updelta h_{\text{c}} )}}{{((\Updelta h_{c} )_{\hbox{max} } - (\Updelta h_{c} )_{\hbox{min} } )}} $$
(9)

where Δh p and Δh c are the predicted and the calculated settlements; and (Δh c)max and (Δh c)min are the maximum and minimum calculated settlements, respectively. As seen from Figs. 8, 9, 10, 11 and 12, about 95, 97, 95, 91, and 96 % of settlements predicted from the ANN model developed for Meyerhof [18], Terzaghi and Peck [19], Pary [20], Peck et al. [21], Burland and Burbidge [22] methods, respectively, fall into ±2 of the SPE, indicating a perfect estimate for the settlement of one-way strip footings. From here, it can be concluded that the Δh value of one-way footings for each traditional method could be predicted from the footing geometry (length, L, and width, B), the footing embedment depth, D f, the bulk unit weight, γ, of the cohesionless soil, the footing applied pressure, Q, and corrected standard penetration test, N cor using trained ANNs values, with acceptable accuracy, at the preliminary stage of designing the one-way strip footing.

Sensitivity analyses were also carried out on the trained work to determine which of the input parameters has the most significant effect on the settlement predictions. A simple and innovative technique proposed by Garson [39], as employed by Shahin et al. [1], was utilized to interpret the relative importance of the input parameters by examining the connection weights of the trained network. For a network with one hidden layer, the technique involves a process of partitioning the hidden output connection weights into components associated with each input node [1]. When the ratio of the number of free parameters (e.g., connection weights) to the data points in the training set is too large, it is difficult to interpret the physical meaning of the relationship found by the ANN [1]. The sensitivity analyses repeated for networks trained with different initial random weights to control the robustness of the model in relation with its ability to obtain information about the relative importance of the physical factors influencing the settlement of one-way footings. In this study, the ratio of the number of weights to the number of data points in the training set is approximately 1:77, and training of the network is repeated four times with different random starting weights. The results of the sensitivity analysis for each traditional method used are given in Table 4. From the results of the sensitivity analysis (Table 4), for each traditional method, N cor is found to be the most important parameter, followed by L, B, Q, D f, and γ for Terzaghi and Peck [19], Pary [20], Peck et al. [21], Burland and Burbidge [22] methods, and followed by B, L, Q, D f, and γ for Meyerhof [18] method.

Table 4 The results of the sensitivity analysis

5 Conclusions

In this study, efforts were made to develop ANN model that can be employed for estimating the settlement, Δh, of one-way footings, without a need to perform any manual work such as using tables or charts. To achieve this, a computer program was developed in the Matlab programming environment to calculate the Δh value of one-way footings from five traditional settlement prediction methods such as Meyerhof [18], Terzaghi and Peck [19], Pary [20], Peck et al. [21], Burland and Burbidge [22]. The footing geometry (length, L, and width, B), the footing embedment depth, D f, the bulk unit weight, γ, of the cohesionless soil, the footing applied pressure, Q, and corrected standard penetration test, N cor varied during the settlement analyses, and the Δh value of each one-way footing was calculated for each method by using the written programme. From the results, Terzaghi and Peck [19] method generally yielded the highest Δh values; Pary [20] and Burland and Burbidge [22] methods yielded lower Δh values; Meyerhof [18] and Peck et al. [21] generally yielded similar Δh values lower than those predicted by Terzaghi and Peck [19] and higher than those predicted by Pary [20] and Burland and Burbidge [22] methods.

Then, an ANN model was developed for each traditional method to predict the Δh value of one-way footings by using the results of the settlement analyses. The Δh values predicted from the ANN model were compared with those calculated from the traditional method for each method to examine the performance of the prediction capacity of the models developed in the study. The results demonstrated that the Δh values predicted from the ANN model are in good agreement with the calculated Δh values for each ANN model developed.

To check the prediction performance of the ANN models developed, several performance indices such as R2, VAF, MAE, and RMSE were calculated. Each ANN model has shown high prediction performance based on the performance indices. In addition to that, about 95, 97, 95, 91, and 96 % of settlements predicted from the ANN model developed for Meyerhof [18], Terzaghi and Peck [19], Pary [20], Peck et al. [21], Burland and Burbidge [22] methods, respectively, fall into ±2 of the SPE, indicating a perfect estimate for the settlement of one-way strip footings. Therefore, the ANN models developed in this study can be employed for estimating the settlement, Δh, of one-way footings, without a need to perform any manual work such as using tables or charts.

Sensitivity analyses were also carried out on the trained work for each traditional method to identify which of the input parameters has the most significant influence on settlement predictions. The results of the sensitivity analysis demonstrated that N cor is the most important parameter while γ is the least important parameter for each method.