Introduction

In the rapidly developing urban landscape of Ho Chi Minh City, the construction industry is frequently confronted with complex geotechnical challenges. Accurate estimation of the load-bearing capacity of piles is crucial for ensuring the structural safety and economic efficiency of high-rise constructions. Traditionally, this capacity has been determined through experimental or empirical methods. Common field experimental techniques for determining the load-bearing capacity of piles include the static pile load (SPL) test and the pile dynamics analyzer (PDA) test, as discussed in the works of Burland et al. [7] and Birid [6]. Numerous empirical formulas have also been developed for calculating load-bearing capacity based on various soil properties and pile geometry [15, 22, 25], or from standard penetration tests (SPT) [4, 28,29,30] and cone penetration tests (CPT) [9, 32]. However, these methods suffer from significant limitations, including the limited accuracy of empirical methods, and the high operational costs and extensive time requirements of experimental methods, which can hinder the progress of construction projects.

The advent of machine learning, and particularly artificial neural networks (ANNs), has introduced a new paradigm in geotechnical engineering. These techniques are known for their ability to handle the nonlinear relationships and complex interactions between variables typically found in geotechnical data. Recent studies have shown that machine learning significantly contributes to solving uncertainties in geotechnical engineering [16, 27]. ANNs are a subset of machine learning that is very powerful and have been widely applied in various aspects of geotechnical engineering [1, 8, 11, 13, 14, 17,18,19,20,21, 23, 24, 26]. For example, Ardalan et al. [2] demonstrated how polynomial neural networks, optimized with genetic algorithms, could effectively predict pile shaft capacity from CPT data, showcasing the potential for enhanced accuracy over traditional methods. Benbouras et al. [5] applied deep neural network models across multinational datasets to predict the bearing capacity of driven piles, confirming the superior capabilities of machine learning models compared to conventional empirical approaches. The findings from ANN applications have contributed to the development of new methods that further enhance performance in this field. With the rapid development of machine learning, the potential applications in geotechnical engineering have become more diverse and extensive [10, 12].

Additionally, the integration of ANNs with other optimization techniques has been explored to improve prediction accuracy. Armaghani et al. [3] developed a hybrid model that combined ANNs with particle swarm optimization (PSO) to estimate the ultimate bearing capacity of rock-socketed piles. This approach not only optimized the predictive accuracy but also highlighted the adaptability of ANNs to complex geological datasets. Similarly, Xun and Abdullah [33] employed the XGBoost algorithm to predict the geotechnical axial capacity of reinforced concrete driven piles with high precision, illustrating the effectiveness of ensemble learning methods in handling geotechnical engineering problems. In complex scenarios, ANNs and nature-inspired optimization techniques emerge as adept solutions, showcasing their prowess in attaining satisfactory levels of accuracy [31].

Despite these advancements, the application of machine learning techniques in the Vietnamese construction sector, particularly in adapting these models to the local construction environments of Ho Chi Minh City, remains limited. This study seeks to fill the research gap by developing an ANN model tailored to the local geotechnical conditions and construction practices. Utilizing a dataset of 876 pile samples from various construction sites, this model incorporates data on pile driving force, depth, and measured load capacities.

This paper presents the development and validation of this novel ANN model and compares its performance against traditional methods using a comprehensive analysis of static load test results and field data. By doing so, it assesses the model’s accuracy and reliability under real-world conditions, aiming to provide a predictive tool that can significantly enhance the design and safety of pile foundations in urban construction projects.

Methods

This study was methodically designed to develop an artificial neural network (ANN) model that estimates the ultimate load-bearing capacity of driven piles (including both frictional resistance and end-bearing capacity) using a comprehensive dataset collected from various construction projects in Ho Chi Minh City. It is worth noting that the soil types studied primarily consist of cohesive clay in soft to hard plastic states, interspersed with thin layers of sand in medium to dense states. The methodology followed a structured sequence of steps, ensuring a systematic progression from data collection to model validation.

The initial phase of the study involved collecting data from static load tests conducted on 16 precast high-strength concrete (PHC) piles across three major construction sites: DQH, C1–C2, and center mall. These tests included eleven piles with a diameter of 600 mm (D600) and five piles with a diameter of 500 mm (D500), offering a comprehensive range of load capacities at various depths along with corresponding driving forces. In addition to data from static load tests, the load capacities were also estimated using empirical formulas that took into account the physical and mechanical properties of the soil, as well as results from field tests such as the standard penetration test (SPT). These empirically calculated capacities were then cross-verified with the actual test results to validate the accuracy of the empirical methods and refine the dataset for input into the ANN.

Once the data were collected, it was integrated into a unified dataset of 876 samples that included key parameters such as pile driving force (P), depth (L), and load-bearing capacity (Rcu). The correlation among these parameters is shown in Fig. 1. Figure 1 shows that all variables are positively correlated with each other. While L and P are correlated with Rcu with correlation coefficients of 0.78 and 0.77, respectively, L and P have a lower correlation with a correlation coefficient of 0.58.

Fig. 1
figure 1

Correlation between variables

In this study, P and L are used as inputs, and Rcu is the output of the ANN model. The distribution of Rcu values is shown in Fig. 2. It can be seen that Rcu values are distributed in a wide range (from 0 to 8200). The dataset underwent a normalization process to stabilize the ANN model training process and speed up the convergence process. Specifically, MinMaxScaler is used to normalize L and P values to the range [− 1, 1]. With Rcu values, normalization is as simple as dividing the Rcu values by 1000 to narrow the range of its values. For training purposes, the dataset is divided into three subsets including a training set, a validation set, and a testing set with 64%, 16%, and 20% respectively. The training set is used to train the model, which captures the relationships between inputs and outputs. The validation set is used to fine-tune the hyperparameters of the model and evaluate underfitting and overfitting phenomena during training. The testing set is used to evaluate the model’s generalization capability on unseen data.

Fig. 2
figure 2

Distribution of Rcu variable

The ANN model is structured as a multilayer perceptron with many dense layers as shown in Fig. 3. The input layer receives the driving force and depth parameters, which are processed through two hidden layers with numbers the neurons in each hidden layer are 12 and 8, respectively. The ReLU activation function is used in these two hidden layers to represent the nonlinear relationship of geotechnical data. The output layer uses a linear activation function, which is designed to predict the continuous variation of the pile’s load-carrying capacity. The model training process uses the Adam optimizer with an initial learning rate of 0.005 to update the model weights. The objective function used in this study is the mean squared error (MSE). The training process is performed with 500 epochs and a batch size of 128. The summary of the proposed ANN model is shown in detail in Table 1, where the number of trainable parameters is 149 parameters.

Fig. 3
figure 3

Architecture of the proposed artificial neural network

Table 1 Summary of the proposed ANN model’s architecture

Following the training phase, the model’s performance was evaluated using the test dataset. Common performance metrics in regression tasks, such as mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and R-squared (R2), were calculated using Eqs. (14) to assess its accuracy and reliability. These metrics help in understanding how well the model is predicting the target variable.

  • MSE is the average of the squared differences between the actual and predicted values. It penalizes larger errors more than smaller ones because the errors are squared, making it sensitive to outliers. It is calculated as follows:

    $$ {\text{MSE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {y_{i} - \widehat{{y_{i} }}} \right)}^{2} $$
    (1)
  • RMSE is the square root of the average of the squared differences between the actual and predicted values. It is similar to MSE but provides an error metric in the same units as the target variable, which can be easier to interpret. Like MSE, it is also sensitive to outliers. It is calculated as follows:

    $$ {\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {y_{i} - \widehat{{y_{i} }}} \right)}^{2} } $$
    (2)
  • MAE is the average of the absolute differences between the actual and predicted values. It provides a linear score, meaning all individual differences are weighted equally in the average. It is calculated as follows:

    $$ {\text{MAE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {y_{i} - \widehat{{y_{i} }}} \right|} $$
    (3)
  • R2 is a statistical measure that explains how much of the variance in the dependent variable is predictable from the independent variables. It ranges from 0 to 1 and is calculated as follows:

    $$ R^{2} = 1 - \frac{{\sum\limits_{i = 1}^{n} {\left( {y_{i} - \widehat{{y_{i} }}} \right)^{2} } }}{{\sum\limits_{i = 1}^{n} {\left( {y_{i} - \overline{y}} \right)^{2} } }} $$
    (4)

    where n referred to the number of data points; \({y}_{i}\) referred to the actual value at the ith sample; \(\widehat{{y}_{i}}\) referred to the predicted value at the ith sample; and \(\overline{y }\) referred to the mean true values. Additionally, the predictions of the ANN model were compared to the results from static load tests to evaluate its effectiveness relative to traditional experimental methods.

Results and Discussion

The use of the artificial neural network (ANN) model to estimate the ultimate load-bearing capacity of driven piles in Ho Chi Minh City achieved significant results, underpinned by detailed performance metrics and comparative analyses as illustrated in various figures and tables.

Evaluate Model Performance

Figures 4 and 5 visually present the model’s learning progression and its predictive performance. Figure 4 illustrates the loss curves on both the training set and validation set during the model training process. It can be seen that the loss values drop sharply in the first epochs and then converge to zero for both the training and validation sets. Underfitting or overfitting did not occur, indicating the effective learning process and model stability. Figure 5 presents scatter plots comparing the predicted and actual load capacities for piles on the testing set. It is worth noting that the closer the blue dots are to the red line, the more accurate the model’s prediction is compared to the actual value. Figure 5 shows that the blue points are distributed around the red line, which highlights the close alignment between the model predictions and actual load-bearing capacities.

Fig. 4
figure 4

Loss values under training and validation subsets

Fig. 5
figure 5

Comparison of predicted values and actual values

The ANN model demonstrated exceptional accuracy, as indicated by comprehensive performance metrics. On the training dataset, the model achieved a mean absolute error (MAE) of 0.2926, a mean squared error (MSE) of 0.1845, and a root mean squared error (RMSE) of 0.4295. On the validation dataset, the model achieved a mean absolute error (MAE) of 0.3048, a mean squared error (MSE) of 0.0216, and a root mean squared error (RMSE) of 0.4490. The testing dataset exhibited slightly higher error metrics, with MAE at 0.3230, MSE at 0.2396, and RMSE at 0.4895. The R-squared values were notably high, with 0.9469 for the training set, 0.9389 for the validation set, and 0.9180 for the test set, suggesting that the model could explain a significant portion of the variance in pile load capacities. These metrics are comprehensively depicted in Table 2, which provides a summary of the model’s statistical performance across all three datasets.

Table 2 Performance statistics of the proposed ANN model

Comparison with Static Load Test Results

The model’s predictions closely matched the actual load-bearing capacities obtained from static load tests, validating its effectiveness as a predictive tool. This close correspondence is detailed in Table 3, where predicted load capacities from the ANN model are compared side by side with the results from static load testing for each pile. The data show minimal discrepancies, underscoring the model’s accuracy in real-world applications.

Table 3 Comparison of model’s predictions and static load test results

Further evaluations were conducted to test the model’s robustness and its ability to generalize to new datasets. These evaluations confirmed that the ANN maintained high accuracy levels across different pile types and construction conditions, indicating its robustness. The model’s performance did not significantly vary with changes in pile diameters or specific site conditions, suggesting strong generalization capabilities across various geotechnical settings.

The findings from this study suggest substantial practical implications for urban construction projects. The ability to accurately predict pile load capacities can lead to more efficient project planning, cost reductions, and enhanced safety. These benefits are particularly pertinent in geotechnically diverse and challenging environments like those found in Ho Chi Minh City.

Conclusion

The research presented demonstrates the effective application of an artificial neural network (ANN) model to estimate the load-bearing capacity of driven piles in urban construction sites in Ho Chi Minh City. The proposed ANN model achieved high accuracy, with performance metrics such as mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), and R-squared (R2) reaching values of 0.3230, 0.2396, 0.4895, and 0.9180, respectively, on the testing set, indicating strong predictive capability. The model’s performance was validated against static load test results, confirming its reliability. Additionally, the model displayed robust generalization capabilities across different pile diameters and construction sites. These findings suggest that machine learning techniques, specifically ANNs, can significantly improve the predictive accuracy and efficiency of pile foundation design in geotechnical engineering. The study underscores the potential for broader applications of ANN models in urban construction projects, leading to more efficient project planning, cost reductions, and enhanced safety.

One limitation of this study is that the dataset used is limited (only 876 samples), which does not fully utilize the power of the ANN model. In addition, fine-tuning the hyperparameters of the ANN model, such as model architecture (number of layers, number of neurons for each layer), activation function, loss function, optimizer, number of epochs, and batch size, was not performed in this study. Fine-tuning these parameters is crucial for improving the model’s performance. To address these limitations, future work will involve collecting more data and using Bayesian optimization to search for optimal hyperparameters for the model.