Compressive strength prediction of PET fiber-reinforced concrete using Dolphin echolocation optimized decision tree-based machine learning algorithms

Parhi, Suraj Kumar; Patro, Sanjaya Kumar

doi:10.1007/s42107-023-00826-8

Compressive strength prediction of PET fiber-reinforced concrete using Dolphin echolocation optimized decision tree-based machine learning algorithms

Research
Published: 21 July 2023

Volume 25, pages 977–996, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Asian Journal of Civil Engineering Aims and scope Submit manuscript

Compressive strength prediction of PET fiber-reinforced concrete using Dolphin echolocation optimized decision tree-based machine learning algorithms

Download PDF

Suraj Kumar Parhi¹ &
Sanjaya Kumar Patro¹

302 Accesses
3 Citations
Explore all metrics

Abstract

This research presents a comprehensive study on predicting the compressive strength (CS) of PET-fiber-reinforced concrete (PFRC) using three decision tree-based machine learning models: Decision Tree (DT), Random Forest (RF), and Gradient Boosting Machine (GBM) regressors. To enhance the predictive capabilities of these models, the hyperparameters were optimized using the novel metaheuristic Dolphin Echolocation Optimization (DEO) technique. The input features considered for the models include the Binder content, W/B ratio, coarse and fine aggregate content, and PET fiber volume fraction. The target variable is the compressive strength of the concrete samples. Extensive experimentation was used to analyze and compare the effectiveness of each model. The results demonstrate that the DEO-tuned Random Forest outperformed its other counterparts, achieving improved accuracy in predicting the CS of PFRC. SHAP (Shapley Additive Explanations) and Sobol sensitivity analysis were conducted to explore the sensitivity of the input features toward compressive strength prediction. The Sobol sensitivity analysis assessed the significance of the input features and their interactions, whereas the SHAP values revealed the specific effects of each feature on the output of the model. The findings from the sensitivity analyses identified the Binder content, fiber volume fraction, and W/B as the most influential factors in determining the compressive strength.

A comparative assessment of tree-based predictive models to estimate geopolymer concrete compressive strength

Article 23 November 2022

Ensemble learning models to predict the compressive strength of geopolymer concrete: a comparative study for geopolymer composition design

Article 10 December 2023

A Novel XGBoost and RF-Based Metaheuristic Models for Concrete Compression Strength

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Due to its special qualities and prospective advantages, PET fiber-reinforced concrete (PFRC) has received a lot of interest in the field of civil engineering. In PFRC, concrete mixtures are infused with PET fibers, which enhances the mechanical and durability characteristics of the resulting composite material (Marthong & Marthong, 2016). Enhancing the tensile strength and toughness of concrete is one of the main benefits of PET fiber reinforcing. Concrete, although strong in compression, is inherently weak in tension (Weckert et al., 2011). The inclusion of PET fibers helps to bridge micro-cracks that occur during the early stages of loading, effectively distributing the stress and preventing crack propagation (Benkharbeche et al., 2021). This property makes PFRC particularly suitable for applications, where enhanced crack resistance is desired, such as pavements, industrial floors, and precast elements. Another significant benefit of PFRC is its impact on the durability and service life of concrete structures. PET fibers act as a barrier against the ingress of water, chloride ions, and other harmful substances that can cause the corrosion of reinforcing steel or deteriorate the concrete matrix (Naidu Gopu & Joseph, 2022). By mitigating the potential damage caused by chemical attacks, PFRC offers improved resistance to environmental factors and can extend the lifespan of structures, reducing maintenance and repair costs. PFRC contributes to sustainable construction practices by utilizing recycled PET materials. The incorporation of PET fibers in concrete provides an environmentally friendly solution to the growing issue of plastic waste (Rao et al., 2022, 2023). By diverting PET bottles and other plastic waste from landfills and repurposing them into construction materials, PFRC offers a valuable avenue for recycling and waste reduction. The procedure to develop PET fiber of various aspect ratio (AR) is illustrated in Fig. 1.

Strength is an important characteristic of any cementitious material (Parhi et al., 2023; Pradhan et al., 2022c). Higher strength is generally an indicator of good durability (2022b; Pradhan et al., 2022a). The prediction of compressive strength plays a vital role in the evaluation and optimization of construction materials, in particular, PET fiber-reinforced concrete. PET fiber-reinforced concrete is gaining increasing attention in the construction industry due to its improved mechanical properties and environmental benefits. Making informed judgments throughout the planning, design, building, and management of infrastructure projects requires accurate prediction of the CS of PFRC (Nafees et al., 2023). Machine learning algorithms have proven their potential in accurately predicting material properties, including compressive strength, by learning from historical data and identifying complex relationships (Kaveh & Khalegi, 1998; Kaveh & Khavaninzadeh, 2023; Parhi & Patro, 2023; Singh et al., 2023; Parhi & Panigrahi, 2023). Developing reliable prediction models empowers engineers to optimize mixture proportions, select suitable reinforcement strategies, and meet strength specifications. This optimization enhances structural performance, durability, cost-effectiveness, and sustainability of construction projects.

To predict the CS of PFRC, three decision tree-based machine learning models: Decision Tree, Random Forest, and Gradient Boosting Machine regressors were utilized in this study. The hyperparameters of all the models were optimized using the metaheuristic Dolphin echolocation optimization technique. SHAP and Sobol sensitivity analysis was used to access the feature sensitivity toward compressive strength. All the models were developed in Google Colab’s Python interface.

Research significance

This study implements three decision tree-based machine learning regressors, i.e., Decision tree, Random Forest, and Gradient Boosting machine to predict the CS of PFRC. Decision trees offer easy interpretability as their decision-making process can be visualized in a tree-like structure with clear if–else rules. They can handle both numerical and categorical data without requiring extensive normalization or scaling. Moreover, decision trees are less sensitive to outliers in the data and provide faster training and prediction, especially for smaller data sets. They also make minimal assumptions about the data distribution, making them suitable for non-linear scenarios. Random forests, on the other hand, construct multiple decision trees and combine their predictions, resulting in improved accuracy and reduced overfitting compared to a single decision tree. They effectively mitigate the impact of noisy data and offer valuable insights into feature importance, aiding feature selection in the data set. In addition, the training of individual decision trees in a random forest can be easily parallelized. Gradient Boosting Machines (GBMs) achieve high predictive accuracy by combining the predictions of multiple weak learners, typically decision trees. They handle missing data effectively without extensive imputation and can accommodate various data types, including numerical and categorical variables. GBMs also work well with a variety of loss functions, making them suitable for regression tasks. These are some of the advantages of these ML methods.

There are some limitations to these methods. Such decision trees tend to overfit, particularly when they become deep and complex, resulting in poor generalization and performance on unseen data. In addition, the discrete splits used by decision trees may not effectively capture the continuous nature of certain data sets or variables. Moreover, the model’s instability can be attributed to the production of different trees with slight variations in the training data. Random forests, consisting of multiple decision trees, can pose challenges in terms of interpretation and understanding due to their ensemble nature. They also require more computational resources and time compared to individual decision trees, especially when handling large data sets or numerous trees. Moreover, the storage of multiple decision trees leads to higher memory consumption. Gradient boosting machines (GBMs) can be sensitive to noisy data as the boosting process attempts to fit the noisy samples during training, thereby increasing the risk of overfitting. GBMs with a large number of iterations or deep trees can be computationally expensive and demand more memory. Furthermore, proper tuning of GBMs, including learning rate, number of iterations, and tree depth, can be a non-trivial task, necessitating careful parameter optimization.

To overcome these limitations, a novel metaheuristic Dolphin echolocation optimization method was used for hyperparameter optimization. Its adaptability, efficient exploration, and robustness increase the prediction accuracy of the models. A database was developed from published literature and pre-processed. Tenfold cross-validation was used for training and testing to obtain the best-performing model. Two sensitivity methods used in this study.

Machine learning algorithms

Decision tree (DT)

A Decision Tree Regressor is a machine learning algorithm used for regression tasks, aiming to predict continuous numerical values. This algorithm utilizes a hierarchical structure of decision nodes and leaf nodes to make predictions based on the input features (Gu et al., 2021). Recursively splitting the data based on the values of the input features is how the Decision Tree Regressor functions (de Ville, 2013). It identifies the most informative feature at each decision node by maximizing the reduction in variance or another suitable metric. The goal is to split the data into subsets that are as homogeneous as possible in terms of the target variable. Each leaf node in the decision tree represents a prediction value for the target variable. When new data points are encountered, they traverse the tree from the root node to a specific leaf node based on the feature values, and the prediction value associated with that leaf node is assigned as the output. One of the key advantages of a Decision Tree Regressor is its interpretability. The resulting tree structure gives us insights into the correlations between features and the target variable, helping us to comprehend the decision-making process. In addition, decision trees can handle both numerical and categorical features and can handle missing values without requiring extensive pre-processing. The Decision Tree Regressor is a versatile algorithm for regression tasks. Its hierarchical structure, interpretability, and ability to handle various feature types make it a popular choice in the field of machine learning. Figure 2 represents the flow chart of decision tree regressor algorithm.

Random forest (RF)

A potent and popular machine learning method known as the Random Forest Regressor excels at predicting tasks requiring continuous numerical variables (Chen & Ishwaran, 2012). Multiple decision trees are combined as part of an ensemble learning technique to produce a reliable and precise predictive model (Sagi & Rokach, 2018). A random forest integrates the concepts of bagging and random feature selection to help alleviate these problems, in contrast to a single decision tree, which can be vulnerable to overfitting and instability. The technique generates a variety of models by selecting random subsets of the training data and characteristics for each tree. Each decision tree in the forest autonomously gains knowledge from the chosen features and data, producing a variety of distinct but individually flawed forecasts. During prediction, the random forest aggregates the predictions from all the decision trees by averaging (for regression tasks) or voting (for classification tasks). This ensemble approach helps to reduce the variance and bias of the final prediction, resulting in improved generalization and robustness. The strength of the random forest regressor lies in its ability to handle complex relationships and interactions among variables. It can capture nonlinearities, handle high-dimensional data, and effectively deal with missing values and outliers. Moreover, random forests are less sensitive to the choice of hyperparameters compared to other machine learning algorithms, making them relatively easy to use and tune. Random forests provide valuable insights into feature importance. This information aids in feature selection, dimensionality reduction, and understanding the underlying factors influencing the outcome. Due to their robustness, accuracy, and interpretability, random forest regressors have found applications in various domains. Figure 3 depicts the flow chart of the random forest regression algorithm.

Gradient boosting machine (GBM)

Gradient Boosting Machine Regressor, a powerful algorithm in machine learning, is widely used for regression tasks due to its exceptional predictive capabilities (Zhou et al., 2021). It is a member of the ensemble learning family, which brings together several weak learners to produce a reliable and precise predictive model. The Gradient Boosting Machine Regressor algorithm works in a sequential manner, where weak learners, typically decision trees, are trained in an additive fashion (Badirli et al., 2020). Each subsequent tree focuses on reducing the errors made by the previous trees. By iteratively fitting new trees to the residuals of the previous trees, the algorithm gradually improves its predictive performance. One of the key strengths of the Gradient Boosting Machine Regressor lies in its ability to handle complex relationships and non-linear interactions within the data. It automatically captures intricate patterns and nonlinearities by leveraging the hierarchical structure of decision trees. This makes it highly effective in capturing both local and global dependencies in the data, leading to accurate predictions. The algorithm incorporates regularization techniques, such as shrinkage and tree pruning, to prevent overfitting and improve generalization. Shrinkage reduces the impact of each tree, allowing for a more conservative and robust model. Tree pruning, on the other hand, removes unnecessary branches and nodes, simplifying the model and enhancing its interpretability. Figure 4 illustrates the flow chart of gradient boosting method.

Metaheuristic optimization

Dolphin echolocation optimization (DEO)

Dolphin Echolocation Optimization (DEO) is an innovative optimization algorithm inspired by the remarkable echolocation abilities of Dolphins (Kaveh, 2017a). This nature-inspired algorithm mimics the biological behavior of Dolphins in using echolocation to locate objects and navigate their surroundings effectively (Kaveh & Farhoudi, 2016b). DEO has gained attention in the field of optimization due to its ability to solve complex problems and find near-optimal solutions (Kaveh, 2017b). The DEO algorithm employs a multi-objective approach, where multiple solutions are generated simultaneously to explore the search space efficiently (Kaveh & Farhoudi, 2013). Similar to how Dolphins emit sound waves and listen to the echoes to perceive their environment, the DEO algorithm uses a set of candidate solutions and evaluates their fitness based on predefined objectives (Kaveh et al., 2018). By iteratively refining these solutions through a combination of exploration and exploitation, DEO aims to converge toward optimal or near-optimal solutions. One key advantage of the DEO algorithm lies in its ability to balance exploration and exploitation effectively. Like Dolphins that continuously adapt their echolocation strategies to changing environmental conditions, DEO dynamically adjusts its search process. This adaptability enables the algorithm to escape local optima and explore diverse regions of the search space, ultimately improving the quality of the solutions obtained. DEO exhibits robustness and versatility, allowing it to handle various types of optimization problems. Its ability to handle both single-objective and multi-objective optimization tasks makes it applicable to a wide range of real-world scenarios. Whether it is used in engineering design, financial modeling, or data analysis, DEO showcases its potential in finding optimal or near-optimal solutions efficiently. Its unique characteristics, including the balance between exploration and exploitation, adaptability, and versatility, make it a valuable tool for solving complex optimization problems across different domains. Figure 5 shows the echolocation of Dolphins in nature.

Hyperparameter optimization

Hyperparameter optimization is a critical step in developing accurate regression models that can effectively capture and model the underlying relationships within a data set (Feurer & Hutter, 2019). Regression models rely on various hyperparameters, which are adjustable settings that determine the model's behavior and performance. Optimizing these hyperparameters is essential to improve the model’s predictive capabilities and ensure optimal performance. The process of hyperparameter optimization involves systematically searching through different combinations of hyperparameter values to identify the configuration that yields the best results. The objective is to identify the set of hyperparameters that maximizes the performance metric of the model or minimizes its inaccuracy. By carefully tuning the hyperparameters of regression models, researchers and practitioners can unlock their full potential and improve the model’s ability to accurately predict outcomes. This optimization process allows for better generalization, and increased model robustness, and ultimately enhances the model's performance in real-world applications.

In this study, the hyperparameters of all the decision tree-based algorithms were optimized using the DEO algorithm. The pseudo-code of the optimized regression models is shown in Tables 1, 2, and 3.

Table 1 Hyperparameter optimization of DT

Full size table

Table 2 Hyperparameter optimization of RF

Full size table

Table 3 Hyperparameter optimization of GBM

Full size table

Data set preparation

A robust and well-organized database is of utmost importance in the field of machine learning. Machine learning models heavily rely on data to learn patterns, make accurate predictions, and provide meaningful insights (Asteris et al., 2021). A good database serves as the foundation for successful machine-learning models. The accuracy and usefulness of the data have a direct bearing on how well these models work. A good database facilitates efficient data preprocessing and feature engineering. Data preprocessing involves tasks, such as cleaning, normalization, and handling missing values, while feature engineering involves transforming raw data into meaningful features that capture relevant information (Ahmed & Iqbal, 2023). A well-structured database simplifies these processes, enabling researchers and practitioners to prepare the data effectively and extract valuable insights.

To predict the CS of PFRC a database was constructed from published literature (Adnan & Dawood, 2020; Foti, 2011, 2013; Fraternali et al., 2011; Irwan et al., 2013; Kim et al., 2010; Marthong, 2015; Marthong & Sarma, 2016; Mohammed & Rahim, 2020; Mohammed Ali, 2021; Nibudey et al., 2013; Ochi et al., 2007; Pelisser et al., 2012; Rahmani et al., 2013). The database consisted of a total of 120 data points. The small number of data was due to the minimal study in the field of PET fiber-reinforced concrete. The data set contained Binder (mostly cement), Water-to-Binder ratio (W/B), fine aggregate (FA), coarse aggregate (CA), and fiber volume fraction (FVF) as input features, while the compressive strength was taken as output. The statistics of the data set are shown in Table 4.

Table 4 Statistical details of the database

Full size table

The Pearson correlation matrix as shown in Fig. 6, is a statistical tool that provides valuable insights into the relationships between variables in a data set. It measures the strength and direction of the linear association between pairs of variables. The matrix displays the correlation coefficients, which range from − 1 to 1, where − 1 represents a perfect negative correlation, 1 represents a perfect positive correlation, and 0 indicates no linear correlation. By the Pearson correlation matrix, researchers can identify the degree of association between different variables. A high positive correlation between two variables suggests that they tend to increase or decrease together, while a high negative correlation indicates an inverse relationship. On the other hand, a correlation close to zero suggests no linear relationship between the variables. Here, the input features were at the non-correlation-to-medium co-relation stage.

Figure 7 represents a density distribution plot, also known as a kernel density plot, which is a visual representation of the distribution of a continuous variable. It provides valuable insights into the shape, spread, and concentration of data points along the range of the variable. This type of plot is widely used in data analysis and statistics to understand the underlying distribution of a data set. The density distribution plot is constructed by estimating the probability density function (PDF) of the data. It smooths out the individual data points and presents a continuous curve that represents the overall distribution. The curve is derived by placing a kernel function, such as a Gaussian kernel, on each data point and summing up the contributions to create a smooth density estimate.

Metrics for model evaluation

Several criteria are frequently employed to judge the accuracy, precision, and generalizability of regression machine learning models. These metrics shed light on the model's capability to forecast continuous numerical values. R² measures the percentage of the target variable's variance that the regression model accounts for. The value, which runs from 0 to 1, denotes the goodness of fit. A better fit of the model to the data is indicated by a higher R² score. The percentage of variance in the target variable that is explained by the model is measured by the explained variance (EV) score. A higher score denotes a better model fit and spans from 0 to 1. The average absolute difference between the expected and actual values is measured by MAE. The average percentage difference between the predicted and actual values is calculated using MAPE. It is very helpful when we want to assess the relative accuracy of the model's predictions and the target variable has a wide range of values. The average magnitude of the prediction mistakes is measured by RMSE, which is the square root of MSE. It helps interpret the error metric in the target variable's original scale. Below are the equations for the evaluation metrics.

$${R}^{2}=1-\frac{\sum_{i}{\left({y}_{i}-{\widehat{y}}_{i}\right)}^{2}}{\sum_{i}{\left({y}_{i}-y{\prime}\right)}^{2}}$$

(1)

$$\mathrm{MAE}=\frac{1}{n}{\sum }_{i=1}^{n}\left|y-{y}{\prime}\right|$$

(2)

$$\mathrm{MAPE}=\frac{1}{n}\sum_{i=1}^{n}\left|\frac{{y}{\prime}-y}{y}\right|\times 100$$

(3)

$$\mathrm{RMSE}=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{\left({y}{\prime}-y\right)}^{2}}$$

(4)

$$\mathrm{EV}=\left(1-\frac{var(y-{y}{\prime})}{var\left(y\right)}\right)\times 100$$

(5)

Results and discussion

Model training and hyperparameter optimization

For the training and testing of all the models, a tenfold cross-validation method was utilized. Tenfold cross-validation is a commonly used technique in machine learning for assessing the performance and generalization ability of a model. It involves dividing the data set into ten equal-sized subsets or “folds”. The model is trained and evaluated ten times, with each iteration using nine folds for training and onefold for validation. This process ensures that every sample in the data set is used for both training and validation exactly once. The results from each fold are then averaged to provide an overall assessment of the model's performance. Tenfold cross-validation helps to reduce the impact of data partitioning on model evaluation and provides a more robust estimation of its effectiveness on unseen data. It is a valuable tool for selecting models, tuning hyperparameters, and comparing different algorithms while avoiding overfitting and optimizing generalization.

Hyperparameter optimization of all the models was done using DEO. By emulating the adaptive and exploratory nature of Dolphin echolocation, DEO aims to enhance optimization processes in machine learning tasks. It potentially incorporates mechanisms to dynamically adjust search strategies, detect relevant cues or patterns, and adapt to changing problem landscapes. Table 5 shows the optimum hyperparameters of the three ML models.

Table 5 Optimum hyperparameters of the models

Full size table