Introduction

Carbon dioxide (CO2) emission is identified as a primary environmental concern, with cement production contributing approximately 8 to 10 percent of the total CO2 emissions (Suhendro, 2014). This process plays a substantial role in greenhouse gas emissions and global warming (Bildirici, 2019). Currently, tackling climate change is of paramount importance worldwide. Concrete is highly valued in construction due to its mechanical strength and cost-effectiveness (Andrew, 2019). However, the construction industry, including its factories, has the largest environmental footprint among human activities. Integrating supplementary cementitious materials (SCMs) into concrete is a viable method for reducing CO2 emissions (Scrivener et al., 2018). Hence, utilizing SCMs in concrete is an effective and environmentally responsible approach. Among these SCMs, FA is considered the predominant substitute for replacing cement in concrete mixtures (Li et al., 2022).

FA is a pozzolanic material abundant in silica and alumina, recognized for its fine powder consistency, even finer than cement. FA is a by-product which comes from the coal combustion process. According to ASTM C618 standards, FA is categorized into Class F and Class C based on its chemical composition. In the past, various researchers have extensively investigated the effect of FA on concrete performance, considering factors such as its type, chemical composition, quantity, and the extent of its replacement (Tkaczewska, 2021). Beyond its role in reducing carbon emissions, FA concrete offers several benefits. It enhances concrete's flow, binding, and water retention properties, improving its workability and performance during application (Nayak et al., 2022). FA inclusion also helps mitigate heat release during concrete hydration, reducing the risk of temperature-related cracks. Furthermore, through secondary hydration effects, FA increases compactness and improves the interface structure in concrete, resulting in enhanced impermeability and resistance against sulfate corrosion. Additionally, the prolonged reaction of volcanic ash in FA concrete improves its durability compared to conventional cement concrete.

Achieving the desired CS of FA concrete usually requires numerous adjustments to the concrete mix ratio using conventional methods. This involves casting laboratory concrete specimens and performing compression tests to evaluate CS. If the measured strength does not meet the desired standard, new specimens must be prepared, which is time-consuming and increases labor expenses. Therefore, developing an effective alternative approach that could predict the CS from a particular mix before performing compression tests would be highly beneficial. This could provide valuable insights in advance, enabling more efficient adjustments to the mix ratio and reducing the need for repeated specimen creation and testing.

The emergence and development of ML have significantly impacted civil engineering (Kaveh, 2024; Manzoor et al., 2021). Various ML models have been successfully applied to predict the compressive strength of concrete, yielding promising results (Al-Gburi & Yusuf, 2022; Sathiparan, 2024; Sathiparan et al., 2023). These techniques rely on extensive datasets to build precise models. The accuracy of their predictions primarily depends on the quality and completeness of the data samples collected from experimental procedures during specimen casting or from literature studies. Researchers utilize these algorithms to predict the mechanical properties of concrete with improved reliability and efficiency.

Kaveh et al. (1999) developed a hybrid method integrating graph theory with neural networks for domain decomposition, enhancing accuracy and efficiency in structured finite element meshes. Iranmanesh and Kaveh (1999) introduced a neurocomputing strategy combining neural networks with structural optimization techniques. Singh et al. (2023) used ML models with 14 input parameters on a dataset of 400 points to predict the CS of red mud (RM)-based concrete. DT and extra tree regressor (ET) models provided the best fit. Microstructural analysis and leaching tests confirmed the safety and compliance of RM concrete, making it suitable for eco-friendly construction, especially for low-traffic or rural roads. Albostami et al. (2023) applied data-driven approaches to predict the CS of self-compacting concrete (SCC) with recycled plastic aggregates (RPA). Using 400 experimental datasets, they employed multi-objective genetic algorithm evolutionary polynomial regression (MOGA-EPR) and gene expression programming (GEP). These models outperformed the traditional LR model. Kaveh et al. (2021) applied ML to relate fiber angle and buckling capacity under bending-induced loads. Their deep learning model, trained on a dataset of 11,000 cases, outperformed RF, DT, and LR models, demonstrating superior accuracy and generalization. Kaveh et al. (2023) developed metaheuristic-trained ANNs to predict the ultimate buckling load of high-strength steel columns. Using particle swarm optimization and genetic algorithms to optimize ANN weights and biases, their models achieved up to 99.8% accuracy.

In the context of FA-based concrete, Ahmad et al. (2021) conducted a study on the utilisation of ML techniques to predict the CS of concrete incorporating SCMs. They employed bagging, DT, adaptive boosting, and GEP models. Among these, the bagging regressor provided the best prediction results. In their study, coarse aggregate, fine aggregate, and cement contributed 24.6%, 18.4%, and 16.3%, respectively to the prediction outcomes. Jiang et al. (2022) used ML algorithms to predict the CS of concrete made with FA. They employed four ML models: RF, extreme learning machine, SVR, and support vector regression with grid search (SVR-GS). The SVR-GS model produced the most accurate predictions, with age and water-cement ratio being the most influential features affecting CS. Mahajan and Bhagat (2022) investigated ANN, DT, GEP, and bagging regressor to predict the CS of concrete with FA admixture. Their prediction model used seven input elements (cement content, fine aggregate, coarse aggregate, fly ash, superplasticizer, water content, and curing days) to predict the output parameter. The bagging algorithm outperformed ANN, DT, and GEP, achieving an R2 value of 0.97, compared to 0.81, 0.78, and 0.82, respectively. Chopra et al. (2016) utilized genetic programming and ANN to forecast the concrete CS, both with and without FA. They collected the relevant data from controlled laboratory experiments at various curing periods. The prediction results indicated that the ANN model, using the Levenberg–Marquardt (LM) training function, was the most effective tool for predicting concrete CS.

Research significance

Several experimental studies have investigated the impact of adding FA on concrete CS. However, only a few have focused on predicting FA concrete CS using ML models. Moreover, many of these studies have relied on a limited number of data-set points and input parameters. Notably, the use of chemical composition (silica content, lime content, iron oxide content, aluminum oxide content, and loss on ignition) of FA as input parameters for predicting concrete CS has rarely been reported in the literature. This inclusion of chemical composition addresses the variability in FA properties, which significantly influence concrete’s mechanical properties.

Addressing these gaps, the current research aims to employ six distinct ML models with 1,089 dataset points and 12 input parameters to predict the FA concrete CS. The research objectives are:

  • To develop ML models that can accurately predict the FA concrete CS.

  • To compare the performance of the models using metrices: MSE, MAE, and R2.

  • To examine the relative significance and impact of each input feature on the CS.

  • To develop a comprehensive graphical user interface (GUI) to facilitate user interaction with the prediction models.

Methodology

The flowchart for the methodology used in the current study is shown in Fig. 1.

Fig. 1
figure 1

Flowchart of the methodology adopted in the currrent study

Data collection

A total of 1,089 dataset points based on the use of FA in concrete were collected from existing literature (Alaka & Oyedele, 2016; Balakrishnan & Awal, 2014; Barbhuiya et al., 2009; Chen et al., 2019; Chindaprasirt et al., 2007; Atis, 2003; Durán-Herrera et al., 2011; Felekoglu, 2006; Hashmi et al., 2020; Golewski, 2018; Hansen, 1990; Huang et al., 2013; Kumar et al., 2007; Kumar et al., 2021; McCarthy & Dhir, 2005; Mehta & Gjorv, 1982; Mukherjee et al., 2013; Nochaiya et al., 2010; Oner et al., 2005; Reiner & Rens, 2006; Saha, 2018; Shaikh & Supit, 2015; Siddique, 2004; Siddique & Khatib, 2010; Sun et al., 2019; Woyciechowski et al., 2019; Yazici et al., 2012) in terms of twelve input parameters: water-binder (w/b) ratio, cement content (kg/m3), coarse aggregate (kg/m3), fine aggregate (kg/m3), silica dioxide (%), calcium oxide (%), ferric oxide (%), aluminum oxide (%), loss on ignition (%), superplasticizer (kg/m3), curing days, and replacement percentage, and one output parameter: compressive strength (Mpa). The dataset included 35 different types of fly ashe, each characterized by diverse chemical and physical properties. The database incorporated data from concrete specimens of varying shapes and sizes, with four distinct configurations utilized. Relevant shape factors were employed in analyzing these specimens. Out of the 1,089 datapoints, 872 (80%) were allocated for training the models, while 217 (20%) were designated for testing the models. The various input parameters are depicted in Fig. 2.

Fig. 2
figure 2

Input parameters used in the current study

Pre- processing

In the pre-processing phase, the dataset was subjected to standard scaling to ensure all numeric features were on a comparable scale. This involved centering the data around zero and rescaling it to unit variance using Python's standard scaling functionality. By standardizing the features in this manner, potential issues stemming from varying scales were mitigated, ensuring that each feature contributed equally to the model's learning process.

Statistical analysis

Descriptive statistical analysis of the input and output variable (CS) is summarized in Table 1, where ‘mean’ represents average value, ‘std’ represents standard deviation, ‘min’ and ‘max’ signify minimum and maximum values, ‘25%’, ‘50%’, and ‘75%’ represent first, second, and third quartile, and ‘skew’ and ‘kurt’ signify skewness and kurtosis, respectively.

Table 1 Descriptive statistical analysis of variables

Machine learning models employed

Linear regression (LR)

LR is a foundational statistical technique used to model the relationship between an outcome variable and one or more predictor variables. This is achieved by fitting a linear equation to the observed data to capture underlying patterns and trends (Su et al., 2012). The model determines the coefficients for each input feature by minimizing the sum of squared errors between the predicted and actual values. The predicted value is calculated as a linear combination of the input features, where each feature is multiplied by its respective coefficient and summed together. For this study, an LR model was initialized and trained using the LinearRegression class from the scikit-learn library in Python. The mathematical equation for the trained LR model is represented in Eq. (1) below.

$$\text{y }= 104.295 + (-76.280*\text{W}/\text{b ratio})+ (-0.004*\text{Cement Content})+ (0.011*\text{Fine Aggregate})+ (0.001*\text{Coarse Aggregate})+ (-0.478*\text{SiO}2)+ (-0.512*\text{CaO})+ (0.005*\text{Fe}2\text{O}3)+ (-0.116*\text{Al}2\text{O}3)+ (-0.233*\text{Loss on ignition})+ (0.397*\text{Superplastisizer})+ (0.109*\text{Curing Days})+ (-0.483*\text{Replacement Percentage})$$
(1)

Decision tree (DT)

DT is a supervised learning algorithm employed for predictive modeling. The model functions by recursively dividing the feature space into regions (Myles et al., 2004). Each internal node represents a decision based on a particular attribute, while each leaf node represents a predicted value. This approach allows the model to capture nonlinear relationships between input features and the target variable. In this study, a DT model was initialized using the DecisionTreeRegressor class from the scikit-learn library in Python. Unlike ensemble methods such as random forests, decision tree regression involves constructing a single decision tree trained on the entire dataset. The hyperparameters chosen for the model are shown in Table 2. These hyperparameters were carefully selected to balance the model’s complexity and predictive accuracy. Decision tree up to the depth of three is shown in Fig. 3.

Table 2 Hyperparameters for the DT model
Fig. 3
figure 3

DT regressor up to tree depth three

Random forest (RF)

RF is an ensemble learning technique that builds multiple decision trees during training and averages their predictions to produce a final output (Biau & Scornet, 2016). This method enhances predictive performance and reduces overfitting by training each tree on a random subset of features and data samples. In this research, an RF model was initialized using the RandomForestRegressor class from the scikit-learn library in Python. To determine the best hyperparameters, a grid search was conducted, wherein the mean square error was computed for different leaf sizes and plotted against the number of estimators, allowing us to visualize the relationship between the number of estimators and model performance. By analyzing the graph, hyperparameters were tuned to minimize MSE and improve model accuracy. The chosen hyperparameters and the aforementioned graph are shown in Table 3 and Fig. 4, respectively.

Table 3 Hyperparameters for the RF model
Fig. 4
figure 4

RF MSE vs. number of estimators for different leaf sizes

Extreme gradient boosting (XGB)

XGB is a highly optimized and scalable implementation of gradient-boosting machines. It is renowned for its exceptional performance in various ML tasks, particularly in regression and classification problems. XGB operates by iteratively incorporating decision trees into an ensemble, wherein each tree is trained to rectify the errors made by the preceding ones (Chen & Guestrin, 2016). This boosting process focuses on minimizing a loss function by optimizing the predictions of the ensemble. An XGB model was initialized and trained using the XGBRegressor class from the xgboost library in Python. To determine the optimal hyperparameters, a grid search was conducted, wherein the mean square error was computed for different learning rates and plotted against the number of estimators. This allowed us to visualize the relationship between the number of estimators and model performance, aiding in the decision about hyperparameters. The chosen hyperparameters and the relevant graph are shown in Table 4 and Fig. 5, respectively.

Table 4 Hyperparameters for the XGB model
Fig. 5
figure 5

XGB MSE vs. number of estimators for different learning rates

Support vector regression (SVR)

SVR leverages the principles of support vector machines for regression analysis, offering a robust technique. Its objective is to identify the optimal hyperplane that maximizes the margin while minimizing the error between the predicted and observed values (Pisner & Schnyer, 2019). In this study, an SVR model was trained using the SVR class from the scikit-learn library in Python. Prior to training, the features were standardized with the StandardScaler from the same library to ensure consistent scaling across different features, thereby enhancing model performance. To fine-tune the hyperparameters and find the best combination of C and Gamma, a (r-squared) value heatmap was plotted for various combinations. The selected hyperparameters and the accuracy heatmap are shown in Table 5 and Fig. 6, respectively.

Table 5 Hyperparameters for the SVR model
Fig. 6
figure 6

Accuracy heatmap for SVR model for different combinations of C and Gamma

Artificial neural network (ANN)

ANNs are computational models consisting of interconnected nodes arranged in layers: an input layer, one or more hidden layers, and an output layer. Each node performs a transformation on its input and forwards the outcome to the nodes in the subsequent layer. Through a process termed training, ANNs adjust the weights of connections between nodes to minimize a loss function and enhance predictive accuracy (Khan, 2018). In the current research, the model architecture was defined using the Keras library, with a sequential model featuring an input layer, a dense hidden layer with a variable number of neurons, and an output layer. The hidden layer utilized the rectified linear activation function (ReLU), while the output layer used a linear activation function, suitable for regression tasks. To determine the optimal number of neurons in the hidden layer, a graph was plotted showing the mean squared error versus the number of neurons. This visualization helped select the model complexity that best balanced underfitting and overfitting. The chosen hyperparameters and the above-mentioned graph are shown in Table 6 and Fig. 7, respectively.

Table 6 Hyperparameters for the ANN model
Fig. 7
figure 7

ANN MSE vs. number of neurons

Results and discussion

Data visualization plots

Marginal plot

A marginal plot combines a scatter plot of input variables against the output variable with histograms or density plots of each variable along the axes. This plot allows for a simultaneous examination of the relationship between predictor variables and the output variable, while also visualizing the distribution of each variable. It facilitates understanding of how the input variables collectively affect the output variable and provides insights into their individual distributions. The marginal plot for all input variables with respect to the output variable is shown in Fig. 8.

Fig. 8
figure 8figure 8

Marginal plot between FA concrete CS and a water-binder ratio, b cement content, c fine aggregate, d coarse aggregate, e SiO2, f CaO, g Fe2O3, h Al2O3, i loss on ignition, j superplastisizer, k curing days, and l replacement percentage

Correlation heatmap

A correlation heatmap visually represents a correlation matrix, using colors to indicate the magnitude and direction of correlations between variables. Typically, warmer colors denote positive correlations, cooler colors represent negative correlations, and neutral colors signify no correlation. These heatmaps illustrate linear correlations between all possible combinations of variables in a dataset, offering insights into relationships and patterns that may exist among them. The heatmap for the employed dataset is presented in Fig. 9.

Fig. 9
figure 9

Correlation matrix heatmap for the employed data-set

Curing days (0.56) and cement content (0.34) exhibit the highest positive correlation with the output CS, while the water-binder ratio (-0.39) and replacement percentage (-0.29) show the highest negative correlation coefficients. Furthermore, since no features are uncorrelated, all twelve input parameters can be utilized for predicting the CS.

Performance metrices

The comparison of the six regression models revealed distinct performance differences, highlighted through three key metrices: MSE, MAE, and R2. These metrices are essential for understanding how well our models predict outcomes. MSE acts as a ruler, emphasizing larger errors by squaring the differences between forecasted and observed values (Allen, 1971). MAE focuses on the average size of errors without considering their direction (Willmott & Matsuura, 2005). R2 indicates how effectively the model explains the variability of the dependent variable using the independent variables. Higher R2 values suggest that the model better captures the patterns and relationships in the data.

The mathematical expressions for MSE, MAE, and R2 are given in Eqs. (2), (3) and (4) (Chicco et al., 2021).

$$\text{MSE }= (1/\text{n})*\Sigma (\text{yi }-{\hat{y}i})^2$$
(2)
$$\text{MAE }= (1/\text{n})*\Sigma |\text{yi }-{\hat{y}i}|$$
(3)
$${\text{R}}^{2}= 1 - (\text{SSres }/\text{ SStot})$$
(4)

where ‘n’ represents the number of samples, ‘yi’ denotes the actual value, ‘\(\hat{y}\)i’ denotes the predicted value, ‘SSres’ represents the sum of squared residuals (errors), and ‘SStot’ stands for the total sum of squares.

The MSE, MAE, and R² values for the employed models are presented in Table 7.

Table 7 Performance metrices for employed models for training and testing data-sets set

Ensemble models, namely XGB and RF outperform other models exhibiting low MSE and high R2 values on both training and testing datasets. DT also performs well on the training data but shows moderate generalization to the testing data. While SVR and ANN show moderate performance on both training and testing datasets.

Prediction plot

A scatter plot of actual versus predicted values visually assesses how well the model predictions align with the true values. Each point on the plot represents a data instance, where the x-coordinate denotes the actual value, and the y-coordinate represents the predicted value. Ideally, all points would lie on the diagonal line (the identity line), indicating perfect alignment between forecasted and actual values. Figure 10 displays the scatter plot of actual vs. predicted values for the FA concrete CS for all the models used in this study.

Fig. 10
figure 10figure 10

Scatter plot of actual vs. predicted values of FA concrete CS for a LR, b DT, c RF, d XGB, e SVR, and f ANN model for training and testing data-sets

Residual plot and distribution of residuals

In regression analysis, residual plots and the distribution of residuals play pivotal roles in assessing the adequacy and validity of the regression model (Suleiman et al., 2015). A residual plot visually depicts the differences between observed and predicted values, typically plotted against the independent variable(s) or the predicted values themselves. This graphical representation enables researchers to scrutinize key aspects of the model's performance: linearity and homoscedasticity. Specifically, a horizontal pattern in the residual plot suggests a linear relationship between the independent and dependent variables, while a consistent spread of residuals across all levels of the independent variable(s) or predicted values indicates homoscedasticity.

The distribution of residuals provides insights into normality, skewness, and kurtosis, aiding in the assessment of the assumptions underlying the regression model. Deviations from normality or symmetry in the residual distribution may signal issues with the model's validity and highlight areas for refinement or further investigation. The percentage of predictions within ± 5 for the employed models is shown in Table 8.

Table 8 Percentage predictions within ± 5 for employed models

In examining all six models, the consistent presence of random scatter in the residual plots indicates that our modeling techniques effectively capture the diverse relationships within the data, without exhibiting systematic patterns. Furthermore, the residuals' normal distribution, centered around zero for most models, reinforces the reliability and robustness of our approach. In summary, the collective analysis of these models provides strong evidence supporting the validity of our statistical modeling framework in comprehensively explaining the inherent variability within the dataset.

The residual plot and distribution of residuals for employed models are shown in Fig. 11.

Fig. 11
figure 11figure 11

Residual plot and distribution of residuals for a LR, b DT, c RF, d XGB, e SVR, and f ANN model

K-fold cross validation

K-fold cross-validation is a method employed to evaluate the performance of ML models accurately. In this method, the dataset is randomly divided into k equally sized subsets. One subset is set aside for validation, while the remaining k-1 subsets are utilized for training. This procedure is repeated k times, with each subset serving as the validation set exactly once. By averaging the results of these iterations, a more reliable assessment of the model's performance is achieved, reducing potential biases. In the described study, a 10-fold cross-validation approach was utilized. The outcomes were evaluated using MSE, MAE, and R2, as shown in Fig. 12.

Fig. 12
figure 12

K-fold cross validation results using a MAE, b MSE, and c R2 for employed models

Across the folds, RF and XGB consistently demonstrate the lowest MSE and highest R2 values, indicating robust performance and strong predictive accuracy. SVR also maintained competitive performance with moderate MSE and high R2 values. Conversely, DT exhibits higher variability and generally higher MSE, while LR consistently displays the highest MSE and lowest R2 values, suggesting less reliable predictive capability.

Regression error characteristics (REC)

The regression error characteristic (REC) curve is a graphical tool for evaluating regression models. It plots absolute error values on the x-axis and cumulative distribution function (CDF) values on the y-axis (Bennett et al., 2003). This curve illustrates how prediction error varies across different levels of accuracy. The CDF, representing the cumulative proportion of data points with absolute errors less than or equal to a certain threshold, provides valuable information about the distribution of errors in the predictions made by the regression model. REC curves are instrumental for comparing models and understanding how the dataset size affects prediction accuracy. The REC curve for the employed models is depicted in Fig. 13.

Fig. 13
figure 13

REC analysis of employed models

XGB, RF, ANN, and SVR show strong performance, as evidenced by their close alignment with the x-axis, indicating lower error rates across various thresholds. In contrast, LR and DT perform poorly and moderately, respectively.

Shapley additive explanation (SHAP) analysis

SHAP, or Shapley Additive Explanations, is a mathematical method used to interpret the predictions of ML models. A SHAP summary plot provides a comprehensive overview of feature imnportance and their influence on model predictions. It displays features along the y-axis, ranked by their importance, while the x-axis represents the average magnitude of SHAP values, indicating the direction and magnitude of each feature's impact on predictions across all data points. For this study, SHAP analysis was performed using the XGB model due to its superior performance, as shown in Fig. 14.

Fig. 14
figure 14

SHAP analysis values for the XGB model

Curing days, water-binder ratio, cement content, and replacement percentage are the most impactful parameters in the prediction of FA concrete CS for the given data set.

Partial dependence plot

Partial dependence plots (PDPs) are visual tools that show the relationship between a subset of input features and the predicted outcome of a model. PDPs display how changes in specific features affect the predicted response while averaging out the effects of all other features. This helps in understanding the influence of each feature on the model's predictions, providing insights into the model's behavior and feature importance. PDPs can also serve as a validation tool, ensuring that the model's predictions are consistent with domain knowledge or expectations. In this study, PDPs were constructed for the most influential parameters— curing days, water-binder ratio, and cement content— using the XGB model while keeping the values of other features constant (at their mean), as shown in Fig. 15.

Fig. 15
figure 15

Partial dependence plot for a curing days, b water-binder ratio, and c cement content using the XGB model

Graphical user interface (GUI)

The development of a graphical user interface (GUI) for the prediction models marks a major step forward in enhancing the practicality and availability of ML applications. A GUI was built using the Flask framework and subsequently deployed on Render. The interface features a dedicated space for users to input values for all relevant features, ensuring comprehensive data entry. Additionally, a drop-down menu is incorporated, allowing users to select the type of model they wish to employ for predictions, thus providing flexibility and adaptability to varying model architectures and algorithms. This interface enables users to trigger predictions with a single click and receive immediate feedback on the predicted outcomes. The GUI and related code files are available at https://fa-cs-pred-ekc0.onrender.com/, and https://github.com/abhinavkapil/FA_CS_PRED, respectively. The interface of the developed GUI is shown in Fig. 16.

Fig. 16
figure 16

The interface of the prepared GUI to predict the FA concrete CS

Conclusions

This study employed six distinct ML models comprising 1089 data-set points extracted from the use of FA in concrete in terms of twelve input parameters to predict the FA concrete CS. The following key findings emerged from this study:

  1. 1.

    The wide range of input variables and the output variable, as evidenced by the statistical analysis and marginal plots, served to validate the reliability of the collected dataset.

  2. 2.

    Correlation analysis revealed that no features were uncorrelated, so all the input features were utilized to increase the accuracy of the developed models.

  3. 3.

    The ensemble ML models (RF and XGB) showed better performance, as indicated by higher values of R2 and lower statistical errors (MSE and MAE), with XGB being the most accurate (r-squared value of 0.95). SVR and ANN performed moderately on both training and testing datasets, meanwhile, DT and LR were the least effective in predicting the results, with R2 values of 0.80 and 0.70, respectively.

  4. 4.

    K-fold cross-validation, which was utilized to confirm the accuracy of developed models revealed similar results with the XGB regressor showing superior performance across all folds.

  5. 5.

    Based on REC analysis, XGB, RF, SVR, and ANN showed strong performance, with low error rates across various thresholds, while DT and LR performed moderately and poorly, respectively.

  6. 6.

    Based on SHAP analysis, curing days, water-binder ratio, cement content, and replacement percentage were the most critical parameters in FA concrete CS prediction for the given data set.

  7. 7.

    The partial dependence plots for curing days, water-binder ratio, and cement content were consistent with the general trend.

  8. 8.

    A graphical user interface (GUI) was successfully developed, which will enable users to predict the FA concrete CS based on their own set of input values.