1 Introduction

Management of industrial waste materials is a global problem; fly ash (FA) is a waste -product of power plants resulting from coal combustion. Supplementary cementitious materials (SCM) are those materials that are used in concrete plants to replace Portland cement in cement-based mortar and cement-based concrete. The hydration of cement with water forming calcium silicate hydroxide gel (C–S–H) and calcium silicate (C–H), SCM like fly ash reacted with C–H and resulted in the formation of further C–S–H and solving the durability problems related to C–H which is vulnerable to chemical attack [1]. Fly ash modified cementitious material generate less heat during the hydration process; therefore, it is suitable for mass concrete [2,3,4,5,6], and their strength is greatly influenced by the physical characteristics and chemical composition of the fly ash; those properties depend on the coal type and the equipment used in the power plant and the reactivity of the fly ash [6, 7]. The pozzolanic reactivity of fly ash has been investigated in various research. Pozzolanic reactivity of fly ash can be measured through chemical analysis to determine the quantity of silica or measuring the heat developed at the hydration time. However, since the silicate forms a gel at a pH greater than 10, the amount of silica used in the gel formation must be considered [8, 9]. After 28 days of curing, the consumption of C–H through a pozzolanic reaction of fly ash can be measured by X-Ray diffraction (XRD) or thermal analysis [10]. Natural hydraulic lime occurs from the calcination and subsequent slaking of marly limestones (limestones with clay impurities, which after calcination become reactive silicates and aluminates). It thus sets through both hydration and carbonation processes, leading to the formation of hydraulic compounds and the formation of calcite, respectively. Due to the relatively low calcination temperatures required, it is considered as an eco-friendly material in relation to modern binders, such as cement, as also pointed out by [11]; furthermore, during hardening, part of the CO2 emitted through the limestone calcination is consumed during the carbonation process, thus further lowering the total environmental impact associated to the greenhouse effect gases [11]. Cho et al. [12] evaluated the effect of fly ash chemical composition on the compressive strength of fly ash modified cement mortar using sixteen different types of fly ashes for replacing cement in cement mortar. They concluded that the pozzolanic reactively of fly ash is mainly affected by the percentage of SiO2, Al2O3, and Fe2O3, and the fly ash effect on compressive strength at 90 days of curing is greater than compressive strength at 28 days of curing. Chindaprasirt et al. [13] studied the effect of fly ash fineness on the mechanical properties, sulfate resistance, and drying shrinkage of cement mortar. The study results showed that fly ash with higher fineness improves strength, drying shrinkage, and sulfate attack. Chindaprasirt et al. [14] evaluated workability and chloride ion resistance of cement mortar modified with fly ash. Replacement of cement with fly ash improved resistance to Chloride ion penetration and better workability for the cement mortar.

Modeling the properties of materials can be performed in various ways, including computational modeling, statistical techniques, and newly created tools like Regression analysis, M5P-tree, and artificial neural networks (ANN) [15,16,17,18,19].

Mohammed et al. [20] used ANN, M5P tree, and nonlinear regression to predict the compressive strength of cement-based mortar modified with fly ash. They have concluded that the ANN model can be used efficiently with a high correlation coefficient (R) and minimum RMSE. ANN model was also used by Apostolopoulou et al. [11] to predict the compressive strength of natural hydraulic lime; the results revealed that ANN could accurately forecast the CS of natural hydraulic lime mortars, implying that they can be used as a decision-making tool when developing natural hydraulic lime mortars. Also, Armaghani and Asteris [21] investigated the application of ANN and adaptive neuro-fuzzy inference system (ANFIS) models to predict the compressive strength of cement mortar with or without metakaolin concluded that ANFIS performed better than ANN. At the same time, overfitting was observed for some of the data. Despite the extensive use of mortar materials in constructions over the last decades, there is not yet a reliable and robust method available in the literature to estimate its strength based on its mix parameters. This limitation is due to the highly nonlinear relation between the mortar’s compressive strength and the mixed components. This paper investigates the application of artificial intelligence techniques to predict the compressive strength of cement-based mortar materials with or without metakaolin. Specifically, surrogate models (such as artificial neural network, ANN and adaptive neuro-fuzzy inference system, ANFIS models) have been developed to predict the compressive strength of mortars trained using experimental data available in the literature. The comparison of the derived results with the experimental findings demonstrates the ability of both ANN and ANFIS models to approximate the compressive strength of mortars reliably and robustly. Although ANFIS obtained higher performance prediction to estimate the compressive strength of mortars compared to the ANN model, it was found through the verification process of some other additional data, the ANFIS model has overfitted the data. Therefore, the developed ANN model has been introduced as the best predictive technique for solving the problem of the compressive strength of mortars. Furthermore, an ambitious attempt to reveal the nature of mortar materials has been made [22, 23]. soft computing techniques in estimating concrete's compressive strength (CS) utilizing two non-destructive tests, namely ultrasonic pulse velocity and rebound hammer test. Specifically, six conventional soft computing models were used: back-propagation neural network (BPNN), relevance vector machine, minimax probability machine regression, genetic programming, Gaussian process regression, and multivariate adaptive regression spline. To construct and validate these models, 629 datasets were collected from the literature. Experimental results show that the BPNN attained the most accurate prediction of concrete CS based on both ultrasonic pulse velocity and rebound number values. The results of the employed MARS and BPNN models are significantly better than those obtained in earlier studies. Thus, these two models can assist engineers in the design phase of civil engineering projects to estimate the concrete CS with a greater accuracy level [23]. An experimental database consisting of 1030 records has been compiled from the machine learning repository of the University of California, Irvine. The database was used to train and validate four conventional machine learning (CML) models, namely Artificial Neural Network (ANN), Linear and Non-Linear Multivariate Adaptive Regression Splines (MARS-L and MARS-C), Gaussian Process Regression (GPR), and Minimax Probability Machine Regression (MPMR). Subsequently, the predicted outputs of CML models were combined and trained using ANN to construct the Hybrid Ensemble Model (HENSM). It is observed that the proposed HENSM produces higher predictive accuracy compared to the CML models used in the present study. The predictive performance of all models for CS prediction was compared using the testing dataset. The HENSM model attained the highest predictive accuracy in both phases. Based on the experimental results, the newly constructed HENSM model is very potential to be a new alternative in handling the overfitting issues of CML models and hence, can be used to predict the concrete CS, including the design of less polluting and more sustainable concrete constructions [24]. Metakaolin is used as an additive in cement mortars, substituting the cement fraction to a certain extent, to enhance the sustainability of cement mortars, both in terms of the environmental impact of raw materials production and in terms of the environmental impact of raw materials production improving cement-based mortars durability under environmental actions. However, as metakaolin affects the mechanical performance of cement-based mortars, it is important to know the compressive strength that these blended mortars achieve at 28-days in terms of structural design. Toward this direction, metaheuristic models such as ANN and Genetic Programming (GP) models have been developed and trained through the use of a database, compiled by available, in the literature, experimental works related to cement and blended cement-metakaolin mortars. In the model development phase, the most important parameters affecting the strength of concrete-based mortars were investigated and selected. In addition, the effect of the selected transfer functions and the initial values of weights and biases on the performance of ANN models were also investigated. Based on this analysis, it was shown that ANNs with selected transfer functions (such as the RadialBasis transfer function, the Soft-Max transfer function, and the Normalized Radial Basis transfer function) were able to reliably simulate the 28-days compressive strength of the cement-based mortars. In addition, it was shown that parameters such as the cement grade and the maximum diameter of aggregates are very important in determining the compressive strength of the cement-based mortars; this is an important finding, because these parameters are usually not taken into account in the research studies concerned in the prediction of compressive strength through computational models [25].

In this study, the MEP model was used to predict the compressive strength of the fly ash modified cement-mortar using 450 data collected from previous research related to modified cement-based mortar, and outcomes were compared with different approaches, including ANN, nonlinear regression, M5P-tree, and nonlinear model. The various statistical evaluations were applied to assess the accuracy of the models. The correlation between the compressive with flexural and splitting strengths of fly ash-modified cement-based mortar using different nonlinear models.

2 Objectives

This study is aimed to investigate the application of the MEP model to forecast the compressive strength of cement-based mortar with or without fly ash up to 360 days curing; the followings are the main objectives:

  1. (i)

    Statistically analyze the collected data to evaluate the effect of the mix proportion of cement-based mortar modified with fly ash on the compressive strength.

  2. (ii)

    Developing a reliable model to predict the compressive strength of cement mortar modified with fly ash and obtaining the sensitivity of the models using different statistical approaches.

  3. (iii)

    Correlating compressive strength of the cement mortar with flexural and splitting tensile strengths of the cement mortar modified with fly ash.

3 Methodology

Figure 1 presents the steps that have been followed during this study. The following are steps of the current study methodology:

  1. (i)

    Collecting a considerable number of the datasets (450 datasets) from different published studies in reputable journals.

  2. (ii)

    Considering w/c, curing time, and fly ash content as independent variables for predictors and compressive strength of the cement-based mortar as a target.

  3. (iii)

    Dividing the collected data into three datasets, 70% for training 30% for testing and validation.

  4. (iv)

    Statistical analysis, visualizing data and determining the correlation between independent and dependent variables.

  5. (v)

    Modeling the compressive strength using MEP, NLR, ANN, and M5P-tree models.

  6. (vi)

    Evaluating developed models based on R2, RMSE, SI, MAE, OBJ, t-test, 95% uncertainty, and performance index for actual and predicted compressive strength.

  7. (vii)

    Performing sensitivity analysis to detect the most dominant parameter on the compressive strength of cement-based mortar modified with fly ash.

Fig. 1
figure 1

Methodology flowchart of the current study

3.1 Data Collection

A comprehensive 450 data on compressive strength and flexural strength data on cement-based mortar modified with fly ash were collected from different literature [20, 24,25,26,27,28,29,30,31,32,33,34,35,36,37,38]. The dataset was divided into three groups (training, testing, and validating) randomly using the Rand function in Microsoft Excel. The largest group included 70% of the dataset (300 data), and each of the other two groups had 15% of the dataset (75 data). The training data is used to develop the model while validating and testing data is provided to test the developed model against unseen data. The overfitting of the developed model can be minimized [39]. The summary of statistical analysis on the input and output parameters with detail of the collected data is shown in Table 1.

Table 1 Summary of statistical analysis of model input parameters

3.2 Statistical Analysis

(i) Water to cement ratio (w/c)

According to the statistical evaluation on the collected data, w/c was ranged between 0.24 to 1.2, with mean, Standard Deviation (SD), Variance (Var), Skewness (Skew), and Kurtosis (Kur) of 0.44, 0.18, 0.03, 1.34, 3.32, respectively. The relation between w/c and compressive strength and the histogram for w/c is shown in Fig. 2a.

Fig. 2
figure 2

Marginal plot for a compressive strength (CS) with water to cement ratio, b CS with curing time, and c CS with fly ash content

(ii) Curing time (t), (days)

The collected dataset contained experimental results from previous studies; the curing time ranged from 1 to 360 days, with a median of 7 days, SD, Var, Skew, and Kur of 51 days, 2672.98 4.19, 21.64, respectively. The histogram for curing time and variation of compressive strength with curing time are presented in Fig. 2b.

(iii) Fly ash content, FA (%)

Based on the collected data from the literature, the maximum percentage of cement replacement with fly ash was 55%. With a mean, SD, Var, Skew, and Kur of 6.77%, 11.87, 140.97, 1.76, 2.13, respectively. The variation of compressive strength with the percentage of the replacement of fly ash content and histogram for fly ash content is displayed in Fig. 2c.

(iv) Compressive strength (CS)

From 450 datasets, the compressive strength of cement-based mortar modified with fly ash up to 360 days was ranged from 3.9 to 84 MPa, with a median of 30.3 MPa, SD, Var, Skew, and Kur of 14.02 MPa, 196.59, 0.57, and 0.18, respectively. The histogram of compressive strength of cement-based mortar modified with fly ash and Weibull Distribution Function is shown in Fig. 3a.

Fig. 3
figure 3

Histogram for a compressive strength, b flexural strength, and c splitting tensile strength for fly ash modified cement mortar from 1 to 360 days of curing

(v) Flexural strength (FS)

Based on 56 data of the tested sample collected from literature, flexural strength for cement-based mortar up to 360 days was ranged from 0.5 to 8.5 MPa, with a Median of 6.8, SD, Var, Skew, and Kur of 1.88 MPa, 3.526,–1.01, and–0.394, respectively. The histogram for flexural strength and smallest extreme value distribution function is shown in Fig. 3b.

(vi) Splitting tensile strength (STS)

According to the 26 data collected from previous research about fly ash modified cement-based mortar up to 360 days, the splitting tensile strength was varied from 1.2 to 4 MPa, with a median of 2.77 MPa. SD, Var, Skew, Kur of 0.837 MPa, 0.7, − 0.228, -1.1212, respectively. The histogram for splitting tensile strength with the smallest extreme value distribution function is displayed in Fig. 3c.

3.3 Modeling

From the correlation between independent variables and dependent variable direct relationship between cement-based- mortar compositions and compressive strength were not observed; as can be seen from the correlation matrix (Fig. 4), the correlation coefficient (R) of CS with w/c, curing time, and fly ash content are − 0.386, 0.541, − 0.279, respectively. Accordingly, the relations are poor between dependent and independent variables, which means that the compressive strength of the cement-mortar is a multivariable function. Therefore, MEP is used to develop a model to predict the compressive strength of cement-based mortar modified with fly ash based on the cement-mortar composition such as w/c, curing time, and fly ash content.

Fig. 4
figure 4

Correlation matrix for independent variables and dependent variable

3.3.1 Multi-Expression-Programing (MEP Model)

Genetic Algorithm (GA) was first introduced by Holland [40], which was motivated through evolution theory, similar to that of Genetic Programming (GP) proposed by Cramer [40,41,42]. Several linear variations of GP have already been proposed to deal with some difficulties (such as bloat) caused by tree representations of GP. A few examples are Cartesian Genetic Programming, Grammatical Evolution (GE), Linear GP, and Gene Expression Programming [43]. MEP individuals are strings of genes encoding complex computer programs; when MEPs package expressions for conceptual regression issues, they comparably represent them to how processors convert C or Pascal expressions into machine code [44]. Multiple solutions are stored in a separate chromosome in MEP individuals. The most acceptable option is generally chosen. This is known as strong implicit parallelism, and it is a distinctive characteristic of MEP [45, 46]. This feature does not make MEP more complex when compared with GE and GEP. The MEP model incorporates different fitting factors to generate a generalized relationship. Simple math operators were employed to generate simple expressions in this investigation, and a trial-and-error procedure was used to determine the fitting parameters [47], as presented in Table 2.

Table 2 Optimal parameters for MEP model

3.3.2 Nonlinear Regression Model (NLR)

The following formula can be considered a general form for developing a nonlinear regression model [39, 48] to predict the compressive strength from the cement-mortar components, including the FA content (Eq. 1).

$$CS={ \beta }_{1}{\left(\frac{w}{c}\right)}^{{ \beta }_{2}}{\left(t\right)}^{{ \beta }_{3}}+{ \beta }_{4}{\left(\frac{w}{c}\right)}^{{ \beta }_{5}}{\left(t\right)}^{{ \beta }_{6}}{\left(FA\right)}^{{ \beta }_{7}}$$
(1)

CS, w/c, t, and FA are compressive strength, water to cement ratio, curing time, fly ash content, and β1 to β7 are model parameters.

3.3.3 ANN Model

ANN is the computing system designed to simulate the way how the human brain processes and analyses. Also, this model is a machine learning system used for various numerical predictions/problems in Construction Engineering. ANN includes the input layer, the hidden layer (one or more layers), and the output layer. The hidden layer is related by weight, transfer function, and bias to the other layers. A multi-layer feed-forward network was programmed with a mixture of proportions, w/b, curing time, and FA content like inputs, and compressive strength as output. There is no standard method for designing or selecting a network architecture. Therefore, the maximum number of hidden layers and neurons was calculated by the trial and error test based on the lowest average square error criterion. The second step of the optimal network design process was to choose the optimum number of epochs during the training that gave the minimum MAE and RMSE and high R-value. The same preliminarily designed networks with hyperbolic tangent transfer functions were used to see the effect of several epochs on reducing the MAE and RMSE. The MAE variations with the number of epochs are presented for the preliminarily designed networks. After designing the optimum architecture, the available data set (total of 450 data) was divided into two parts; the first part was 2/3 of the overall data set (300) for training the network, the second part was 1/3 of the total data set (150) for testing and validating the network [17]. Several transfer functions and ANN structures with a varied number of hidden layers and neurons were tested to design the optimal network structure to predict the cement mortar compressive strength. Among the networks, one hidden layer with seven neurons and a hyperbolic tangent transfer function were chosen due to having the minimum mean absolute error (MAE) (Fig. 9). In this part of the research, the ANN model was used to estimate the compressive strength of FA-containing cement mortar as a cement replacement, w/b, curing time, and FA contents.

The Artificial Neural Network (ANN) is a computing system that resembles the human brain and its information analysis. In addition, this model is a machine learning system employed in construction engineering for various numerical forecasts and difficulties [49]. ANN consists of three layers input, hidden, and output layer; these layers are connected through biases and weights. The behavior of an ANN network is influenced by the connections of neurons pattern, which also determines the class of the network. It is possible to train a network to enhance network performance. In more technical terms, the topology of the network and connection weights change repeatedly such that the error at each output layer node is minimized [21]. In this study, a multi-layer feed-forward network was designed with mortar composition (w/c, t, FA) as input and CS as output, and a sigmoid activation function is used in the output layer.

$$Output=f\left(\sum_{j=1}^{n}{w}_{j}{x}_{j}+bias\right)$$
(2)

where j is the number of input variables, xj is the input number j, and bias is the threshold for sigmoid activation function. The typical process of the result of ANN is illustrated in Fig. 5.

Fig. 5
figure 5

Typical procedure for output of ANN network in a single node

3.3.4 M5P-Tree Model

Quinlan [50] first devised the M5 algorithm, which was developed into the M5P-tree algorithm [51]. One of the most significant advantages of model trees is their ability to efficiently solve problems, dealing with many data sets with a substantial number of attributes and dimensions. They are also noted for being powerful while dealing with missing data. The M5P-tree approach establishes a linear regression at the terminal node by classifying or partitioning diverse data areas into numerous separate spaces. It fits on each sub-location in a multivariate linear regression model. The error is estimated based on the default variance value inserted into the node. The general formula for the M5P-tree model is shown in Eq. 3.

$$CS= a+b\left(\frac{w}{c}\right)+c\left(t\right)+d\left(FA\right)$$
(3)

CS, w/c, t, and FA are compressive strength, water to cement ratio, curing time, fly ash content, and a, b, c, and d are model parameters (Table 3).

Table 3 Model parameters for M5P-tree model

3.3.5 Correlation of Compressive Strength with Flexural and Splitting Tensile Strengths

(i) Vipulanandan Correlation Model

A Vipulanandan correlation model was used to develop the relationship between CS and FS of cement mortar modified with fly ash [16, 52,53,54,55,56,57]. The model is displayed in Eq. 4.

$$FS \, or\, STS =\frac{CS}{a+b(CS)}$$
(4)

FS, STS, and CS are flexural strength, splitting tensile strength, and compressive strength.

a & b are model parameters. The performance of the Eq. 4 was compared with the following models (Eqs. 56, and 7).

(ii) Exponential Association 2 model

The Exponential Association 2 model is also used to correlate the flexural strength with the compressive strength of cement-based mortar; the model is shown in Eq. 5 [58, 59].

$$FS=\propto \left(1-{e}^{-\beta \left(CS\right)}\right)$$
(5)

FS, CS are flexural and compressive strengths, α & β are model parameters.

(iii) DR-Hill-Zero background model

Additionally, the DR-Hill-Zero background model is used to predict flexural and splitting tensile strengths from compressive strength, which is displayed in Eq. 6 [60].

$$FS\, or\, STS=\frac{\theta {(CS)}^{\eta }}{{\kappa }^{\eta }+{(CS)}^{\eta }}$$
(6)

FS, STS, and CS are flexural strength, splitting tensile strength, and compressive strength. θ, η, and κ are model parameters.

(iv) Power Model

The power model formula is presented in Eq. 7 [61].

$$FS\, or\, STS=\varphi {(CS)}^{\omega }$$
(7)

FS, STS, and CS are flexural strength, splitting tensile strength, and compressive strength. φ and ω and are model parameters.

4 Performance criteria for model evaluation

The developed models are evaluated based on different assessment tools to choose the best model to predict the CS of the mortar; the following are efficiency measurements for the models:

$${R}^{2}=1- \frac{\sum_{1}^{n}{\left(yp-ye\right)}^{2}}{\sum_{1}^{n}{\left(ye-\overline{ye }\right)}^{2}}$$
(8)
$$R=\sqrt{{R}^{2}}$$
(9)
$$RMSE= \sqrt{\frac{SSE}{n}} $$
(10)
$$MAE= \frac{\sum_{1}^{n}\left|yp-ye\right|}{n}$$
(11)
$$MBE= \frac{\sum_{1}^{n}\left(yp-ye\right)}{n}$$
(12)
$$SI= \frac{RMSE}{\overline{ye} }$$
(13)
$$OBJ= \left(\frac{{n}_{tr}}{{n}_{to}}*\frac{{RMSE}_{tr}+{MAE}_{tr}}{{{R}^{2}}_{tr}+1}\right)+\left(\frac{{n}_{te}}{{n}_{to}}*\frac{{RMSE}_{te}+{MAE}_{te}}{{{R}^{2}}_{te}+1}\right)+\left(\frac{{n}_{val}}{{n}_{to}}*\frac{{RMSE}_{val}+{MAE}_{val}}{{{R}^{2}}_{val}+1}\right)$$
(14)
$${t-}_{test }= \sqrt{\frac{\left(n-1\right){MBE}^{2}}{{RMSE}^{2}-{MBE}^{2}}}$$
(15)
$${U}_{95}=1.96*\sqrt{{SD}^{2}+{RMSE}^{2}}$$
(16)
$$\rho = \frac{SI}{1+R}$$
(17)

where R2, RMSE, MAE, MBE, SI, OBJ, t-test, U95, and ρ are Coefficient of Determination, Root Mean Squared Error, Mean Absolute Error, an Average of Errors, Scatter Index, Objective, t-test, 95% Confidence Uncertainty, and Performance Index, respectively. yp, ye, and \(\overline{ye }\) are predicted compressive strength, measured compressive strength, and an average of measured compressive strength, respectively. n, tr, te, val. are several data in the training, testing, and validating dataset.

For all of the assessment parameters, the ideal value is zero, while the best value for R2 is 1. Corresponding to SI, the performance of the model is excellent, good, fair, and poor if the SI < 0.1, 0.1 < SI < 0.2, 0.2 < SI < 0.3, and SI > 0.3, respectively [62].

5 Analysis of outputs

5.1 Relation between predicted and measured compressive strength

5.1.1 MEP model

Comparison of measured with the predicted value of CS using the MEP model is presented in Fig. 6. The model had a good performance with R2 of 0.87, 0.87, and 0.897 for training, testing, and validating, respectively. Figure 6a contained -20 and + 25% error lines in the training phase and -10 and 15% for testing and validating (Fig. 6 b &c).

Fig. 6
figure 6

Variation of CS Predicted with CS Measured using MEP model a training data, b testing data, and c validating data

$$CS=A+B+C+25-\frac{D-B-C-25}{D+\frac{2}{3}}-E-\frac{B+C}{F}$$
(18a)
$$A= \frac{2\left(\frac{w}{c}\right)\left(FA\right)}{25}-15{\left(\frac{w}{c}\right)}^{2}$$
(18b)
$$B=\frac{2}{15(t-15{\left(\frac{w}{c}\right)}^{2})}$$
(18c)
$$C=\frac{2(FA)}{{15(\frac{w}{c})}}$$
(18d)
$$D=\frac{225{(\frac{w}{c})}^{2}}{2(t-15{(\frac{w}{c})}^{2}}$$
(18e)
$$E= \frac{4{(FA)}^{2}}{375}$$
(18f)
$$F=225 {(\frac{w}{c})}^{3}$$
(18g)

No. of Data = 300, R2 = 0.858, RMSE = 4.943 MPa.

5.1.2 NLR Model

The variation of predicted compressive strength with measured compressive strength is displayed in Fig. 7. From the modeling result, it is clear w/c and curing time are affect the CS more than fly ash content. In comparison, the effect of w/c is more significant on the compression strength of cement-mortar. The model is developed, and the parameters are determined using the least square method and solver technique [63]. The NLR model is presented in Eq. 18.

Fig. 7
figure 7

Variation of CS Predicted with CS Measured using NLR model a training data, b testing data, and c validating data

$$CS=0.62\times \frac{{\left(t\right)}^{0.273}}{{\left(\frac{w}{c}\right)}^{0.872}}\times {\left(FA\right)}^{0.208}+7.681\times \frac{{\left(t\right)}^{0.235}}{{\left(\frac{w}{c}\right)}^{0.759}}$$
(19)

No. of Data = 300, R2 = 0.85, RMSE = 5.34 MPa.

5.1.3 ANN Model

Figure 8 shows the optimal ANN network structures, the best network structure (Fig. 8) selected containing one hidden layer and six hidden neurons, with momentum, learning rate, learning time of 0.1, 0.2, and 2000, respectively. Those mentioned parameters for the network were determined by trial and error based on RMSE and MAE, as illustrated in Fig. 9. Figure 10 shows variation in predicted CS with measured CS using the training dataset and error line -20 to + 20%, indicating the measurements and predictions are in this limit with R2, RMSE of 0.859, and 5.179 MPa.

Fig. 8
figure 8

Optimal ANN network structures a one hidden layer and 6 hidden neurons, b one hidden layer and 7 hidden neurons, and c one hidden layer and 10 hidden neurons

Fig. 9
figure 9

Optimal ANN network selection based on RMSE and MAE

Fig. 10
figure 10

Variation of CS Predicted with CS Measured using ANN model a training data, b testing data, and c validating data

5.1.4 M5P-Tree Model

Figure 11 shows the division of the input space by the algorithm of the M5P-tree model into four linear regression functions named LM 1 and LM 4. The relationship of predicted and measured CS of the M5P-tree model showed in Fig. 12, with R2 and RMSE of 0.824 and 5.771 MPa. There are -20 to 25% error lines for the training data set and -15 to 20% for testing, and -15 to 25% for validating datasets. Figure 11 shows the pruned M5P-tree, which classified the training dataset into four parts based on the criteria shown in the figure; each part of the divided dataset resulted in a single regression model as mentioned in Eq. 3, the model parameters for the M5P-tree model are summarized in Table 3.

Fig. 11
figure 11

Pruned M5P-tree model

Fig. 12
figure 12

Variation of CS Predicted with CS Measured using M5P-tree model a training data, b testing data, and c validating data

5.2 Relationship Between Compressive, Flexural, and Tensile Strengths

Based on the collected data, three different models were developed to predict flexural and splitting tensile strengths from measured compressive strength using the Vipulanandan correlation model, Exponential association-2, DR-Hill-Zero background, and Power model, as illustrated in Eqs. 20 to 25. Figure 13a shows the variation of FS with CS for data collected from literature and predicted FS using developed models. The residual error for predicted FS from CS ranged between 1  to − 1 MPa is shown in Fig. 13b. Variation of splitting tensile strength with CS is shown in Fig. 13c, and the residual errors for predicted STS from CS using ranged between 0.15 to − 0.35 MPa (Fig. 13d).

Fig. 13
figure 13

Comparing models for flexural strength, splitting tensile strength, and compressive strength correlation using a variation of FS with CS, b residual error to predict FS, c variation of STS with CS, and d residual error to predict STS

(v) Vipulanandan correlation model

$$FS=\frac{CS}{3.06+0.073(CS)}$$
(20)

No. of data = 56, R2 = 0.955, RMSE = 0.396 MPa

$$STS =\frac{CS}{5.144+0.108(CS)}$$
(21)

No. of data = 27, R2 = 0.981, RMSE = 0.115 MPa.

(vi) Exponential association 2

$$FS=9.446(1-{e}^{-0.032 \left(CS\right)})$$
(22)

No. of data = 56, R2 = 0.958, RMSE = 0.386 MPa.

(viii) DR-Hill-Zero Background

$$FS= \frac{10.789{(CS)}^{1.293}}{{26.574}^{1.293}+{(CS)}^{1.293}}$$
(23)

No. of Data = 56, R2 = 0.958, RMSE = 0.382 MPa

$$STS = \frac{71.87{(CS)}^{0.741}}{{1598.864}^{0.741}+{(CS)}^{0.741}}$$
(24)

No. of Data = 27, R2 = 0.982, RMSE = 0.11 MPa.

(viii) Power Model

$$STS= {0.316(CS)}^{0.714}$$
(25)

No. of Data = 27, R2 = 0.982, RMSE = 0.11 MPa.

Based on the R2 and RMSE, the DR-Hill-Zero background model is better than other models for predicting flexural strength from compressive strength; on the other hand, the best model for correlation of splitting tensile strength with compressive strength is DR-Hill-Zero background and Power Models.

5.3 Model Evaluations

The proposed models are compared according to the relationship between predicted and measured CS for testing data set; the MEP model had less variation; the plotted data are near the Y = X line, which indicates a minor error in predicted values, as shown in Fig. 14a. Furthermore, the maximum and minimum residual errors for the MEP model were -19 and 18 MPa. Residual error of NLR, ANN, and M5P-tree model was -12 to 14 MPa, -14 to 14 MPa, and -21 to 19 MPa, respectively. The residual error indicates better performance of the NLR model than other developed models, as shown in Fig. 14b. The residual errors for the ANN, M5P-tree, and MEP are provided in Fig. 14c and d.

Fig. 14
figure 14

Comparison of developed model based on a variation between measured and predicted CS values for testing data and b residual error for the MEP and NLR models c residual error for the MEP and ANN models (d) residual error for the MEP and M5P-tree models

The SI value of the MEP model, NLR, ANN, and M5P-tree model for the training dataset was 0.148, 0.16, 0.155, and 0.173. When comparing SI value for validating datasets, the SI value for the MEP model is less than NLR, ANN, and M5P-tree model by 8, 6, and 16.5%, respectively. For the testing dataset, the SI value of the MEP model is equal to 0.159 and less than ANN, and M5P-tree model by 10 and 5%, and more significant than the NLR model by 5%, as shown in Fig. 15a.

Fig. 15
figure 15

Comparing developed models based on a SI and b MAE

The comparison of developed models based on MAE is presented in Fig. 15b. The MAE for MEP models is less than the MAE of other developed models for training and validating datasets; however, the MAE of MEP model value for testing is less than ANN, and M5P- tree model by 8 and 4%, and greater than the NLR model by 6%.

The OBJ values for the proposed models are also evaluated; the OBJ for the MEP model is less than NLR, ANN, and M5P-tree models by 7, 6, and 14, as displayed in Fig. 16a.

Fig. 16
figure 16

Comparing developed models based on a OBJ and b T-stat and U95

The t-test and U95 values comparison for the developed models is illustrated in Fig. 16b. as can be seen from the figure, the uncertainty of the predicted compressive strength for 95% confidence level of MEP model is less than ANN and M5P-tree models by 2 and 6%, and greater than NLR model by 4%. However, the t-test value of the MEP model is less than other developed models. The t-test value results in a probability of accepting or rejecting the null hypothesis. The larger t-test value indicates a significant difference in the measured and predicted CS of the cement mortar.

Also, the performance index for the MEP model was less than other developed models for training and validating data. At the same time, it is greater than the NLR model in testing the data set by 4%, as presented in Fig. 17.

Fig. 17
figure 17

Comparing developed models using a performance index

The box plot for actual and predicted CS is drawn as shown in Fig. 18 (a, b & c). The boxplot for the MEP model had the same pattern for the minimum and maximum CS values, Mean and median. According to the box plot MEP model is better than other developed models.

Fig. 18
figure 18

Comparing developed models using boxplot for actual and predicted compressive strength values a training data, b testing data, and c validating data

Summary of model evaluation for R2, RMSE, and MAE of the developed models is presented in Table 4.

Table 4 Summary of developed models performance

5.4 Sensitivity Evaluation

The most influential parameter on the compressive strength of cement-based mortar modified with fly ash is determined using the MEP model. Every time a single input parameter is removed from the training dataset, regression is run again in the process. MAE for the model is recorded, the trial with maximum MAE (MPa) and RMSE (MPa) is chosen, and the trials ranked according to the recorded MAE the more sensitive variable in predicting the compressive strength of cement mortar modified with fly ash is the removed parameter from the trial with the highest MAE. Based on the sensitivity analysis, the most influential parameter is the curing time of the tested samples, as summarized in Table 5.

Table 5 Sensitivity analysis for the model parameters using MEP model

6 Conclusions

Accurate models can be developed using different soft computing techniques; in this study, four different approaches were used to establish a reliable model for the prediction of compressive strength of cement mortar modified with fly ash; the followings are the main conclusion:

  1. 1.

    Based on the collected data from literature maximum percentage of fly ash is 55%, w/c was ranged from 0.24 to 1.2. The addition of fly ash to cement mortar increased the compressive strength for the same mixture and w/c.

  2. 2.

    SI for the MEP model is less than NLR, ANN, and M5P-tree models in training and validating data set; on the other hand, the SI for MEP is more than the NLR model by 5%. The objective value for the MEP model is less than other developed models. 95% Uncertainty (U95) value for MEP is smaller than ANN and M5P-tree models. Nevertheless, its U95 value is greater than the U95 of the NLR model by 4%. t-test value for the MEP model is less than other developed models.

  3. 3.

    The performance index of the MEP model for training and the validating dataset is less than other developed models.

  4. 4.

    Based on the box plot for actual and predicted compressive strength, the MEP model predictions had the same arrangement as real compressive strength compared to other developed models in maximum, minimum, mean, and median.

  5. 5.

    According to the statistical evaluation tools, the MEP model is better than NLR, ANN, and M5P-tree models for compressive strength prediction. It has less scatter in predicted compressive strength compared with measured compressive strength.

  6. 6.

    Depending on the sensitivity evaluation result, the curing time of the sample is the most influential parameter on the compressive strength of cement mortar modified with fly ash.

  7. 7.

    Reliable nonlinear models were used to predict the splitting tensile strength and the flexure strength of cement mortar modified with FA with a minimal prediction error.