1 Introduction

In most developing countries like Nigeria, rapid increase in human population and urban settlements have increased the demand for good access highways, roads and housing facilities to guarantee connectivity among the different urban settlements. This, in addition to scarcity of quality aggregates, has compelled government and road authorities to make use of naturally occurring geomaterials or construction materials in constructing roads [1, 2]. Laterite is a kind of residual soil that is restricted to the humid and tropical parts of the world is an example of naturally occurring geomaterial which is widely used in the construction of engineering structures such as roads, compacted landfill liners, embankments as well as foundation materials [3, 4]. Residual soil such as laterite/lateritic soil often occurs in a loose, structured state and can collapse due to loading or wetting, resulting in sudden settlements [5]. Similarly, lateritic soil can experience strength deterioration because of their varying silt and clay contents which often make them sensitive to changes in moisture content [6]. In addition, when subjected to stresses, the soil particles are often less stable and crushable due to the aggregation of finer particles into coarse-grained size fractions, held together by cementing agent or clay matrix, and this may lead to volumetric compression [7]. For this soil to be suitable for highway and pavement design, ground/soil improvement method is usually deployed to address the numerous ground condition problems and improve its undesirable engineering properties. This improvement can be achieved in several ways, including soil densification (such as compaction or preloading), hydraulic modification (such dewatering or electro-osmosis), admixture stabilization (mechanical, chemical, and biological stabilization), geosynthetic reinforcement and structural inclusions [8].

Chemical stabilization has been shown to be an effective method of improving and modifying the index and engineering properties of residual soil such as lateritic soil [9,10,11]. Cement, lime, limestone, rice husk ash, wood ash, and fly ash are few examples of admixtures available to stabilize lateritic soils and these admixtures are classified by Reddy et al. [12] into cementitious, non-cementitious and chemical additives. Literature has shown that cementitious additives such as lime and cement are widely used in the stabilization of lateritic soils. Studies such as [13,14,15,16] reported that addition of lime and cement improved the CBR, unconfined compressive strength, plasticity and compressibility of the lateritic soil. However, the utilization of these conventional mineral additives is associated with cost, formation of hazardous compounds which could contaminate soil and groundwater and greenhouse gases emission challenges [17].

Consequently, these shortcomings have compelled researchers to source for alternative materials that are both economical and environmentally friendly in place of lime and cement. Researchers have also used agricultural and industrial wastes that are sourced locally and disposed in large quantities in landfills with its attendant environmental hazards to stabilize and improve the engineering properties of lateritic soils [18]. The outcomes of waste derivatives utilization such as fly ash [19], saw dust ash [20], bagasse ash [21], steel slag [22], rice husk ash [23, 24], groundnut shell ash [25], and oil palm fiber ash [26] which possess pozzolanic behavior as construction materials to stabilized and improve the properties of lateritic soils have shown desired effect or improvement when these alternative stabilizers are used alone or with cement/lime [27, 28].

Sawdust is one of the byproducts of the wood processing factories. The wood processing factories normally dumped the sawdust at different locations and allowed it to decay or resorts into incineration in the open atmosphere which poses a serious challenge to the environment [29]. However, to find an environmentally friendly disposal method and creative use of this industrial residue, sawdust has been processed to ash which are used as eco-efficient construction materials to treat lateritic soil whose properties do not meet the standard specification to achieve sustainability [30]. More so, there is need for the optimization of the mixture design when using saw dust ash to stabilize lateritic soil. This study will therefore concentrate on treating lateritic soil with sawdust in a two component mixture design, experimentally investigating its mechanical properties to establish its suitability for sustainable construction, and using Extreme Vertex Design (EVD) to optimize the mixture component so as to develop a model of the mixture design [31, 32]. Experimental assessments involving mixture components under imposed limits may be shown using the EVD approach. The experimental design points would be located in the feasible region within the simplex as a result of the restrictions leading to a decrease in the size of the factor-space. The components of a mixed experiment issue are varied in fractions, and the consequences of these changes on the response parameter are evaluated [33]. In situations when the factor level has various dependencies that are articulated via component constraints formulation, EVD offers an effective method for mixed experiment design. The generated experimental points are thus located inside the vertices and center edges of the practicable confined area [34].

Mixture design is a special type of response surface experiments whereby the ingredient ratio is more significant when compared to their quantity and also the product under study consists of several ingredients whose total sum is unitary. This design process further results in response surface optimization [35]. Mixture design can be subdivided into simplex lattice, simplex centroid and extreme vertex design method. Extreme vertex design makes use of a limited number of experimental rules required to assess the behaviour of obtained responses which is deployed for prediction of actual or experimental results [36]. This type of mixture design, upper and lower values are expertly assigned to each of the factor levels which is referred to as components constraints to minimize the experimental points required for evaluation of response function. These constraints are as a result of economic and safety considerations [37]. Mixture experiments are utilized widely in the area of blending experiments, viability of experiment choice or systems and formulations of mixture ingredients so as to obtain the preferential attribute composition of a given product to achieve desired goal and objectives in terms of cost effectiveness and quality. Mixture of more than one component to obtain homogenous and end product is termed mixture experiment [38].

Due to the imposition of sum to the constraints in mixture experiment presented in Eq. (1) for i  = 1, 2, …, q, the linear models presented in Eq. (2). become redundant in mixture designs. More so, Scheffe proposed linear and quadratic mixture models as shown in Eqs. (3) and (4) [39].

$$\sum_{i = 1}^q {x_i - 1 = 0}$$
(1)
$$\beta_0 + \sum_{i = 1}^q {\beta_i } x_i + \varepsilon$$
(2)
$$y = \sum_{i = 1}^q {\beta_i } x_i + \varepsilon$$
(3)
$$y = \sum_{i = 1}^q {\beta_i } x_i + \sum_{i = 1}^{q - 1} {\beta_i } x_i + \sum_{j = i + 1}^q {\beta {}_{ij}} x_i x_j + \varepsilon$$
(4)

where \(\beta_i\) is the model coefficients at the simplex vertices where \(x_i = 1\), otherwise known as the pure blend and \(\beta_{ij}\) is the model coefficient which signifies the quality of quadratic curvature at the edge of the simplex region having binary mixtures of \(x_i x{}_j\). When the model coefficient \(\beta_{ij}\) is negative it indicates antagonistic blend while synergistic blend signifies positive \(\beta_{ij}\) [40]. Mclean and Anderson [33] discovered extreme vertices can be calculated by combinations formulation of the lower (ai) and upper (bi) limits of the initial (q − 1) components using two-level factorial designs and thereafter deriving the qth component level. Extreme vertex point is observed if the calculated Xq values lie between or on the upper (bq) or lower (aq) limits of Xq with the limit points defined as shown in Eq. (5) [41].

$$a_q \le x_q \le b_q$$
(5)

This computation produces \(2^{q - 1}\) possible experimental points and the process is repeated q-times which permits each of x (factor levels) to be the computed variable to produce a total of \(N = q \times 2^{q - 1}\) possible points. Extreme vertex designs are mixture designs which cover sub-positions of the simplex when the components or factor levels are constrained to upper and lower bounds and these imposed constraints must be consistent as shown in Eq. (6) [42].

$$0 \le L_i \le x_i \le y_i \le 1.\quad {\text{For}}\;i = 1,2 \ldots q$$
(6)

In this research study, using the EVD mixture design method, two components blend namely loose laterite soil and saw dust ash were optimized in terms of its mechanical characteristics to achieve sustainable construction materials and robust performance. The optimal fraction of the factor levels was determined through the desirability function using multiple optimization criteria in Design Expert software. This experimental research work presents essential insight on the constrained simplex mixture design application for soil-additive blend engineering properties evaluation [2, 43].

2 Methodology

2.1 Materials

The Lateritic soil for the experiments was collected from a borrow-pit in Olokoro, Umuahia, Abia state, Nigeria, at a depth of 2 m. The sample was taken in a solid condition and had a reddish brown hue. It was air dried and ground using a pestle in a tray and sieved using a neatly organized British standard in accordance with ASTM-Vol. 04.08 [44]. The Saw dust residue was acquired from a timber-workshop at Abia State, Nigeria. The trees found in the workshop where the processed saw dust was obtained are majorly hard wood for furniture purposes such as mahogany, walnut, maple, oak, and cherry. The saw dust, processed ash and the sample mixed with weak soil is shown in Fig. 1. The industrial residue was further burnt through controlled incineration system in a muffle furnace to acquire ash samples, which were then processed for laboratory testing in line with ASTM C618 [45] and BS 8615-1 (2019) [46] and sieved with a 150 μm to obtain finer particles.

Fig. 1
figure 1

Materials for the experiment. a Saw dust waste; b saw dust ash; c soil–saw dust ash mixture

2.2 Test Methods

The program for this investigation was conducted in accordance with the guidelines outlined in BS 1377 (1990) [47] and BS 1924 (2018) [48] for the enhancement of lateritic clayey soil engineering characteristics using sawdust ash (SDA). Derivation of general engineering properties of the test soil and classification using AASHTO were carried out which includes; specific gravity test, sieve analysis, compaction test, consistency limit, CBR and UCS test [49]. This mixture experiment problem which consists of two components namely; saw dust ash (SDA) and lateritic soil, while the optimum moisture content derived during the preliminary tests was utilized for the investigation. I-optimal design computation with cubic model mixture design were deployed for the computation of the ingredients’ ratios and the number of experimental-runs required taken into consideration the imposed components constraints formulated [50, 51]. The formulated constituent proportions were taken into consideration while conducting CBR and UCS experiments, and the resulting responses were used to simulate the soil–SDA blend's mechanical characteristics. Following the development of experimental responses, further statistical impacts and diagnostic tests were conducted to support the created EVD model [52]. Furthermore, the optimal combination of the soil–SDA blend for maximum mechanical response was determined through graphical and numerical optimization which were conducted using desirability-function computation to take into account the factor levels while maximizing the output variable criterion (Design expert 11; Minitab 18) [53, 54]. The program flowchart presented in Fig. 2.

Fig. 2
figure 2

Program Flowchart (source: Aju et al. 2021) [55]

2.2.1 California Bearing Ratio (CBR)

CBR test presents an essential soil strength characteristics indicator which is carried out in accordance with BS 1924 [48] and BS 1377 [47] specifications. Making use of the formulated proportions of varying soil–SDA blend and required number of runs, the experiments were carried out for the mixed soil specimens carefully compacted based on British Standard Light (BSL) compaction energy. The blended soil samples in the course of the laboratory exercise were adequately compacted in layers of three sections using a 2.5 kg weight rammer with about 62 number of blows for each of the layer. The compacted samples were further cured for 7 days after which they are immersed in water for two days to get the test specimens ready to be put through the CBR machine's static loading mechanism until failure is witnessed [56].

2.2.2 Unconfined Compressive Strength (UCS)

UCS is defined as the maximum axial compressive stress which a cylindrical molded or compacted soil–SDA blended sample can bear or resist at unconfined settings and carried out in accordance with BS 1377 [47] specifications. The mixture samples were compacted using compaction energy of British Standard Light and further cured for seven days after which they are positioned without lateral support in the loading frame UCS testing apparatus to determine the failure load at constant rate of stream [55].

2.3 Mixture Components Constraints Formulation

As a starting point for the formulation and design of component constraints, bibliographic reviews, professional opinion, evidence-based interrelations, a wealth of experiences, as well as practical, environmental, and financial factors, were used. The formulation of the combination ratio of the mixture's ingredients, which is constrained by the sum-to-one constraint, will be moderated by these boundary limits [57]. The upper and lower limits for each of the parameters are presented in the design components constraints shown in Table 1 ranging from 0.5–1.0 for the soil sample and 0–0.5 for the saw dust ash material. The viable experimental-region is placed inside the confined area of the simplex to adopt a hyper-polyhedron simplex form due to multi-criteria limitations put on factor levels. The degree of freedom (df) is further evaluated by computing the design matrix for the design trials using the cubic mixture model with L pseudo mixture component coding, as shown in Table 2. Table 3 displays the information-matrix measurements for leverages, space type, and mean. Six experimental runs were created in order to increase the optimality or performance of the established EVD-model [58].

Table 1 Design constraints
Table 2 Design matrix evaluation
Table 3 Measures derived from the information matrix

2.3.1 Two-Component Factor Space Plot

Figure 3 illustrates the contour plot created for the two-component restricted simplex with the space type on the x-axis and the component distribution on the y-axis. The graphical representation depicts the diagrammatic placements and arrangements of the experimental points inside the confined area of the simplex in designated regions, namely the vertex, axial CB, and center [59].

Fig. 3
figure 3

Factor-space of two-component simplex

2.3.2 Fraction of Design Space (FDS)

Fractional design space (FDS) is a schematic approach that offers a straightforward setting for evaluating functionality and assessing contrast in a two-component mixture experiment of soil-additive bend. FDS offers computational data from the scaled prediction variance curvature for expected design region; and also measures G-efficiency which affords needed template for studied comparisons as shown in Figs. 4, 5 [60]. The maximum, lowest, and mean standard-error values were derived from the displayed FDS at 0.993, 0.618, and 0.792, respectively. Furthermore, Coefficient Matrix condition number of 6.477 was obtained with G efficiency of 78.2% which holds converse correlation with the maximum variance, computed determinant (X'X)−1 of 2.095E + 2 and Trace (X'X)−1 of 84.667, Scaled D-optimality Criterion and I-optimality result of 15.12 and 0.859 were calculated to enable the components fractions formulation and number of experimental runs [61].

Fig. 4
figure 4

Fraction design space plot for two-component

Fig. 5
figure 5

Two-component standard error design plot

2.3.3 Experimental Mix Proportions

Six generated runs of experimental run were obtained from the simplex cubic mixture design information-matrix computation which evaluates the imposed components constraints and using criterion optimality algorithm in the Design Expert software to improve suitable experimental point selection within the confined design space. To achieve accurate optimization results, the generated two-component proportions and number of experimental runs as derived by the statistical software are expected to be strictly taken in the experiments to generate the CBR and UCS responses. The determined proportions of the combination elements would be utilized to construct an EVD-model capable of forecasting the mechanical characteristics of the soil–SDA blend. These obtained mixes for the actual and pseudo fractions provide a special guide for specimens to be tested in the laboratory to achieve the desired responses as shown in Table 4 [62]. In restricted simplex mixture designs, pseudo values are coded or fictitious parameters that are used to facilitate model fitting by minimizing multi-collinearity between component boundaries. The blended soil–SDA mixes for the six experimental runs were affected with the aid of derived optimum moisture content of 17% to achieve homogenous mixture of the two-component binary blend with SDA fractions added to the soil from 0 to 50%.

Table 4 Mixtures of component proportions for the experimental research

3 Discussion of Results

3.1 Characterization of Test Soil and Admixture

The preliminary results on the soil samples which were aimed at deriving the engineering behavior of the test materials and for classification using a unified soil classification system (USCS) and AASHTO method as shown in Table 5. The laboratory results showed specific gravity of 2.35, plasticity index of 25%, 13% for shrinkage limit and liquid limit of 59% which signifies high plasticity behavior with high swelling potential and also poorly graded with AASHTO and USCS of A-7 and CH, respectively [63]. The obtained experimental results when compared with federal ministry of works specification implies unsuitable and unacceptable performance for construction works with 7% CBR, max-dry-density of 1.74 mg/m3 and 17% optimum-moisture-content. The particle size distribution of the test materials is presented in Fig. 6 which showed 96.6–42% passing through 2 mm–75 μm sieve aperture for the soil and 99.57–53.4% were also passing through 2 mm–75 μm sieve aperture for SDA [64].

Table 5 General test soil properties
Fig. 6
figure 6

Particle size distribution

3.1.1 Chemical Properties

Table 6 shows the chemical properties of the test mixture ingredients with soil SDA possessing constituents of 65.28% for SiO2, 2.75% for Fe2O3 and 5.52% for Al2O3 which indicates cumulative sum (SiO2 + Fe2O3 + Al2O3) of 73.55% to signify good pozzolanic behavior according to ASTM C618, 98 [65] standard requirements. However, the test soil has high alumina, iron-oxide and silica with 17.91%, 2.32% and 49.74%, respectively. From the presented results, the derived elemental oxides prevailing in the soil and SDA would react in the presence of water with the abundant alumina silica in the blended mixture through pozzolanic reaction to produce stable calcium-silicate-hydrate as hydration products which improves the soil’s mechanical properties [66].

Table 6 Chemical properties of SDA and Soil

3.2 Mechanical Properties of Soil–SDA Blend

The laboratory test responses to assess the mechanical behavior of the soil–SDA blend with respect to the formulated mixture ingredients proportions through the design six experimental runs are presented in Table 7 and contour plot in Fig. 7 to observe the effects of the additives additions to enhance the soil engineering properties. The derived experimental results showed maximum CBR and UCS results of 35% and 248 kN/m2 with mixture fraction of 0.875: 0.125 for soil and SDA two component mixture, respectively [67]. However, the unblended soil with ratio of 1:0 for soil and SDA, respectively, produced the minimum CBR and UCS results of 7% and 64 kN/m2 correspondingly. The pozzolanic reactions of SDA to produce cementitious products by combining the Pozzolana in SDA with the CaOH in the soil are what give the material its increased strength. The low strength shown by the SDA, which therefore occupies inside the weak soil sample, is responsible for the drop in outcome after reaching the optimal value. Weak connections between the soil and cementitious chemicals are created when too much SDA is added to the soil, which is aligned with Obeta et al. [68] study results. According to the federal Ministry of Works standard for subgrade materials, the laboratory findings show improved mechanical strength performance on the poor soil for construction road foundation applications when SDA is mixed with it at 12–25% [69].

Table 7 Mixtures of components for the experimental research
Fig. 7
figure 7

Response of the mechanical strength to SDA interaction with soil

4 EVD Model Development, Formulation, and Validation

After the experimental exercises using the formulated mixture components proportions, the derived details are taken as the system database to develop the EVD-model which is adapted to analyze the two component mixture design optimization problem with quadratic (square-root, λ = 0.5) power transformation. The responses obtained for CBR and UCS experiments range from 7 to 35% with maximum-to-minimum ratio of 5 and 64–248 kN/m2 with maximum-to-minimum ratio of 3.875 respectively [69]. To derive the optimal two component mixture ratio which evaluates the mechanical behavior of the soil–SDA blended mixture, fit statistics, statistical influences and diagnostics, numerical and graphical optimization were executed using design expert software and Minitab software [53, 54]. Furthermore, post statistical analysis, confirmation and point prediction computation, were carried out to derive the EVD-model coefficients after which model simulation of the developed model were achieved to validate the applicability and performance of the prediction module for field construction works [70].

4.1 Fit-Summary Statistics

Fit statistical computations were first performed on the derived experimental datasets to ascertain its type and suitability so as to attain the requirements for EVD modeling using L_Pseudo mixture component coding. The summary derived from this statistical computation gathers vital details which would enable the selection of the most appropriate starting point for the EVD model development. With the help of Whitcomb score, the suggested suitable model were taken which are expected to be the starting point acceptable for the model fitting [71]. This statistical analysis shows the fit summary which search for the right model using evaluation indicators such as standard deviation, coefficient-of-determination (R-sqd.), predicted sum of squares (PRESS) which measures how appropriate the model fits the design points, sequential model sum of squares and lack of fit as presented in Tables 8, 9, 10, 11, 12, and 13. The results shown for the CBR and UCS response variables help to select the required polynomial order which possesses significant terms. Cubic models were suggested for the two responses under study with R-sqd. of 0.9753 and 0.9901 for CBR and UCS, respectively. The sequential sum of squares statistical evaluation presents p-value of 0.0383 and 0.0395 for CBR and UCS, respectively, which indicates statistical significance between the selected models and the target variables [72].

Table 8 CBR response model summary statistics
Table 9 Lack of fit tests for CBR response
Table 10 Sum of squares [type I] sequential model for CBR response
Table 11 Model summary statistics for UCS response
Table 12 Lack of fit tests for UCS response
Table 13 Sequential model sum of squares [type I] for UCS response

4.2 Results of an Analysis of Variance (ANOVA) Test

ANOVA computational statistics were performed once the preferred appropriate model was selected through fit statistical calculations whereby minimum p-value conditions are deployed to enable determination of significant model terms. A univariate ANOVA was deployed in this research study as provided by the Design Expert software because we intend to test the effects of the model terms on one dependent variable of CBR and UCS responses successively. Adopting the suitable cubic models and using L_Pseudo for the CBR and UCS, respectively, computation of partial sum of squares type III and calculation of R-sqd. statistics as shown in Tables 14, 15, 16, and 17. From the computed results, p-value, model and lack of fit F-value of 0.0368, 26.35 and 11.79, respectively, was obtained for the CBR response. However, for the UCS response, 0.0148, 66.78 and 2.7 was calculated for p-value, model and lack of fit F-value, respectively, which indicated the derived model terms are statistically significant and also that the lack of fit is not significant relative to the computed pure error [73].

Table 14 Cubic mixture model for CBR response using ANOVA
Table 15 R-sqd. Calculations for CBR Response
Table 16 Cubic mixture model for UCS Response using ANOVA
Table 17 R-sqd. calculations for UCS response

A negative Pred. R-Squared suggests that the overall mean predicts your reaction better than the present model. The signal-to-noise ratio is measured using Adeq.-Prec., and a determined value of > 4 is preferable. As a result, the resulting ratio of 11.023 shows a sufficient signal. This paradigm is useful for navigating the design space [74].

The Pred. R-Sqd. of 0.9901 and the Adjd. R-Sqd. of 0.9753 are reasonably in agreement, therefore the difference is less than 0.2. The signal to noise ratio is measured using Adeq.-Prec., where a determined value of > 4 is preferred. Your signal is strong enough based on your ratio of 16.887. To move about the design area, utilize this model [75].

4.3 Estimates of the Coefficients and Model Equations

Through meticulous statistical analysis performed with the aid of Design Expert software, the appropriate model coefficient estimations and equations are generated. For the CBR and UCS responses, the computations clearly illustrate the standard error, variance inflation factor (VIF), low and high confidence ranges, and coefficient estimate as given in Tables 18, 19, 20, and 21. The influences of orthogonality absence on the generated model coefficients’ variances are evaluated by VIF, therefore, the standard error of the model is proportional to square root of VIF [76].

Table 18 Results of the model coefficients computation for CBR response
Table 19 Final equation in terms of L_Pseudo components
Table 20 Results of the model coefficients computation for UCS response
Table 21 Final Equation in Terms of L_Pseudo Components:

The developed model equations in pseudo coded factor terms can be used to carry out investigative predictions about the response for given levels of each factor to evaluate the response variables [53, 54].

4.4 Diagnostics Plots

Diagnostic statistical plots are of the form of scattered plot which presents the models’ error prediction or residuals vs. predicted results to investigate how further the achieved model estimation can be improved. The diagnostic plot also uses studentized residuals to assess the developed EVD model’s goodness-of-fit and also to affirm the fulfilment of regression assumption conditions so as to fathom undue observations which could present substantial influences on the analysis outcome [77]. The calculated standard errors of the residuals are diverse unless the leverages of the experimental runs are similar which signifies that the model residuals which are not studentized are inadequate for evaluation of regression model assumption. Therefore, studentized residuals are highly recommended so that all observed normal distributions from varying dimensions are sequentially mapped to solitary distribution. To achieve diagnostic statistical analysis with respect to the model target responses, predicted vs. residual, normal probability, run plot vs. residuals, predicted vs. actual and box-cox power transformation to help detect problems with the analysis using outliers [78].

4.4.1 Normal-Probability Plot

The first diagnostic statistical assessment is executed in the form of normal probability plot which ensures the EVD model residuals assume the path of a normal standard distribution closer to the regression line as shown in Fig. 8. for the CBR and UCS response parameters. The normal probability plot presents the externally studentized residuals on the x-axis and the normal probability in percentage on the y-axis. To derive better results, defined graph patterns such as curve or linear shapes are carefully studied to determine when it’s necessary to deploy transformation of the response variables [79].

Fig. 8
figure 8

Residuals normal probability plots for the target responses

4.4.2 Predicted vs. Studentized Residuals Plot

This statistical diagnostic test presents the externally studentized residuals on the vertical axis and the EVD mode predicted results on the horizontal axis of the graphical scattered plot to validate the residuals’ constant variance assumption for the target responses as presented in Fig. 9. The derived results from the statistical plot indicate clustering of predicted data at zero externally studentized points at boundary points of \(\pm 76.39\) for CBR and UCS response variables, respectively [80].

Fig. 9
figure 9

Residual vs. predicted diagnostic statistical graphs

4.4.3 Experimental Run vs. Studentized Residuals Plot

This diagnostic statistical plot presents experimental run on the x-axis against the externally studentized residuals on the y-axis to evaluate the studentized residuals vs. experimental run’s effects on the response variables under study. This analytical plot basically search for prowling factors that would influence the dependent parameters during the statistical computation as presented in Fig. 10. Similar to the predicted vs. residual plots, the computed results for the experimental runs were found to be positioned about the zero external studentized residual regions for the CBR response while the result for the UCS response spread out more to about \(\pm 25\) residual points for run number 2, 3, 5 and 6, respectively. However, the derived outcomes were overall situated at the boundary of \(\pm 76.39\) studentized residual limits [81].

Fig. 10
figure 10

Residual vs. experimental run diagnostic graphs

4.4.4 Experimental vs. Predicted Results

This statistical diagnostic graph presents a straight line plot of the experimental or actual results on the horizontal axis against the EVD model predicted values on the vertical axis to evaluate the generated model performance in terms of accuracy of prediction. The regression line is used to examine how well the two datasets fits the square root transformation output (λ = 0.5) of the two response variables as shown in Fig. 11. The essence of this analytical diagnostic computation is to investigate the experimental groups or design points which the developed EVD model cannot accurately predict and to evaluate the laboratory or actual and predicted response relationships. The obtained results from the plots indicated predicted and actual points ranging from 2 to 6 for CBR and 8–16 for UCS responses, respectively [82].

Fig. 11
figure 11

Actual vs. model predicted diagnostic graphs

4.4.5 Box-Cox Plot Power Transformation

This statistical computation graph presents essential conditions for choosing appropriate power transformation law to investigate the significant effects of the factor levels on the response variables. The analytical plot is dependent upon the obtained best lambda results on the x-axis derived at the minimum point of the curve on the graph with respect to the natural logarithm of the sum-of-squares residuals on the y-axis, the preferred transformation is then endorsed as presented in Fig. 12 [83]. From the statistical plots for the CBR response showed 0.5 current lambda, high and low confidence interval (C.I.) of 2.8 and −2.07 lambda, respectively, with best lambda of 0.2. However, the UCS response showed best and current lambda of 0.67 and 0.5 respectively, high and low C.I. of 3.54 and −3.45, respectively [84].

Fig. 12
figure 12

Box-cox plots for power transformation

4.5 Diagnostic-Influence Plots

This diagnostic effect graphical calculation makes it possible to assess the potential impact or influence of experimental runs on the results of the study. When a case or a few instances stand out from other groups in the experimental design, these analytical charts provide a better statistical viewpoint. Utilizing the crucial instruments of leverage vs. experimental run, cook's distance, and statistical difference in fits (DFFITS) vs. run of experiments, the diagnostic statistical influence calculation was carried out [85].

4.5.1 The Cook’s Distance

Cook’s distance is a statistical diagnostic influence tool utilized in regression statistics to determine significant outliers in the factor levels with respect to the response parameters. It is then used to locate predictor variables in the possible design points that have an adverse impact on the created EVD model. Additionally, it is used to locate areas or planes with an apparent strong correlational link, as illustrated in Fig. 13. The computed cook's distance value, which ranges from 0 to 40, is shown on the y-axis of the graph, and the six trial runs are shown on the x plane. For the CBR and UCS responses, experimental run number 2 was observed to lie above the zero to one cook’s distance boundary which indicates overall positive influences of the factor levels on the developed model [86].

Fig. 13
figure 13

Cook’s distance influence plot

4.5.2 Leverage vs. Run

In statistical diagnostic influence computations, leverage is adapted to investigate the magnitude which each design point in an experimental design affects the goodness-of-fit of the developed model. It is also the measure of distance in which the factor levels or independent variables of a given observation are away from a set of other observations in the mixture experiment design. The leverage point is the area of observation that presents unfamiliar estimations from the several observations under study as shown in Fig. 14 presenting the experimental runs on the x-axis and the calculated leverage points ranging from 0 to 1 on the vertical axis of the plot. When leverage of 1 is derived, it indicates that the developed EVD-model at the point of interest for evaluation perfectly fits the observation. The plotted results for the CBR and UCS responses present leverage boundary points of 0–0.66667. Experimental runs number 1, 4, and 6 were observed to be situated below the leverage limits at about 0.5 while runs 2, 3, and 5 were found above the line from about 0.75–1.0 [31, 39].

Fig. 14
figure 14

Diagnostic influence plot of leverage vs. run

4.5.3 Runs vs. Statistical Difference in Fits (DFFITS)

The DFFITS statistic, which provides a scaled assessment of the expected response variation for the ith observation, is a crucial tool for studentized diagnostic impact. The difference in fits (DFFITS) also defines the variances in the EVD-model responses for particular point on the experimental design when the model fitting processes are excluded as shown in Fig. 15. The presented graphical computation displays the six experimental runs on the horizontal axis and the derived DFFITS results ranging from −30 to 10 and −150 to 50 for CBR and UCS, respectively, with boundary limits of \(\pm\) 2.44949 on the y-axis. The results plotted for the CBR and UCS responses were observed to be situated at approximately DFFITS points from −1 to 5 and −10 to 30 respectively [87].

Fig. 15
figure 15

Experimental run vs. diagnostic influence plot of DFFITS

The diagnostic statistical summary and influences for the analytical computations carried out in this experimental investigation and EVD model development for the CBR and UCS responses are shown in Tables 22, 23. The results presents the predicted vs. actual values, the computed residuals, leverages, internally and externally studentized residuals, cook’s distance and influence on fitted value [35].

Table 22 Diagnostics case statistics for CBR response
Table 23 Diagnostics case statistics for UCS response

4.6 Optimization Overview

To maximize the response variables, optimization is further carried out using the desirability function, which evaluates the imposed components constraints on the model variables. This is done after the fits statistical computation, analysis of variances (ANOVA), statistical diagnostics, and influences graphical computations have been achieved to generate the EVD model, analyze, and validate models. To do this, the aspects of the goals functions are routinely altered by analytical weight function adaptation in accordance with the predetermined model variable criteria. Through a scale of \(0 \le d\left( {y_i } \right) \le 1\) boundary conditions, these multi-collinearity criteria would allow for the accomplishment of the necessary circumstances to reach a desirability score of 1.0 [88]. The optimization procedures for this two component mixture design explores the varying formulated ingredients combinations in the factor space with respect to the sort after response which satisfies the imposed criteria or components constraints for the factor levels and the corresponding optimized dependent variables simultaneously. The goal of optimization as expected is set to in-range preference for the factor levels and at maximum option for the target responses to ascertain the optimal ratio of the mixture ingredients with maximized responses bounded by the derived upper and lower limits from the experimental details as shown in Table 24 [89].

Table 24 Variables constraints for optimization

Table 25 shows the calculated mixture design optimization solution that was obtained from the analytical results of the mixed experiment designs. From the deduced data, a combination ratio of 0.8125: 0.1875 produced a maximum CBR and UCS response of 35.053% and 257.152 kN/m2, respectively, with an ideal desirability score of 1.0 [90].

Table 25 Computed mixture design optimization solutions

4.6.1 Graphical Optimization Plots of Ramps and Bar

The outcomes from the mixture design optimization processes were presented graphically in the form of bar chart and ramps for clearer understanding of the derived results. To further explain the acquired desirability function findings as shown in Figs. 16, 17, 18, bar graphs and optimization ramps illustrate the ideal solution for the mixture components in red and the dependent parameters in blue hues. The results presented indicate explanatory and response variables desirability score of 1.0, while the combined optimal desirability result of 1.0 was calculated which denotes satisfactory performance when the prescribed factors criteria are taken into account. The optimization results is also in agreement with the findings of Aju et al. [55] and Sahni et al. [86]

Fig. 16
figure 16

Ramps graph for the desirability function

Fig. 17
figure 17

Bar graph for the desirability function

Fig. 18
figure 18

Graphical plot of two-component mixture optimization

4.7 EVD-Model Post Analysis and Simulation

This stage in the computation is executed after development of the EVD model and numerical optimization to further evaluate the generated factor values, the derivation of sample means and coefficient table through point prediction and confirmation. The simulation of the generated EVD model is thus carried out to assess its performance and applicability and the validity of the derived statistical analysis results [91].

4.7.1 Point Prediction Computation

Point prediction, as shown in Table 26, is a post-analysis tool that allows the factor values to be examined and uses the generated model’s fit calculation and the recommended settings for the factor tools to provide interval estimations once the study is successfully completed. The calculated findings showed standard deviations of 3.865 and 17.81, with 95% confidence intervals of 23.26–49.27 and 200.699–320.601 for CBR and UCS, respectively [71].

Table 26 Point prediction

4.7.2 Coefficient-Table

The coefficient table presents the ratio combination magnitude which assesses the two-component soil–SDA mixture blend which was aimed at enhancing the soils engineering properties for construction works as shown in Table 27. From the analysis of the multi-criteria optimization using EVD method to obtain suitable performance the factor levels coefficients are derived. Cubic models were adapted in this optimization exercise based on the recommendations from the fits statistics for the CBR and UCS responses, respectively, to attain better experimental data generalization [92]. The p-values for each term of the models were calculated and displayed in the legend column, where p-values < 0.01 represent the best measures for strong significance (presented in red), p-values ≤ 0.05 and ≥ 0.01 represent moderate significance (presented in green), p-values ≤ 0.10 and ≥ 0.05 represent slight significance (presented in blue), and p-values ≥ 0.10 represent level of non-insignificant (presented in black) [93].

Table 27 Coefficient table

4.7.3 EVD Model Simulation

The post analysis and derivation of the coefficient table are completed before the EVD-model simulation exercise, which is also seen as the concluding stage of the statistical analysis of the two-component design of experiment optimization. This is achieved using statistical methods which are essential so as to simulate the non-trivial scenario to give appropriate practical guide for consultants, contractors, field operators, and designers on the expected performance of the EVD-model when compared to real life applications. The derived results from this statistical analysis exercise is helpful to validate the rigors through fits computation, diagnostics statistics and influences followed through the development of the model. A graphical representation showing the laboratory or actual responses against EVD-model simulated results are presented in Fig. 19 [94, 95]. The plotted datasets were statistically compared using Minitab and Microsoft Excel software to evaluate the degree of correlation and statistical significance at 95% confidence interval as shown in Tables 28, 29, 30, 31 using student’s t-test and F-test. The t-test and F-test are statistical tests in which the test statistic follows t-distribution and F-distribution, respectively, under null hypothesis. These tests are essential when comparing statistical models that have been fitted to a dataset to detect how significant the differences between group means are. This would help to identify if the differences in means could have occurred by chance in hypothesis testing. P(F <  = f) two-tail values 0.960 and 0.977, which are larger than 0.05 for the CBR and UCS answers for the student's t-test, respectively, were calculated from the computed results. Additionally, P(T <  = t) one-tail values of 0.490 and 0.499 were determined for the CBR and UCS responses based on the findings of the F-test. Additionally, the results show that there is no statistically significant difference between the experimental and model outcomes, indicating that the model performed well and was consistent with Alaneme et al. [59] findings.

Fig. 19
figure 19

Actual vs. EVD-model predicted responses

Table 28 T-test two-sample for variances for CBR response
Table 29 F-test two-sample for variances for CBR response
Table 30 T-test two-sample for variances for UCS response
Table 31 F-test two-sample for variances for UCS response

5 Conclusion

Two components mixture design optimization using extreme vertex design method was adapted in this experimental research which evaluate the soil–SDA mechanical strength behavior for sustainable foundation construction. From the results derived, the following conclusions can be drawn:

  • The mixture ingredients constraints were set at 0–50% range to investigate and ascertain appropriate fractions with optimum strength performance of the blended mixture to enhance utilization of industrial waste to achieve eco-friendly stabilization of weak soils.

  • Chemical properties test carried out on SDA signified good pozzolanic behavior with 65.28% of SiO2, 2.75% of Fe2O3 and 5.52% of Al2O3 to obtain a total value of 73.55%. While the preliminary tests carried out on the test soil showed inadequate strength performance and potential swelling behavior when compared to federal ministry of works specifications.

  • The formulated mixture ingredients using I-optimal design for factor space evaluation were taken for experimental methodology to ascertain corresponding responses for the UCS and CBR properties of the treated soil samples.

  • The conclusions reached demonstrated a considerable improvement in the behavior of the soil's strength for pavement foundation materials while promoting the reuse of industrial leftovers, which is a crucial component of waste management. This is because the low strength displayed by the SDA, which consequently takes up space in the sample, is attributed to the pozzolanic reactions of SDA that result in the cementitious products being formed between the CaOH present in the soil and the pozzolana present in SDA after the optimum value.

  • For the analysis and development of the EVD-model, information was taken from experimental exercises, including the formulation of the mixture components' fractions and derived responses. The steps involved statistical evaluation, ANOVA, diagnostic tests, and influence statistics to analyze the datasets.

  • Additionally, by applying the desirability function, numerical and graphical optimization was accomplished to find the mixture components' optimal points with the highest response to meet or fulfill the many requirements outlined for the variables in the EVD-model.

  • The calculation operations at the mix ratio of 0.8125: 0.1875 for the two-component soil–SDA blend produced a maximal CBR and UCS response of 35.053% and 257.152 kN/m2, respectively. This resulted in a desirability score of 1.0.

  • Finally, statistical validation and simulation exercises utilizing the student's t-test and F-test were used to evaluate the suitability of the model created. For the CBR and UCS responses, respectively, the computed findings indicated P(T = t) one-tail of 0.490 and 0.499 for the F-test and P(F = f) two-tail of 0.960 and 0.977 for the t-test. The results of the experiments revealed a strong connection between the simulated and experimental values from the EVD-model.