1 Introduction

Improvement of soil properties where environmental issues including: expansion, shrinkage and liquefaction are prevailing is of major importance to prevent infrastructure failures (Jalal et al. 2020). Saturated fine sands are usually susceptible to catastrophic disasters such as liquefaction (Jamhiri et al. 2020). To prevent these disasters, artificial cementation methods such as cement or lime additions, microbial-induced cementation, and fibre reinforcement cementation have been suggested in practice (Hamdan et al. 2017; Sharma et al. 2018; Tran et al. 2018). The main issue concerning artificially cemented sands under shearing is their complex shearing mechanism (He et al. 2014). For example, during undrained shearing, when cemented soils experience an increase in confining stress, degree of saturation adjusts the pore water space where the coefficient of pore water pressure or B-ratio is considered for evaluation of degree of saturation. Confining pressure and peak shearing stress, also affect the failure mechanism of sands under undrained shear loadings (Sugiyama et al. 2016; Mun et al. 2016). More importantly, the void ratio has long been recognized as a key factor controlling shear strength of sands (Zamani and Montoya 2018).

Following the treatment of sands with several binding agents, natural zeolite as a hydrated aluminosilicate pozzolan that contains alkaline-earth metals has shown promising geotechnical and environmental benefits (Ahmadi and Shekarchi 2010). Unlike soil improvement methods such as sole cement or lime treatment, alkali activation of zeolite with lime and then treatment of soil with its blend does not alter the safety of treated soil and averts the toxicity that is often produced by the buildup of soil pollutants (Shon and Kim 2013). However, current studies have utilized zeolite as a partial cement replacement (Mola-Abasi et al. 2019, 2020) while remaining researchers investigated lime-zeolite stabilization only in terms of chemical reactions and shear strength development (Jamhiri 2020; Jamhiri and Pakbaz 2020). Though, there are great shortcomings in determining a comprehensive relationship to correlate or to predict associated parameters with the shear failure mechanism of zeolite–lime treated sands such as void ratio, peak deviatoric stress, and B-ratio.

A few empirical equations have been proposed to correlate or to predict influential parameters regarding sand amendment with or without additives concerning compressive and tensile strength. Among these studies, Consoli et al. (2010, 2012, 2013) and Consoli and Foppa 2014) presented an adjusted porosity-binding agent index to evaluate the unconfined compressive and tensile strength of lime/cement treated sands. Also, Subramanian et al. (2019) indirectly predicted compressive strength of treated sands as normalized strength against curing time to capture time dependencies on strength evolution, while Mola-Abasi et al. 2019 directly predicted compressive strength of treated sands without explicit consideration of curing conditions.

Regarding studies about undrained shear of treated sands, Okamura and Soga (2006) introduced a relationship to account for the increase in cyclic strength of natural saturated sands with the change in the degree of saturation, suggesting that the normalized undrained shear strength of sands in the form of liquefaction resistance ratio (LRR) has an exponential relationship with B-ratio and potential volumetric strain of εv as follows:

$$\text{LRR} = \log (6500\varepsilon_{v}^{*} + 10).$$
(1)

He et al. (2014) expanded Eq. (1) further by incorporating a calibrated coefficient α = 14,000 and 1800 for both compression and extension conditions respectively in undrained triaxial testing as follows:

$$\text{LRR} = \log (\alpha \times \varepsilon_{v}^{*} + 10).$$
(2)

But Eq. (2) is only applicable in the range of predetermined α values. To further extend Eq. (2), a general equation was proposed by Mun et al. (2016) indicating that \(\left( {\sigma_{1} - \sigma_{3} } \right)_{f}\) increases log-linearly with axial strain rate, where \(\left( {\sigma_{1} - \sigma_{3} } \right)_{f}\) is the principal stress difference at failure and A and B are the slope and intercept parameters depending on any specific undrained loading respectively as follows:

$$\left( {\sigma_{1} - \sigma_{3} } \right)_{f} = A \times \,\log \left( {\dot{\varepsilon }} \right) + B.$$
(3)

Perhaps, the most inclusive relationships were proposed by Belkhatir et al. (2011). They employed multivariate linear regressions and instrumented intergranular void ratios (es) with the aim of forecasting undrained residual shear strength (Sus) of sand-silt mixtures. Due to the presence of low to large amounts of fine content and several confining pressures (σc) in their study, a number of equations considering different relative densities were proposed as follows:

$$\begin{gathered} S_{{us}} /\sigma^{\prime}_{c} = 0.22 - 0.056\left( {e_{s} } \right)\quad {\rm for} \,D_{{r}} = 12\% \hfill \\ S_{{us}} /\sigma^{\prime}_{c} = 0.22 - 0.046\left( {e_{s} } \right)\quad {\rm for} \,D_{{r}} = 50\% \hfill \\ S_{{us}} /\sigma^{\prime}_{c} = 0.23 - 0.049\left( {e_{s} } \right)\quad {\rm for} \,D_{{r}} = 90\% \hfill \\ \end{gathered}$$
(4)

Relationships above generally work well for loose fine to naturally compacted sand mixtures bearing in mind their shortcomings and lack of any alkali activated agents in their structures. Hence, considering the scarcity of available studies regarding the undrained shear response of zeolite–alkali (zeolite–lime) activated sands, achieving relationships capable of generalizing the effects of curing periods and additives content by associating undrained shear parameters will establish a solid ground for evaluating the performance of treated soils. Furthermore, advanced artificial intelligence methods such as GEP and deep neural networks are feasible to perform when high uncertainty or nonlinearity exist among variables (Jalal et al. 2021a, b, c; Jamhiri et al. 2021; Shahmansouri et al. 2021). Otherwise, hybridization of several AI approaches in combination, to improve the precision of each step in machine learning and without extra demands for resources, is preferable.

Therefore, through a chain of analyses, this research aims to define new correlational equations between B-ratio, void ratio at failure and principal stress difference at failure (hereafter called USS). First, by using an experimental approach including the performance of tests such as unconfined compression strength (UCS) tests and undrained unconsolidated (UU) triaxial test on reconstituted specimens of zeolite–lime activated sands. Then, on the basis of experimental results, a unified forecasting model is proposed by generalizing the effects of curing periods, zeolite–lime content, and each given confining pressures. Forecast values were derived directly from the experimental results by a hybrid time series analysis namely, exponential growth smoothing and by identifying logically visible patterns in the extension of curing periods. Finally, with the aid of multivariate linear regression analyses on forecast values, unique relationships are proposed by achieving a good agreement between the measured and predicted undrained shear properties of zeolite–alkali activated sands.

2 Materials, experimental program and methodology

2.1 Materials

The results of the soil characterization tests are illustrated in Table 1 and the grain-size distribution curve of sand and laser particle size analysis of natural zeolite is shown in Fig. 1. This soil is classified as poorly graded sand with less than 5% fine content and used natural zeolite was clinoptilolite. To achieve desired fineness, zeolite was micronized by high energy ball milling. Conventional dry hydrated lime with mean particle size of 2 µm and a specific gravity of 2.34 was used as the bonding agent. Distilled and tap water were used for the characterization and remolding of specimens, respectively.

Table 1 Physical and mechanical characteristics of studied sands
Fig. 1
figure 1

Grain size distribution of sands and laser particle size distribution of natural zeolite

Following reconstituting of samples, sand and zeolite were dry mixed thoroughly before mixing with lime. To reduce dispersion in the grain size distribution of mixed batches, zeolite with a mean particle size of nearly 10 μm and the Si/Al ratio of 5.6% was used. The batch was then mixed with lime until a consistent mixture was obtained. Finally, required water for wet mixing with respect to the optimum moisture content was added to the mixture and the wet batch was again mixed properly to reach a homogeneous state. To ensure sample uniformity, undercompaction procedure was adopted for sample remolding and precise measures were taken to reach a relative density of 95% with respect to the maximum dry unit weights. Specimens were then compacted in cylindrical molds with the height to diameter ratio of 2.2. After that, the molds with the samples in place were sealed in plastic bags and placed in a humid room with 90% humidity and 20° ± 1 °C temperature.

2.2 Experimental program and methodology

The experimental program was carried out by conducting a series of unconfined compression tests on reconstituted specimens to fixate an optimized lime content based on the highest compressive strength. In reference to the results of other studies (Consoli et al. 2010, 2012, 2013), four different lime percentages i.e., 3%, 4%, 5%, and 6% were used based on the mass of dry soil. Three different percentage amounts of zeolite, i.e., 8%, 10%, and 12% which can be considered as a limit for its effective use were adopted for UC tests. Close to exactly similar percentage amounts were also adopted in other studies investigating zeolite as cement replacement material (Ahmadi and Shekarchi 2010; Nagrockiene and Girskas 2016).

At the end of the experimental program, a series of undrained triaxial tests was carried out to investigate principal stress difference at failure and corresponding variation of void ratio with B-ratio of treated specimens. Noticeably, only zeolite content was varied in samples prepared for undrained shearing while lime content was maintained at its optimized value. Additionally, along with previous 8%, 10%, and 12% of zeolite, a 14% amount was also used to investigate the effects of variation in zeolite content on undrained shear strength development. The AI modeling part in this research includes two main parts. time series analysis with exponential smoothing to forecast growth of pozzolanic reactions. Then, Multivariate robust regression, with outlier treatment thorough Huber estimation. The flowchart of steps taken in the hybrid regression modelling is shown in Fig. 2.

Fig. 2
figure 2

Flowchart of steps in the hybrid multivariate robust regression

3 Unconfined compressive strength and undrained triaxial shearing tests

Following curing periods of 7 and 28 days, the hardened samples were subjected to UCS tests in accordance with ASTM D-2166 to verify the effectiveness of the stabilization with lime. Afterwards, triaxial tests were performed under a constant rate of strain according to the ASTM D 2850 on samples with optimized lime and different percentages of 8%,10%,12%, and 14% of zeolite. It is expected that stiff or brittle materials such as cemented samples exhibit small deformations at failure. In this regard, cemented samples should be tested at the lower portion of recommended strain range between 0.5 and 2% mm/min. Accordingly, the strain rate was adopted to be 0.75 mm/min for both UCS and triaxial tests. Also, confinement pressures of 50, 100, and 150 kPa were applied to the specimens in the triaxial chamber. Finally, observed UCS and undrained shear strength at failure corresponding to the maximum strain during loading were recorded. Each specific specimen was tested at least twice to ensure repeatability and validity of the results, and then the mean of results was reported.

4 The B‐value test procedure

While performing undrained triaxial shearing, measurements of B-ratio is undertaken to account for variation of degree of saturation. In cases of cemented samples, back pressure is applied to the specimen in order to increase the degree of saturation by raising the pore water pressure (Sugiyama et al. 2016).

One of efficient methods for sample saturation includes both vacuum procedure and application of back pressure (Rad and Clough 1984). During saturation, both cell and the back pressure must be applied in increments and when a back pressure is applied, an equal pressure simultaneously is added to the cell pressure to maintain the effective confining pressure constant. Thus, specimen vacuity starts with the exertion of 50 kPa of vacuum pressure below atmospheric pressure applied to the water in the specimens.

Following the vacuum procedure, initially, total volume of the specimen with a preliminary degree of saturation (S0), at saturation stage of triaxial test, comprises the volume of air in the specimen (Va), initial void ratio (e0) with pore pressure, (u0) and the volume occupied by the dissolved air which is negligible even during saturation. After successful application of incremental back pressure, as it is visualized both graphically and theoretically in Fig. 3, initial B-ratio at the onset of shearing equivalent to the final degree of saturation Sf (not necessarily full saturation) is obtained after further saturating of the specimen and raising the pressure to (u0 + Δu).

Fig. 3
figure 3

Theoretical steps in incremental application of back pressure utilizing with simultaneous confining pressure up to the full equilibrium (modified after Lade 2016)

5 Results and discussions

5.1 Unconfined compression (UC) test

The UCS variations with different percentage amounts of zeolite and lime for two distinct curing periods of 7 and 28 days are shown in Fig. 4. All treated samples with zeolite and lime showed compressive strength improvement due to the formation of cemented compounds by pozzolanic reaction in treated soils leading to stronger bonds and increase in interlocking forces among soil particles, which is similar to the findings of Rao and Rajasekaran (1996).

Fig. 4
figure 4

Variation of UCS of treated samples with different percentage amounts of zeolite and lime during a 7 days of curing and b 28 days of curing

As shown in Fig. 4a, after 7 days, cured samples with a similar lime content portrayed high compressive stress as zeolite content increased. The results in Fig. 4b indicate that 28-day cured samples containing 5% lime developed relatively higher compressive stress with increasing zeolite. However, for samples with 4% and 6% of lime, the maximum compressive stress belonged to the samples with less zeolite. It is indicated by comparing Fig. 4a with b that samples with 5% lime showed a steady growth of hardening with no or low deviation in any case and in all curing ages.

Additionally, it was indicated that UCS results of specimens with 4% and 6% of lime and different zeolite contents are in contrast with each other. This case prompts different conclusions which must be considered. Initially, at 7 days of curing, samples with more zeolite showed higher strength growth, while at 28 days of curing, samples with less zeolite showed higher strength. These differences attributed to the imbalance of the incorporation of zeolite and lime into the pozzolanic reaction process. For example, in samples with 4% lime, as in Fig. 4b, required lime was not enough for the pozzolanic reaction to progress, and in samples with 6% lime, zeolite was insufficient to consume extra lime. In other words, as the lime percentage increased, the pH increased accordingly and strongly influenced the early reactivity (Mertens et al. 2009; Jamhiri and Pakbaz 2020).

Following the results of UCS tests, the optimum lime requirement for an effective zeolite–lime treatment of sands is 5%. Accordingly, in the next part of the experimental program, the authors decided to conduct an additional series of tests on reconstituted samples with 14% of zeolite along with previous 8%,10%, and 12% of zeolite to investigate the effects of variation in zeolite content on undrained shear strength.

5.2 Unconsolidated undrained triaxial test

The UU stress–strain response of samples with optimized 5% lime and different percentage amounts of zeolite are illustrated in Figs. 5 and 6 for 7 and 28 days of curing periods, respectively. The samples under undrained shearing showed strain-hardening behavior (Nataatmadja and Parkin 1989); but there were no clear drops in strength or residual softening until the termination of loading. Hence, failure points on the slopes of the curves are breaking points of the samples corresponding to a given strain range rather than total breaking point. As shown in Figs. 4 and 5 at initial part of the curve, the stress increases linearly with strain which then shifts into a semi hook-shaped ending part though at a slower rate of increase. Therefore, the increase of shear stress decreases when the stress state starts to reach the failure state. This event probably associated to the development of higher friction due to the increase in slippage forces among cemented particles which are trying to break apart (Cai et al. 2006; Muntohar et al. 2012).

Fig. 5
figure 5

The UU stress–strain response of samples with optimized 5% lime and different percentage amounts of zeolite at 7 days of curing under 50, 100, and 150 kPa of confining pressure

Fig. 6
figure 6

The UU stress–strain response of samples with optimized 5% lime and different percentage amounts of zeolite at 28 days of curing under 50, 100, and 150 kPa of confining pressure

Following the importance of confining pressure on the undrained response of treated samples, variation of peak deviatoric stress for samples comprising optimized 5% lime and different percentage amounts of zeolite are investigated in Fig. 7a, b. As depicted in Fig. 7a, samples containing 8% zeolite have a steep slope at their initial portion of the curve which could be due to closure of low aspect ratio micro pores resultant of lack of a mature cementation. Beside the initial range, strength envelopes are essentially straight lines with nearly the same slope. Additionally, as can be seen in Fig. 7b, samples with 12% and 14% of zeolite have shown a sharp strength increase especially beyond the application of 100 kPa of confining pressure which are visible as a steep slope at the end portion of their curves too. The reason is ascertained from the fact that at lower confining pressures, the cemented bond breakage has more contribution to the final strength than particle interlock friction.

Fig. 7
figure 7

Variation of peak deviatoric stress for treated samples with optimized 5% lime and different percentage amounts of zeolite with confining pressures, a 7 days of curing, b 28 days of curing

Also, both samples containing 12% and 14% of zeolite cured for 7 days have a corresponding peak deviatoric strength to each relative confining pressure well above those of the 8% and 10%. The reason for this behavior could be traced from the relatively higher zeolite content which largely contributes to the degree of cementation. However, the envelopes for both specimens are parallel to each other and closely following a straight line which shows a logical pattern in strength development.

5.3 Analysis of measured B-ratios and void ratios

For cemented soils B-ratio is expected to be less than unity, even for fully saturated specimens (Lade 2016). Moreover, large voids or a fissured microstructure has a significant effect on the value of B at non-zero differential pressures. For example, cracked sandstones apparently have a value of B around 0.6, while sandstones with low crack densities are expected to have values closer to 0.8 for non-zero differential pressures (Berge et al. 1993).

Table 2 expresses that as confining pressure increases over the extension of curing periods, the B-ratio decreases owing to two factors. First, pozzolanic reaction and the evolution of a soft mixture into a rigid cemented body. Second, decreasing bulk compressibility due to the closure of highly compliant small voids at low confining pressures and also the compression of relatively soft inter-granular contacts (Green and Wang 1986).

Table 2 Variation of pore pressure coefficient B-ratio with confining pressures at curing periods of 7 and 28 days

Measured B-ratios in this study are similar to those observed by Green and Wang (1986) who measured B = 0.95–1.0 for differential pressures below 1 MPa for fully saturated sandstones. Fredrich et al. (1995) showed that in the fully saturated silicified zeolite sandstone, the B-ratio at near-zero applied confining minus induced pore pressure is close to 0.9, and it decreases systematically to approximately 0.7–0.8 at effective pressures of about 25 MPa, and tends to increase again at higher confining pressures. Namikawa et al. (2017) studied cement-treated Toyoura sand containing a small amount of Kaolinite clay using triaxial compression and tension tests under a back-pressure of 200 kPa and under confining pressures varying from 50 to 200 kPa. B-ratios measured during their tests ranged from 0.8 to 0.95 for samples prepared with 4.5% cement and from 0.7 to 0.95 for samples prepared with 7% of cement.

The influence of USS, curing age, void ratio, and confining pressure on the B-ratio is shown in Fig. 8a. It is clear from Fig. 8a, that void ratio increases with increasing of B-ratio, but void ratio decreases over curing time due to further development of calcium silicate hydrate (C–S–H) bonds when coarser mass of their reaction product being produced by growing pozzolanic reaction. As it is highlighted in Fig. 8b, with the increase of zeolite content under similar confining pressures, the difference between values of the void ratio is significant and occurs during curing age. But, for samples with an identical zeolite content, the difference between the void ratios and peak deviatoric strengths develops less difference during curing age especially for treated samples with 12% and 14% of zeolite, while comparing to those samples treated with 8% and 10%, increase in peak failure strength and void ratio are much more noticeable.

Fig. 8
figure 8

a Influence of peak deviatoric stress, age and confining pressure on B ratio, b variation of void ratio at peak failure stress for samples containing 8%, 10%, 12%, and 14% of zeolite over curing ages

In view of Fig. 8, the data in Table 2 could be better explained knowing that the rate of air and water occupancy of voids is mainly dependent on degree of saturation and air dissolution in water. So, as B-ratio is also a function of the degree of saturation, increase in degree of saturation will result in the increase of B-ratio and achieving a higher void ratio. These observations accentuate previous studies in implicating that the increase in degree of C–S–H bonding and pozzolanic reaction rate with time lead to a finer void space (Consoli et al. 2010, 2012, 2013).

6 Exponential smoothing time series analysis

Forecasting shear strength of treated samples requires analyzing time dependent variables influencing strength development. One major effective variable is the pozzolanic reaction that continues its influence over curing time. Although, pozzolanic reaction itself is not measured on a quantitative scale, however, its kinetics could be measured through solvent-regent dissolution and SEM analysis. But the former is not easily comparable with different pozzolan contents, and the latter shows low precision in results (Wang 2014). Thus, pozzolanic reaction is definable as a qualitative value which can be explained mainly by discrete empirical observations over time.

In contrast, evaluation of compressive strength development at each distinct curing periods, which is a direct result of pozzolanic reaction, can only be explained by comparing the UCS values. Therefore, in these cases, a precise correlational pattern could be established because of quantitative nature. But forecasting instead means providing functional correlations among all influential variables including both quantitative and qualitative. So as to determine a multivariate relationship among all influential variables in this study, a time series forecasting was implemented directly from the curve fitting among recorded data. Noticeably, time series analysis can be performed by means of curve fitting with smoothing and interpolation among datapoints. In order investigate the variation of input variables over time, beyond available four weeks of experimental evidence, exponential curve fitting smoothing can also be visualized to aid inferring datapoints in a projected forecast. Therefore, by extrapolating the smoothed curve beyond the range of the experimental data in Fig. 8b, the logical trend between experimental values of void ratio and USS values under each confining pressure over curing time was used to develop a time series analysis in Fig. 9 as the basis for an exponentially smoothed time series forecast. The reason for adopting variation of USS as the study variable in the forecast model is due to the previously determined significant effect of confining pressure on deviator stress and its constant predictor pattern of influence that follows an escalating straight line as shown in Fig. 7a, b.

Fig. 9
figure 9

R-squared validated exponential trend among experimental readings of void ratio with peak deviatoric stress over curing ages of 7 and 28 days

Observed exponential trend in Fig. 9 was then used as a high confidential modeling tool for exercising a time dependent forecast with integration of exponential smoothing and by identifying patterns in the past such as growth of strength in the length of curing periods. Exponential smoothing methods allow smoothing parameters to change over time with an aim of adapt to changes in the characteristics of the time dependent forecast series (Taylor 2004; Billah et al. 2006). Subsequently, calculation of predicted exponential growth was done by using triaxial test results and by utilizing the following equation to calculate the least squares fit through the data points in Fig. 9 as follows:

$$y = b \times m^{x},$$
(5)

where b and m are unknown constants, to a set of (xi, yi) data points. Thereafter, by assigning each set of data points to a given zeolite content including variation of void ratio with corresponding USS, evaluation of the resulting function at each given set of points was performed. The values of b and m were obtained by reduction of linear fit to an exponential one by taking the logarithm of the y values, and then using it to fit the function log y(x) = α + βx, where α = log b and β = log m, to the data points (xi, log yi). Obtained values of b and m in this way do not generally minimize the sum of squared errors, but they do minimize the sum of squared logarithmic errors in Eq. (5) by adjusting an exponential fit using the following equation:

$$\log y_{i} - \log \hat{y}_{i} = \sum\nolimits_{i} {(\log y_{i} - \log \hat{y}_{i} )^{2} } = \sum\nolimits_{i} {\left( {\log \frac{{\left( {y_{i} } \right)}}{{\left( {\hat{y}_{i} } \right)}}} \right)}^{2},$$
(6)

where yi is the experimental value and \(\hat{y}\) is the predicted value for i = 1, …, n. Equation (6) is the sum of squared logarithmic errors to the set of data points. Furthermore, avoiding implanting an asymmetry in the residuals, mean squared logarithmic error (MSLE) was ruled out to treat all estimates in their real differences. Thus, by exponentially smoothing the existing trend among undrained triaxial test results shown in Fig. 9, for curing periods of 1 week to 4 weeks, the growth-adjusted forecast of predicted values for curing periods up to 16 weeks is illustrated in Fig. 10. This enables the forecast to include and also integrate the effect of curing periods in predicted values and to be smoothed without losing the trend of time dependent forecast by giving best fit accuracy. Also, Table 3 provides a comparison between exponentially smoothed growth forecast of experimental results over projected curing time.

Fig. 10
figure 10

Exponentially smoothed time series forecast of experimental readings over curing ages of 7 and 28 days to a forecast projection up to 16 weeks

Table 3 Best-fitted values of RMSE and MAPE of the predicted data compare to the experimental values

6.1 Exponentially smoothed growth projection validation

There are two prominent measures used to evaluate the precision of time projection analyses, each of them is based on the error or deviation between the forecast and actual values including, mean absolute percentage error (MAPE) and root mean square error (RMSE). The best possible approach to judge the performance of a forecast is rests on the lowest RMSE or MAPE (Hyndman and Koehler 2006).

Best-fitted values of RMSE and MAPE of the predicted data compared to the experimental ones shown in Fig. 9 are presented in Table 4. As can be seen in Table 4, the data is perfectly fitted with low deviations in both RMSE and MAPE values. The present high R-squared shows that approximately more than 95% of variation in USS due to the changes in void ratio in the length of predicted 12 weeks plus the measured 4 weeks of curing can be explained by the forecast projection. Noticeably, appearance of high RMSE in the data, is because of sensitive response of RMSE to extreme values or so-called outliers, so a few extreme values or even a single one can completely change estimates, though it is a compensation for excluding mean squared logarithmic error in the forecast projection when adopting Eq. (6). Still, if the error diagnoses perform on the residuals, it will ascertain that adopted models produce less noise and the extent of outlier influence will be determined.

Table 4 Comparison of values obtained by exponentially smoothed growth projection against experimental results

7 Multivariate analyses

7.1 Multiple linear regression (MLR)

The growth-adjusted exponential forecast showed a good accuracy with high precision in forecasting the variation of void ratio at failure with USS over the extension of 16 weeks of experimental and forecast curing period. It is now advantageous to use more than one independent influential variables in the forecast which extend the domain of the forecast model. As previously mentioned, presence of a high RMSE means that trend projection with exponential smoothing cannot guarantee further accuracy with high precision when the chances of appearing outliers are imminent. Particularly, when more variables are added to the forecast, for example, forecast curing period exceeds by far over measured 4 weeks, which is beyond the reach of trend projection. In that event, the model cannot avoid outliers in the forecast period. Forecasting more variables directly from these data values then requires regression analyses.

A regression analysis applies least-squares analysis to find the best-fitting line. The best line is defined as the minimized mean square error by application of Eq. (6) between the historical behavior of samples such as growth in strength due to higher zeolite content and subsequent higher pozzolanic activities, and the predicted behavior of cemented particles under triaxial shearing based on those forecast values. So, as the quantitative values affecting each other in this study comprise B-ratio, void ratio, and USS, a simple linear regression analysis could not be adopted since the independent variables are more than one. So, by extending the concept of linear regression, MLR should be adopted (Piñeiro et al. 2008; Khanlari et al. 2012). MLR attributes one dependent variable namely Y to n independent variables as in Xi (i = 1, …, n) as:

$$Y = a_{0} + a_{1} X_{1i} + a_{2} X_{2i} + ... + a_{n} X_{ni} + \varepsilon_{i},$$
(7)

where Y is an evaluation of variable’s co-relationship and the ais are the coefficient of each variable’s best-fit line slope and \(\varepsilon_{i}\) is the error for the ith observation. The coefficients ais are measured such that the sum of the squares of the evaluation error is at its minimum residual.

As it is evident in Fig. 10, the variances of void ratio with USS along the exponential smoothed line of best fit remain similar as curing age extends; showing a continuous scale of the data. But, over curing time, as it can be seen from Fig. 8a, the void ratio is decreasing with the decrease of B-ratio while it decreases with the increase of USS as shown in Fig. 8b. So, obviously direct co-linearity among variables does not exist and it enables the regression analysis to distinguish which independent variable contributes highly to the rest the dependent variables. Above all, sound performance of a MLR analysis relies on avoiding significant outliers or influential points. Possible appearances of any untreated outlier would change the statistics output produced by the model and reduces the predictive accuracy of the results. Hence, every MLR should be examined with diagnostic measures to evaluate its accuracy and to treat possible outliers (Sohn and Kim 1997).

7.2 Error diagnostic and treatment of MLR:

To measure the accuracy of performed multivariate regression analysis, error diagnostic tools (Schützenmeister et al. 2012) are plotted for each zeolite content-based dataset. Accordingly, normal probability (QQ) error diagnostic plots of residuals are shown in Fig. 11. These QQ plots show how similar the quantiles are for each dataset with a similar zeolite content compared to what the quantiles of that dataset would be if it followed a normal probability distribution. Figure 11 shows that despite sound performance of regression analysis, residuals were randomly distributed and points are partially fold on diagonal line and even some points (the numbered ones) are sharply deviated from the normal distribution line, particularly as zeolite content increases to the highest amount; the deviation increases accordingly and it is clearly showing the influence of outliers.

Fig. 11
figure 11

Normal QQ error diagnostic plots for distinct datasets including a 8% zeolite, b 10% zeolite, c 12% zeolite, d 14% zeolite

Figure 11 also showed that many influential outliers are present in each dataset which means some points are not approximated well by the model. Theses outliers have large residual errors which then it significantly influences regression accuracy. Though, not all outliers are really influential in a regression analysis. However, considering the high RMSE values in Table 4 and current results of error diagnostic plots, the extent of influence of outliers should be investigated. To determine the influence of outliers, a metric called Cook’s distance has been developed by Cook (1977). This diagnostic tool is used to determine influential outliers that could have a significant effect on the output and accuracy of the analysis and also the rest of dataset.

Subsequently, error diagnostic plots of residual vs leverage (Cook’s plot) are shown in Fig. 12. In this type of diagnosis, important outlying values are generally located at the upper right corner or at the lower right corner outside the dashed line (Cook’s distance). When cases are outside the Cook’s distance, it means that results will alter if those cases being excluded. From Fig. 12, it is concluded that almost all outliers are out of Cook’s distance which it means those points might have significant influence on any relationship among variables obtained by a regression analysis and they should inevitably be treated.

Fig. 12
figure 12

Residual vs leverage error diagnostic plots for distinct datasets including a 8% zeolite, b 10% zeolite, c 12% zeolite, d 14% zeolite

7.3 Outlier treatment with robust regression

Robust regression is an alternative to least squares regression catering for residual errors and variance inflation factor (VIF) due to multicollinearity. It checks all the observed residual errors differently based on how well they behaved. Additionally, Robust regression makes it possible to evaluate and remove variables which promote high VIF values without extra needs to perform analysis such as principal component analysis. Based on those residuals, it reduces the skewness resulting from the outliers and after that trains all the data accordingly and if necessary, removes the points with the highest errors and finally refits the model again (Davies 1993). Robust regression employs several methods for its outlier treatment; among which, the two-stage huber estimation (Huber 1983) is one of the reliable ones. In Huber estimation, objectives are observations with different residual errors which will then get a weight function according to the leverage of the residuals and a tuning constant, k, as follows:

$$\rho_{H} (e) = \left\{ {\begin{array}{*{20}c} {\frac{1}{2}e^{2} \begin{array}{*{20}c} {\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{{for}}} & {\left| e \right| \le k} \\ \end{array} } \\ {\begin{array}{*{20}c} {k\left| e \right| - \frac{1}{2}k^{2} } & {for} & {\left| e \right| \ge k} \\ \end{array} } \\ \end{array} } \right.$$
(8)

and

$$w_{H} (e) = \begin{array}{*{20}c} {\frac{k}{\left| e \right|}} & {for} & {\left| e \right| \ge k} \\ \end{array} \,\,\,\,{otherwise}\,\,\,w_{H} (e) = 1,$$
(9)

where \(\rho_{H} (e)\) is Huber objective function, \(w_{H} (e)\) is Huber weight function, and k is tuning constant relative to error residual (e) where accordingly smaller values of k produce more resistance to outliers. Since all datasets with different zeolite content have shown similar presence of outliers (Figs. 11, 12), it is reasonable to establish a unified model to combine all datasets and then run error diagnoses again only this time with a Huber outlier treatment. Consequently, by running error diagnoses, Fig. 13a, b is drafted to show the results of robust regression by instrumentation of Huber estimation. Figure 13a indicates that outliers were evenly distributed and points are folded roughly on diagonal line and small deviations in values at first and last points of the red line are symmetrically balanced and compensating the associated errors between measured and predicted values. Dashed lines in Fig. 13b can now barely be seen, because all cases are well inside the Cook’s distance and there is no evidence of outliers.

Fig. 13
figure 13

a Normal QQ, and b residual vs leverage error diagnostic plots of treated outliers with Huber estimation of robust regression in the unified dataset

Figure 13a, b showed that from Huber estimation, existing outliers in each zeolite database are treated and placed in one unified database, implying that they could not alter the results if either being treated separately or collectively. This does not mean that zeolite content lacks any role in predication steps, rather, it means the role is now integrated into more pronounced quantitative variables that they produce the same results. Integration of qualitative influence of zeolite in a unified database gives an opportunity to establish generalized relationships with disregard to zeolite content and its curing time influence.

Now that a well-ordered unified database is created, by employing Eq. (7) and incorporating robust multivariate regression, computation of a unified fitting model capable of forecasting and correlating the behavior of zeolite–lime treated sands with consideration of major influential variables is possible as follows:

$$e_{f} = 4.89 - 0.043B - 0.0003\left( {\sigma_{1} - \sigma_{3} } \right)_{f}$$
(10)
$$B = 113 - 23e_{f} - 0.007\left( {\sigma_{1} - \sigma_{3} } \right)_{f}$$
(11)

where \(e_{f}\) is void ratio at peak failure, B represents the coefficient of pore pressure and (\(\sigma\)1\(\sigma\)3)f is principal stress difference at failure. It is expected that the fitting model of the co-relationships between \(e_{f}\), (\(\sigma\)1\(\sigma\)3)f, and B-ratio expressed by Eqs. (10) and (11) can predict future observations. Consequently, the target undrained shear strengths can be achieved by using the expected void ratio at failure and B-ratio at saturation stage considering the conditions of current study. Of note, Eqs. (10) and (11) are also subjected to further optimization with wider range of observations.

7.4 Measuring the accuracy of the hybrid model

Considering the precision of experimental program, the reproduced values are almost similar to the measured ones. However, the rate of prediction accuracy should be considered in practice. For that reason, measured accuracy between predicted values by Eqs. (10) and (11) and experimental results for B-ratio, void ratio (\(e_{f}\)), and USS are illustrated in Tables 5, 6 and 7, respectively. The average discrepancy between forecast values by proposed relationships and experimental results are about 8.6% for B-ratio, 10% for values of void ratio, and 8.8% for USS values. Furthermore, pair-wise correlation matrices among all linked parameters by Eqs. (10) and (11) is shown in Fig. 14. Accordingly, Fig. 14 shows a strong link between each pair of variables where B-ratio is indirectly proportional to the void ratio with 90% collinearity and USS is directly proportional to the void ratio with 92% collinearity; and remarkably the collinearity between produced values of B-ratio and USS is about 96% which confirms a solid performance of derived equations without a major multicollinearity and by proving a normal distribution of data values and residuals in the unified model’s database.

Table 5 Average discrepancy between predicted and experimental values by proposed relationships for the value of B-ratio based on different percentage amounts of zeolite
Table 6 Average discrepancy between predicted and experimental values by proposed relationships for the value of Void-ratio at failure based on different percentage amounts of zeolite
Table 7 Average discrepancy between predicted and experimental values by proposed relationships for the values of principal stress difference at peak failure (kPa) based on different percentage amounts of zeolite
Fig. 14
figure 14

Pair-wise correlation plot matrices of studied variables obtained by proposed relationships through multivariate linear regression. (Asterisk) Correlation values in the upper right portion of the graph belong to each neighboring parameters which are delineated to one another with one arrow pointing at the corresponding correlation value. The diagonal graphs and graphs at the lower left portion are normal probability distribution of each value and the random distribution of values, respectively

Keeping in mind that the proposed relationships include outliers in their regression model, while integrating several amounts of zeolite into the model, and considering the forecast period up to 16 weeks, some errors are expected. But, as can be seen in the Tables 5, 6 and 7, obtained mean discrepancies are satisfactory with a high confidence rate equal to more than 90%. This high correlation between B-ratio and void ratio suggests that these two effects are fairly bounded together and interpretation of B-ratio for determination of geotechnical parameters such as degree of saturation should be coupled with measuring the variations of void ratio. Moreover, according to the probability in the predicted Eqs. (10) and (11), void ratio also contributes a statistically significant role towards this model of predicting undrained shear strength from B-ratio.

The proposed relationships by this research will establish logical correlations between the governing variables influencing undrained shear properties of zeolite–lime treated sands with less than 5% fine content, considering the initial confining pressure (σ3) and peak deviatoric stress. Additionally, successful application of robust multivariate regression will provide a solid ground for predicting conditions beyond the conditions of the current study specially where available data are in discord under the influence of outliers. Finally, presented results in this study will recommend the neglected yet beneficial instrumentation of optimized zeolite–lime blend as an eco-friendly soil improvement method for future references.

7.5 Relative importance analysis

In order to assess the isolated contribution of variables on the overall performance, assessment of relative importance of parameters is required. Based on the performed multivariate regression, a relative important analysis among deviatoric stress, Void ratio, B-ratio and zeolite was performed. Accordingly, the averaging over orderings method proposed by Grömping (2007) was used. This method employs averaging sequential sums of squares over all orderings of independent variables and comes up with the most influential parameters in the regression model. The result of the analysis on the hybrid model is shown in Fig. 15. Consequently, the deviatoric stress has the governing effect on the overall regression output, followed by B-ratio and void ratio, respectively. However, the rate of influence for zeolite is negligible, which ascertains from the previous findings. This outcome means that lime content and zeolite should be optimized together in order to assess the overall effect of pozzolanic reaction on the overall performance.

Fig. 15
figure 15

Relative importance analysis of variables in the overall Hybrid model

8 Conclusions

This study proposes a novel forecasting procedure to correlate influential variables of zeolite–alkali activated sand. Based on a series of experimental tests such as UC and UU triaxial tests, principal stress difference at failure (USS), void ratio at failure, and B-ratio at saturation stage were measured. Considering experimental evidence, a chain of the growth-adjusted forecast projections and a series of multivariate regression analyses including robust analysis along with residual error diagnoses were utilized to define unified correlational relationships. The validity of proposed relationships has been proven accurate by comparing the average discrepancy between predicted and measured results. This research has also established some important verdicts about zeolite–lime treated sands as follows:

  1. 1.

    Strain hardening behavior of samples under shearing indicates quick mobilization of resistance forces at earlier strain ranges. However, considering a given strain rate for loading termination, confining pressure contributes highly to the occurrence of shear failure of samples while zeolite content defines the shear strength.

  2. 2.

    Earlier in curing and because of the lack of sufficient hardening in samples containing more lime, a sharper increase in compressibility of the soil body is observed. This is due to the presence of higher void ratios which then leads to the increase of B-ratio and consequent decrease in the rate of shear mobilization.

  3. 3.

    Hybridization of multivariate robust regression with exponentially smoothed times series analysis enables a wider range of zeolite contents to be dealt by time projection. However, integration of separate datasets in one requires directing the residuals by defining a tuning constant to assign an equal weight to all variables considering their nature in description.

  4. 4.

    Employing Huber estimation in robust analyses merges outlier without compromising the precision of the forecasting model. Alternating least squares regression with robust regression then enables building big datasets with the least error residual while adhering quantitative and qualitative variables together

  5. 5.

    The collinearity between B-ratio and void ratio at failure in zeolite–alkali activated sands is implying that one should not directly interpret the degree of saturation by attaining a high value of B-ratio and the result should be adjusted in accordance to the capacity of the void ratio to be minimized.

  6. 6.

    The comparisons of the predicted and measured results showed that the adopted forecast model and proposed relationships are capable of predicting values of void ratios at failure, principal stress difference at failure, and B-ratio of zeolite–lime treated sands with high confidence rate equal to 90% up to 16 weeks of curing period.

  7. 7.

    Relative importance analysis on the hybrid model indicates that the optimization of zeolite with alkali activators such as lime should be performed prior to utilization in modeling. In this way, any possible collinearity due to limited range of incorporated zeolite dosages or unreliability in estimated regression coefficients can be eliminated considering zeolite–alkali activator’s inherent importance in stabilization.