1 Introduction

The jet grouting is a popular soil stabilization technology widely used in the construction industry to cope with various geotechnical challenges, such as liquefaction [1], front soil improvement of entrance shaft during shield tunneling [39, 50] and sealing the potential leakage of diaphragm wall [51], which can decrease potential risks in engineering practice [2629]. The method consists of loosening the soil with high-velocity jet in a predrilled borehole and mixing it with cement slurry to create rigid bodies from a predefined depth to the ground surface [38, 40, 45, 48, 54]. Various jet grouting methods including single, double and triple fluids systems are usually utilized to achieve the jet grouted bodies, whose properties are influenced by the implementation conditions [46]. Field observations have demonstrated that the diameter of jet grout columns mainly depends on the resistance of in situ soil and jetting energy [2, 3]. The maximum distance achievable by the cutting jet in general determines the diameter of a column. Experimental investigations [6] and theoretical study [20, 32] have been adopted to scrutinize and interpret these general observations on the jet grouted bodies.

Indeed, one of the major issues associated with the practice of jet grouting is the column diameter prediction, which has attracted a considerable research interest. Prediction of the diameter of jet grout columns is very challenging because it needs to consider both soil parameters and jetting parameters. Prediction models are continuously developed and improved over the years. These methods can be in general grouped into four categories, namely empirical, conceptual, theoretical and machine learning-based models. Ribeiro and Cardoso [37] provided a comprehensive review on these prediction methods. It has been acknowledged that although very efficient under some conditions, the most sophisticated analytical models to date [12, 38, 40, 41] still have a few limitations. To overcome these shortcomings, some studies adopted machine learning algorithms such as neural networks [34] and support vector machines (SVMs) [42, 43] to predict jet grout column diameters. In general, their performances were found satisfactory; nonetheless, improvements are still in demand, in particular, with regard to the “real-time assessment.”

The existing machine learning models can only predict the final (or average) diameters of jet grout columns based on the given data. Their intrinsic structures cannot vary dynamically to align with the variation in input data with time. As a result, they cannot consider that the jet grouting columns are constructed by lifting the nozzle upwardly from a specified depth in a given speed. In other words, the variation in jet grout column with depth is not considered. Due to the change in the soil resistance between the soil layers, the diameter of the jet grouting column is varying along with the depth. Consequently, in the case of complex geological profiles [8, 36, 46], adopting the average diameter for a jet grouting column as key metric can be misleading and vulnerable for the real design [47]. Recent efforts to tackle this issue have resulted in the development of probabilistic models [30, 31, 35], which, although relatively effective for some applications, lack strong physical meaning regarding the diameter prediction. A dynamic prediction model is thus in high demand, which can improve the precision in prediction by providing a “real-time assessment” on the diameter of columns. It should be noted that this assessment is contingent on the precise sequencing of subsoil and jet grout column profiles; on the other hand, it enables operators to reach the desired goal of adjusting treatment parameters. In fact, taking advantage of the tremendous ability of artificial intelligence-based models to deal with nonlinear problems, a dynamic prediction allows capturing the random variability of jet grouting columns with depth via suitably integrating the soil properties, depth and injection parameters. Furthermore, traditional machine learning models require guidance to learn a task and/or perform efficiently [49, 55]. Experienced engineer needs to step in and conduct tunings if the machine learning algorithm returns an erroneous prediction. On the contrary, deep learning models can perform these adjustments on their own, as they are able to learn high-level features from data. It can be regarded as a very distinctive advance of deep learning over the conventional machine learning techniques. Another prominent advantage of the deep learning models, compared with the conventional machine learning, is that the deep learning models are able to deal with time series problems. In other words, they are capable of predicting future values based on previous observations [16, 19, 22, 24, 44, 57].

In this paper, a novel framework based on Bi-LSTM method has been proposed to provide a possible way to achieve a real-time prediction of the varying diameter during the entire jet grout process. Compared with traditional machine learning algorithms (Neural Networks, SVM, etc.) that cannot handle sequential data (or learn hierarchical representation of a time series), the LSTM approaches have the advantage of learning and predicting a sequence of future data based on previous observations. The Bi-LSTM, particularly, integrates the notion of order of time by performing its learning process in two directions (from the past to the future, then from the future to the past). We adapted the Bi-LSTM model to make it suitable for the dynamic modeling of jet grouting. This was achieved via defining a framework to incorporate Bi-LSTM and data sequencing so as to be able in any point in time to preserve information of a jet grout column from both previous and future measurements. The proposed model can predict the variation in a diameter with depth, particularly considering the time lag between each lifting step and the soil properties at the specific depth. This paper consecutively dissects the Bi-LSTM and data sequencing strategies necessary to fulfill this task and then systematically compares the predictive performance of the Bi-LSTM, LSTM and SVR methods. Ultimately, some recommendations for improving the current design routine of jet grout columns are elaborated.

2 Methodology

The sequential nature of the jet grouting treatment leads to a continuous variation in column diameters with depth. Hence, modeling the interrelationship among chronological data points is crucial to accurately forecast this variation. This section describes the computation strategies and data sequencing methods for real-time assessment of jet grouting columns diameter based on artificial intelligence and optimizing computation technologies [8,9,10, 13, 23, 52, 53]. In particular, the bidirectional LSTM which is a variant of the traditional LSTM is discussed, as well as two different sequence patterns adopted for assessing its performance.

2.1 Bidirectional LSTM networks

To comprehend the functioning of Bi-LSTM, it is crucial to firstly understand that of LSTM [21]. LSTM architecture learns long-term dependencies between time steps and sequence data to predict future occurrences. Compared with the feedforward networks [15, 55, 56] that cannot handle the notion of order in time, or recurrent neural networks (RNNs) [11] that are hampered by the issue of vanishing gradient, the LSTM models are more robust alternatives. The sequential data are usually presented as vectors in a sequence that are not independent of each other but appear in a given order. Moreover, LSTM architectures tackle the issue of inefficient weights updating during the training process (vanishing gradient) via achieving long-range dependency.

The core concept of LSTM involves a cell state and its various gates. The cell state works as a transport highway which conveys relative information throughout the processing of the sequence. It is, in theory, considered as the “memory” of the network. Along with the cell state transport procedure, gates are utilized to add or remove information to the cell state. These gates are simplified neural networks that can learn which data in a sequence are crucial to keep or forget. By achieving that, the gates continuously transfer relevant information through the sequences chain to make predictions. Being simplified neural networks, gates generally integrate either a sigmoid (sigmoid) or hyperbolic tangent (tanh) activation function. The former transforms the values to fall between the range of 0 and 1, where 1 means important and 0 means not important. On the other hand, the tanh function is used to regulate the network by squishing the values between − 1 and 1. As a matter of illustration, Fig. 1 gives a discretization of a LSTM memory cell through which the functioning of LSTM can be visualized. The cell state working process usually involves three major operations (steps) achieved by the gates. (1) Considering the output of the previous cell (hidden unit at the time t − 1) ht−1 and input data at a given time t, xt, the first step (forget gate) consists of removing information that are irrelevant to the cell state (Fig. 1a). The sigmoid squashing function is used to define the value of that decision \(f_{t}\) (Eq. 1). (2) The second operation consists of input gate (it) and input node (gt) (see Fig. 1b) that are mainly responsible of the cell state updating. Therefore, this operation involves selecting which values to store in the cell state. Specifically, the sigmoid function is firstly employed to decide which value to pass through by using Eq. 2. Then, the tanh function is adopted to decide the level of importance of values that entered the cell via using Eq. 3. (3) The last operation (output gate, Fig. 1c) consists of gauging what is adequate to be output at a certain time. In other words, it decides what the next hidden state should be. The sigmoid is firstly operated to find out what parts of the cell state reach the output (Eq. 4). Then, tanh is employed to give a weightage of the cell state, which will be multiplied by the output of the sigmoid gate (Eq. 5). Ultimately, the hidden state is obtained as a result of this operation, and subsequently, the new cell state and new hidden state are carried over to the next time step. In the equations below, W and b represent the values of weights and biases of the above-mentioned basic neural networks:

$$f_{t} = \sigma (W_{xf} x_{t} + W_{hf} h_{t - 1} + b_{f} )$$
(1)
$$i_{t} = \sigma (W_{xi} x_{t} + W_{hi} h_{t - 1} + b_{i} )$$
(2)
$$g_{t} = \tanh (W_{xg} x_{t} + W_{hg} h_{t - 1} + b_{g} )$$
(3)
$$o_{t} = \sigma (W_{xo} x_{t} + W_{ho} h_{t - 1} + b_{o} )$$
(4)
$$h_{t} = o_{t} \tanh (C_{t} )$$
(5)
Fig. 1
figure 1

Descriptive diagram of the functioning of LSTM: a step 1, b step 2 and c step 3

Bi-LSTM is an extension of the conventional LSTM [18]. It learns bidirectional long-term dependencies between time steps of time series or sequence data. These dependencies can be useful in improving model performance as the network learns from the complete time series at each time step. As illustrated in Fig. 2, the Bi-LSTM can be discretized into two hidden layer networks, one working in forward direction and the other in the backward direction. These forward and backward propagations are analogous to the ordinary feedforward and back-propagation networks, except that the hidden states for all time steps are unfolded [17]. Therefore, the main difference between the Bi-LSTM and the LSTM is that the latter only preserves information of the past, whereas using the two hidden states combined, both past and future information can be preserved at any point in time in Bi-LSTM. In this study, the bidirectional LSTM scheme was trained using back-propagation through time (BPTT) [14].

Fig. 2
figure 2

Comparison between LSTM and Bi-LSTM networks (recreated after [33]

2.2 Sequence patterns definition

As previously mentioned, sequences are a nonindependent order in the input data; it is thus critical to find a way to leverage this valuable information to the Bi-LSTM model. As a main objective of this work was to provide a robust and accurate dynamic model, two data sequencing schemes were adopted to comprehensively investigate the aptitude of both LSTM and Bi-LSTM networks to dynamically forecast the diameters of jet grouting columns. Let us recall that the dynamic nature of this prediction stems from the fact that each time a new vertical measurement is obtained, it can be used to update the previous state. Specifically, a sequence-to-one and sequence-to-sequence systems were defined (as shown in Fig. 3) and integrated in the LSTM algorithms. The sequence-to-sequence learning consists of training the model to convert sequences from one domain to sequences in another domain. In this case, the output of the LSTM layer is a sequence, fed into a fully connected layer. The state of the LSTM consumes each point in the sequence and considers its previous state. Conversely, in the sequence-to-one leaning, it is assumed that the output of the LSTM layer is the last element of the sequence, fed into a fully connected layer. In other words, it involves only dealing with the final result after integrating the whole sequence of a time series.

Fig. 3
figure 3

Schematic illustration of a sequence-to-one and b sequence-to-sequence models

3 Case study

3.1 Data acquisition

As previously mentioned, the jet grouting data commonly used in machine learning are fixed data and not suitable for time series prediction. However, the field investigation of jet grouting in Pyroclastic soils performed by [5] represents an exceptional case to test our models. Their investigation mainly focused on the effect of single fluid jet grouting on the soil. To this end, several vertical columns were constructed over a depth of 10 m and regularly spaced by a center-to-center distance of 1.20 m. The columns were constructed with a constant injection pressure of 45 MPa and a lifting step of 40 mm. As shown in Table 1, the other treatment parameters varied from one column to another.

Table 1 Summary of jet columns’ injection parameters

Furthermore, field and laboratory testing programs revealed that the soil at the construction site was mainly sandy–silty ashes, rather heterogeneous, with a mean cohesion of c’ = 54.6 kPa and a mean friction angle of ϕ’ = 35.4°. The soil compression index was found to have an average value of 0.223, while the degree of saturation over the treatment depth ranged from 0.83 to 0.98.

To dynamically account for the variation in soil properties with depth in this particular case, the original soil porosity (input variable) was computed at each monitoring depth using the formula below (Eq. 6). This equation is derived from the continuity condition applied to the volume of jet grout column per unit length via considering the mechanical effects of grouting as well as mixture hardening process [4]. The assessment of this variable is substantiated by the fact that the soil porosity varies noticeably with depth [7]. Another reason is to involve the jet–soil interaction mechanism [32] in the prediction. Given that in the jet grouting method, each column is a snapshot of the interaction between the jetting parameters and the soil properties:

$$V = \frac{k\alpha }{{n(1 - \beta ) + \beta - n_{sc} }}V_{j}$$
(6)

where \(V = \pi D^{2} /4\) is the volume of jet column pet unit length, \(V_{j}\) is the volume of grout injected per unit length, n is the porosity of the original soil and nsc is the soil cement mixture porosity. k is the volumetric ratio between hydrated cement and retained grout (during the jetting process), which is equal to 0.37. α is the volumetric percentage of cement slurry absorbed by the soil, which is set to be 0.775. β is the volumetric percentage of soil excavated by the jet action, and the value of β is equal to 0.45.

After the treatment, the operation area was excavated, and the column diameters were measured at every 0.5 m depth. This information is crucial because it gives the variation in column diameters with depth. Moreover, using the lifting step and lifting speed, the incremental time of the jet grouting construction at each depth can be determined, which allows the real-time assessment by using LSTM architectures. Table 2 provides synoptic statistics of the jet grout columns diameters investigated.

Table 2 Range of diameter values per column

3.2 Implementation

Considering the two prediction schemes introduced in Sect. 2.2, a total of 11 input variables were considered for the simulation, which include the injection parameters (number of rotations per lifting step (Rot), flow rate (Q), number of nozzles (N), diameter of nozzles (d), injection time per lifting step (Δt), average rotation speed (w), injected volume grout (Vj’), lifting speed (v)), the incremental time, the soil depth and porosity. The output was the column diameters at specific depths. For the first model (sequence-to-one regression), all the diameter values gathered from the first six columns were arranged as a sequence to predict the diameter of the seventh column over the implementation depth. For the second model (sequence-to-sequence regression), the main objective was to predict the last six values (considering that the data sequence is regularly defined upwardly following the monitoring depth of 0.5 m) of the diameter for each of the seven columns.

Both models were simulated by using MATLAB packages. The LSTM and Bi-LSTM networks consisted of 300 hidden layers, and both were trained for 300 epochs. This configuration was selected through a comprehensive preliminary experimentation. The algorithm Adam [25] was adopted as optimizer. It is a first-order gradient-based optimization approach that is strongly efficient in minimizing validation loss and RMSE. The initial learning rate was set to 0.01. Also, the gradient threshold was taken equal to 1 for preventing the gradients from exploding.

4 Modeling results

4.1 Performance metric

The root-mean-square error (RMSE) is the standard deviation of the prediction errors, which was used as a standard metric for measuring the performance of the proposed models:

$${\text{RMSE}} = \sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {(Y_{\text{obs}} - } Y_{\text{pred}} )^{2} }$$
(7)

where N is number of data points, Yobs is the observed value and Ypred is the predicted values by models. This RMSE is commonly used in forecasting, climatology and regression analysis to verify experimental results. Mathematically, the RMSE is the square root of the mean square error (MSE), which is the average squared difference between the observed values and the values predicted by the model. The square root is introduced to make scale of the errors to be the same as the scale of targets. The lower value of the RMSE implies a better model performance.

4.2 Training process

Indeed, the training processes of LSTM algorithms embodying either a sequence-to-one or sequence-to-sequence systems are quite analogous. Figure 4 shows the training progress curves pertaining to the sequence-to-sequence learning pattern for both LSTM and Bi-LSTM algorithms. There are no significant differences between the two approaches regarding either the training loss or the RMSE of the training process. The loss curves show that the Bi-LSTM learned slower at the beginning of the training process but tended to improve and stabilize faster with time. It is supposed that the Bi-LSTM did not have access to enough future values at the initiation of the training, but as this issue progressively resolved, its performance increased accordingly. Overall, it was observed that the Bi-LTSM performs slightly better than the LSTM.

Fig. 4
figure 4

Plots of training progress during training: a Loss and b RMSE

4.3 Model predictions

This section presents the prediction results achieved by both the LSTM and Bi-LSTM approaches, which were simulated considering the sequence-to-one and sequence-to-sequence patterns. It should be mentioned that originally the column C6 had four missing values, which were recovered by the prediction schemes. As can be seen from Fig. 5, using both the LSTM and Bi-LSTM approaches, the sequence-to-one model achieved a good prediction to experimental results, with a RMSE errors (in mm) equal to 15.2585 (Bi-LSTM) and 18.7202 (LSTM). Figure 5a also depicts the variation in the missing values of column diameters with depth. Interestingly, for the Bi-LSTM method, these values are bounded by relatively small error values, which allows inferring that these predictions are more reliable in comparison with the LSTM approach. As the injection parameters remained unchanged throughout the construction of this column, it can also be speculated that within the range (depth) concerned by these missing values, the soil properties varied slightly. However, with regard to the direction of error variation (see Fig. 5b), it seems that although having a larger RMSE, the LSTM approach seems to be more stable.

Fig. 5
figure 5

Schematic illustration of the performance of the sequence-to-one model

Figure 6 shows the prediction results of columns diameters performed using sequence-to-sequence model. Here also, both the LSTM and Bi-LSTM approaches provided well predictions to the variation in column diameter with depth observed by [5]. Figure 7 shows the statistical analysis of the prediction performance of two models. As shown in Fig. 7, the average RMSE (in mm) achieved by the LSTM approach is 27.04198, while that of Bi-LSTM is 21.80673. Furthermore, Fig. 7a, b shows that: (1) in terms of the LSTM and Bi-LSTM methods, the most accurate prediction was achieved for columns C3 and C4, respectively; (2) the most centered errors by using LSTM and Bi-LSTM were observed for columns C0 and C2, respectively; and (3) column C5 presents the largest error variability outside the upper and lower quartiles for both methods. These results demonstrate that the diameter of jet grouting column can be accurately predicted based on specific input parameters and using the sequence-to-sequence models. Both computation schemes (LSTM and Bi-LSTM) are capable of assessing the variation in jet grouting column diameter with depth considering the time lag between each step. The comparison between LSTM and Bi-LSTM and physical meaning of the predictions will be discussed in the following section.

Fig. 6
figure 6

Predicted diameter values of each of the seven investigated columns

Fig. 7
figure 7

Error variation by column for the second simulation case: a LSTM and b Bi-LSTM

5 Discussion

5.1 Comparison of the performances between LSTM and Bi-LSTM

From the results abovementioned, it can be clearly seen that the Bi-LSTM layer performed better than the LSTM. Table 3 shows an assessment of the predictive performance of the proposed models using different metrics. The RMSE and mean absolute error (MAE) values substantiate the advantage of the Bi-LSTM approach over the LSTM approach, especially for sequence-to-sequence tasks. But importantly, the forecast bias metric brings about important insight into the forecasting process. It should be recalled that this metric is not able to assess the precision of a model as it is just an indicator of its predicting quality. As shown in Table 3, the results of the forecast bias suggest that the LSTM approach has a tendency to provide underestimated predictions, while the Bi-LSTM approach has a tendency to provide overestimated predictions. In particular, in the sequence-to-sequence scheme, the LSTM failed to accurately predict (underestimated) the actual diameter values and yielded relatively larger error compared to the overestimation of the Bi-LSTM. This comparison clearly substantiates the advantage of the Bi-LSTM over the LSTM methods for the sequence-to-sequence tasks. However, as pointed out earlier, the LSTM method tended to be more stable throughout the experimentation process.

Table 3 Performance evaluation using different metrics

To further demonstrate the prominent ability of LSTM-based models to achieve the real-time assessment of jet grouting columns diameter in comparison with traditional methods (machine learning-based), a support vector regression (SVR) model was selected as a demonstration for traditional machine learning methods. From Fig. 8, it can be seen that compared to the SVR that achieved relatively unsatisfactory prediction accuracy, LSTM and Bi-LSTM models handle sequential data in this case very well. As shown in Fig. 8, it can be seen that the Bi-LSTM performs better than both the LSTM and SVR in RMSE prediction, with the respective enhancement being 7.86 mm (21.93%) and 63.86 mm (69.65%). These results are consistent with the conclusions by [56]. The comparison underpins that the sequential step-by-step jet grouting process can be well modeled by using the Bi-LSTM network.

Fig. 8
figure 8

Scatterplot and bar graph comparing Bi-LSTM with LSTM and SVR

5.2 Physical meaning of the predictions

The physical meaning of the above prediction results is illustrated by using the graphical illustration provided in Fig. 9. This drawing was plotted by using the real configuration scale, and it can be seen that the predicted diameters (in blue) well overlapped with the observed values. The variation in diameter with depth (considering the time lag between each depth/step) for a given column can be assessed using the information of columns constructed entirely (Fig. 9a) or partly (Fig. 9b). As the assessment predicts the diameter variation in the grouted column, it can be exploited to tune the treatment parameters accordingly. Indeed, as shown in Fig. 10, the relationship between the volume of injected grout (Vj) and the column diameter established using the Bi-LSTM (sequence-to-sequence) approach was found to be in a good agreement with the correlation proposed by [5]. In this case, the ability to predict the relationship between the grout volume and column diameter was found encouraging, since the parameters for jet grouting apparatus can be suitably tuned based on the relevant correlations to ensure the construction quality of jet grout in terms of the dimension and profile of columns to be constructed. On the other hand, as it is well acknowledged that the specific energy better captures the effects of jet grouting, it is worthy to study a relation with this parameter, instead of injected volume of grout, in future study.

Fig. 9
figure 9

Physical meaning of the prediction results of: a sequence-to-one and b sequence-to-sequence models

Fig. 10
figure 10

Variation in column diameter with the amount of injected grout

5.3 Recommendations

Figure 11 illustrates a simple canvas for the practical implementation of the proposed approach for evaluating the real-time diameters of jet grouting columns, which can be divided into three main steps.

Fig. 11
figure 11

Flowchart of the implementation of real-time assessment of jet grouting columns

  • Step 1: collecting the input data. This procedure can be divided into two phases representing the core facets of the jet grouting method. (1) The first component involves performing site investigations beforehand and gathering relevant soil data. It is recommended to generate a continuous subsoil characterization based on in situ test techniques such as cone penetration test (CPT). This geotechnical characterization approach is advantageous because it conjugates practicality and effectiveness. (2) The second component of the procedure refers to the traditional in situ jet grouting testing. This procedure is important as it provides the initial jetting parameters and the experimental diameters that will be used as target variable during the Bi-LSTM training process.

  • Step 2: processing the input data using either the sequence-to-one Bi-LSTM or sequence-to-sequence Bi-LSTM depending on the design purpose, as discussed in Sect. 5.2.

  • Step 3: adjusting the operational parameters based of the results of step 2, accordingly obtaining an accurate prediction of future values.

6 Conclusions

This study explored the capability of Bi-LSTM to provide an accurate estimate of the diameter of jet grouting columns. The specific goal was to dynamically predict the variation in jet column with depth considering the time lag between each step. Moreover, the performance of the Bi-LSTM was compared against that of LSTM and SVR methods. The following conclusion can be drawn:

  1. 1.

    In the proposed framework, two models including the sequence-to-one and sequence-to-sequence were investigated and tested using a case study of the implementation of jet grouting in soft soil. The results showed that the diameter of jet grout column can be accurately forecast using the two models. Both the LSTM and Bi-LSTM have the ability to efficiently evaluate the variation in jet grout column diameter with depth considering sequential data.

  2. 2.

    The results among Bi-LSTM, LSTM and SVR methods showed that Bi-LSTM performs better than both the LSTM and SVR in RMSE prediction, with the improvement being 9.8% and 24.3%, respectively. This result substantiates the efficacy of modeling sequential step-by-step jet grouting process using the Bi-LSTM.

  3. 3.

    The proposed prediction schemes have proven to be reliable and flexible. They can readily adapt to new data and provide future occurrences of diameter variation, which has the practical advantage to enable the operational parameters tuning.

  4. 4.

    To further enhance the forecasting performance of the proposed models, it is recommended to include more parameters representative of the variation in soil properties with depth (such as CPT values). The continuous geotechnical characterization of the soil is thus an imperative.