Introduction

One of the most important considerations for tunnel boring machine (TBM) is the performance analysis of the machines in different types of rocks (Mohammadi et al. 2014). In many cases, because of its easy measurement, uniaxial compressive strength (UCS) of rock has been used for predicting the TBM performance (McFeat Smith 1977; Roxborough and Phillips 1975; Ozdemir 1977; Pang et al. 1989; Torabi et al. 2013). Also, a number of experimental tests (i.e., Brazilian tensile strength (BTS), Schmidt hammer, Taber abrasion, point load index, shore hardness, and drilling rate index tests) have been used to estimate TBM performance (Blindheim 1979; Fowel and McFeat Smith 1977; Howarth et al. 1986). During the past decades, numerous TBM performance predictions have been introduced based on rock mass classification systems such as rock mass rating (RMR) (Sapigni et al. 2002; Ribacchi and Lembo Fazio 2005; Bieniawski et al. 2007; Ramamurthy 2008; Khademi Hamidi et al. 2010), rock structure rating (RSR) system (Cassinelli et al. 1982; Innaurato et al. 1991), rock mass index (RMi) (Palmstrom 1995), rock mass quality system (Q) (Hassanpour et al. 2009), geological strength index (GSI) (Hassanpour et al. 2009), and rock mass quality for tunnel boring machines (QTBM) (Barton 2000; Oraee and Salehi 2013).

Mechanical behavior of rock masses is caused by several factors related to the environment and geological processes that form them. Defining these factors, especially the geological one, in common forms of mathematics is difficult for subsequent analyses. In most cases, only some of these factors are used rather than all of them as main variables in formulas. Therefore, the experimentally obtained data are not consistent with the predicted results of these formulas. Although tunneling machines were applicable in moderate strength rocks and uniformly geological conditions in the past, the technological developments have now resulted in the broader scope of the machines as well as their wider applicability and greater advance rate. Mechanized tunnel excavation method widely applied in the world TBMs is used for mechanized excavation of different tunnels, such as road and rail transport and water transfer. Typically, excavation of water tunnel starts downstream from the outlet of the tunnel with a positive slope upward so that the water runs out of the tunnel without pumping. In machine drilling, many factors including the characteristics of the rock mass, rock material properties, machine characteristics, geological conditions, operator skill, and technical knowledge are effective. The interaction between the machine and the rock mass is a dynamic, complex, uncertain, and nonlinear effect. Hence, modeling the penetration rate of drilling machine and applying mathematical models founded on the basic principles of probability and statistics is a very difficult task. Generally, prediction methods of advance and penetration rate used in industry are divided into two experimental (mostly NTH) and shear (mostly CSM) models (Nilson and Ozdemir 1993; Bruland 2000). Several researchers applied some parameters for assessing the performance and penetration rate of TBM. For example, Graham (1976) used rock uniaxial compressive strength; Farmer and Glossop (1980) applied rock strength and uniaxial compressive strength; Haykin (1991) incorporated rock structure and uniaxial strength; and Barton (2000) used rock classification system. Empirical methods are developed based on observation, practical machine and ground parameters, and the fitting between rock properties and machine penetration rate parameters. Some researchers, including Tarkoy (1974), Graham (1976), Farmer and Glossop (1980), and Nelson (1983) predicted penetration rates only with respect to a parameter or combination of rock with a machine parameter. Researchers such as Cassineli, Innaurato, Park, and Hassanpour introduced the penetration rate as a function of rock classification systems such as RMR, RSR, RQD, and Q (Innaurato et al. 1991; Park et al. 2001; Hassanpour et al. 2011; Barton 2000). Also, NTH method has been established based on empirical relationships between mechanical parameters, rock geological conditions, and productivity of the machine with the actual progress of the tunnel. The advantage of the experimental method is that all impacts from the ground and the machine are normally considered. On the other hand, the development of technology and providing machines with different capabilities limits further use of this method. Note that the model proposed in this work is for a special case of the presented conditions and is not applicable to all situations. In recent years, methods of artificial neural networks are used as a new tool to predict the penetration rate. Artificial neural networks (ANNs), which are inspired by the brain’s structure and its parallel function, need no pure mathematical model. Nevertheless, like humans, the gained experience and results are generalized in this modeling. In cases where there is no relationship between input and output parameters, and there is no clear relationship between them detectable by conventional methods, using artificial neural networks, it is a good idea for estimation and modeling. The main advantages of these networks arise from the fact that they do not require the formulation and dynamic, network, and the possibility of more effective parameters. Considering the neural network capabilities, they can be much easier and less costly and give results that are more accurate to engineers.

To predict the parameter involved in TBM penetration rate using neural network required, a record of all parameters and the effective parameters is prepared to predict the desired quantity. Two conventional empirical methods (NTH and Innaurato) and the fundamental of neural networks—as the subject of present study—are discussed in this work. In NTH model, the influence of force parameters for each disc rotation speed frontal shear, disc spacing, the machine potential, the life of the disc, drilling rate, joints, and fractures are included. Drilling rate and disc life indices are obtained using the friction values, tenderness tests, and the available relations and the corresponding graphs. Parameters including penetration rates, according to the drilling rate, the penetration force per disc, diameter, discs spacing, and joints in the rock mass are calculated. Next, considering the delay in drilling time due to disc replacement, the machine propelling failure, the time required for maintenance and repair of the backup system is calculated, followed by estimation of utilization factor and advance rate. Here, fracture rate and crushing rock mass are more important than other rock mass parameters.

In Innaurato model, rock mass parameters and rock structure are simultaneously applied and RSR classification is used. The most important shortcoming of this model is neglecting the machine parameters (strategic and geometrical parameters). According to the database chosen by Innaurato to create his own model (five tunnels with a length of 19 km) and limited compressive strength of the rock (50 to 150 MPa), poor results are derived from this model when the resistance is lower than this range.

Artificial neural network

Artificial neural network (ANN) is an information processing system that simulates both structure and functions. This network consists of numerous simple processing elements (neurons) capable of performing complex data processing and knowledge representation (Kosko 1994). The neural network is normally trained by processing a large number of input and output patterns to achieve matching and prediction. It is basically mapping the input and output values; hence, it has excellent interpolation capabilities, especially when the input data are noisy. Neural networks may be used as a substitute for auto-correlation, multivariable regression, linear regression, trigonometric, and other statistical analysis techniques. When data are analyzed using a neural network, it is possible to detect important predictive patterns that were not previously apparent to a non-expert. A particular network, which depends on the nature of the problem to be solved, can be defined using three fundamental components: transfer function, network architecture, and learning law (Simpson 1990).

Multilayer perceptron networks

Structure of a given ANN consists of many units called as neurons. Information processing by neurons is shown in Fig. 1. Information received by the neurons (P1 and P2) as the weights (W1 and W2) is multiplied and then summed so that the total input to each neuron is achieved. The weights, which are the indicative of connecting intensity of the neurons, are determined during the learning process (da Silva et al. 2017). The mathematical form of the above operation can be observed in the below equation (Shahin et al. 2008). In some cases, a fixed amount called bias or oblique weight is added to each neuron in the above equation and consequently, Eq. (1) is expressed as Eq. (2):

$$ Ne{t}_j=\varSigma {W}_{IR}{P}_{R\kern0.5em Ri=1} $$
(1)
$$ Ne{t}_j=\varSigma {W}_{PR\kern0.5em Ri=1}+{B}_J $$
(2)
Fig. 1
figure 1

Schematic of a three-layer neural network

In the next step, a nonlinear activation function is applied to the total input of each neuron, using which the output of the neuron is determined. In neural networks, neurons are arranged in layers. In the simplest form of these networks, there is a layer of input neurons completely connected to the layer of output neurons. This structure is called as single-layer perceptron network, because the calculations are done only at the nodes of the output layer. The second type of network is build using one or more hidden layers. Addition of a hidden layer to the network enhances its capability in estimating a nonlinear mapping. Increasing the number of hidden layers enables the network to obtain statistical information that are more accurate. Activation function can be either linear or nonlinear. A stimulus function is selected considering the problem that is supposed to be solved by neural network. Most commonly used functions are tangent sigmoid (tansig) and hyperbolic tangent function. The neurons are located in layers at the layered neural networks. As a result, the capacity of the network increases especially when the input value is high. Neurons in the input layer do not perform any computation on input vector to the second layer (first hidden layer). The outputs of the second layer multiplied by the weight vectors make the inputs for the third layer (second hidden layer).

The study area

Nowsoud water transfer tunnel project is a part of the water supply project in the tropical west terrains of Iran. This tunnel with a total length of 50 km is located around Hirvi in the vicinity of Azgele. The tunnel is constructed on the Sirvan River. The river basin is located in the Iran-Iraq border. The study area is located in simply folded zone of Zagros fold-and-thrust belt that shows fairly regular wavelength folds running along hundreds of kilometers and also cut by several major faults (Agard et al. 2011). Rock units in tunnel route consist of shale and clayey limestone of Pabdeh and Gurpi formations (Late Cretaceous and Tertiary), respectively; however, the rock layers are mainly Cretaceous limestone. The oldest rocks exposed in the tunnel are the Jurassic evaporate units. Based on the lithological characteristics, rock masses in the tunnel route are divided into different units including limestone (Li), limestone-shale (Li-Sh), marl (MA), marly limestone (ML), limy shale (LS), crushed zone (CZ), and fractured zone (FZ). According to geological and engineering geological investigations, there are 23 rock units in this area. These rock units are mainly limestone (Li) and limestone-shale rocks (Li-Sh) as presented on geological section of the tunnel in Fig. 2.

Fig. 2
figure 2

Geological section along the tunnel

Data used

In this study, the effects of geotechnical parameters on the penetration rate (PR) were analyzed in 63 sections along the tunnel. Data set used in each section included uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), modulus of elasticity (E), cutter life index (CLI), modulus of deformation (M), porosity (n), tunneling quality index (Q), rock mass rating (RMR), geological strength index (GSI), and rock structure rating (RSR). Number of data used and the minimum and maximum values of each parameter are presented in Table 1.

Table 1 Statistics of data used in this study

Prediction of penetration rate by statistical method

In order to predict the penetration rate by using artificial neural networks, first the parameters affecting the penetration rate for making the network architecture are selected. Next, the most effective parameters for optimized network design and simulation were detected. Through the analysis, a statistical study using by SPSS software was performed on the input data, so that to compare the results of this method with those produced by neural network models. Statistical analyses are common methods in rock engineering and engineering geology for designing prediction models of the penetration rate. This modeling is conducted based on two simple and multivariate regression approaches. In this study, to obtain a simple regression (SR), in addition to linear equations (y = ax + b), other nonlinear equations were also investigated. Using the data in Table 2, the correlation between UCS, BTS, E, CLI, M, and n, with penetration rate (PR), was studied for 23 lithological units in Nowsoud tunnel. All three mechanical parameters mentioned indicated an exponential relationship with penetration rate (PR) of TBM. Among these, the uniaxial compressive strength (UCS) has the highest correlation with the TBM penetration rate (PR). The correlation between the engineering classification of rock mass (RQD, Q, GSI, RSR, RMi, and RMR) with PR was also evaluated in this work. Statistical results showed that all classification systems have statistically insignificant relationship with the PR. Among the classification systems, RSR and RMR indicated highest (R2 = 0.67) and lowest (R2 = 0.40) exponential relationship with PR, respectively (Table 2).

Table 2 Determination coefficient values of the SR method

Multiple linear regression

Multiple regression is a modeling approach applied to determine the linear relationship between a dependent variable and one or more independent variables. Regression analysis is a statistical method that tries to determine the relationship between two or more variables using the values of one variable based on one or more variables in predicting process. In the statistical model described in this study, the relationship between the independent and dependent variables is considered as follows:

$$ Y={b}_0+{b}_1{X}_1+\cdots +{b}_k{X}_k $$
(3)

where Y is the dependent variable, XkX1 are independent parameters, b0 is a constant, and b1... bk are parameters coefficients. The purpose of regression analysis is to determine the regression parameters and compare the model with the dependent variable (Y). Note that the equations obtained from simple regression (other than the relationship between UCS and PR) are not reliable performance, so as the next step statistical analysis was performed using multivariate regression. In order to predict the PR, which is some independent parameters in multiple regression, and the parameters of the intact rock, rock mass and rock classification systems were used as independent parameters. In this study, seven statistical models were used, where one model is only based on the intact rock mechanical parameters and the other models are based on a combination of intact rock parameter and rock mass classification (GSI, RSR, and RMR). Details of these models are presented in Table 3. Equations derived for the seven models designed for production of the PR are presented in Table 3. Validity of models was assessed using different criteria such as square of correlation coefficient (R2) and t Student’s test. According to Table 3 and their square of the correlation coefficient, models 3, 4, and 5 with R2 0.856, 0.861, and 0.870, respectively, indicate acceptable factor to predict PR.

Table 3 Values obtained using the MLR method

The first models that were created based on the intact rock mechanical parameters indicate an R2 value of 0.601. As shown in Table 3, model 7, which was created based on a combination of UCS, E, and GSI, is different from model 1 (R2 = 0.66) in terms of predicting PR values, implying that GSI has no positive effect on enhancing predictive power of the model. Application of GSI in the model not only has a positive effect on network performance but also reduces it. The main explanation for this outcome is that estimating this parameter based on only qualitative characteristics of the rock mass that involves human errors. According to Al Atik and Abrahamson (2010), large quantities of regression coefficients in the prediction models do not necessarily mean their superiority. Thus, to assess the correlation between model and experimental results, we applied T test with a confidence level of 95% for all models. If the calculated t value is greater than the parameter value, the degree of certainty (Sign) drops below 0.05, and then it is considered as a significant variable in the success of the prediction model. Therefore, based on square of correlation coefficient values and t statistic of the generated models, the models with effective input parameters in the neural network structure were chosen including rock material and classification systems. Finally, model 3 with coefficient of determination of (R2 = 0.85) and degree of importance (Sign) below 0.05 was selected as the optimum model for statistical prediction of the PR.

Prediction of penetration rate by using neural networks method

A set of input and output data is required to design a proper network. As is expressed in the principles of artificial neural networks, a network must be trained with a set of pairs as training couples and be tested with a set of evaluation data. Therefore, prior to training, one of the most important and most difficult stages is defining the neural network structure and requires it to choose the optimal number of layers as well as the optimal number of neurons in each layer. In this regard, there is no specific method for determining the optimal network. Generally, it has been suggested that for a constant number of layers’ number of neurons in each layer is determined by error and trial. The numbers of neurons in input and output layers are limited to the respective number of the input and output parameters of the model.

Although there is no certain way to determine the number of hidden layers, some researchers believe that many issues can be solved using a middle (hidden) layer (Hornik 1989; Hecht-Nielsen 2010; Basheer 2000). However, to achieve acceptable results in this study, the large networks with one and two hidden layers were designed. The number of neurons in the hidden layer is another important parameter in network construction. If the number of hidden layer neurons is low, the network cannot reflect nonlinear mapping between input and output with required accuracy. On the other hand, the excessive number of neurons in the hidden layer has a negative effect, as it increases the number of network elements and inaccuracy involved in learning process and adjusting the weight vectors on one hand, and increase in probability over fitting on the other (Menhag 2011; Kia 2010). To optimize the network, the outputs and the inputs were normalized within a range of − 1 and + 1. To predict the penetration rate, feedforward backpropagation network with an input layer, two hidden layers and an output layer were built, where a tansig transfer function was applied in the hidden layer. To determine the number of neurons in the middle layer, various networks with different number of neurons in the hidden layer were trained and evaluated and the best network was selected among them. The estimated values of PR in networks against the real PR data were plotted in a coordinate with actual data in x-axis and the predicted ones in y-axis. If all predicted PRs are equal to the actual PR values (y = x), all points are on the line. Diversion of points from this line indicates the degree of difference between the network output and the actual values. The optimum model structure was chosen based on input parameters, R2, and the root mean square error (RMSE) value of network training and testing. Based on the abovementioned points, network model with input parameters including UCS, BTS, RMR, E, and CLI and network structure of 5 × 12 × 5 × 1 (five input, 12 neurons in hidden layer 1, five neurons in the hidden layer 2, and one output channel which is the penetration rate) were found as most accurate in terms of quantitative and qualitative conditions to determine the PR (Table 4). Standard error of the mean (SEM) was applied to determine the optimal number of cells in the middle layer. To achieve this goal, error band of models 9,10, 11, and 12 were used as they provide higher correlation coefficient. To evaluate the grids, three bands (Y = X ± 0.25, Y = X ± 0.5, and Y = X ± 1) shown in Table 5 were defined. The frequency of each of these three bands was considered as the evaluation criteria. This frequency shows that the accuracy of the network and the probable PR can be potentially predicted by the network. Finally, model 10 with average error of 0.29 was found as appropriate to predict the PR. The probability that the network output 5 × 12 × 5 × 1 is different from the actual penetration rate of less than 0.25, 0.5, and 1 m is 0.2, 0.047, 0.047 m, respectively. Three cited a band for network 5 × 12 × 5 × 1 is shown in Fig. 3.

Table 4 Input parameters in network structure and values of correlation coefficients obtained from the neural network method
Table 5 Evaluation of network performance in models with different network structure
Fig. 3
figure 3

Determination of error band for network structure 5*12*5*1

Sensitivity analysis

Sensitivity analysis was used to determine the sensitivity of the model to changes in parameter values and structure of the model. Different values were assigned to determine the sensitivity of parameter to measure the change in model behavior. In fact, the sensitivity analysis allows determining the degree of confidence in a model with uncertainty of parameters. One of the major initiatives after modeling is determining the sensitivity of modeling aim in terms of the input parameters. Typically, to determine the effect of input parameters on the target, variations of test error are investigated with removing one of the input parameters. The large differences between the actual and predicted values indicate the greater influence the omitted parameter on the results. After applying the above method on existing data, the effects of various parameters were calculated in the network. Table 6 shows that the uniaxial compressive strength (UCS) is the most effective parameter on the PR. Two other parameters, CLI and SIGMA, are less effective than other parameters.

Table 6 Sensitivity analysis of input data and the objective function

A comparison was made between the multivariate regression and neural network using the input data, consisting of intact rock mechanical parameters and rock mass classification. Table 7 presents that the neural network model 10 (with UCS, BTS, RMR, E, and CLI as input data and the structure 1 × 5 × 12 × 5) indicates higher R2 and lower RMSE compared with model 3 (multivariate regression with E, BTS, RSR, CLI, and UCS as input data). This fact implicates that neural network accuracy in predicting the PR is higher than that of multivariate regression.

Table 7 Comparison of multivariate regression and neural network with import data which consist of composited intact rock mechanical parameters and rock mass classification

Comparison between ANN with NTH and Innaurto methods

To compare performance of different models designed for Nowsoud tunnel, the average difference between the actual and predicted values of PR in various models was plotted. Here, the lithology units of tunnel route were divided into three main units and compared with those of different models. As shown in Fig. 4, the neural network model with input parameters of E, CLI, UCS, BTS, and RMR and network structure of 5 × 12 × 5 × 1 indicates R2 = 0.88, while R2 is 0.57 for NTH model for PR prediction. Besides, units ML-SH and SH-LS have the minimum difference to the actual values penetration rates in ANN method and in Li unit value predicted by the neural network model is closer to the actual penetration rate (Fig. 5).

Fig. 4
figure 4

Comparison between the predicted PR from the ANN, NTH, and Innaurto methods

Fig. 5
figure 5

Bar diagram average actual penetration rate vs prediction penetration rate of prediction model in lithologic units of a SH-ML, b SH-LS, and c Li

Discussion

Results indicate that three types of parameters affect the penetration rate of TBM in the case of this study including geotechnical parameters of intact rock, geotechnical parameters of rock mass, and finally machine-related parameters. Among these parameters, UCS has the most effect on penetration rate of TBM in rock units of study area. This is because of that the strength and toughness of rock material have a major influence on rock mass boreability and in turn on TBM performance (Salimi et al. 2016). For this reason, the most previous models for prediction of penetration rate have used the uniaxial compressive strength of intact rock as a main input parameter (e.g., Graham 1976; Rostami and Ozdemir 1993; Bieniawski et al. 2006; Farrokh et al. 2012; Zare Naghadehi et al. 2018). Rock structure and discontinuity characteristics also affect the PR of TBM in rock units of this study as it is clear from the correlation between penetration rate and rock mass classification systems of RSR, RMR, and GSI. Among the rock mass classification systems, RSR shows the highest correlation with penetration rate in regression analyses (Table 2). Rock mass classification systems integrate characteristics related to the rock structure and discontinuities, including spacing and orientation of discontinuities that have important impacts on TBM performance and its penetration rate. In addition to intrinsic characteristics of intact rock and rock mass, machine-related factor of CLI also influences the RP. In comparison with geotechnical characteristics, CLI has a lower impact on penetration rate. In this study, regression analysis led to knowing those geotechnical parameters that highly influenced the penetration rate of TBM in Nowsoud Tunnel. Selection of the most influential parameters and applying the ANN method for integration these parameters resulted in reliable prediction models for penetration rate. As presented in Fig. 4, predicted values of penetration rate are close to observed ones. This observation proves the effectiveness of selected geotechnical parameters and also the efficiency of the ANN method. The influence of geotechnical parameters on the penetration rate after using the artificial neural network was also evaluated by sensitivity analysis. The results of sensitivity analysis and ranking the influence of geotechnical parameters on the penetration rate are illustrated in Fig. 6. As this figure shows, UCS is the most influential geotechnical parameters on penetration rate followed by BTS and RMR.

Fig. 6
figure 6

Results of sensitivity analysis

Conclusions

The effect of geotechnical parameters on the penetration rate of TMB in Nowsoud Tunnel was evaluated. Regression analyses showed that penetration rate correlated well with the strength of intact rocks and discontinuity characteristics from rock mass classification systems. The strongest correlation was observed between penetration rate and UCS. Then, the most influential parameters were selected as input parameters for the ANN method based on regression analyses results. Predicted penetration rates obtained from the ANN method were very close to observed penetration rates in study tunnel, revealing the effectiveness of selected parameters for prediction of penetration rate. Examination of the impacts of input parameters on the results of ANN by sensitivity analysis showed that UCS has the highest impact on ANN results and predicted penetration rates in Nowsoud tunnel. BTS and RMR stand in the next levels of importance, respectively. It can be concluded that selection of influential geotechnical parameters and applying powerful methods such as ANN make it possible to predict and estimate TBM performance and its penetration rates in various rock units that are essential for time scheduling in tunneling projects.