Introduction

Performance prediction of tunnel boring machines (TBMs) is a significant factor for timetable planning and estimating the project cost in mechanical tunneling construction (Shijing et al. 2006). Proper estimation of the TBM performance parameters is needed for reduction of tunneling project risks. Over the past few decades, many techniques have been developed to introduce more accurate TBM performance predictive models. These techniques can be categorized into three groups; theoretical and empirical (based on laboratory testing and field observation), statistical (based on mathematical rules), and computational (based on artificial intelligence techniques).

Several theoretical and empirical models have been developed to predict performance parameters of TBMs (Farmer and Glossop 1980; Sato et al. 1991; Rostami and Ozdemir 1993). Farmer and Glossop (1980) proposed a model that calculates penetration rate (PR) values using the rock tensile strength and average cutter force. A strong relationship between the uniaxial compressive strength (UCS) of rock and the specific energy was introduced by Sanio (1985). The Colorado School of Mines (CSM) model has been developed and named after research works at the Colorado School of Mines. The first version of this model was developed by Ozdemir (1977) and updated by Rostami (1997). The CSM model was proposed based on rock properties and cutter and cutting geometry, and can predict the cutter forces for a given penetration (mm/rev). Yagiz (2002) made a modification of the CSM model through the addition of brittleness of intact rock and the fracture properties of the rock masses as indices to the model. Review of the previous investigations showed that the empirical and theoretical models are mainly based on the machine performance in given geological conditions, rock mass properties, and laboratory test results. The advantages of these models are their capability to use the ground conditions, especially the rock mass behavior. These models are limited in their use for scenarios where new machine parameters are introduced and also lack the ability to be used in machine design and optimization. Some of the empirical models are based on a single rock or machine property and, according to Ramezanzadeh (2005), these are not sufficient to give an acceptable prediction of the TBM performance when there are many influential parameters on the TBM performance. In general, as mentioned by several studies (e.g., Grima et al. 2000; Yagiz et al. 2009; Gong et al. 2016), the performance prediction of empirical and theoretical models is relatively low and there is a need to develop new methods/models for TBM performance prediction.

Another group of TBM performance prediction models is considered as statistical approaches, which have been widely developed for the estimation of TBM performance. For example, Gong and Zhao (2009) performed a non-linear regression analysis on the data collected from two tunnels that were constructed in granitic rock masses in Singapore. They introduced an equation for estimating the rock mass boreability and stated that there is a meaningful relationship between the rock mass boreability index and four rock material and mass parameters (i.e., brittleness, UCS, orientation of joints, and joint count number). Statistical approaches suggest a relationship (i.e., linear and non-linear) between the independent/input and dependent/output parameters. Several researchers pointed out that these methods are not always robust enough to solve non-linear and complex problems (e.g., Grima and Babuška 1999). Additionally, the performance of the methods may be affected by the outliers and extreme values in the data. It can be mentioned that, in several cases, the performance of the statistical models is higher than that of the empirical and theoretical models (e.g., Benardos and Kaliampakos 2004; Benato and Oreste 2015), while in some other cases, the empirical and theoretical models are better (e.g., Rostami 1997). In order to improve the accuracy and reliability of the TBM performance predictive models, some other computational techniques, such as artificial intelligence (AI), may be considered as a suitable alternative.

AI techniques have been extensively developed to solve various problems in the field of geotechnical engineering (Hasanipanah et al. 2018; Koopialipoor et al. 2018a, b). To estimate TBM performance parameters such as PR and advance rate (AR), several techniques such as fuzzy logic (Ghasemi et al. 2014), artificial neural networks (ANNs) (Benardos and Kaliampakos 2004), particle swarm optimization (PSO) (Yagiz and Karahan 2011), adaptive neuro-fuzzy inference system (ANFIS) (Grima et al. 2000), imperialist competitive algorithm (ICA) (Armaghani et al. 2017), gene expression programing (Armaghani et al. 2018), and support vector regression (SVR) (Mahdevari et al. 2014) have been utilized. Grima et al. (2000) developed an ANFIS model for predicting PR values and found that the use of the ANFIS method is more accurate than using statistical methods. Benardos and Kaliampakos (2004) offered an ANN model to predict AR values. The SVR technique was modeled by Mahdevari et al. (2014) for the database collected from a water tunnel in the United States. Salimi et al. (2016) utilized ANFIS and SVR models to recommend a model for TBM PR prediction. They demonstrated that SVR was more efficient compared to an ANFIS model. Furthermore, Armaghani et al. (2017) developed hybrid models (combinations of ANN with ICA and PSO) to predict the PR values of TBMs.

There are many published/developed ANN-based models for solving geotechnical engineering problems (Gordan et al. 2018; Koopialipoor et al. 2018d). Although ANNs benefit from various gradient-based learning approaches to establish the prediction model, two major problems may occur during the learning process: (a) getting trapped in local minima and (b) slow convergence rate (Lee et al. 1991; Wang et al. 2004). One efficient way of alleviating these problems is to design an extended version of a pre-developed ANN, namely, deep neural networks (DNNs), which have been developed and used recently (Hinton and Salakhutdinov 2006; Hinton et al. 2006; Salakhutdinov et al. 2013). The DNNs model can provide a better implementation and, consequently, a higher performance prediction using the same database. DNNs have been successfully utilized in many bodies of applications, such as image classification and speech recognition (Hinton et al. 2012; Zeng et al. 2016). This study investigates the potential of DNNs in predicting the PR of TBM. A water transfer tunnel in Malaysia is studied and, based on the most influencing factors on the PR, a DNN model is established and tested against an ANN model. Finally, to identify the most accurate and reliable predictive model for PR estimation, both of the developed models are compared with each other.

Structure of DNNs

In recent years, according to the available evidence on the presence of structures in the human brain, researchers have tended to focus on the training of deep-layered neural networks (Serre et al. 2007). This is despite the fact that the usual network training with more than two hidden layers often leads to poor results. So when all layers are tested to be trained through a standard function such as the degree of similarity to inputs or similarity to desirable classes, the results may be worse than shallow models (Erhan 2011). This is due to the increase in the number of active local minima in these structures, which depends on the type of functions and initialization of the network parameters (Bengio 2012). In other words, in teaching the deep structures of the network, the problem of localized minima becomes more tangible as the number of layers and neurons increases (Plath et al. 2008). Therefore, initializing appropriate network weights in order to converge their education seems to be necessary. It is necessary to have several successive stages of pre-education such that each stage of pre-education is a prerequisite for the next educational stage to pass the teaching process from the barriers of the local minimum.

Training a deep belief neural (DBN) is the first important contribution in the field of DNNs, which includes a neural network with a lot of hidden layers and without any intra-layer connection (Hinton et al. 2006). To build blocks for DBN training, a restricted Boltzmann machine (RBM) was used, which is a generative stochastic neural network with no intra-layer connection. The majority of DBN training was based on greedy layer-wise unsupervised learning of the RBM. To train DNNs, the greedy layer-wise unsupervised training concept by application of denoising autoencoders as building blocks was used (Vincent et al. 2010). A denoising autoencoder is a standard autoencoder trained by minimizing reconstruction error with added noise to the input. The approach of using a denoising autoencoder in DNN training is called a stacked denoising autoencoder. To train a DNN, several algorithms have been suggested (Rifai et al. 2011a, b; Goodfellow et al. 2013). Figure 1 shows the architecture of a DNN training procedure, which includes supervised fine-tuning and unsupervised pre-training. The procedure of pre-training treats each consecutive pair of layers as an RBM with the joint probability (see Eq. 1 in the first row of Table 1). The variables in Fig. 1 are presented in the last column of Table 1.

Fig. 1
figure 1

Architecture of a deep neural network (DNN)

Table 1 Summary of the equations used in deep neural network (DNN) design

Here, it is assumed that the visible units follow the Gaussian noise model with an identity covariance matrix if the input data are pre-processed by mean and variance normalization (Hinton et al. 2006). By increasing the likelihood over training visible samples with the contrastive divergence algorithm approximation, the RBM parameters can be trained in an unsupervised fashion efficiently. In DNN training, for the first layer, a Gaussian–Bernoulli RBM is used, while a pile of Bernoulli–Bernoulli RBMs can be stacked behind the Gaussian–Bernoulli RBM. Then, the RBM’s parameters can be trained layer-by-layer.

A supervised fine-tuning of the parameters in the whole neural network with the final output layer will be done after pre-training for initializing the weights of the first several layers, which are shown in Eq. 4 of Table 1.

To learn the mapping function, with the set of parameters Θ, given the set of training samples pairs {(xr, yr)| r = 1,…, R}, it is needed to focus on the minimizing mean squared error (MMSE) function that can be formulated as shown in Eq. 5 of Table 1. In this equation, \( {\widehat{y}}_r \) is the predicted output vector using Eq. 4. Equation 6 is an objective function which can be optimized with a back-propagation procedure with conjugate gradient method in mini-batch mode.

As mentioned earlier, the pre-training procedure treats each consecutive pair of layers as an RBM with the joint probability defined in Eq. 1 and, when both v and h are binary terms, i.e., the Bernoulli–Bernoulli RBM, the energy function can be calculated using Eq. 2, and, when h is binary and v is real-valued data, i.e., the Gaussian–Bernoulli RBM, the energy function is given in Eq. 3 of Table 1. The fourth row of Table 1 shows the general multivariate regression problem. In order to decrease MMSE values, Eqs. 5 or 6 can be used. Figure 2 shows a flowchart of the DNN. Based on this flowchart, first, the data are divided into two parts: training and testing. Based on the training data, the DNN model is designed/developed. Each implementation section of the model involves different layers, i.e., training layer, hidden layer, and output layer, and then all layers will be examined. This process continues until the system error reaches the desired/defined level.

Fig. 2
figure 2

Flowchart of the DNN

Case study and data collection

A tunnel project (Pahang–Selangor raw water transfer, PSRWT) with a total length of 44.6 km and diameter of 5.23 m was constructed in Malaysia. The purpose was to transfer water from Pahang state to Selangor state. At the end of this tunnel, through pipelines, the water is distributed with the gravity flow in order to receive basins of a planned treatment plant. The tunnel route and its location between Pahang and Selangor states in Malaysia is shown in Fig. 3. The PSRWT tunnel was excavated in the mountain area of Peninsular Malaysia with an elevation range of 100 to 1400 m. Six major faults were observed in this tunnel. As expected, the rock strength is poor in areas of fault intersections. Additionally, in fault areas, highly to moderately weathered zones were observed. Different rock types, including shale, coarse-grained granite, and medium-grained granite, were observed in various tunnel distances (TDs) of the PSRWT tunnel.

Fig. 3
figure 3

The tunnel route and its location between Pahang and Selangor states in Malaysia

Three sections of TBMs and four sections of conventional drilling and blasting were planned to be excavated in the PSRWT tunnel project. The mentioned TBMs were used to excavate various ground conditions in different mass weathering zones from fresh to highly weathered. In the PSRWT tunnel project, 11,761 m in mixed ground, 11,761 m in very hard ground, and 11,218 m in blocky ground were excavated by TBM numbers 1 to 3, respectively. The dominant rock type was granite, with average rock quality designation (RQD) and rock mass rating (RMR) values of (40%, 67), (68%, 80), and (34%, 57) in mixed ground, very hard ground, and blocky ground, respectively.

In order to establish a proper database for solving TBM PR values, the collected data were divided into two categories: (a) field observations and (b) laboratory tests. In total, a distance of 12,649 m of the PSRWT tunnel in different TBMs was investigated. To do this, panels with a typical length of 10 m were selected and, finally, 1286 panels were observed/measured. Parameters such as the degree of weathering, rock mass strength, spacing of discontinuities, number of joint sets, degree of roughness, alteration, infilling material, groundwater condition, cutterhead thrust force (TF), revolutions per minute (RPM), stroke speed, rate of penetration, boring energy, and cutterhead torque were measured and recorded carefully in every panel. It should be noted that, among the studied panels, 560 panels, 553 panels, and 173 panels were identified as fresh, slightly weathered, and moderately weathered zones, respectively. In addition, laboratory tests were conducted on more than 150 block samples collated from the tunnel site. Relevant tests such as Schmidt hammer, Brazilian tensile strength (BTS), UCS, point load strength, and p-wave velocity were carried out in accordance with the International Society for Rock Mechanics (ISRM) (Ulusay and Hudson 2007).

Effective parameters and input selection

Investigation of the influential parameters on TBM performance is considered as a necessary task to select input parameters in implementing AI techniques. According to Grima et al. (2000), all influential parameters on TBM performance can be classified into three categories: (1) rock mass properties, (2) machine characteristics, and (3) tunnel geometry. In this section, the previous related investigations are reviewed and, subsequently, the most influential factors on TBM performance are selected.

Based on the previous investigations, the most influential parameter of intact rock that can affect the PR is the rock compressive strength (Roxborough and Phillips 1975; Farmer and Glossop 1980; Rostami 1997; Bieniawski and Grandori 2007; Yagiz et al. 2009). Greater rock strength generally results in a lower PR (Grima et al. 2000). Sapigni et al. (2002) mentioned that RMR and rock compressive strength may influence TBM performance significantly. RMR, RQD, and UCS were considered as the most effective rock properties on TBM performance in the study conducted by Benardos and Kaliampakos (2004). Simoes and Kim (2006) conducted a research study of TBM performance estimation and considered RQD, RMR, and groundwater inflow as the most influential factors. In another study, Gong and Zhao (2009) used parameters related to joint condition, BTS, and UCS as independent variables in the analysis of PR prediction. Additionally, rock tensile strength was successfully applied for prediction of the PR in several studies (Eftekhari et al. 2010; Yagiz and Karahan 2011; Salimi and Esmaeili 2013; Shao et al. 2013). As stated by several researchers such as Yagiz (2008), rock mass weathering can affect the TBM performance significantly. Mogana (2007) mentioned that tunneling conditions and TBM performance are sensitive to the effects of increased degrees of weathering. Benardos and Kaliampakos (2004) also used the weathering degree of the rock mass as an influential parameter in their TBM performance model.

According to Grima et al. (2000), the most significant characteristic of a TBM machine is the TF. The number of mounted cutters on a TBM lets users compute the maximal thrust per cutter, which is employed in the majority of the models. In addition, the maximal torque, maximal power, and maximal RPM are functions of the TF. A database comprising of more than 600 TBM projects was established by Grima et al. (2000) to estimate TBM performance when developing an ANFIS predictive technique. In terms of machine characteristics, they used TF, RPM, and cutter diameter in their model and found that the PR values increased as the values of RPM increased. According to Maidl et al. (2008), the spacing of disk tracks, disk type, thrust, and torque are the most effective parameters of TBMs on the PR. In another study conducted by Farrokh et al. (2012) for prediction of the PR, RPM and normal force of the disk cutter were considered as model inputs and effective parameters. Mahdevari et al. (2014), in their developed model, utilized different types of machine factors, such as cutterhead torque, cutterhead power, TF, and TBM-driven direction, and, after performing sensitivity analysis, they concluded that TF is the most effective factor (in terms of machine characteristics) on TBM performance prediction.

The geometry of the tunnel is known as a parameter of a great importance. According to Grima et al. (2000), many parameters, including the rock mass stability, torque, RPM, and the total amount of consumed power, are affected by the tunnel diameter. Generally, the increase in diameter leads to the decrease of the PR per length (Maidl et al. 2008). Moreover, the tunnel diameter was considered by Farrokh et al. (2012) as independent variable for TBM performance prediction. Note that, in the case of tunnel geometry (e.g., tunnel diameter), since in the PSRWT tunnel such parameters are constant, the geometry-related parameters are not taken into consideration in this study.

It seems that a degree of multicollinearity among several parameters i.e., RQD, UCS, and RMR, can be found that may impact the final results. However, each of them defines a particular subject and their information can be used for deeper understanding and clarification of further related problems. RMR classification comprises six different parameters, including RQD, UCS, the groundwater condition, the condition of discontinuities, the spacing of discontinuities, and an adjustment for discontinuity orientation. To include the effects of other RMR factors, such as the condition of discontinuities, which are significant parameters, this classification (RMR) can be set as a model input to estimate the PR.

According to the above discussion, five parameters, including weathering zone (WZ), RQD, RMR, UCS, and BTS, as rock mass and material properties and two parameters of RPM and TF as machine characteristics were chosen as model inputs to predict the PR of TBMs. Although some other effective factors could be considered as independent variables, according to Armaghani et al. (2015), a model with a lower number of inputs has an advantage as it can reduce the model complexity. A database comprising 1286 datasets (1286 panels) of data was established and used in the modeling of AI techniques in this study. In order to use WZ as an input, a rating system to each zone was assigned and utilized in the established database. Therefore, values of 1 to 3 were assigned to fresh, slightly weathered, and moderately weathered zones, respectively. For RQD, UCS, RMR, BTS, WZ, TF, and RPM, minimum, maximum, and average values of (6.25, 95, and 44.15), (40, 185, and 107.45), (44, 95, and 64.73), (4.69, 15.68, and 8.43), (1, 3, and 1.7), (80.6, 565.84, and 321.5), and (4.08, 11.95, and 8.84) were considered, respectively. Additionally, values of 0.12, 5.82, and 2.61 were measured for the minimum, maximum, and average PR, respectively. In Table 2, 100 datasets out of all the data used (1286 datasets) are presented to have a better description for the readers. In the following sections, ANN and DNN models are established in order to predict the PR, then the obtained results are evaluated, and, finally, the best model is selected.

Table 2 Some of the data used for modeling the penetration rate (PR)

Model development

ANN model

In this study for TBM PR prediction, the ANN model is developed to demonstrate the capabilities of a DNN predictive model. In general, in dealing with problems that have complex relationships between its variables, applications of ANNs can be appropriate and useful. In implemented ANNs, the successful use of the feed-forward method is reported (Simpson 1990). This method adjusts network weights. ANNs must be trained with learning algorithms to approximate the problems. The most commonly used algorithm for the training of ANNs is back-propagation (BP) (Dreyfus 2005). Model errors between output and target (obtained by the system) values can be decreased by using BP algorithms. The model error should be smaller than defined errors like the root mean square error (RMSE), unless the system will be propagated back to adjust the network weights.

To develop intelligent systems for predicting TBM PR, as a first step of modeling, the established database should be divided into random datasets of training and testing datasets. To perform this task, five different random datasets were created from all of prepared data (for use in both ANN and DNN models). Different percentage values of (80, 20), (75, 25), and (70, 30) were suggested for training and testing datasets, respectively, in previous investigations. However, according to Swingler (1996), the best model developments and model evaluations can be obtained using a combination of (80, 20). Therefore, in this study, 80% of the whole data, being 1029 datasets, was used for network training or model development and 20% of the whole data, being 257 datasets, was considered for network testing or model evaluation. Based on the created datasets, five ANN models were developed to predict TBM PR and their evaluations were performed based on the coefficient of determination (R2) and RMSE, as presented in Table 3. The formulas for calculating R2 and RMSE are shown in the following equations:

$$ {\mathrm{R}}^2=1-\frac{\varSigma_{\mathrm{i}=1}^{\mathrm{N}}\left(y-{y}^{\prime}\right)}{\varSigma_{\mathrm{i}=1}^{\mathrm{N}}{\left(y-\tilde{y}\right)}^2} $$
(1)
$$ \mathrm{RMSE}=\sqrt{\frac{1}{\mathrm{N}}{\varSigma}_{\mathrm{i}=1}^{\mathrm{N}}{\left(y-{y}^{\prime}\right)}^2} $$
(2)

where y and y′ are the measured and predicted values, respectively, is the mean of the y values, and N is the total number of data. A perfect model with excellent predictability can achieve an R2 value of 1 and an RMSE of 0.

Table 3 Results of the artificial neural network (ANN) models for predicting the PR

As shown in Table 3, R2 value ranges of (0.6971–0.7282) and (0.6379–0.6738) were obtained for the five training and testing sets, respectively, which indicate an acceptable accuracy level in predicting TBM PR. However, selection of the best ANN model is often challenging because the obtained values for the R2 and RMSE results are very close to each other (Koopialipoor et al. 2017, 2018c; Ghaleini et al. 2019). A ranking technique developed by Zorlu et al. (2008) was used to solve this problem. Based on this method, the RMSE and R2 were ordered in their class and the best performance index (highest R2 and lowest RMSE) was assigned the highest rating. For instance, values of 0.6971, 0.701, 0.721, 0.7087, and 0.7282 were obtained for R2 of the training dataset for models 1 to 5, respectively. The ranking results of the five mentioned dataset models were respectively obtained as 1, 2, 4, 3, and 5. Then, a summation of all ranking values for each model is calculated as the total rank (the last column of Table 3). As can be seen, the last ANN model with a total rank of 17 outperforms the other ANN models.

DNN model

The DNN structure for the purpose of predicting the PR is displayed in Fig. 4. As shown in this figure, the network is created with three hidden layers. The number of neurons in the mentioned hidden layers are equal to 8, 32, and 8 for hidden layers 1 to 3, respectively. The mentioned structure should be performed to increase the system’s efficiency. In these analyses, a layer-wise pre-training method was considered and used. In fact, when considering an RBM, the initial weights of the system will be created and automatically transferred to the next layer.

Fig. 4
figure 4

Structure of the DNN model for predicting the PR

In the next step, the BP error algorithm was used for accurate determination of the size of DNN weights. This process is repeated for subsequent layers to reach the output layer. This network, like the ANN model, was implemented for the five different data categories. The results of the DNN technique are presented in Table 4 for all five categories. According to the ranking method mentioned previously, total rank values of 12, 12, 12, 13, and 11 were achieved for DNN models 1 to 5, respectively. These total ranks are very close to each other and the models are similar. Model number 4 has slightly better rank and is selected as the best DNN model. It seems that, by developing DNN models, the performance capacity of the network can be improved significantly in terms of both the R2 values and the RMSE. When the R2 results are considered, an increase of about 0.2 and 0.3 is observed for the training and testing datasets, respectively, by proposing a DNN technique in the estimation the TBM PR.

Table 4 Results of the DNN models for predicting the PR

Results and discussion

In this section, the best developed DNN and ANN models in estimating the TBM PR are evaluated in detail. The evaluation was performed based on the two performance indices of R2 and RMSE. Table 5 shows the obtained R2 and RMSE results for the best DNN and ANN models in estimating the TBM PR.

Table 5 The best DNN and ANN models for predicting the PR

As shown in Table 5, the best model of DNN in comparison with the best model of ANN gives much better results (R2 = 0.7282 and 0.6659 for the training and testing of the ANN and R2 = 0.9339 and 0.9336 for the training and testing of the DNN). Additionally, in terms of system error, a significant reduction can be observed by proposing the DNN model, especially when testing datasets are taken into consideration (from 0.07144 by the ANN to 0.0319 by the DNN). As shown in Fig. 5, in all five datasets, the R2 results of the DNN models are better than those of the ANN models. In addition, Fig. 6 shows a comparison between the errors obtained by the ANN and DNN models. As a result, system errors can be minimized by developing a new learning algorithm based on DNNs in estimating the TBM PR. The predicted and actual PR values for the best ANN and DNN predictive models are displayed in Figs. 7 and 8, respectively. These figures clearly show that the DNN model can provide a higher degree of accuracy because of the higher R2 values. It is worthwhile to mention that the DNN model developed in this study is among the first few models in the field of rock mechanics and tunnel engineering.

Fig. 5
figure 5

R2 results for all five datasets of the DNN and ANN models for predicting the PR

Fig. 6
figure 6

RMSE results of the DNN and ANN models for predicting the PR

Fig. 7
figure 7

Predicted and actual PR values by the ANN model for the training and testing datasets

Fig. 8
figure 8

Predicted and actual PR values by the DNN model for the training and testing datasets

In order to show the capability of the DNN developed in this study (with a R2 value of 0.934), a comparison (based on the R2 results) with the existing intelligent techniques for prediction of the PR has been made. As mentioned earlier, an ANN predictive model has been developed for prediction of the PR. Results for the R2 value equal to 0.9, 0.69, and 0.72 were obtained by Yagiz et al. (2009), Eftekhari et al. (2010), and Gholami et al. (2012), respectively, for prediction of the PR using an ANN model. Ghasemi et al. (2014) proposed a fuzzy logic model with an R2 value of 0.89 in estimating the PR of TBMs. In another study of intelligent systems, a PSO predictive model with an R2 value of 0.67 was developed by Yagiz and Karahan (2011) for PR prediction. Oraee et al. (2012) conducted a research based on an ANFIS predictive model to predict the PR and concluded that the performance of their model is suitable (with an R2 value of 0.69) in this field. Armaghani et al. (2017) used two hybrids of ANN models (PSO-ANN and ICA-ANN) for predicting TBM PR. The best R2 results were not higher than 0.92 for the ICA-ANN hybrid model. According to the above discussion, the developed DNN model yields better performance than the existing intelligent systems and it can be used in various applications of geotechnics.

Conclusions

A deep neural network (DNN) model for the performance prediction of a water tunnel in Malaysia was developed. The tunnel boring machine’s (TBM) penetration rate was estimated using the DNN model. A review of existing theoretical, empirical, and statistical models for the prediction of TBM performance was provided and an artificial intelligence (AI) technique was focused in this study as a suitable alternative. After describing the structure of the DNN network, a case study was presented and the collected database was discussed. The total length of 12,649 m of the tunnel was investigated and used in the model development. To predict the penetration rate (PR) of the tunnel, five parameters, including weathering zone, rock quality designation (RQD), rock mass rating (RMR), uniaxial compressive strength (UCS), and Brazilian tensile strength (BTS), were selected as the rock mass independent variables and the two parameters revolutions per minute (RPM) and thrust force (TF) were selected as the machine characteristics independent variables. Using the same database, several DNN models for prediction of the PR were developed and compared with each other and with five artificial neural network (ANN) models. The evaluation was performed based on two performance indices, R2 and the root mean square error (RMSE). The best model of DNNs in comparison with the best model of ANNs provided much better results. The developed DNN model achieved better performance than the existing intelligent systems and demonstrated potential for use in other geotechnical applications.