Introduction

In recent decades, mechanized longwall mining is tending to high-speed mining under the influence of strict safety and health standards (Peng 2020). In each longwall panel, two tunnels derived on both sides of the panel, in which personnel, supplies, extracted coal, and ventilating air are to pass. Figure 1 shows a schematic view of a retreat longwall panel developed by two tunnels. These tunnels are the lifeline and have unique functions during mining. Headgate is used for the haulage of extracted material, personnel’s passageway, and transportation of supplies. Tailgate that is situated on the other side of the panel, is mainly used for egress and return air.

Fig. 1
figure 1

A longwall panel along with tailgate and headgate roadways

In practice, some adjacent longwall panels are designed to exploit in turn. It means headgate in the previous panel should play the role of tailgate in the next panel. This leads to a high-stress distribution around the tailgate roadway, especially around its T-junction due to the superposition of the abutment stresses resulted from two adjacent panels. Because of high induced stresses around the tailgate roadway in a mechanized longwall panel, its stability is critical for both safety and continuous production. Due to the high investment costs, it is not satisfactory that mining operations in a mechanized longwall panel be interrupted due to the unstable tailgate.

Various parameters including geometrical parameters, geological conditions, advance rate, panel orientation, mining direction, barrier pillar sizes, in-situ and induced stresses, support systems, and other geomechanical conditions are of importance, and each one may be played a significant role in the tailgate stability. However, the risky areas for roof strata instabilities during longwall advancement are where the headgate and tailgate roadways intersect the longwall face, i.e., T-junctions. This issue is more severe in the vicinity of the intersection of tailgate and coalface, i.e., the tailgate T-junction (Chen et al. 2017; Peng and Biswas 1994).

An unstable tailgate would not only cause mining operations to be slowed down or delayed but could also potentially cause incidents leading to injuries or fatalities. Reliable tailgate support design is actually a complicated and case-based procedure, which mostly depends on experience or trial and error. Nonetheless, any tailgate instability is vital in mechanized longwall mining and may be responsible for roof failure, mine downtime, and a threat to personnel’s safety. Although support systems design is underway, and many empirical, analytical, and numerical design procedures were conducted to control roof failures, the problematic tailgate behavior is still an essential concern in mechanized longwall panels. This research aims at developing an Improved Support Vector Regression (ISVR) to predict unstable zones in the tailgate roadway based on the geomechanical information that is routinely collected during the phase of mine development.

Literature review

When a panel in mechanized longwall mining is exploited, the ground tends to move away from the high-stress zones to the low-stress areas. In fact, the in situ stresses are disturbed due to mining and will be redistributed after face advancement to create new equilibrium conditions. Because the induced stresses cannot be transferred through broken rock mass, they bring about stress concentrations at the boundaries of the roadways, especially around the T-junctions.

A number of researches have been done to gain a better understanding of tailgate stability in longwall mines. Numerous techniques based on field experiences, experimental and analytical analyses, and numerical simulations were developed by different researchers to address the problems of tailgate instability, and provide the safe and sustainable working conditions at the mine (Chen et al. 2017; Sears et al. 2019; Wang et al. 2018).

Some practical approaches for tunneling in coal mines under challenging ground conditions were introduced by Hudewentz and Luecker (1983). Heidarieh-Zadeh and Smith (1985) investigated the stability behavior of coal mine roadways using closure data at specific distances from the coalface. This problem was investigated by Cox (1994) through conducting a statistical analysis of tailgate convergence versus face distance over five longwall panels to find an exponential relationship between maximum convergences and the distance from the coalface. Seedsman (2001) presented the failure mechanisms of longwall tailgates through following the failure and stress paths in roof strata to present an appropriate support system for the tailgate roadway. Barczak et al. (2008) also emphasized that the longwall tailgate is suffered a severe loading while the longwall face is approached and passed. Tarrant (2003) combined empirical and analytical methods for statistical assessments of the longwall tailgate layout and the support design procedure. Esterhuizen and Barczak (2006) developed ground response curves to design the support systems for the longwall tailgate. Jiang et al. (2016) presented an analytical model based on the elastic beam resting on the Winkler-type foundation for roof stability analysis of the stratified roadway in coal mines. Buddery et al. (2018) introduced a remote reading telltale as a reliable real-time monitoring system to monitor tailgate or headgate stability, and to provide valuable data for ongoing support design. Kang et al. (2018) combined physical and numerical methods to gain a better understanding of failure mechanisms associated with sudden roof collapse in longwall faces. Zhu et al. (2018) indicated that the zone along the tailgate and ahead of its T-junction is of importance in view of rockburst occurrence potential, and the support systems installed in such areas are also utterly destroyed. Kang et al. (2019) investigated the mechanics of load transfer in longwall coal mining to indicate the zone of high pressure in front of the coalface. Esterhuizen et al. (2019) analyzed the tailgate stability by numerical modeling to designate the support requirements. Wang et al. (2019) developed a numerical model to investigate the stress redistribution in longwall tailgates during face advancement.

A recent trend in conducting such a complex non-linear problem is to recourse to machine learning algorithms such as SVR, which appears to be influential in solving non-linear regression problems in various engineering fields. Li et al. (2011) established a model based on the SVR and time-series analyses to predict surface movements over coal mines. Mahdevari et al. (2013) proposed a dynamical model based on the SVR to predict the tunnel convergence during excavation. Li et al. (2016) assessed the tunnel stability through combining uniform design and SVR. Pu et al. (2019) summarized some applications of SVR for prediction of the rockburst phenomenon. Shi et al. (2019) predicted the settlement in shallow tunnels using a time-series model based on the SVR. And, Liu et al. (2019) predicted rock mass parameters in mechanized tunneling by employing the SVR model. In this research, the SVR as a widespread and robust technique is improved to predict the roof displacements in the longwall tailgate.

Case study

Tabas mine is an underground coal mine in Iran, which is exploited by the mechanized longwall mining method. This mine is geographically situated about 85 km south of Tabas County, South Khorasan province (Fig. 2). The annual coal production from each longwall panel is about 1.5 Mt. The mine is geologically placed in the Parvadeh coalfield, where is determined by two major north-south trending fault systems, namely Kalmard and Nayband faults (IRITEC 2003).

Fig. 2
figure 2

Position of Tabas coal mine, South Khorasan, Iran

The asymmetrical Parvadeh anticline is located on the south side of the Rostam fault. The mine is developed in the central part of the anticline in an area of 1200 km2. Rock formations in the vicinity of the Rostam fault endure a severe deformation due to tight folding and numerous faults.

The rock strata in the Tabas coal basin are typically mudstone with noticeable coarsening up siltstone and sandstone sequences. Thin marine limestone layers are locally observed. The main coal horizons in Parvadeh coalfield are seams D, C2, C1, B2, and B1. The primary coal seam to be extracted is C1, which was developed from its outcrop on the south side of the Parvadeh anticline towards the south-west. The seam thickness of C1 varies from 2.2 to 1.5 m (IRITEC 2003).

Theory

Support vector regression

Support Vector Regression (SVR) concept, which was developed on the basis of the Vapnik-Chervonenkis (VC) theory, is nowadays a suitable method to tackle the problems of high-dimensional function approximation (Cortes and Vapnik 1995; Vapnik et al. 1997). Succinctly, the unseen data are aptly generalized using VC theory in machine learning. The SVR algorithm uses the support vectors to solve problems of function estimation via presenting a loss function. Furthermore, executing the Structural Risk Minimization (SRM) principle, SVR simultaneously minimizes the VC dimension and the empirical risk to present a robust generalization by restricted learning patterns (Jap et al. 2015). The SVR approximating function can linearly be written as:

$$ f(x)=w.\varPhi (x)+b\kern0.5em ,\kern0.75em \mathrm{with}\kern0.75em w\epsilon {R}^d\kern0.5em ,\kern1em b\epsilon R $$
(1)

where, b and w are respectively the bias and weight matrix, and Φ is the high-dimensional feature space mapped from the input space Rd. The objective of the SVR algorithm is then to estimate f(x) for a given training set {(x1, y1), …, (xn, yn)} ⊂ Rd × R, that has nearly ε deviation from the real targets yi, and be concurrently as flat as possible (Vapnik 2000). Therefore, errors less than ε are accepted, while any deviation larger than ε does not assent:

$$ L\left(\xi \right)=\left\{\begin{array}{c}o\kern4.25em \mathrm{if}\kern0.75em \left|\xi \right|\le \varepsilon\ \\ {}\left|\xi \right|-\varepsilon \kern1.5em \mathrm{otherwise}\end{array}\right. $$
(2)

where ε is the approximation accuracy resulted from the training data. Therefore, flatness in Eq. (1) is ensured when the norm ‖w2 is minimized (Cortes and Vapnik 1995):

$$ \min J=\frac{1}{2}{\left\Vert w\right\Vert}^2+C{\sum}_{i=1}^n\left({\xi}_i^{+}+{\xi}_i^{-}\right) $$
(3)

Subjected to: \( \left(w.\varPhi \left({x}_i\right)+{b}_i\right)-{y}_i\le \varepsilon +{\xi}_i^{+} \)

$$ {y}_i-\left(w.\varPhi \left({x}_i\right)+{b}_i\right)\le \varepsilon +{\xi}_i^{-} $$
$$ {\xi}_i^{+}\kern0.5em ,{\xi}_i^{-}\ge 0 $$

where slack variables \( {\xi}_i^{+} \) and \( {\xi}_i^{-} \) are the upper and lower bounds of training errors in the ε-insensitive tube. In Eq. (3), the term \( \frac{1}{2}{\left\Vert w\right\Vert}^2 \) controls the complexity of the function, and the term \( C{\sum}_{i=1}^n\left({\xi}_i^{+}+{\xi}_i^{-}\right) \) is the empirical risk. Consequently, the empirical and structural risks are both minimized in the SVR algorithm. The parameter C is the regularized constant and determines the trade-off between structural and empirical risks.

Various approaches are presented to solve Eq. (3). One way to simply solve this equation is the implementation of the dual formulation, which yields the function f(x) through quadratic optimization. This approach presents a distinctive solution that is not trapped in the local extremum. Based on the Karush-Kuhn-Tucker (KKT) conditions, the dual formulation can be rewritten by the Lagrange multipliers \( {a}_i^{\ast } \) and ai, the training data, and the constant b (Vapnik 2000):

$$ f(x)=\sum \limits_{i=1}^n\left({a}_i-{a}_i^{\ast}\right)\left\langle {x}_i,x\right\rangle +b $$
(4)

in which:

$$ \sum \limits_{i=1}^n\left({a}_i-{a}_i^{\ast}\right)=0\kern1em \mathrm{and}\kern0.75em 0\le {a}_i^{\ast },{a}_i\le C $$
(5)

The KKT conditions imply that the \( {a}_i^{\ast } \) and ai are zero when \( {a}_i^{\ast },{a}_i\ne C \) and | f(xi) − yi| < ε. Therefore, the whole input data are not necessary to calculate f(x), and only the training data having an approximation error equal to or larger than ε (\( {a}_i^{\ast }\ \mathrm{and}\ {a}_i\ne 0 \)) are used as the support vectors.

In order to map the input data into a high-dimensional feature space, the kernel functions are employed to carry out the non-linear mapping. The value of a kernel function equals the inner product of two vectors xi and xj in the feature spaces of Φ(xi) and Φ(xj), i.e., K(xi, xj) = Φ(xi). Φ(xj). Therefore, f(x) can be rewritten in terms of the kernel as (Vapnik 2000):

$$ f(x)=\sum \limits_{i=1}^n\left({a}_i-{a}_i^{\ast}\right)K\left({x}_i,x\right)+b $$
(6)

There are many kernel functions to produce support vectors. This research employs the Gaussian kernel function as a general radial basis function:

$$ K\left({x}_i,{x}_j\right)=\exp \left(-\gamma {\left\Vert {x}_i-{x}_j\right\Vert}^2\right) $$
(7)

where, γ is the variance of the Gaussian kernel, and ‖ ‖ is the Euclidean norm.

Improved SVR

One of the vital stages in designing any predictive model based on the SVR is the optimal selection of the model’s parameters. The optimization of parameters in the SVR model may sturdily affect the performance of the predictive model. In this research, the SVR algorithm is improved in such a way that the best values of three parameters, namely penalty factor (C), insensitivity zone (ε), and kernel parameter (γ), are selected to control the learning procedure of the proposed model.

In order to define the regression problem in SVR, suppose we are given a training set of n observation as {(x1, y1), …, (xn, yn)} ⊂ Rd × R, and then the regression problem is to estimate yi = f(x), which will be obtained by rewriting the Eq. (1) as:

$$ f\left(x,\omega \right)={\omega}_0+{\omega}_1{\phi}_1(x)+{\omega}_2{\phi}_2(x)+\dots +{\omega}_m{\phi}_m(x)={W}^{\intercal}\Phi (x) $$
(8)

In order to decrease the training complexity and improve the computing speed of SVR, the constrained optimization problem in Eq. (3) is simplified by substituting the penalty factor C with \( \frac{C}{2} \), and adding a constant term \( \frac{b^2}{2} \) as:

$$ \min J=\frac{1}{2}\left({\left\Vert w\right\Vert}^2+{b}^2\right)+\frac{C}{2}{\sum}_{i=1}^n\kern0.1em {\xi}_i^{+}+\frac{C}{2}{\sum}_{i=1}^n\kern0.1em {\xi}_i^{-} $$
(9)

Subjected to: \( \left(w.\varPhi \left({x}_i\right)+{b}_i\right)-{y}_i\le \varepsilon +{\xi}_i^{+} \)

$$ {y}_i-\left(w.\varPhi \left({x}_i\right)+{b}_i\right)\le \varepsilon +{\xi}_i^{-} $$
$$ {\xi}_i^{+}\kern0.5em ,{\xi}_i^{-}\ge 0 $$

This equation can be converted to unconstrained convex quadratic optimization problem by supposing \( {z}_i^{+}=\left(w.\varPhi \left({x}_i\right)+{b}_i\right)-{y}_i-\varepsilon \) and \( {z}_i^{-}={y}_i-\left(w.\varPhi \left({x}_i\right)+{b}_i\right)-\varepsilon \) as (Lee et al. 2005):

$$ \min J=\frac{1}{2}\left({\left\Vert w\right\Vert}^2+{b}^2\right)+\frac{C}{2}{\sum}_{i=1}^n{\left|{z}_i\right|}_{+}^2 $$
(10)

where, |zi|+ =  max {0, zi}. This problem is a convex minimization problem having a unique solution without any constraints. Since the target function in Eq. (10) is not twice differentiable, a p function (integral of the sigmoid function) with a smoothing parameter α is used to define the strictly convex and infinitely differentiable smooth function as (Lee et al. 2005; Xiong et al. 2006):

$$ p\left(u,\alpha \right)=u+\frac{1}{\alpha}\log \left(1+{e}^{-\alpha u}\right),\kern0.5em \alpha >0 $$
(11)

Therefore, the objective function of the developed ISVR is determined by the Newton-Armijo algorithm as (Doreswamy and Vastrad 2013):

$$ \min J=\frac{1}{2}\left({\left\Vert w\right\Vert}^2+{b}^2\right)+\frac{C}{2}{\sum}_{i=1}^n{p}^2\left(u,\alpha \right) $$
(12)

The procedure of selecting the best parameters for training the ISVR model is then executed to pick up the optimum values for C, ε and γ with the highest coefficient of determination (R2) and the lowest cross-validation error based on the following fitness function:

$$ \mathrm{Fitness}=\operatorname{Max}\left({R}^2+\frac{1}{\mathrm{MSE}}\right) $$
(13)

where, MSE is the mean squared error given as (Demuth and Beale 2002):

$$ \mathrm{MSE}=\frac{1}{N}\sum \limits_{i=1}^N{\left|\frac{Y_i-{Y}_i^{\ast }}{Y_i}\right|}^2 $$
(14)

in which, Yi and \( {Y}_i^{\ast } \) are respectively the measured and predicted values, and N is the number of input-output data pairs.

In the ISVR model, developed in this research, different combinations of C, ε, and γ are explored over a log2 range of values, so that the optimum values for each parameter will be selected by the grid search optimization with a limited step size. The pseudo code for training the proposed scheme of ISVR is summarized as:

Inputs: Original training set XTraining , and testing set XTesting

Goal: Finding optimum values of C, ε, and γ

Initializing: The upper and lower bounds of C ∈ [Cmin, Cmax], ε ϵ[εmin, εmax], and γ ∈ [γmin, γmax].

For C = log2Cmin to log2Cmax

For log2γmin to log2γmax

For ε = εmin to εmax

Training the SVR model by XTraining

Defining a suitable k-fold cross-validation

Computing R2 and MSE based on the Fitnesscurrent in Eq. (13)

IF Fitnesscurrent > Fitness

FitnessFitnesscurrent

CC

γγ

εε

END IF

END For

END For

END For

Inputs: Evaluation of the trained ISVR model by XTesting

Results

Amongst several mining methods to extract coal seams, longwall mining is the foremost in Iran. There are several methods developed for extracting the coal seams throughout the country. Tabas coal mine in the Parvadeh coalfield as the first mechanized underground mine is exploited by the longwall mining method. During longwall mining, a zone of high induced stresses is concentrated ahead of the coalface. The area in the vicinity of the T-junctions, which will move by face advancement, is suffered maximum stresses. This leads to large displacements occurred in the roof strata, especially in the tailgate roadway. Although support design knowledge has been technologically advanced, problematic tailgate behavior is now a significant problem in the Tabas longwall mine. In this research, an ISVR model is developed to predict unstable zones ahead of time in the tailgate roadway and its T-junction based on the monitored roof displacements data and the geological and geomechanical information collected during mining. In order to investigate the prediction capability of the proposed models, the obtained results are compared with Artificial Neural Networks (ANNs) and Multivariable Linear Regression (MLR).

Establishing database

In this study, a geomechanical database is established using the geological information and laboratory tests. The geomechanical parameters were obtained by carrying out the rock mechanics tests on intact rock samples. Besides, roof displacements are recorded by face advancement using telltale instruments installed at specified distances along the tailgate roadway. A dataset of 72 records in various sections of the 1.2 km long tailgate was recalled as independent variables for training the ISVR model (Table 1). The input datasets contain the uniaxial compressive strength (UCS), tensile strength (σt), cohesion (C), angle of internal friction (ϕ), Young’s modulus (E), shear strength (τ), density (ρ), slake durability index (Id2) and rock mass rating (RMR). Also, the maximum roof displacements (dmax) monitored in the tailgate are also selected as the dependent variable.

Table 1 Details of the datasets applied for training the ISVR model

Normalizing input data

Due to the fact that the input data have different units, the data have to be normalized before training the model. Normalization causes to dimensionless and keeps the input data between 0 and +1. In addition, dimensionless leads to an increase in the learning speed and an enhancement in the permanency of the model. Input data are normalized using Eq. (15):

$$ {X}_{\mathrm{Norm}}^{ij}=\frac{X^{ij}-{X}_{\mathrm{min}}^j}{X_{\mathrm{max}}^j-{X}_{\mathrm{min}}^j} $$
(15)

where, \( {X}_{\mathrm{Norm}}^{ij} \) is the normalized value, Xij is the original data in the ith row and the jth column, respectively. The \( {X}_{\mathrm{max}}^j \) and \( {X}_{\mathrm{min}}^j \) are respectively the maximum and minimum values of the related jth column.

Designing the ISVR model

Learning procedure in a machine learning model using the same training and testing samples leads to a methodological bias. Therefore, in order to keep away from over-fitting, it is essential to categorize the training and testing data. Cross-validation is a practical method to divide the input data into two separate training and testing sets.

Cross-validation

In order to adjust the hyper-parameters in the proposed ISVR model, the cross-validation technique is used at first. Based on the input data applied in this research, a 4-fold cross-validation method was found to be suitable for ISVR modeling. In fact, the generalization error is assessed through a 4-fold cross-validation to randomly split the training data into four mutually unique subsets of equal sizes. Therefore, in each iteration, the decision rule will be implemented using three subsets, and then tested on the residual subset. This procedure is repeated four times, and the generalization error is finally approximated by averaging the validation in the four iterations.

Parameters optimization in ISVR

The parameters C, ε, and γ have a great influence on the prediction accuracy of the SVR model. As mentioned, parameter C determines the trade-off between training error and VC dimension. The parameter ε is the insensitivity zone in the ε-insensitive loss function. The γ that is the width of the Gaussian kernel function outlines the non-linear mapping from the input space to a multi-dimensional feature space.

Implementing a 4-fold cross-validation, different combinations of C, ε, and γ are explored in the proposed ISVR model over a log2 range of values, so that in the range of [20 , 25] for C, and in the range of [2−3.5, 21.5] for γ, both with step sizes of 20.1. The parameter ε is also examined in the range of [0.001, 0.1] with step sizes of 0.01. This procedure iterates in three loops for 260100 times (51 × 51 × 100), and lastly the optimum values of C, ε, and γ with the highest R2 and the lowest MSE are chosen based on the fitness function in Eq. (13). The results of the proposed ISVR are summarized in Table 2, in which the 4th fold yields the best values for R2 and MSE.

Table 2 Optimal values for C, ε, and γ in four folds

Training the ISVR model

In order to train the ISVR model, a Gaussian kernel function was introduced to the model at first. Three parameters of C, ε, and γ are then picked up via the best results of the 4-fold cross-validation. For training the ISVR model, the code is executed in MATLAB software. In order to be rational training and avoid the over-fitting, 75% of the input data is randomly selected for training, and the rest is retained for testing the ISVR model. According to the results presented in Table 2, the best values of R2 and MSE for training the ISVR model are respectively 0.94 and 0.009. In addition, the averaged values of the R2 and MSE are respectively obtained 0.92 and 0.017, which signify a suitable accuracy for training the ISVR model. The R2 resulted from training the ISVR model including 75% of the input data is shown in Fig. 3. As seen, a high goodness of fit and a low error are obtained from the trained model.

Fig. 3
figure 3

The measured roof displacements versus ISVR predicted values for the trained model

The changes of the MSE and R2 during training the ISVR model are depicted versus the log2C and log2γ in Fig. 4. As perceived in these figures, maximum R2 and minimum MSE are obtained based on the fitness function defined in Eq. (13) when log2C and log2γ approximately fall in the optimum ranges of 0.189 and 0.799, respectively.

Fig. 4
figure 4

R2 and MSE changes versus log2C and log2γ during training the ISVR model

Since the parameter C controls the trade-off between error minimization and margin maximization, selecting an optimum value for this parameter is important. Assigning a too large or a too small value for parameter C results in the over-fitting or under-fitting phenomenon. In the proposed ISVR model, the C and γ parameters are chosen in such a way that the optimum values obtain based on the described procedure, which is remarkable for designing or updating the model.

Testing the ISVR model

The trained ISVR model is put into practice to predict tailgate stability based on the 25% of the input data, which are unseen testing samples. The developed ISVR model is suited by selecting the parameters of C, ε, and γ to be respectively implemented as 1.140, 0.008, and 1.741 while testing the ISVR model. Figure 5 depicts the graphical output of the regression examination for the testing data. The ISVR model predicted values are plotted versus the measured ones. The best linear fit is specified by a blue line. Based on the prediction results, the MSE and R2 between the predicted and measured values are respectively obtained as 0.0116 and 0.9092. According to the results, the prediction of roof displacements using the ISVR model is reasonably in agreement with the measured ones signified by the closeness to the equality line and the high goodness of fit.

Fig. 5
figure 5

Testing the ISVR model with unseen data in 18 sections

Examining the prediction capability

In order to evaluate the validity and performance of the ISVR model in predicting dmax in the longwall tailgates, the obtained results are compared with those of ANN and MLR, which are presented in two separate subsections.

ANN results

For building an ANN model, the Levenberg–Marquardt algorithm was employed. For this purpose, the relationship between the roof displacement and the geomechanical parameters are established using a Multi-Layer Perceptron (MLP) network in the MATLAB environment. The MLP neural network developed in this research consists of an input layer, two hidden layers, and an output layer with a “9-4-5-1” topology. The hidden layers are included a logistic sigmoid (LogSig) and a hyperbolic tangent sigmoid (TanSig) transformation function.

Based on the trial and error, a four-layer MLP network is tested with the various arrangements of neurons, and finally a “9-4-5-1” topology is yielded to be optimum. The first layer has nine nodes corresponding to the number of inputs. The network should have one output neuron since there is only one target. In order to train the ANN model, the whole input data are randomly divided into three subsets of training, validating, and testing data, which respectively contain 50%, 25%, and 25% of the input data. Figure 6 illustrates the ANN model outputs, in which the predicted values of dmax are plotted versus the desired targets. As seen, the maximum values of R2 for training, validation, test, and overall data are respectively obtained as 0.990, 0.911, 0.868, and 0.939.

Fig. 6
figure 6

R2 for training, validation, test, and overall data (dashed line is equity line)

The MSE, as a typical performance function usually used for training feed-forward neural networks, is applied in the ANN model as a measure of stopping the training process to prevent over-fitting of the model. Taking into account the MSE curve during training of the ANN model, the best validation performance was obtained at epoch 3, and the value of MSE is calculated as 0.1289, which shows a good level of performance (Fig. 7).

Fig. 7
figure 7

The best validation performance for the ANN model

MLR results

In statistics, MLR is a widespread method to detect the linear relationship between some independent variables and a dependent variable. In this research, an MLR function is established based on the 75% of the input data using SPSS software. The linear relationship resulted from the MLR is obtained as:

$$ {d}_{\mathrm{max}}=616.3845+0.1317\ \mathrm{UCS}+4.6579\ {\sigma}_t-13.5096\ C+0.2813\ \phi +2.5463\ E-2.3538\ \tau -118.7582\ \rho -2.3372\ {I}_{d2}+0.2107\ \mathrm{RMR} $$
(16)

where the whole parameters were previously defined in the “Establishing database” section. The MLR approximation function was obtained by introducing the same dataset, which was used for training the ISVR and ANN models. As presented in Fig. 8, the R2 obtained for testing the MLR over 18 unseen testing data is calculated as 0.8086 with an MSE value of 0.439.

Fig. 8
figure 8

Testing the MLR function with unseen data in 18 sections

Discussion

There is presently no unique technique that provides mining engineers with a reliable tailgate support design in longwall coal mining. Despite widespread developments in longwall mining, the design of both primary and secondary support systems for longwall tailgate still remains uncertain and often debatable.

The reliance upon experience or trial and error is responsible for many downtimes and delays in longwall mines, which possibly causes calamitous consequences (Tarrant 2003). Therefore, in order to reach the sustainable coal mining, it is vital to raise the recovery rate, and at the same time, reduce roof instabilities, especially those occurring in the tailgate roadway and T-junctions.

Designing a tailgate roadway in longwall mining is somewhat complicated, and is usually the primary task in the procedure of the longwall mine designing. This is due to the fact that the coalface advancement leads to a high-stress redistribution in the roof strata. The high-stress concentrations may cause large fractures along roof strata in the vicinity of the tailgate and its T-junction. Unless effectively, timely, and adequately supported, the excessive roof displacements and subsequently roof falls may have occurred in the tailgate roadway. Therefore, in order to ensure that the tailgate roadway is securely functional, a set of standing support systems are also installed, usually determined by trial and error or experiences gained from surrounding longwall panels.

In order to reduce the deformations of the immediate roof rocks, and consequently to prevent further instabilities leading to unwanted incidents or catastrophic failures, an ISVR model is proposed in this research based on the geomechanical characteristics and continuous monitoring of roof displacements. Employing the proposed model, the areas prone to instabilities will be predicted ahead of time by introducing the geomechanical parameters. In fact, the ISVR model uses historical data to detect unstable zones around the longwall tailgate and its T-junction.

The prediction capability of the ISVR model is examined by comparing the results with those of the ANNs and MLR, which reveals a high goodness of fit and superior accuracy for the ISVR model. Figure 9 depicts the results of the ISVR, ANN, and MLR models, in which the predicted roof displacements are compared with the measured ones in 18 sections. The R2 and MSE measures for the ISVR, ANN, and MLR models are also presented in Table 3. Based on the results, the measured and predicted values are comparatively matched in the ISVR model, and differences are only a few millimeters in some sections. Therefore, the ISVR model is relatively reliable and may be valuable in predicting tailgate instabilities ahead of time in mechanized longwall mining.

Fig. 9
figure 9

Results of the ISVR, ANN, and MLR models over testing samples

Table 3 The best values of MSE and R2 for ISVR, ANN and MLR models

Since the severely damaged zones are accorded where the peak stresses locate on the tailgate roadway (Zhu et al. 2018), the ISVR model can therefore be a flexible tool to provide a safer environment in T-junctions for mine’s personnel. Also, the ISVR model may be more cost-effective for support system optimization in tailgate roadway through engineering and intelligent procedure rather than by trial and error. In addition, unlike the black-box models, the ISVR model will quickly be developed and adopted by introducing new training data recorded from other panels. The major benefit of the ISVR model is the potential to be continuously updated by introducing new data, and consequently understand the interaction of the support systems with the ground conditions at the monitored sections. In addition, independent variables are directly updated by introducing the historical data without the necessity for changing the model parameters (Vapnik et al. 1997). Thus, unlike the black-box models, the ISVR model with the same parameter settings will give reliable results. This implies the potential of repeatability for the proposed ISVR model in predicting roof displacements based on the geological and geomechanical information. Therefore, the training runs required for selecting the optimum parameters are significantly decreased in the ISVR model than the black-box ANN model.

In respect to the other AI algorithms, SVR has the following characteristics (Mahdevari et al. 2013): (1) a global optimal solution, (2) avoiding overtraining, (3) the solution is sparse and only a limited set of training points contribute to this solution, and (4) non-linear solutions can be calculated efficiently due to the usage of inner products.

When cracks are initiated and propagated ahead of the coalface, the inelastic zone is expanded, which may cause to transfer of the overburden loads from the yielded or failed rocks to the adjacent unmined areas (Wang et al. 2019). Since understanding the roof strata behavior significantly affects the optimal designing of support systems, this study may be an intelligent measure for a better understanding of the roof strata behavior to timely control unstable zones in the vicinity of the T-junction and along the tailgate roadway in longwall coal mines.

Conclusions

Although tailgate roadway in a mechanized longwall coal mine is often subjected to high stresses and deformations, maintaining a sustainable and functional tailgate is vital to the success of the safe and efficient coal mining operations. Problematic tailgate conditions in Tabas mine are recognized as a serious concern, given that the consequences vary from production delays to catastrophic failures. This research is intended to predict the unstable zones in the vicinity of the T-junction in the tailgate roadway. For this purpose, the SVR algorithm was improved to approximate the non-linear relationship between the geomechanical features and roof displacements. For this purpose, a dataset containing 72 records in different sections of a tailgate in Tabas mine was introduced to the ISVR model for training. The independent parameters are UCS, σt, C, ϕ, E, τ, ρ, Id2, and RMR. In addition, datasets of the dmax are selected as the dependent variable. Prediction capability of the ISVR model was fulfilled by computing the R2 and MSE between the ISVR predicted values and measured roof displacements. Accordingly, R2 for training and testing data was respectively obtained as 0.94 and 0.91, whereas MSE was respectively calculated 0.009 and 0.012 for training and testing data. In order to examine the prediction capability of the ISVR model, the results are compared with those of the ANN model and MLR function. By introducing the same dataset, the R2 and MSE for testing the ANN model are respectively obtained as 0.87 and 0.129, and for testing the MLR model are respectively obtained as 0.81 and 0.439. These results indicate a satisfactory precision for the proposed ISVR model to monitor and control tailgate stability in mechanized longwall coal mines. In addition, unlike the black-box ANN model, the ISVR model will quickly be developed and updated by introducing new training data recorded from other adjacent panels.