1 Introduction

Generally, the ultimate goal of underground mining is to remove the ore from the ground in a safe and economic way. The performance of this removal depends on overall conditions of the coal seam and overburden strata as well as the utilized mining method. Longwall mining is the most commonly used methods in underground mining particular to coal seam extraction. In this method, the coal (ore) seam’s extraction within a considerable panel width causes a downward movement of the immediate roof layers above the extracted panel. Thus, this stratum collapses and caves in some distance behind the face work within the gob region. Downward movement of the rock layers then gradually expands upward and causes the damaged rock stratum to be fractured and caved. Therefore, the cover pressure upon the caved zone will be transmitted to the front and rib-sides solid sections. The height of upward extension of disturbed zone (including caving and fracturing zones), depends on the many variables, i.e., overburden thickness, mined ore (seam) thickness, panel width, the strength properties, the relative thickness and number of roof rock strata and the their related bulking factor, etc. [14].

Establishing an appropriate approach to assess the behavior of roof rock strata in longwall mining is the prime concern of coal mining researchers. Exact determination of the height of induced caving–fracturing zone over the gobs is a very main object to the longwall mining investigators and designers. For proper evaluation of the transferred stresses to the neighboring face access tunnels and their related barrier and chain pillars, the height of caving–fracturing zone should be predicted accordingly. There are several methods in the literature to predict the height of caved and fractured zones, i.e., in situ measurement and physical, empirical, numerical and analytical modeling that is referred by Majdi et al. [1] and Rezaei et al. [2]. Performances of the available models to estimate the height of disturbance zone over the longwall gobs are not satisfactory because of the complexity of longwall mining environment. Empirical methods cannot be accurately applied for all cases because they have been commonly developed based on the information of a specific case study with particular properties. Numerical models are the commonly used methods to evaluate the roof rock strata disturbance processes, but require a large number of input parameters that may need to be approximated or assumed. Despite having high accuracy, in situ measurements and physical models are time-consuming and expensive. Conversely, the analytical model is simple and cut-rate somewhat. However, the later method is based on the numerous assumptions that may increase the estimation error.

According to the above-mentioned demerits as well as considering only a few numbers of the effective parameters in the available predictive models, utilizing the suitable alternatives seems to be necessary. Predictive intelligence systems can be the appropriate approaches in this regard. Artificial neural networks (ANNs) are one of the most populous intelligent systems and can be utilized to model the complex problems. The ANN models are flexible in dealing with ill-defined systems. These networks were carefully utilized in the field of mining and rock engineering until now [523]. The above-mentioned utilizations reveal that neural network models are the powerful techniques in mining and geo-engineering issues in which multivariate problems should be models in a precise procedure. Unlike the aforementioned available methods, the influence of all possible effective parameters can be simultaneously considered in ANN modeling.

In the current paper, “caving–fracturing zone” is considered as the equivalent of the combination of caved zone and interconnected fractured zone. Generally, the height of caved zone in addition to lower and middle parts of fractured zone is very important in longwall face stability and also in transferred stress to the front and sides abutments. The caved zone is destressed from the beginning, whereas the interconnected fractures zone will be continuously caved and destressed during the mining operation. Therefore, these two mentioned parts of disturbed zone above the mined panel formed the destressed zone which is important in the estimation of longwall mining-induced stress. Therefore, combination height of the caved zone and the lower and middle parts of the fractured zone (combination height of the “caved zone” and “bedding plane separation zone” in Fig. 1) are considered as the height of “caving–fracturing zone” in this research. In order to determine the height of caving–fracturing zone (HCFZ) over the longwall gobs, in this study, two predictive models including the artificial neural network (ANN) and the multivariable regression analysis (MVRA) have been proposed, analyzed and compared with each other as well as with the obtained results from the available models in the literature.

Fig. 1
figure 1

Zones of overburden movement caused by longwall mining [after 32]

2 Literature review

Failure mechanisms and breakage characteristics of the strata treatment over the extracted panel in longwall mining and development of straining, fracturing and caving process of roof layers are considerably studied by many investigators. The majority of them believed that three defined zones of movement are found in the roof rock strata above the longwall mine gob and goaf [2431]. These zones composed of caved, fractured and continuous deformation zones (Fig. 1). A comprehensive literature review of this work is given by Majdi et al. [1] and Rezaei et al. [2]. Here, some newly related references presented in 2015 and 2016 years are reviewed.

Tajduś [33] analyzed the horizontal displacement distribution caused by single advancing longwall panel excavation. His research is based on the assumption that the value of horizontal displacement is proportional to the slope of the subsidence trough. Xue et al. [34] evaluated the overlying strata movement and fracture based on the experimental analysis. According to this research, the maximum height of caving, fracturing and bending zones is 7.5, 25 and 25 times the mining height, respectively. Bai et al. [35] studied the deformation and failure mechanisms of roof layers over the gobs and evaluated the prevention methods in advancing longwall working face. This research is based on the in situ measurements, numerical simulation and analytical assessments to spatially evaluate the distributed loads due to longwall mining. Ming-he et al. [36] used the numerical investigation to evaluate the distributed loads in the vicinity of the longwall working face based on the height of caving zone. Obtained results of their research showed that the caving zone height is a main effective parameter on the coefficient of stress concentration over the neighboring solid sections of a longwall panel. Palchik [37] performed an in situ measurement for evaluation of the height of caved zone and mechanical parameters of overburden. He concluded that the uttermost height of caved zone can reach 20 times the mining height. Qu et al. [38] proposed a conceptual approach to model the overlying strata behavior over the longwall panels in order to evaluate the roof rock strata deformations and gas flow process. According to this research, the maximum ratio of the key stratum height to the mining height is about 20.7. Yu et al. [39] performed in situ investigations to evaluate the failure of roof layers over the longwall gob and concluded that the maximum height of fractured zone reaches 22 times the mining height (thickness of mined seam). According to their research, in situ measurements are taken by many authors in China’s mines in which the height of fractured zone is in the ranges of 11.94–19.97 times the mined seam (ore) thickness.

Jiachen et al. [40] proposed a coalface failure model and concluded that the caved zone height increases with the increase in mining height in longwall panels. Meng et al. [41] studied the heights of caving–fracturing zone at a particular case study using the in situ tests, physical modeling and numerical simulation. Accordingly, the average height of caving–fracturing zone obtained from these three approaches is 20.3, 22.8 and 19.4 times the seam extraction thickness, respectively. Zhu et al. [42] investigated the abutment stress evaluation due to longwall mining using the key stratum theorem in order to assess the mechanism of transferred loads. Yu et al. [43] conducted the in situ measurements to evaluate the stress and deformation of pillars and gateroads surrounding the longwall panels that are affected by the mining-induced stress. Their investigations indicate that the extraction of a panel causes the main deformation and damage in the adjacent pillars and gateroads.

Rounding up the above reviewed references, the results of utilized models by some researchers to calculate the height of caved, fractured and destressed zones in the recent years (2015 and 2016) are shown in Table 1. In this table, the results are described based on the ratio of the height of caved (H c) and fractured/destressed (H f) zones to the extracted coal seam thickness or mined seam (ore) height (h s). Also, there are numerous empirical and analytical relations to estimate those heights that are found in [1] and [2].

Table 1 The results of existing models to calculate the height of disturbed zones

3 Basic concept of ANNs

The detailed descriptions of artificial neural networks (ANNs) can be found in numerous literature [912], so it is explained briefly here. The structure of these networks is like the human brain and acts similarly in visualizing the environments and mapping the problems. ANNs are capable of solving the systems with high complexity in which the relations between the dependent parameters with predictor variables are nonlinear and ill-defined. This characteristic causes the neural networks to be a popular technique utilized in engineering complex fields. In fact, neural networks are composed of neurons that are connected together and operate as the computation units. These neurons are located in the different consecutive layers with defined interconnections. Generally, a neural network is composed of three fundamental components including network architecture, transfer function and learning law. Definition of these components usually depends on the type and complexity of the studied problem. In neural network modeling possess, training is firstly required before taking novel data. There are many types of artificial neural network in the literature. However, the feed-forward back-propagation neural network is the mostly and efficiently used type in the field of engineering problems. The back-propagation neural networks with more than one layer (multilayer structure) usually made up leastwise three distinct layers including input, hidden(s) and output layers. Numbers of neurons in input and output layers are equal to the number of input and output variables of the studied problem. However, the hidden layers number and the number of neurons located in each hidden layer are dependent on the complexity of the problem system modeled. The optimum number of hidden layers and their respective neurons number are achievable based on the trial-and-error process [7, 911].

For differentiation and visualization between the various processing units (neurons), bias values are established into the transfer (activation) functions. These functions are usually utilized to transmit the aggregated weights dissipated from all inputs in which the intensity of neuron output is determined [44]. In general, there are two types of transfer functions including nonlinear (sigmoid) and linear ones that are employed in the network structure construction. This transfers functions into the several subsets by itself, i.e., LOGSIG and TANSIG belong to nonlinear and POSLIN and PURELIN belong to linear sets. The ultimate goal of neural network utilization and structure of the discussed problem affects the selection of transfer function type in the network architecture. By surveying the literature, it is concluded that the sigmoid transfer function has the high efficiency and is the most one used [45]. In the training step of a neural network, the interred data to the input layer are processed to the hidden layer(s) and then reached the output layer. This process is called the “forward pass” in neural network modeling. At the end, comparison between the neural network output and the real ones is implemented. If there is disagreement between them, it is distributed back through the connections between the network neurons. This process is known as the “backward passes” and is used to update the exclusive biases of each neuron and connection weights between the layers. The forward and backward pass process is reiterated for all pairs of training datasets until the error of network meets the considered threshold value in modeling. The value of threshold is defined based on the evaluation performance indices, i.e., root-mean-squared error (RMSE) and summed squared error (SSE) [6, 10].

4 Structure of dataset

To develop an intelligence model, i.e., neural networks, fuzzy inference systems, providing impressive number of data is essential. In other words, gathering of the dataset is the first and most important step in constructing these models. In this study, a vast collection of suitable dataset (83 series) was collected from the Iranian coalfields and comprehensive literature surveys given in Sect. 2 (see Table 1) in addition to the results of literature review conducted by [2]. To predict the height of caving–fracturing zone (HCFZ) using the ANN and MVRA models, parameters such as overburden depth, panel width, rock mass unit weight, rock mass elastic modulus, rock mass Poisson’s ratio, unconfined compressive strength of rock mass, rock mass bulking factor, rock mass friction angle and mined seam (ore) height were considered as input parameters. It should be noted that the mean values of panel overlying layers’ characteristics in the current research are being used.

For constructing the ANN and MVRA models, the available datasets are partitioned into two distinct groups including training data and testing data. Seventy-five percent of the datasets were considered as the training data for learning stage and construction of the ANN and MVRA models, while the rest were used to test and evaluate the proposed optimum models. Data selection was done based on the randomly sorting approach in order to divide the datasets. Table 2 represents the input and output variables along with their respective symbols used in the modeling. Furthermore, all of the datasets have been statistically analyzed, and the histograms of datasets variables are presented in Figs. 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11.

Table 2 Characteristics of input and output variables used in the proposed models
Fig. 2
figure 2

The histogram of overburden depth

Fig. 3
figure 3

The histogram of panel width

Fig. 4
figure 4

The histogram of unit weight

Fig. 5
figure 5

The histogram of elastic modulus

Fig. 6
figure 6

The histogram of Poisson’s ratio

Fig. 7
figure 7

The histogram of unconfined compressive strength

Fig. 8
figure 8

The histogram of bulking factor

Fig. 9
figure 9

The histogram of friction angle

Fig. 10
figure 10

The histogram of mined seam (ore) height

Fig. 11
figure 11

The histogram of the height of caving–fracturing zone

5 Development of the ANN model to predict HCFZ

The process of constructing the ANN optimum model includes data normalization, determination of model architecture, training of the network and finally, validation and testing of the model. In the other words, the above steps are required to develop an optimum ANN model that is discussed in the following.

5.1 Data normalization

To advance the training rate and the capability of neural networks modeling, the entrancing datasets should be firstly normalized and kept in the ranges of 0 to +1 before starting the model construction. Normalization causes the datasets to be dimensionless. This process should be implemented before training and modeling. Data normalization causes the input parameters with distinct values to have the equivalent influence on the network output. Furthermore, since the normalization provided the data in the dimensionless form, then the input parameters with disparate units will have the similar influence on the neural network output. Data normalization of the data is done using the following equation:

$$X_{\text{Norm}}^{ij} = \frac{{X_{\text{Max}}^{j} - X^{ij} }}{{X_{\text{Max}}^{j} - X_{\text{Min}}^{j} }}$$
(1)

where X ij is the original value of data in the jth column, X ijNorm is normalized value of data in the ith row, and X ijMin and X ijMin are maximum and minimum values of each related jth column, respectively.

5.2 Model architecture

In order to achieve an optimum architecture of ANN model to predict the HCFZ, neural networks with different characteristics are tested in which the network with the minimum error is chosen. The characteristic to be tested includes the number of hidden layers and their respected neurons number, learning and transformation functions types, number of Epochs and values of learning rate. To evaluate the error of all possible tested networks, root-mean-square error (RMSE) index is used and computed. As mentioned before, the network with least RMSE is selected to be an optimized model in HCFZ determination. In addition, the similar method is used to obtain the optimum number of neurons contributed in the hidden layers. The results of trial-and-error method to determine the number of hidden layers neurons are demonstrated in Fig. 12 in which the error of different networks with different numbers of hidden layer neurons is calculated. As demonstrated in the above figure, a value of 10 neuron number in hidden layers of a network leads to the minimum RMSE and then the highest efficiency of the model. The obtained optimum number of hidden neurons can situate in one or more hidden layers. It should be noted that the number of hidden layers and arrangements of their respective neurons are also determined using the trial-and-error method. Accordingly, the obtained optimum number of neurons (10 neurons) is distributed in different network architectures, i.e., one and two hidden layers. Also, the other above-mentioned network characteristics are varied in each distribution process. Then, the error (RMSE in this research) is calculated for all possible networks with varying properties. The obtained results of trial-and-error process for some sample network architectures in HCFZ prediction are presented in Table 3. RMSE is computed using this equation [10]:

$${\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {A_{\text{imeas}} - A_{\text{ipred}} } \right)^{2} } }$$
(2)

where A imeas is the ith measured element, A ipred is the ith predicted element, and n is the number of datasets.

Fig. 12
figure 12

Performances of the network to determine the optimum number of hidden neurons

Table 3 Obtained errors of some sample networks to determine the optimum network

As distinctly shown in Fig. 12 and Table 3, a multilayer back-propagation neural network with 9 inputs, 6 neurons in first hidden layer, 4 neurons in second hidden layer and one output (9-6-4-1 network), TRAINLM training function and LOGSIG transfer function lead to the minimum RMSE (row 8). This architecture is considered as the optimum ANN model architecture to predict the HCFZ. A schematic demonstration the optimum network architecture is presented in Fig. 13. In addition, Table 4 shows the settings and details of architecture characteristics of proposed optimum neural network.

Fig. 13
figure 13

Suggested ANN for the HCFZ prediction

Table 4 Detailed characteristics of the archived optimum network

5.3 Validation and testing

For evaluation and testing of the proposed optimum neural network model, twenty-five percent of data (20 series) were randomly selected from datasets. It should be noted that these testing data were not incorporated in the learning stages of the network. These testing data are entranced to the optimum trained network in order to evaluate and validate its performances. As performance evaluation tool, correlation coefficient (R) between the network output and real f HCFZ is calculated for validation and testing processes. The network outputs were achieved based on the input testing data that are described in Sect. 4. Figure 14 demonstrates the obtained results of the proposed optimum ANN model for all of the modeling processes including training, validation and testing.

Fig. 14
figure 14

Obtained correlation coefficients of optimum ANN model in different stages for HCFZ prediction

6 Multivariate regression analysis

The multivariable regression analysis (MVRA) method which is a branch of the statistical models is used for statistical HCFZ modeling in the current research. This method is usually utilized to make a statistical equation dependent(s) and independent (inputs) variables. In other words, the output parameter(s) can be predicted based on the defined input parameters in the current statistical method. In fact, a new predictive relation is established to estimate the targets in these models [46]. Based on the multivariable regression analysis, relationships between the output (HCFZ) and input variables have been discussed. The environment of the statistical software package SPSS 22 was utilized to produce a multivariable equation based on the identical data that are used in training of the ANN model. The results of multivariable regression analysis including the constant and the coefficients between the input parameters and output parameter (HCFZ) are presented in Table 5.

Table 5 Outputs of multivariable regression analysis for HCFZ prediction

7 Discussion and results

In this section, the suggested models firstly are compared together based on the evaluation performance indices. Then, the results of the present models in this study are further compared with the results of the existing comparable models. Finally, the sensitivity analysis is done to evaluate the effect of input parameters on the HCFZ.

7.1 Models performance evaluation

To evaluate the proposed models’ performance, their results are compared and tested based on the measured values. For this purpose, the four performance evaluation indices including determination coefficient (R 2), variant account for (VAF), mean absolute error (E a) and mean relative error (E r) have been employed. The above-mentioned performance indices are calculated using Eqs. 36 [10, 4749]:

$$R^{2} = 100\left[ {\frac{{\sum\limits_{i = 1}^{n} {\left( {A_{\text{ipred}} - \bar{A}_{\text{pred}} } \right)\left( {A_{\text{imeas}} - \bar{A}_{\text{meas}} } \right)} }}{{\sqrt {\sum\nolimits_{i = 1}^{n} {\left( {A_{\text{ipred}} - \bar{A}_{\text{pred}} } \right)^{2} \sum\nolimits_{i = 1}^{n} {\left( {A_{\text{imeas}} - \bar{A}_{\text{meas}} } \right)^{2} } } } }}} \right]^{2}$$
(3)
$${\text{VAF}} = 100\left( {1 - \frac{{\text{var} \left( {A_{\text{imeas}} - A_{\text{ipred}} } \right)}}{{\text{var} \left( {A_{\text{ipred}} } \right)}}} \right)$$
(4)
$$E_{\text{a}} = \left| {A_{\text{imeas}} - A_{\text{ipred}} } \right|$$
(5)
$$E_{\text{r}} = \left( {\frac{{\left| {A_{\text{imeas}} - A_{\text{ipred}} } \right|}}{{A_{\text{imeas}} }}} \right) \times 100$$
(6)

where \(\bar{A}_{\text{ipred}}\) is the average of prediction sets, \(\bar{A}_{\text{imeas}}\) is the measured sets, and other variables are defined previously.

The datasets that are not incorporated in training and construction of the models have been used for testing them. Accordingly, the values of the above-mentioned indices for two suggested models were computed and are demonstrated in Table 6. Moreover, the comparative results of the predicted and measured HCFZ values obtained from both proposed models are presented in Figs. 15 and 16, respectively. Furthermore, Fig. 17 indicates compliance of the ANN and MVRA results with the measured values of HCFZ for 20 different series of testing data. According to the above comparison, it can be concluded that the ANN model performances are higher than the MVRA model and its predicted outputs completely agree with the measured ones.

Table 6 Calculated indices of the suggested models in HCFZ prediction
Fig. 15
figure 15

Relationship between measured and predicted HCFZ values for the ANN model

Fig. 16
figure 16

Relationship between measured and predicted HCFZ values for the MVRA model

Fig. 17
figure 17

Comparison of the proposed model’s output with the measured HCFZ for 20 series of dataset

7.2 Comparative analysis

The outputs from the suggested models are compared with the results of available physical, analytical, empirical and numerical and in situ models gathered from the comprehensive literature review. Acquired results from all models are analyzed and compared with each other as coefficients of mined seam (ore) height (h) and panel width (L). Relationships between the mined seam (ore) height and predicted HCFZ values resulted from the proposed neural network and statistical models are presented in Figs. 18 and 19, respectively. Also, Figs. 20 and 21 demonstrate the relationships between panel width and predicted HCFZ values resulted from ANN and MVRA models, respectively. As Figs. 18 and 19 indicate, the coefficient between HCFZ and mined seam (ore) height (h) values is in the ranges of 3.1–86.5 in the ANN model and 10.02–40.93 in the MVRA model. Also, Figs. 20 and 21 demonstrate that the predicted HCFZ resulted from ANN model ranges from 0.14 to 1.34 times the panel width, whereas the predicted HCFZ resulted from MVRA model ranges from 0.17 to 1.03 times the panel width.

Fig. 18
figure 18

Relation of the height of caving–fracturing zone with the mined seam (ore) height (ANN model)

Fig. 19
figure 19

Relation of the height of caving–fracturing zone with the mined seam (ore) height (MVRA model)

Fig. 20
figure 20

Relation of the height of caving–fracturing zone with the panel width (ANN model)

Fig. 21
figure 21

Relation of the height of caving–fracturing zone with the panel width (MVRA model)

The proposed models are further compared with each other and with the results of reported methods in the literature by other investigators (Tables 7, 8). Table 7 presents the HCFZ results of the proposed models as a coefficient of the mined seam height (h) in addition to a summary of five sets results achieved from the reported respective methods in the literature that are reviewed by the author are presented in Table 1. In this table, the results of models are on the basis of the ratio of the height of caving–fracturing zone to the mined seam (ore) height (HCFZ/h). Moreover, Tables 8 shows the models results in terms of the panel width.

Table 7 Results of the comparative analysis as the coefficient of mined seam (ore) height
Table 8 Results of the comparative analysis as the coefficient of panel width

As shown in Table 7, the lower limit of the coefficient of HCFZ/h in ANN model is quite close to the lower limit of both empirical and in situ models. Also, the upper limit of HCFZ/h in this model is closer to those upper limits of both empirical and in situ models compared to the others. On the contrary, the lower and upper limits of this coefficient in the MVRA model are rather far from those of the other models. It can be said that the upper limit of HCFZ/h in the ANN model is in the middle of the upper limits of analytical, numerical, physical and MVRA models with the upper limits of in situ measurements and empirical model. According to Table 8, the results of both proposed models in terms of the panel width (HCFZ/L) are in a high conformity with the results of comparable models presented by other researcher. Considering the above comparisons, it can be concluded that there exists a reasonable agreement between the ANN model results and the in situ measurements as well as the previous models results. Therefore, this technique can be efficiently utilized to predict the height of caving–fracturing zone over the longwall mine gobs.

7.3 Sensitivity analysis

In general, sensitivity analysis is implemented to assess the influence of the input parameters of a model on its related output parameter(s). The cosine amplitude method (CAM) that is one of the most important methods in this field is usually employed to determine the inherent relationships between the output parameters with their respective input(s) [53]. The impacts of the inputs variables (r ij ) on the output(s) parameter are computed by using this equation in the CAM method:

$$r_{ij} = \sum\limits_{k = 1}^{m} {x_{ik} x_{jk} /} \sqrt {\sum\limits_{k = 1}^{m} {x_{ik}^{2} \sum\limits_{k = 1}^{m} {x_{jk}^{2} } } }$$
(7)

where x ik and x jk are the kth input and model output per the same input value, respectively.

The impacts of the inputs variables (r ij ) values on the predicted HCFZ resulted from the ANN model are shown in Fig. 22. In other words, this figure shows the impact of input parameters on the HCFZ. As shown in this figure, the ordered effective variables on the HCFZ are overburden depth, panel width, mined seam (ore) height, rock mass unit weight, rock mass bulking factor, rock mass unconfined compressive strength, rock mass friction angle, rock mass elastic modulus and rock mass Poisson’s ratio, respectively. Accordingly, overburden depth and panel width are the most effective parameters on the HCFZ. On the other hand, Poisson’s ratio and elastic modulus are the least effective ones.

Fig. 22
figure 22

Impact values of input parameters on the HCFZ

8 Conclusion

Caving–fracturing zone above the longwall gobs is a key aspect in underground mining that plays a crucial role in determining the workface supports and transferred stress toward the front and rib-sides abutments. Due to complex environments of longwall mining, ANN approach was proposed for the estimation of the height of caving–fracturing zone (HCFZ) over the longwall mine gobs in this research, and the obtained results were compared with the multivariable regression analysis (MVRA) results. On the basis of network with minimum error, a feed-forward back-propagation type of neural network with architecture of 9-6-4-1, TRAINLM learning function and LOGSIG transfer function was found to be the optimum network. In order to assess the proposed models’ performances, determination coefficient (R 2), variance account for (VAF), mean absolute error (E a) and mean relative error (E r) indices were used and computed on the basis of testing datasets. After that, the results of ANN and MVRA models as a coefficient of mined seam (ore) height and panel width are compared with the results of the available models in the literature. At the end of modeling, the sensitivity analysis was done based on the cosine amplitude method (CAM) to determine the impacts of input parameters on the ANN model output. Comparison of the proposed models based on the above-mentioned performance evaluation proved that the accuracy of ANN model is relatively higher than the MVRA model and showed that there is very close conformity between its outputs and real ones. In addition, comparative analysis proved that the results of ANN model are in a very close agreement with in situ models extracted from the literature and with those obtained from the existing empirical, analytical, numerical and physical models. Later part of ANN modeling, i.e., the sensitivity analysis, showed that the most and least effective parameters on the HCFZ are overburden depth and Poisson’s ratio, respectively. The main advantages of the ANN model are that the common effective parameters on the HCFZ including geometrical and geomechanical properties of the overburden strata and mined panel were taken into account in the modeling. According to the results obtained from this research, it can be concluded that an optimum proposed ANN model can be a powerful and applicable technique to estimate the height of caving–fracturing zone. Therefore, this approach can be successfully used to predict the HCFZ above the longwall gobs in underground mining.